\n","\n"," * Distributed under the terms of the GPL License\n"," * Maintainer: ryjo@mbari.org\n"," * Author: John Ryan ryjo@mbari.org"]},{"cell_type":"markdown","metadata":{"id":"oufZYHdskBWn"},"source":["## Fin whale song\n","---\n","Fin whales produce rhythmic structured sequences of sound, typically in a series of pulses described as 'song'. This tutorial describes use of the *Pacific Ocean Sound Recordings* archive to examine temporal patterns of occurrence of fin whale song.\n","\n","If you use this data set, please **[cite our project](https://ieeexplore.ieee.org/document/7761363).**\n"]},{"cell_type":"markdown","metadata":{"id":"9HiGo0WNkBWn"},"source":["## Data Overview\n","---"]},{"cell_type":"markdown","metadata":{"id":"tn2mT9DEkBWn"},"source":["### Recording site\n","The [recording site](https://www.mbari.org/at-sea/cabled-observatory/) is located on the continental slope of the eastern North Pacific, within [Monterey Bay National Marine Sanctuary](https://montereybay.noaa.gov/). The region is known to be [important foraging habitat](https://www.cascadiaresearch.org/publications/biologically-important-areas-selected-cetaceans-within-us-waters-%E2%80%93-west-coast-region) for the regional whale populations."]},{"cell_type":"markdown","metadata":{"id":"dk15J9HEkBWn"},"source":["### Hydrophone calibration\n","For the low-frequency (2 kHz) data, calibration data are not frequency dependent; a single low-frequency calibration value is used. Its value depends on time of data collection, as two hydrophones have been deployed sequentially at the same site. Before 14 June 2017, the calibration value is -168.8 dB re V / uPa (measured at 26 Hz). After this date the value is -177.9 dB re V / uPa (measured at 250 Hz). See also:\n","\n","\n","* https://bitbucket.org/mbari/pacific-sound/src/master/MBARI_MARS_Hydrophone_Deployment01.json\n","* https://bitbucket.org/mbari/pacific-sound/src/master/MBARI_MARS_Hydrophone_Deployment02.json\n","\n","The first hydrophone exhibited calibration drift, while the second (deployed 13 June 2017 and currently operational) has not. This observation is consistent with differences in the technologies of the two instruments. However, for this application the calibration drift of the first hydrophone is not problematic because the CI is computed as a signal to noise ratio. Therefore, time-series analysis of CI can reliably span the full archive."]},{"cell_type":"markdown","metadata":{"id":"Qxt8sRQWkBWo"},"source":["### Data files and archive organization\n","The decimated audio data are in daily [WAV](https://en.wikipedia.org/wiki/WAV) files in an s3 bucket named pacific-sound-2khz, grouped by year and month. Buckets are stored as objects, so the data are not physically stored in folders or directories as you may be famaliar with, but you can think of it conceptually as follows:\n","\n","```\n","pacific-sound-2khz\n"," |\n"," ----2020\n"," |\n"," |----01\n"," ...\n"," |----12\n","```\n"]},{"cell_type":"markdown","metadata":{"id":"0gCxAK9NkBWo"},"source":["## Install required dependencies\n","\n","First, let's install the required software dependencies.\n","\n","If you are using this notebook in a cloud environment, select a Python3 compatible kernel and run this next section. This only needs to be done once for the duration of this notebook.\n","\n","If you are working on local computer, you can skip this next cell. Change your kernel to *pacific-sound-notebooks*, which you installed according to the instructions in the [README](https://github.com/mbari-org/pacific-sound-notebooks/) - this has all the dependencies that are needed."]},{"cell_type":"code","execution_count":1,"metadata":{"id":"PdgRR34ykBWp","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1704651117597,"user_tz":480,"elapsed":39498,"user":{"displayName":"John Ryan","userId":"06274207741251648949"}},"outputId":"6cfc0c78-cd6f-49a4-8905-fb15e280be05"},"outputs":[{"output_type":"stream","name":"stdout","text":["\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m139.3/139.3 kB\u001b[0m \u001b[31m1.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n","\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m11.9/11.9 MB\u001b[0m \u001b[31m26.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n","\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m82.1/82.1 kB\u001b[0m \u001b[31m8.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n","\u001b[?25h"]}],"source":["!pip install -q boto3 --quiet\n","!pip install -q soundfile --quiet\n","!pip install -q scipy --quiet\n","!pip install -q numpy --quiet\n","!pip install -q matplotlib --quiet"]},{"cell_type":"markdown","metadata":{"id":"cnvdJE7GkBWp"},"source":["### Import all packages"]},{"cell_type":"code","execution_count":2,"metadata":{"id":"RXuZEXTvkBWq","executionInfo":{"status":"ok","timestamp":1704651118661,"user_tz":480,"elapsed":1069,"user":{"displayName":"John Ryan","userId":"06274207741251648949"}}},"outputs":[],"source":["import boto3, botocore\n","from botocore import UNSIGNED\n","from botocore.client import Config\n","from six.moves.urllib.request import urlopen\n","import io\n","import scipy\n","from scipy import signal\n","import numpy as np\n","import soundfile as sf\n","import matplotlib.pyplot as plt"]},{"cell_type":"markdown","metadata":{"id":"ncMMqR0wkBWq"},"source":["## Data Access\n","---\n","This section covers file listing, metadata retrieval, and data loading."]},{"cell_type":"markdown","metadata":{"id":"Z6NHGnrmkBWq"},"source":["### List files\n","Files are organized by year and month; list all of the files available for one month of one year."]},{"cell_type":"code","execution_count":3,"metadata":{"id":"CR8tNSNCkBWq","executionInfo":{"status":"ok","timestamp":1704651118662,"user_tz":480,"elapsed":5,"user":{"displayName":"John Ryan","userId":"06274207741251648949"}}},"outputs":[],"source":["s3 = boto3.client('s3',\n"," aws_access_key_id='',\n"," aws_secret_access_key='',\n"," config=Config(signature_version=UNSIGNED))"]},{"cell_type":"code","execution_count":4,"metadata":{"id":"WVuPmzvskBWq","colab":{"base_uri":"https://localhost:8080/"},"outputId":"7aa571a9-ee02-467b-db12-6d861697a666","executionInfo":{"status":"ok","timestamp":1704651118858,"user_tz":480,"elapsed":200,"user":{"displayName":"John Ryan","userId":"06274207741251648949"}}},"outputs":[{"output_type":"stream","name":"stdout","text":["2023/12/MARS-20231201T000000Z-2kHz.wav\n","2023/12/MARS-20231202T000000Z-2kHz.wav\n","2023/12/MARS-20231203T000000Z-2kHz.wav\n","2023/12/MARS-20231204T000000Z-2kHz.wav\n","2023/12/MARS-20231205T000000Z-2kHz.wav\n","2023/12/MARS-20231206T000000Z-2kHz.wav\n","2023/12/MARS-20231207T000000Z-2kHz.wav\n","2023/12/MARS-20231208T000000Z-2kHz.wav\n","2023/12/MARS-20231209T000000Z-2kHz.wav\n","2023/12/MARS-20231210T000000Z-2kHz.wav\n","2023/12/MARS-20231211T000000Z-2kHz.wav\n","2023/12/MARS-20231212T000000Z-2kHz.wav\n","2023/12/MARS-20231213T000000Z-2kHz.wav\n","2023/12/MARS-20231214T000000Z-2kHz.wav\n","2023/12/MARS-20231215T000000Z-2kHz.wav\n","2023/12/MARS-20231216T000000Z-2kHz.wav\n","2023/12/MARS-20231217T000000Z-2kHz.wav\n","2023/12/MARS-20231218T000000Z-2kHz.wav\n","2023/12/MARS-20231219T000000Z-2kHz.wav\n","2023/12/MARS-20231220T000000Z-2kHz.wav\n","2023/12/MARS-20231221T000000Z-2kHz.wav\n","2023/12/MARS-20231222T000000Z-2kHz.wav\n","2023/12/MARS-20231223T000000Z-2kHz.wav\n","2023/12/MARS-20231224T000000Z-2kHz.wav\n","2023/12/MARS-20231225T000000Z-2kHz.wav\n","2023/12/MARS-20231226T000000Z-2kHz.wav\n","2023/12/MARS-20231227T000000Z-2kHz.wav\n","2023/12/MARS-20231228T000000Z-2kHz.wav\n","2023/12/MARS-20231229T000000Z-2kHz.wav\n","2023/12/MARS-20231230T000000Z-2kHz.wav\n","2023/12/MARS-20231231T000000Z-2kHz.wav\n"]}],"source":["year = 2023\n","month = 12\n","bucket = 'pacific-sound-2khz'\n","\n","for obj in s3.list_objects_v2(Bucket=bucket, Prefix=f'{year:04d}/{month:02d}')['Contents']:\n"," print(obj['Key'])"]},{"cell_type":"markdown","metadata":{"id":"t9tfOzx1kBWr"},"source":["### Retrieve metadata\n","Read and show metadata for a single daily file."]},{"cell_type":"code","execution_count":5,"metadata":{"id":"fUiQcjgNkBWr","colab":{"base_uri":"https://localhost:8080/"},"outputId":"ff5ac190-d796-42e1-a3d4-fb3e611519ef","executionInfo":{"status":"ok","timestamp":1704651119065,"user_tz":480,"elapsed":210,"user":{"displayName":"John Ryan","userId":"06274207741251648949"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["<_io.BytesIO object at 0x7a4c6ebcad40>\n","samplerate: 2000 Hz\n","channels: 1\n","duration: 222 samples\n","format: WAV (Microsoft) [WAV]\n","subtype: Signed 24 bit PCM [PCM_24]\n","endian: FILE\n","sections: 1\n","frames: 222\n","extra_info: \"\"\"\n"," Length : 1000\n"," RIFF : 518400324 (should be 992)\n"," WAVE\n"," fmt : 16\n"," Format : 0x1 => WAVE_FORMAT_PCM\n"," Channels : 1\n"," Sample Rate : 2000\n"," Block Align : 3\n"," Bit Width : 24\n"," Bytes/sec : 6000\n"," LIST : 280\n"," INFO\n"," INAM : MBARI ocean audio data, start 20231201T000000 UTC\n"," ICMT : If you use these data, please cite https://doi.org/10.1109/OCEANS.2016.7761363. Recording metadata can be found at https://bitbucket.org/mbari/pacific-sound/src/master/MBARI_MARS_Hydrophone_Deployment02.json.\n"," data : 518400000 (should be 668)\n"," End\n"," \"\"\""]},"metadata":{},"execution_count":5}],"source":["filename = 'MARS-20231201T000000Z-2kHz.wav'\n","key = f'{year:04d}/{month:02d}/{filename}'\n","\n","url = f'https://{bucket}.s3.amazonaws.com/{key}'\n","\n","sf.info(io.BytesIO(urlopen(url).read(1_000)), verbose=True)"]},{"cell_type":"markdown","metadata":{"id":"pfoQaAtAkBWr"},"source":["### Load data\n","Read a single daily file."]},{"cell_type":"code","execution_count":6,"metadata":{"id":"3_55ErkDkBWr","colab":{"base_uri":"https://localhost:8080/"},"outputId":"5f856e24-ff66-4870-cf81-18fa205c7fb7","executionInfo":{"status":"ok","timestamp":1704651130500,"user_tz":480,"elapsed":11438,"user":{"displayName":"John Ryan","userId":"06274207741251648949"}}},"outputs":[{"output_type":"stream","name":"stdout","text":["Reading from https://pacific-sound-2khz.s3.amazonaws.com/2023/12/MARS-20231201T000000Z-2kHz.wav\n","Read 86400.0 seconds of data\n"]}],"source":["# read full-day of data\n","print(f'Reading from {url}')\n","v, sample_rate = sf.read(io.BytesIO(urlopen(url).read()),dtype='float32')\n","v = v*3 # convert scaled voltage to volts\n","nsec = (v.size)/sample_rate # number of seconds in vector\n","print(f'Read {nsec} seconds of data')"]},{"cell_type":"markdown","metadata":{"id":"IP3JXXk0kBWr"},"source":["## A view of fin whale song\n","---\n","To understand the method of quantifying song occurrence using an energy metric, it is useful to first consider the attributes of fin whale song. Analysis approaches include (1) detecting, classifying, and counting calls, and (2) quantifying the energy within the frequency band of the call, relative to that at background frequencies. The first approach becomes difficult during periods when the whales chorus because the presence of overlapping calls thwarts distinction of individual calls. The second approach can be applied consistently regardless of whether or not whale calls overlap, and it is effective for quantifying the integrated signal received from many calling whales.\n","\n"]},{"cell_type":"code","execution_count":7,"metadata":{"id":"YfTAc5BGkBWr","colab":{"base_uri":"https://localhost:8080/"},"outputId":"371b3ea3-f8a4-4ca4-9d10-d05b3be6caf4","executionInfo":{"status":"ok","timestamp":1704651132525,"user_tz":480,"elapsed":2045,"user":{"displayName":"John Ryan","userId":"06274207741251648949"}}},"outputs":[{"output_type":"stream","name":"stdout","text":[":: psd.shape = (1001, 86400)\n",":: f.size = 1001\n",":: t.size = 86400\n"]},{"output_type":"stream","name":"stderr","text":[":5: RuntimeWarning: divide by zero encountered in log10\n"," psd = 10*np.log10(psd) - sens\n"]}],"source":["# Compute spectrogram\n","w = scipy.signal.get_window('hann',sample_rate)\n","f, t, psd = scipy.signal.spectrogram(v, sample_rate,nperseg=sample_rate,noverlap=0,window=w,nfft=sample_rate)\n","sens = -168.8 # hydrophone sensitivity at 26 Hz\n","psd = 10*np.log10(psd) - sens\n","print(f':: psd.shape = {psd.shape}')\n","print(f':: f.size = {f.size}')\n","print(f':: t.size = {t.size}')\n"]},{"cell_type":"code","source":["# View 15 minutes\n","start_hour = 15\n","start_sec = int(start_hour * 3600 + 1)\n","end_sec = start_sec+900-1\n","psd_subset = psd[:,start_sec:end_sec]\n","plt.figure(dpi=200, figsize = [9,3])\n","plt.imshow(psd_subset,aspect='auto',origin='lower',vmin=45,vmax=95)\n","plt.colorbar()\n","plt.ylim(10,50)\n","plt.xlabel('Second of hour 15')\n","plt.ylabel('Frequency (Hz)')\n","plt.title('Spectrum level (dB re 1 $\\mu$Pa$^2$/Hz)')\n","plt.annotate(\"fin whale pulses\",(400,30),color='w')"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":0},"id":"8bsX0R2NcSda","outputId":"fdac025c-fb6a-4a0d-d1ba-dba90de26dd8","executionInfo":{"status":"ok","timestamp":1704651134204,"user_tz":480,"elapsed":1685,"user":{"displayName":"John Ryan","userId":"06274207741251648949"}}},"execution_count":8,"outputs":[{"output_type":"execute_result","data":{"text/plain":["Text(400, 30, 'fin whale pulses')"]},"metadata":{},"execution_count":8},{"output_type":"display_data","data":{"text/plain":["