Acoustic Feature Demo

[1]:
import avn.dataloading as dataloading
import avn.acoustics as acoustics

import pandas as pd

In this tutorial, we will be using the avn.acoustics module to calculate a suite of acoustic features first for a single period of song, then for all syllables in a syllable table. The song features included in avn are based on those in Sound Analysis Pro (SAP). They include: - Goodness of Pitch - Mean Frequency - Wiener Entropy - Amplitude - Amplitude Modulation - Frequency Modulation - Pitch

I plan to add a notebook with more information about each of these features in the future, but in the meantime more detailed descriptions of the features can be found in the Sound Analysis Pro documentation here: http://soundanalysispro.com/manual/chapter-4-the-song-features-of-sap2

Calculating Acoustic Features for a Single Song Interval

Creating a SongInterval Object

AVN’s acoustics module centers around a class called SongInterval. This class stores information about a particular interval of audio, be that a song motif, a single syllable, or a full .wav file. It also has many methods which allow you to quickly calculate each acoustic feature across the interval and to extract summary statistics for each feature.

Let’s begin by loading an example .wav file using AVN’s dataloading module:

[2]:
song = dataloading.SongFile("../sample_data/G402/G402_43362.23322048_9_19_6_28_42.wav")

We create our SongInterval object by passing this SongFile type object as an input to acoustics.SongInterval(). You also have the option to pass the onset and offset timestamps within the file of your interval of interest, but if these aren’t specified the SongInterval will consist of the full file. In this tutorial, we will be looking at only the first 2 seconds of the file.

[3]:
song_interval = acoustics.SongInterval(song, onset = 0, offset = 2)

acoustics.SongInterval() also has many other optional parameters that you can set to change how the acoustic features are calculated. The default parameters are well suited for the analysis of zebra finch song and match those used by Sound Analysis Pro. It is unlikely that you would have need to change them, but if you’re curious the full list of parameters, their default values, and a brief explanation of how each is used is available in the docstring of the SongInterval class.

Calculating Acoustic Features

Now that we have our SongInterval object ready, we can calculate any single acoustic feature over this interval by calling one of the following functions:

  • .calc_goodness() to calculate the goodness of pitch

  • .calc_mean_frequency() to calculate the mean frequency

  • .calc_frequency_modulation() to calculate the frequency modulation

  • .calc_amplitude_modulation() to calculate the amplitude modulation

  • .calc_entropy() to calculate the Weiner entropy

  • .calc_amplitude() to calculate the amplitude

  • .calc_pitch() to estimate the fundamental frequency

Each of these functions will return a numpy array containing the value of the acoustic feature for each short time window in the song interval. As an example, let’s calculate the goodness of pitch.

[4]:
goodness = song_interval.calc_goodness()
print("Shape of Output: " + str(goodness.shape))
goodness
Shape of Output: (2206,)
[4]:
array([0.08748554, 0.08418637, 0.08150075, ..., 0.07466519, 0.09283684,
       0.08723036])

If you want to calculate multiple acoustic features, this can be done more efficiently with the .calc_all_features() method. You can pass this method a list of all the acoustic features that you want to calculate and it will return them as a dictionary. By default, all available acoustic features will be returned but if you don’t need all of them you can pass a list of only those features that you do want as the features parameter. For this to work, the names of the features must be spelled correctly, with the correct capitalization. That is: [‘Goodness’, ‘Mean_frequency’, ‘Frequency_modulation’, ‘Amplitude_modulation’, ‘Amplitude’, ‘Entropy’, ‘Pitch’].

Let’s calculate just the goodness of pitch and the amplitude in this example:

[5]:
features = song_interval.calc_all_features(features = ['Goodness', 'Amplitude'])
features
[5]:
{'Goodness': array([0.08748554, 0.08418637, 0.08150075, ..., 0.07466519, 0.09283684,
        0.08723036]),
 'Amplitude': array([44.05477002, 45.21658021, 46.0164633 , ..., 47.51012431,
        46.92642835, 46.14961101])}

Visualizing Acoustic Features

SongInterval also has a handy plotting function which can plot any acoustic feature over a spectrogram of the interval. Let’s have a look at some of those features:

[6]:
song_interval.plot_feature(feature = "Goodness")
_images/acoustic_feature_demo_17_0.png
<Figure size 1440x360 with 0 Axes>
[7]:
song_interval.plot_feature(feature = "Entropy")
_images/acoustic_feature_demo_18_0.png
<Figure size 1440x360 with 0 Axes>

This will work for any of the acoustic features in avn.acoustics (‘Goodness’, ‘Mean_frequency’, ‘Frequency_modulation’, ‘Amplitude_modulation’, ‘Amplitude’, ‘Entropy’, or ‘Pitch’)

Calculating Acoustic Feature Summary Statistics

Now we know how to calculate acoustic features for each frame in a song interval. These can be interesting if you are curious about how a feature changes over the course of a syllable, for example, but it is generally more useful to have a single score per feature per interval. Thankfully, SongInterval also has a method to calculate summary statistics for each acoustic feature over the interval. Let’s do that for all the features available (which is the default if we don’t specify a subset of features) by calling .calc_feature_stats().

[8]:
feature_stats = song_interval.calc_feature_stats()
feature_stats
[8]:
Goodness Mean_frequency Entropy Amplitude Amplitude_modulation Frequency_modulation Pitch
mean 0.137592 2891.814561 -1.803505 56.846695 0.000021 0.454302 3511.723555
std 0.091330 747.591179 0.696087 11.296620 0.399036 0.354409 2397.807850
min 0.059758 1417.312546 -5.164130 44.054770 -4.353593 0.001553 538.585520
25% 0.096266 2385.826963 -2.058598 47.804789 -0.011437 0.156986 1133.268704
50% 0.112000 2833.490716 -1.590509 49.840404 -0.000314 0.388846 4277.993715
75% 0.135132 3322.131753 -1.339322 66.323051 0.008567 0.695680 5967.726910
max 0.799963 5229.989962 -0.792397 90.562486 4.662347 1.565718 7297.810765

It is a bit silly to calculate these features over such a long interval as in this example, but these scores can be very valuable when calculated over single syllables. They can be used to cluster syllables, to detect unusual syllable types, to measure the variability of individual syllable types, or to detect changes in song structure before and after a manipulation, etc.

Saving Features

SongInterval also has built in functions to save acoustic features or acoustic feature statistics .csv files. You can save the full time series of features with the method .save_features(), which takes as input out_file_path (the path to the folder where you want to save the features), file_name (the root name you want to give the file), and features(the list of features you want to save - if unspecified all will be saved). This will save a file called file_name_features.csv in the designated output folder, as well as a file called file_name_metadata.csv which contains all the parameter values used to calculate the features, as well as the avn version, the original file name and the onset and offset timestamps of the interval within the file. This metadata information will ensure that your acoustic feature calculations will be reproducible in the future.

[9]:
song_interval.save_features(out_file_path="../sample_data/", file_name = "G402_sample_song")

You can also save the summary statistics in exactly the same way with the method save_feature_stats(). The output files will be called file_name_feature_stats.csv and file_name_metadata.csv.

[10]:
song_interval.save_feature_stats(out_file_path="../sample_data/", file_name = "G402_sample_song")

Calculate Acoustic Features for Many Syllables

In the first section of this tutorial we outlined how to use the avn.acoustics module to calculate acoustic features for a single interval of audio. However, a more common workflow is to calculate the acoustic features for many syllables at a time so that they can be compared, clustered etc. Thankfully, the avn.acoustics module also has a dedicated class to make it easier to handle large tables of song syllables!

Creating an AcousticData Object

Similarly to the avn.syntax.SyntaxData class, the avn.acoustics module has an AcousticData class which stores a table of syllables, and has convenient methods to calculate the acoustic features of each of those syllables.

To begin, we need a table with one row for every syllable that we want to analyze. This table should have the following columns: - files which should contain the name of the .wav file in which the syllable is found - onsets which should contain the onset timestamp of the syllable in seconds within the file - offsets which should contain the offset timestamp of the syllable in seconds within the file.

Such a table can be generated using the avn.segmentation module, or whatever your preferred syllable segmentation method is as long as it has columns structured as described above.

The syllable table can also contain any number of other columns (such as syllable labels, notes, etc.) which will be preserved but are not necessary.

Here is an example of what such a syllable table could look like:

[11]:
syll_df = pd.read_csv("../sample_data/G402_syll_df.csv")
syll_df
[11]:
files onsets offsets labels
0 G402_43362.23322048_9_19_6_28_42.wav 0.166009 0.225170 i
1 G402_43362.23322048_9_19_6_28_42.wav 0.321043 0.385828 i
2 G402_43362.23322048_9_19_6_28_42.wav 0.570295 0.626757 i
3 G402_43362.23322048_9_19_6_28_42.wav 0.690363 0.751837 i
4 G402_43362.23322048_9_19_6_28_42.wav 0.828390 0.898912 i
... ... ... ... ...
589 G402_43362.43586086_9_19_12_6_26.wav 1.808186 1.976712 b
590 G402_43362.43586086_9_19_12_6_26.wav 2.011905 2.095646 c
591 G402_43362.43586086_9_19_12_6_26.wav 2.114762 2.263175 d
592 G402_43362.43586086_9_19_12_6_26.wav 2.318934 2.402177 e
593 G402_43362.43586086_9_19_12_6_26.wav 2.433810 2.563447 f

594 rows × 4 columns

We can create a new AcousticData object by passing this syllable table as input to acoustics.AcousticData(), along with the Bird_ID of the subject bird and the path to the folder containing the .wav files in the syllable table. Like so:

[12]:
acoustic_data = acoustics.AcousticData(Bird_ID = "G402", syll_df = syll_df, song_folder_path="../sample_data/G402/")

Calculate Acoustic Features

We can now calculate any combination of acoustic features from this list: [‘Goodness’, ‘Mean_frequency’, ‘Frequency_modulation’, ‘Amplitude_modulation’, ‘Amplitude’, ‘Entropy’, ‘Pitch’] for each syllable in the syllable table with the method .calc_all_features().

For example, let’s calculate the Frequency modulation and the Pitch:

[13]:
features = acoustic_data.calc_all_features(features = ['Pitch', 'Frequency_modulation'])
features
[13]:
Frequency_modulation Pitch files onsets offsets labels
0 [1.554138752062315, 1.522665756024348, 1.39044... [1342.9870358953176, 1343.6980823646277, 1347.... G402_43362.23322048_9_19_6_28_42.wav 0.166009 0.22517 i
0 [1.1951050687546656, 1.2754715619300896, 1.390... [1337.81447860608, 1340.4311096227414, 1344.11... G402_43362.23322048_9_19_6_28_42.wav 0.321043 0.385828 i
0 [1.5256539450074136, 1.4687677891619524, 1.055... [1334.094750632218, 1332.705804681969, 1337.61... G402_43362.23322048_9_19_6_28_42.wav 0.570295 0.626757 i
0 [1.2974545304143592, 1.1919562335359768, 0.876... [1379.7435850832105, 1394.0666879679236, 1405.... G402_43362.23322048_9_19_6_28_42.wav 0.690363 0.751837 i
0 [1.4339299730529087, 0.9783459018679432, 0.460... [1341.9634801933178, 1357.344164568718, 1365.5... G402_43362.23322048_9_19_6_28_42.wav 0.82839 0.898912 i
... ... ... ... ... ... ...
0 [1.175703506965015, 1.0612232718381467, 0.9003... [4838.566260423787, 4807.5771214231645, 4803.9... G402_43362.43586086_9_19_12_6_26.wav 1.808186 1.976712 b
0 [1.5624430371801388, 1.5534304047914287, 0.858... [4059.867476677284, 4130.549387945176, 4148.93... G402_43362.43586086_9_19_12_6_26.wav 2.011905 2.095646 c
0 [1.3678523064969446, 1.3134255155920922, 1.232... [1808.7585386523256, 1808.5561471447486, 1809.... G402_43362.43586086_9_19_12_6_26.wav 2.114762 2.263175 d
0 [1.3641694039563599, 1.325291611030776, 1.2633... [568.0846693657795, 567.4237886580611, 566.769... G402_43362.43586086_9_19_12_6_26.wav 2.318934 2.402177 e
0 [1.5496912993782248, 1.5060051496775306, 1.328... [3933.414997549934, 3942.7519148349297, 3918.4... G402_43362.43586086_9_19_12_6_26.wav 2.43381 2.563447 f

594 rows × 6 columns

As you can see, this method returns a new dataframe with all the columns in our original syllable table, plus a column for each acoustic feature we specified where each row contains a numpy array with the value of the feature at every short time window within each syllable. This table will also be saved as an attribute of acoustic_data called all_features.

[16]:
acoustic_data.all_features.head()
[16]:
Frequency_modulation Pitch files onsets offsets labels
0 [1.554138752062315, 1.522665756024348, 1.39044... [1342.9870358953176, 1343.6980823646277, 1347.... G402_43362.23322048_9_19_6_28_42.wav 0.166009 0.22517 i
0 [1.1951050687546656, 1.2754715619300896, 1.390... [1337.81447860608, 1340.4311096227414, 1344.11... G402_43362.23322048_9_19_6_28_42.wav 0.321043 0.385828 i
0 [1.5256539450074136, 1.4687677891619524, 1.055... [1334.094750632218, 1332.705804681969, 1337.61... G402_43362.23322048_9_19_6_28_42.wav 0.570295 0.626757 i
0 [1.2974545304143592, 1.1919562335359768, 0.876... [1379.7435850832105, 1394.0666879679236, 1405.... G402_43362.23322048_9_19_6_28_42.wav 0.690363 0.751837 i
0 [1.4339299730529087, 0.9783459018679432, 0.460... [1341.9634801933178, 1357.344164568718, 1365.5... G402_43362.23322048_9_19_6_28_42.wav 0.82839 0.898912 i

Calculate Acoustic Feature Summary Statistics

As mentioned in the first section of this tutorial, it is often more useful to have a single mean and std value for each feature for each syllable, instead of time series (which will take more disk space to save and are all different lengths for different syllables making them difficult to work with). To get just the summary statistics for the acoustic features for each syllable in the acoustic_data syllable table, we can use the method .calc_all_feature_stats().

Let’s calculate the summary statistics for the mean frequency for every syllable in our table:

[22]:
feature_stats = acoustic_data.calc_all_feature_stats(features = ['Mean_frequency'])
feature_stats
[22]:
Syll_info Mean_frequency
files onsets offsets labels mean std min 25% 50% 75% max
0 G402_43362.23322048_9_19_6_28_42.wav 0.166009 0.22517 i 2907.785023 1121.457694 1412.581905 2144.890471 2500.659910 3946.849894 4906.114381
1 G402_43362.23322048_9_19_6_28_42.wav 0.321043 0.385828 i 2703.070150 941.459831 1498.764199 1765.620166 2566.293475 3330.237660 4636.205351
2 G402_43362.23322048_9_19_6_28_42.wav 0.570295 0.626757 i 3012.216848 1175.352043 1562.812118 2162.649309 2471.829333 4066.341596 5047.588192
3 G402_43362.23322048_9_19_6_28_42.wav 0.690363 0.751837 i 2916.965205 1049.030479 1504.505563 2223.822274 2500.249679 3732.941696 4843.951932
4 G402_43362.23322048_9_19_6_28_42.wav 0.82839 0.898912 i 2724.163727 944.713327 1504.298356 1965.995565 2532.514305 3406.932164 4577.235024
... ... ... ... ... ... ... ... ... ... ... ...
589 G402_43362.43586086_9_19_12_6_26.wav 1.808186 1.976712 b 4211.076484 930.050590 1923.876639 3771.950718 4486.205261 4859.350242 5467.731391
590 G402_43362.43586086_9_19_12_6_26.wav 2.011905 2.095646 c 3909.581357 326.175365 3197.165752 3650.098384 3947.938012 4124.447948 4458.822812
591 G402_43362.43586086_9_19_12_6_26.wav 2.114762 2.263175 d 3771.857709 348.871318 3010.147580 3600.451037 3732.176595 3915.852528 5122.593444
592 G402_43362.43586086_9_19_12_6_26.wav 2.318934 2.402177 e 3997.790057 272.362455 3387.536710 3781.849900 4105.547617 4231.509360 4317.167893
593 G402_43362.43586086_9_19_12_6_26.wav 2.43381 2.563447 f 3770.705568 554.947440 1941.667383 3547.984968 3958.680979 4109.973595 4351.028939

594 rows × 11 columns

Saving Features

Like the SongInterval class, the AcousticData class also has methods to automatically save your acoustic features, along with a metadata file with all the information necessary to reproduce their calculation.

While it is possible to save the timeseries of each acoustic feature for every syllable with the method .save_features(), this will occupy considerable disk space and usually isn’t necessary. Instead, I suggest saving just the summary statistics for each feature with the method .save_feature_stats().

Both .save_features() and .save_feature_stats() take as input an out_file_path (the path to the folder where you want the .csv files to be saved), file_name (the root name you want to give the file) and features (the list of features you want to save. If this is not specified all features will be saved).

Let’s save the summary statistics for Goodness of Pitch and Amplitude for all the syllables in our table:

[23]:
acoustic_data.save_feature_stats(out_file_path="../sample_data/", file_name = "G402_syll_table", features =  ["Goodness", "Amplitude"])

This calculates and saves the summary statistics for each acoustic feature specified in a file called G402_syll_table_all_feature_stats.csv, and saves a file called G402_syll_table_metadata.csv.