warbleR logo

Streamlining acoustic analysis

Marcelo Araya-Salas and Grace Smith Vidaurre
2017-04-05


Here we present a case study of how warbleR functions can be used for bioacoustics, as well as some tips on managing bioacoustic data in R. For more details about function arguments, input or output, please read the documentation for the function in question (e.g. ?querxc).

Please note that most tools in warbleR use functions from the seewave package internally. warbleR has been designed to make the analyses more accessible to average R-users. However, acoustic analysis in R would not be possible without the tools provided by seewave. Hence, both packages should be given credit when using warbleR and both should be appropriately cited in publications (try citation("warbleR") and citation("seewave")).

This vignette can be run without an advanced understanding of R, as long as you know how to run code in you R console. However, knowing more about data and file manipulation would be very helpful in order to modify the code to fit your own needs.

warbleR workflow


To start, we will use warbleR to download recordings from Xeno-Canto.

Prepare for data collection


First, we need to install and load warbleR. Notice that you need an R version ≥ 2.10 and seewave package version ≥ 2.0.1. Also, users using UNIX machines (Mac OSX or Linux), will need to install fftw3, pkg-config and libsndfile on their machines prior to installing warbleR. These external packages will need to be installed through a UNIX terminal.

install.packages("warbleR")
library(warbleR)


And set up a working directory

# Create a new directory
dir.create(file.path(getwd(),"warbleR_example"))
setwd(file.path(getwd(),"warbleR_example"))

# Check the location of the directory
getwd()


Obtain metadata and recordings from Xeno-Canto

Next, we can query the Xeno-Canto database for a species or genus of interest. The function querxc has two types of output:

  1. Metadata of recordings: geographic coordinates, recording quality, recorder, type of signal, etc.

  2. Sound files: Sound files (mp3 format) are returned if the argument download is set to TRUE (default is FALSE).


You can query Xeno-Canto by genus:

# Query Xeno-Canto for all recordings of the hummingbird genus Phaethornis
Phae <- querxc(qword = "Phaethornis", download = FALSE) 
## Obtaining recording list...
## 769 recordings found!
# Find out what kind of metadata we have
names(Phae) 
##  [1] "Recording_ID"      "Genus"             "Specific_epithet" 
##  [4] "Subspecies"        "English_name"      "Recordist"        
##  [7] "Country"           "Locality"          "Latitude"         
## [10] "Longitude"         "Vocalization_type" "Audio_file"       
## [13] "License"           "Url"               "Quality"          
## [16] "Time"              "Date"
View(Phae)


Or you can query by species:

# Query Xeno-Canto for all recordings of the species Phaethornis longirostris
Phae.lon <- querxc(qword = "Phaethornis longirostris", download = FALSE) 
View(Phae.lon)


If you’re interested in the geographic spread of the recording locations, you can use the function xcmaps to visualize locations. xcmaps will create an image file of a map per species in your current directory if img = TRUE. If img = FALSE, maps will be displayed in the graphic device.

# Image type default is jpeg, but tiff files have better resolution
xcmaps(X = Phae, img = TRUE, it = "tiff") 
xcmaps(X = Phae.lon, img = FALSE) 

Filter Xeno-Canto recordings by signal type

In most cases, you will need to filter the type of signal you want to analyze. You can filter the recordings prior to download by subsetting the metadata. Then, you can input the filtered metadata back into querxc to download only the selected recordings. There are many ways to filter data in R, and the example below can be modified to fit your own data.

Some of the metadata is not quite consistent across recordings, such as signal type or recording quality. These are characteristics of the recordings that you will need to explore visually with downstream functions before proceeding with data collection and analysis. However, if you are dealing with many recordings, we advise removing the lowest quality recordings (D quality level) or selecting specific vocalization types.

For example, we are interested in assessing the microgeographic variation of long-billed hermit (Phaethornis longirostris) songs. Variation at small geographic scales has been already described in this species [1]. We should proceed by looking for a site with the highest number of songs. Our goal is to search for visible differences in song structure within a site, and then determine whether underlying differences in acoustic parameters are representative of spectrographic distinctiveness.


# Find out number of available recordings
nrow(Phae.lon) 
## [1] 74
# Find out how many types of signal descriptions exist in the Xeno-Canto metadata
levels(Phae.lon$Vocalization_type)
## [1] "song"        "song at lek" "calls"       "call"        "lekking"    
## [6] "flight call" "300"
# How many recordings per signal type?
table(Phae.lon$Vocalization_type)
## 
##        song song at lek       calls        call     lekking flight call 
##          63           2           1           5           1           1 
##         300 
##           1


# There are many levels to the Vocalization_type variable. 
# Some are biologically relevant signals, but most just 
# reflect variation in data entry.

# Luckily, it's very easy to filter the signals we want 
Phae.lon.song <- Phae.lon[grep("song", Phae.lon$Vocalization_type, 
                                ignore.case = TRUE), ]

# Check resulting data frame
str(Phae.lon.song) 
## 'data.frame':    65 obs. of  17 variables:
##  $ Recording_ID     : Factor w/ 74 levels "355350","355346",..: 1 2 3 4 5 6 7 10 11 12 ...
##  $ Genus            : Factor w/ 1 level "Phaethornis": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Specific_epithet : Factor w/ 1 level "longirostris": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Subspecies       : Factor w/ 3 levels "","baroni","cephalus": 1 1 1 2 1 1 2 1 1 1 ...
##  $ English_name     : Factor w/ 1 level "Long-billed Hermit": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Recordist        : Factor w/ 15 levels "Kent Livezey",..: 1 1 2 3 4 4 5 8 8 8 ...
##  $ Country          : Factor w/ 7 levels "Panama","Nicaragua",..: 1 1 2 3 4 1 3 4 4 4 ...
##  $ Locality         : Factor w/ 20 levels "Torti, San Francisco Nature Reserve, Panama ",..: 1 1 2 3 4 5 6 8 8 8 ...
##  $ Latitude         : Factor w/ 20 levels "8.939","13.237",..: 1 1 2 3 4 5 6 8 8 8 ...
##  $ Longitude        : Factor w/ 19 levels "-78.465","-86.0546",..: 1 1 2 3 4 5 6 8 8 8 ...
##  $ Vocalization_type: Factor w/ 7 levels "song","song at lek",..: 1 1 1 2 1 1 2 1 1 1 ...
##  $ Audio_file       : Factor w/ 74 levels "http://www.xeno-canto.org/355350/download",..: 1 2 3 4 5 6 7 10 11 12 ...
##  $ License          : Factor w/ 5 levels "http://creativecommons.org/licenses/by-nc-sa/4.0/",..: 1 1 1 1 2 2 1 3 3 3 ...
##  $ Url              : Factor w/ 74 levels "http://www.xeno-canto.org/355350",..: 1 2 3 4 5 6 7 10 11 12 ...
##  $ Quality          : Factor w/ 4 levels "A","B","C","D": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Time             : Factor w/ 60 levels "09:12","17:24",..: 1 2 3 4 5 6 7 10 11 12 ...
##  $ Date             : Factor w/ 32 levels "2017-02-09","2016-02-24",..: 1 1 2 3 4 5 6 9 10 10 ...


# Now, how many recordings per locatity?
table(Phae.lon.song$Locality)
# In case you want more than one signal type you can try something like this:
Phae.lon.sc <- Phae.lon[grep("song|call", Phae.lon$Vocalization_type,ignore.case = TRUE), ]
str(Phae.lon.sc)


Let’s focus on the high quality song recordings from La Selva Biological Station:

#first filter by location
Phae.lon.LS <- Phae.lon.song[grep("La Selva Biological Station, Sarapiqui, Heredia", Phae.lon.song$Locality,
                              ignore.case = FALSE),]

# And only those of the highest quality
Phae.lon.LS <- Phae.lon.LS[Phae.lon.LS$Quality == "A", ]

We can check if the location coordinates make sense (all recordings should be from a single place in Costa Rica) by making a map of these recordings using xcmaps

# map in the graphic device (img = FALSE)
xcmaps(Phae.lon.LS, img = FALSE)


Once you’re sure you want the recordings, use querxc to download the files. It is also a good idea to save the metadata as .csv files at this point. This data will likely be useful at a later point, especially if you’re aiming for a publication.

# Download sound files
querxc(X = Phae.lon.LS) 

# Save each data frame object as a .csv file 
write.csv(Phae.lon.LS, "Phae_lon.LS.csv", row.names = FALSE)


Convert Xeno-Canto mp3 recordings to wav format

Xeno-Canto maintains recordings in mp3 format, as this compressed format yields smaller files. However, we require wav format for all downstream analyses. Compression from wav to mp3 and back involves information losses, but recordings that have undergone this transformation have been successfully used in research [2].

To convert mp3 to wav, we can use the warbleR function mp32wav, which relies on a underlying function from the tuneR package. However, this function does not always work (and it remains unclear as to why!). This bug should be fixed in future versions of tuneR. If RStudio aborts when running mp32wav, use an mp3 to wav converter online, or download the open source software Audacity (available for Mac, Linux and Windows users). We have made the selected wav files available for download (see next section).

After mp3 files have been converted, we need to check that the wav files are not corrupted and can be read into RStudio (some wav files can’t be read due to format or permission issues).

# Neither of these functions requires arguments
# Always check you're in the right directory beforehand
# getwd()
mp32wav() 

# You can use checkwavs to see if wav files can be read
checkwavs() 

# Let's create a list of all the recordings in the directory
wavs <- list.files(pattern="wav$")

# We will use this list to downsample the wav files so the following analyses go a bit faster
lapply(wavs, function(x) writeWave(downsample(readWave(x), samp.rate = 22050),
                                  filename = x))


Filter recordings by visual inspection

The function lspec produces image files with spectrograms of whole sound files split into multiple rows. It is a useful tool for:

  • vocal repertoire analyses
  • filtering by visual inspection

This is the first time we can visualize the recordings since the Xeno-Canto download, and we can make the most of it.

If your research focuses on assessing variation between individuals or groups, lspec can provide you with important information about how to steer your analysis. If there is an obvious variation in vocalization structure across groups (e.g. treatments or geographic regions), you can focus your analysis on a visual classification of vocalizations. You can use lspec to your advantage here, printing spectrograms on paper and classifying signal types by hand.

Whether or not you decide to proceed with visual classification, lspec allows you to visually inspect the quality of the recording (e.g. amount of background noise) or the type, number, and completeness of the vocalizations of interest. You can discard the image files and recordings that you no longer want to analyze, as this will become very useful for downstream functions.

We should first adjust the function arguments to fit our recording characteristics. We can work on a subset of the recordings by specifying their names with the flist argument.

# Let's first create a subset for playing with arguments 
# This subset is based on the list of wav files we created above
sub <- wavs[c(1,3)]

# ovlp = 10 speeds up process a bit 
# tiff image files are better quality and are faster to produce
lspec(flist = sub, ovlp = 10, it = "tiff")

# We can zoom in on the frequency axis by changing flim, 
# the number of seconds per row, and number of rows
lspec(flist = sub, flim = c(1.5, 11), sxrow = 6, rows = 15, ovlp = 10, it = "tiff")

Once satisfied with the argument settings we can run all files

lspec(flim = c(1.5, 11), ovlp = 10, sxrow = 6, rows = 15, it = "tiff")

The image files (in the working directory) should look like this:

lspec example, make it 700 pxls width

lspec example, make it 700 pxls width

Note that the sound file name and page number are placed in the top right corner. Also, the dimensions of the image fit perfectly on letter paper size for printing.

Now we should inspect the spectrograms. Throwing away image files that are poor quality at first glance (e.g. lots of background noise), will help us in later steps. For instance, the spectrogram for the recording with ID 154143 does not look good, so we will manually delete those image files (yes, go and delete those image files!). Then, we can select only the sound files that have an image file in the folder. This looks strange for just one recording, but it is really useful when working with many recordings.

# List the image files in the directory
# Change the pattern to "jpeg" if you used that image type
imgs <- list.files(pattern = ".tiff") 

# If the maps we created previously are still there, you can remove them from this list easily
imgs <- imgs[grep("Map", imgs, invert = TRUE)]

# Extract the recording IDs of the files for which image files remain 
kept <- unique(sapply(imgs, function(x){
  strsplit(x, split = "-", fixed = TRUE)[[1]][3]
  }, USE.NAMES = FALSE))

# Now we can get rid of sound files that do not have image files 
snds <- list.files(pattern = ".wav", ignore.case = TRUE) 
file.remove(snds[grep(paste(kept, collapse = "|"), snds, invert = TRUE)])

Data Collection

We can move on to data collection with the filtered recordings, using the functions autodetec and manualoc

Automatically detect signals with autodetec

Both these functions work faster with shorter recordings, but there are ways to deal with larger recordings (an hour long or more). As its name suggests, autodetec automatically detects the start and end of vocalizations in sound files based on amplitude, duration, and frequency range attributes.

Here are some points that will help us tailor autodetec for our use:

  1. autodetec has 2 types of output:
    • data frame with recording name, selection, start and end times. These are temporal coordinates that will be passed on to downstream functions to measure acoustic parameters. Save this output as an object or it will not be saved in the environment, but rather printed in the console.
    • a spectrogram per recording, with red dotted lines marking the start and end of each detected signal, saved in your working directory. In the case of long recordings, as in our example, it’s preferable to create long spectrograms (ls = TRUE). For shorter recordings (a few seconds), spectrograms of the individual selections may work better (ls = FALSE).
  2. Some important detection parameters to fiddle with:
    • threshold controls detection by relative amplitude (%)
    • bp serves as a frequency bandpass filter
    • ssmooth controls combination of window length and overlap to smooth signals that have many peaks and would otherwise be detected as multiple signals
    • mindur & maxdur determine the minimum and maximum duration of the signals to be detected
    • set controls if settings of the autodetection process are included in the image file name. If TRUE, threshold (th), envelope (envt), bandpass (bp), power (pw), smooth (smo), maxdur (mxdu), and mindur (midu) are included. This could be very helpful for comparing the performance of different detection setttings

To set these parameters, we need to have some idea of the frequency range and duration of the signals we want to detect. Sectrograms produced above can help us to figure this out. Phaenthornis longirostris songs have frequencies between 2 and 10 kHz and durations between 0.05 and 0.5 s.

If you need to detect all or most of the signals within the recording, play around with different argument values to increase detection accuracy. It may be necessary to do several rounds of optimization with different subsets of your recordings. If just a few signals are needed per recording, a low-accuracy detection could yield enough selections. For instance, if the species you study produces stereotyped signals, like Phaethornis longirostris.

Finally, although autodetec performs automatic signal detection, it doesn’t remove all manual labor from your data collection. Take the time to visually inspect your selections.

# Select a subset of the recordings
wavs <- list.files(pattern = ".wav", ignore.case = TRUE)

# Set a seed so we all have the same results
set.seed(1)
sub <- wavs[sample(1:length(wavs), 3)]

# Run autodetec() on subset of recordings
autodetec(flist = sub, bp = c(1, 10), threshold = 10, mindur = 0.05, maxdur = 0.5, envt="abs",
          ssmooth = 300, ls = TRUE, res = 100, 
          flim = c(1, 12), wl = 300, set =TRUE, sxrow = 6, rows = 15, 
          redo = FALSE, it = "tiff")

The image files (in the working directory) should look like this (shown below is recording ID 154161):

autodetec image example, make it 700 pxls width

autodetec image example, make it 700 pxls width

Notice some song were correclty detected but other undersired sounds were also detected. In most cases they have a shorter duration than our target signals.

We won’t save the autodetec ouput in an object until we’re satisfied with the detection. To improve our detection we should play around with the arguments values. Below are some detection parameters that work well for these Phaethornis longirotris recordings:

autodetec(flist = sub, bp = c(2, 9), threshold = 20, mindur = 0.09, maxdur = 0.22, 
                     envt = "abs", ssmooth = 900, ls = TRUE, res = 100, 
                     flim= c(1, 12), wl = 300, set =TRUE, sxrow = 6, rows = 15, 
                     redo = TRUE, it = "tiff", img = TRUE, smadj = "end")

This seems to provide a good detection for most recordings (recording ID 154161):

autodetec image example, make it 700 pxls width

autodetec image example, make it 700 pxls width

Once we’re satisfied with the detection, we can run the autodetec on all the recordings, removing the argument flist(so it runs over all files in the working directory). We also need to save the temporal output at this point.

Phae.ad <- autodetec(bp = c(2, 9), threshold = 20, mindur = 0.09, maxdur = 0.22, 
                     envt = "abs", ssmooth = 900, ls = TRUE, res = 100, 
                     flim= c(1, 12), wl = 300, set =TRUE, sxrow = 6, rows = 15, 
                     redo = TRUE, it = "tiff", img = TRUE, smadj = "end")

str(Phae.ad)

Let’s look at the number of selections per sound file

table(Phae.ad$sound.files)
## 
## Phaethornis-longirostris-154070.wav Phaethornis-longirostris-154072.wav 
##                                  17                                 108 
## Phaethornis-longirostris-154123.wav Phaethornis-longirostris-154129.wav 
##                                  30                                  60 
## Phaethornis-longirostris-154138.wav Phaethornis-longirostris-154161.wav 
##                                  21                                  81


Select signals based on signal-to-noise ratio (SNR)


Use snrspecs to prepare signal to noise measurements

Filtering your selected signals by signal to noise ratio (SNR) is often a good idea. Signals that have a ratio close to 1 (or lower) have very poor quality. A SNR of 1 means the signal and background noise have the same amplitude.

A SNR filter can be applied at any point in your worklow, after using autodetec or manualoc. However, if you just need a sample of the signals in each recording, it would make sense to use the SNR functions to perform another quality filter prior to making acoustic measurements. Like the other functions downstream of autodetec or manualoc, the signal to noise functions require the start and end time of the signals.

snrspecs is another function in the family of spectrogram creators. It has very similar arguments to specreator, but it also has additional arguments for picking a margin over which to measure noise. These margins are very important for calculating SNR, especially when you’re measuring signals with short silence in between. You want to be sure to pick a noise margin that doesn’t overlap neighboring signals.

# A margin that's too large causes other signals to be included in the noise measurement
# Re-initialize X as needed, for either autodetec or manualoc output

# Let's try it on 10% of the selections so it goes a faster
# Set a seed first, so we all have the same results
set.seed(5)

X <- Phae.ad[sample(1:nrow(Phae.ad),(nrow(Phae.ad)*0.1)), ]

snrspecs(X = X, flim = c(2, 110), snrmar = 0.5, mar = 0.7, it = "tiff")

The image files should look like this:

snrpecs

snrpecs

This margin is far too large! It’s overlapping the whole previous signal. Let’s try with smaller margins.

# This smaller margin is better
snrspecs(X = X, flim = c(2, 11), snrmar = 0.2, mar = 0.7, it = "tiff")
snrpecs

snrpecs

These smaller margins seem to work well. Now we can run the function over all the selections and inspect all spectrograms to be sure that the margin(s) still hold.

snrspecs(X = Phae.ad, flim = c(2, 11), snrmar = 0.2, mar = 0.7, it = "tiff")


Calculate signal to noise ratio for recordings

Once you’ve picked a margin for all recordings, you can move forward with the SNR calculation. This calculation can allow you to remove poor quality recordings with a SNR close to 1. Since you’ve already performed several visual filters in the workflow, this step is not always necessary, but it can provide you with quantitative information about recording quality.

We will measure SNR on every other selection to speed up the process

Phae.snr <- sig2noise(X = Phae.ad[seq(1, nrow(Phae.ad), 2), ], mar = 0.04)

As we just need a few songs to characterize each sound file and individual, we can choose selections with the highest SNR per sound file. In this example, we will choose 5 selections with the highest SNRs.

Phae.hisnr <- Phae.snr[ave(-Phae.snr$SNR, Phae.snr$sound.files, FUN = rank) <= 5, ]

# Double check the number of selection per sound files 
table(Phae.hisnr$sound.files)
## 
## Phaethornis-longirostris-154070.wav Phaethornis-longirostris-154072.wav 
##                                   5                                   5 
## Phaethornis-longirostris-154123.wav Phaethornis-longirostris-154129.wav 
##                                   5                                   5 
## Phaethornis-longirostris-154138.wav Phaethornis-longirostris-154161.wav 
##                                   5                                   5

At this point would be a good idea to save the selections as a file

write.csv(Phae.hisnr, "Phae_lon_autodetec_selecs.csv", row.names = FALSE)


Manually select signals with manualoc

In some cases manual selection may be preferable, especially if you have shorter recordings or if the automatic detection is not as accurate as you’d like.

manualoc is a function that provides a graphical interface in which you can select the start and end of the signals. Users can zoom-in a specific sound file segment by clicking at the start and end (left side and right side) of the segment. To select the start and end of a vocalization unit the users need to click at the end and then at the start (right side and left side) of the unit. In addition, 6 “buttons” are provided at the upper right side of the spectrogram that allow to display a full view of the spectrogram (“Full view”), go back to the previous view (“Previous view”), stop the analysis (“Stop”), go to the next sound file (“Next rec”), play the current view using external software (“Play”, see “player” argument), or delete the last manual selection in the current sound file (“Del-sel”).

It can often run slowly, depending on the size of the sound files. However, manualoc can be most helpful when you only need a few signals from short recordings.

We recommend reading the documentation for manualoc prior to running this example. Once you’ve done so, here are some points to keep in mind:

  1. The sole output for this function is a .csv file:
    • contains the time coordinates, selection information and any comments made
    • similar to the autodetec output, these coordinates will be used in downstream functions
  2. Be very precise with your clicks
    • stray clicks will cause manualoc to fail
    • don’t double-click, instead click once and wait for blue bubble to pop up before clicking again
    • suspect a stray click when manualoc starts responding to single clicks
    • if so, stop manualoc with Stop button
    • manualoc retains all previous selections in the .csv file and will start up where you left off
  3. Selections can be deleted directly in the manualoc interface
    • use Del-sel button
  4. Selections can also be deleted in the manualoc_output.csv
    • stop manualoc, open the .csv
    • delete the rows corresponding to the unwanted selection(s) prior to starting manualoc again
  5. Run manualoc within the expected frequency range for your species
    • use argument flim to facilitate signal selection
  6. Run manualoc with oscillograms enabled to improve signal selection
    • when osci = TRUE, the oscillogram or waveform serve as a visual aid
    • use changes in amplitude to select start and end of the signal
    • oscillogram will print to screen when the resolution of the projected spectrogram
      improves (depends on the seltime argument)
    • seltime determines the time interval in seconds at which the spectrograms are produced with higher resolution and oscillograms

Some other uses for manualoc:

  1. manualoc can be used in combination with autodetec if you have large recordings:
    • you can select specific time points to break up the recording and speed up autodetec
    • then you can feed these time coordinates to autodetec using the data frame argument X
    • this can help customize autodetec if you have recordings with different noise or playback treatments
  2. manualoc can also be used for visual classification:
    • run manualoc with selcomm = TRUE
    • tag individual selections with song or element types using selcomm
    • use specreator to create spectrograms with selcomm text and check visual classifications

Note that you can stop the function at any point by clicking twice on the stop button.

# Run manualoc() with frequency range set for Phaethornis longirostris
# Recording comments are enabled to mark recording quality
# Selection comments enabled to include visual classifications
manualoc(flim = c(1, 11), reccomm = TRUE, selcomm = TRUE, osci = TRUE)

# Read manualoc() output back into RStudio as an object
# This data frame object can be used as input for the functions that follow
manualoc_out <- read.csv("manualoc_output.csv", header = TRUE)

The graphic device should display something like this

manualoc view

manualoc view


Visualize autodetec or manualoc selections with specreator

specreator serves as yet another option for visual inspection, although at the level of individual selections made through autodetec and manualoc.

Like the other members of the spectrogram-creating family, specreator contains many options related to graphical parameters. With some fiddling around, it’s possible to make images of publication quality. However, some of these graphical parameters do not play well together (especially osci, gr, sc), see the documentation for suggestions.

# Create a subset of 5 recordings analyzed by autodetec() or manualoc()
# Speeds up process of playing around with arguments 
# Run either line below to reinitialize X with either autodetec 
# or manualoc subset as desired

set.seed(50)
X <- Phae.hisnr[sample(1:nrow(Phae.hisnr), 5), ]

# Plot selection lines from manualoc() or autodetec()
specreator(X, osci = FALSE, line = TRUE, wl = 300, flim = c(1, 11), it = "tiff")

# Change frequency limits of y-axis
specreator(X, flim = c(1, 11), osci = TRUE, line = TRUE, wl = 300, it = "tiff")

# Change width of spectrogram to be proportional to signal duration
specreator(X, flim = c(1, 11), osci = TRUE, line = TRUE, propwidth = TRUE, wl = 300, it = "tiff")

# Change spectrogram size 
# Changing inner.mar and outer.mar arguments improves picsize results
specreator(X, flim = c(1, 11), osci = TRUE, line = TRUE, picsize = 1.5, wl = 300, 
           ovlp = 90, inner.mar = c(4, 4.5, 2, 1), outer.mar = c(4, 2, 2, 1), it = "tiff")

# Run function for all recordings, with final argument settings
specreator(Phae.hisnr, flim = c(1, 11), osci = TRUE, line = TRUE, wl = 300,
           ovlp = 90, it = "tiff", res = 300)
specreator example

specreator example

Measure acoustic parameters


Visualize frequency measurements with trackfreqs

Prior to calculating acoustic measurements, it’s good practice to visualize the accuracy of some important measurements, namely frequency measurements. The function trackfreqs is the last in the family of spectrogram-creators. It allows you to create spectrograms with dominant frequency and fundamental frequency measurements plotted on top of each selected signal.

In general, the fundamental frequency measurements are not as reliable as the dominant frequency measurements. Use trackfreqs on all the recordings for which you want to measure acoustic parameters. Scroll through all the spectrograms to get a feeling for how well the frequency measurements will be performed across your recordings.

Like it’s sister functions, trackfreqs has many graphical arguments. It has additional graphical arguments to change colors of the plotting symbols, and size and position of legend labels. These arguments will largely depend on the nature of your selections.

# Note that the dominant frequency measurements are almost always more accurate
trackfreqs(Phae.hisnr, flim = c(1, 11), bp = c(1, 12), it = "tiff")

# We can change the lower end of bandpass to make the frequency measurements more precise
trackfreqs(Phae.hisnr, flim = c(1, 11), bp = c(2, 12), col = c("purple", "orange"),
           pch = c(17, 3), res = 300, it = "tiff")

# If the frequency measurements look acceptable with this bandpass setting,
# that's the setting we should use when running specan() 
trackfreqs image

trackfreqs image

As fundamental frequency does not seem to be adequately tracked, we will remove it from the acoustic parameters measured in the next step.


Batch-process acoustic measurements with specan

We’re close to finshing the warbleR workflow. We can now perform acoustic measurements with the function specan. This function calculates several acoustic parameters across all the specified recordings. It’s a batch process that is much faster than calculating measurements one recording at a time. specan uses and customizes several functions available in the seewave package.

specan uses the time coordinates in the autodetec or manualoc output. It will measure acoustic parameters within the start and end times of the selections.

# Use the bandpass filter to your advantage, to filter out low or high background
# noise before performing measurements
# The amplitude threshold will change the amplitude at which noises are
# detected for measurements 
params <- specan(Phae.hisnr, bp = c(1, 11), threshold = 15)

View(params)

str(params)

# As always, it's a good idea to write .csv files to your working directory
## 'data.frame':    30 obs. of  24 variables:
##  $ sound.files: Factor w/ 6 levels "Phaethornis-longirostris-154070.wav",..: 1 1 1 1 1 2 2 2 2 2 ...
##  $ selec      : int  13 15 19 27 29 36 76 104 126 148 ...
##  $ duration   : num  0.176 0.174 0.175 0.159 0.163 ...
##  $ meanfreq   : num  6.46 6.41 6.42 6.54 6.45 ...
##  $ sd         : num  1.76 1.83 1.76 1.86 1.79 ...
##  $ median     : num  6.31 6.22 6.21 6.38 6.32 ...
##  $ Q25        : num  5.46 5.39 5.38 5.36 5.4 ...
##  $ Q75        : num  7.54 7.6 7.52 7.71 7.54 ...
##  $ IQR        : num  2.09 2.22 2.14 2.34 2.14 ...
##  $ skew       : num  2.15 2.46 2.45 2.5 2.21 ...
##  $ kurt       : num  8.8 10.4 10.8 12.57 9.95 ...
##  $ sp.ent     : num  0.929 0.931 0.927 0.935 0.933 ...
##  $ sfm        : num  0.494 0.551 0.49 0.546 0.517 ...
##  $ mode       : num  5.87 6.13 6.14 6.12 6.14 ...
##  $ centroid   : num  6.46 6.41 6.42 6.54 6.45 ...
##  $ peakf      : num  5.8412 5.6854 5.7821 6.0871 0.0429 ...
##  $ meanfun    : num  4.48 4.79 4.75 5.04 5.36 ...
##  $ minfun     : num  0.182 0.126 0.441 2.756 2.756 ...
##  $ maxfun     : num  7.35 7.35 7.35 7.35 7.35 7.35 7.35 7.35 7.35 7.35 ...
##  $ meandom    : num  6.41 6.36 6.34 6.42 6.43 ...
##  $ mindom     : num  5.38 5.51 5.64 5.38 5.47 ...
##  $ maxdom     : num  7.62 7.62 7.58 7.58 7.62 ...
##  $ dfrange    : num  2.24 2.11 1.94 2.2 2.15 ...
##  $ modindx    : num  0.439 0.395 0.333 0.435 0.39 ...

Now let’s remove parameters derived from fundamental frequency (based on trackfreqs results)

params <- params[, grep("fun|peakf", colnames(params), invert = TRUE)]


Analysis of geographic variation using specan measurements


Now we can evaluate whether the observed variation in song structure is actually reflected by the acoustic parameters we just measured. For this we will conduct a Principal Component Analysis on scaled (z-transformed) acoustic parameters and look at the grouping of songs (data points) in the scatter plot.

# Run the PCA with only numeric variables of params
pca <- prcomp(x = params[, sapply(params, is.numeric)], scale. = TRUE)

# Check loadings
summary(pca)
## Importance of components:
##                           PC1    PC2    PC3     PC4     PC5     PC6    PC7
## Standard deviation     2.9427 2.1208 1.4828 1.06118 0.96515 0.72535 0.6040
## Proportion of Variance 0.4558 0.2367 0.1157 0.05927 0.04903 0.02769 0.0192
## Cumulative Proportion  0.4558 0.6925 0.8082 0.86747 0.91649 0.94419 0.9634
##                            PC8     PC9    PC10    PC11    PC12    PC13
## Standard deviation     0.55713 0.44265 0.30472 0.18092 0.16071 0.13171
## Proportion of Variance 0.01634 0.01031 0.00489 0.00172 0.00136 0.00091
## Cumulative Proportion  0.97972 0.99003 0.99492 0.99664 0.99800 0.99892
##                           PC14    PC15    PC16      PC17      PC18
## Standard deviation     0.10486 0.07596 0.06171 3.528e-15 4.434e-16
## Proportion of Variance 0.00058 0.00030 0.00020 0.000e+00 0.000e+00
## Cumulative Proportion  0.99950 0.99980 1.00000 1.000e+00 1.000e+00
##                             PC19
## Standard deviation     1.315e-16
## Proportion of Variance 0.000e+00
## Cumulative Proportion  1.000e+00
# Extract PCA scores
pcascor <- as.data.frame(pca[[5]])

# Plot the 2 first PCs
plot(pcascor[, 1], pcascor[, 2], col = as.numeric(params$sound.files), pch = 20, 
     cex = 1, xlab = "PC1", ylab = "PC2")

# Add recordings/individuals labels 
x <- tapply(pcascor[, 1], params$sound.files, mean)
y <- tapply(pcascor[, 2], params$sound.files, mean)

labs <- gsub(".wav", "", unique(sapply(as.character(params$sound.files), function(x){
  strsplit(x, split = "-", fixed = TRUE)[[1]][3]
  }, USE.NAMES = FALSE)))

text(x, y, labs, cex=0.5)

It seems like the songs are grouped by sound file. As each sound files represent a single individual, this suggests that songs have individual signatures. Let’s look at the song type level. First, we need to classify the songs by song type. We can check the spectrograms we previously created to do this.

song types, make it 700 pxl width

song types, make it 700 pxl width

Songs from sound files 154070 and 154072 seem to belong to the same song type. Sound files 154129 and 154161 represent a different song type. Finally, the songs from each of the other 2 sound files have a unique structure, so each one represents a different song type. We can add this information to the plot by using symbols to represent song types.

# Create a song type variable

# First, extract recording ID
songtype <- gsub(".wav", "", sapply(as.character(params$sound.files), function(x){
  strsplit(x, split = "-", fixed = TRUE)[[1]][3]
  }, USE.NAMES = FALSE))

# Now change IDs for letters representing song types
songtype <- gsub("154070|154072", "A", songtype)
songtype <- gsub("154129|154161", "B", songtype)
songtype <- gsub("154123", "C", songtype)
songtype <- gsub("154138", "D", songtype)

# Add song type as a variable representing symbol type
plot(pcascor[, 1], pcascor[, 2], col = as.numeric(params$sound.files), 
pch = as.numeric(as.factor(songtype)), 
     cex = 1, xlab = "PC1", ylab = "PC2")

# Add song type labels 
x <- tapply(pcascor[, 1], songtype, mean)
y <- tapply(pcascor[, 2], songtype, mean)

text(x, y, unique(songtype), cex = 1)

Songs of the same song type are more similar (they cluster together). This PCA confirms that the visually obvious differences in the song structures are well described by the acoustic parameters measured in warbleR. Likewise, it also confirms that we can detect biologically relevant differences from sound files that have undergone mp3 compression and conversion back to wav format (see also [2]).


References

  1. Araya-Salas, M. and T. Wright. 2013. Open-ended song learning in a hummingbird. Biology Letters. 9 (5). doi: 10.1098/rsbl.2013.0625

  2. Medina‐García, Angela, M. Araya‐Salas, and T. Wright. 2015. Does vocal learning accelerate acoustic diversification? Evolution of contact calls in Neotropical parrots. Journal of Evolutionary Biology. doi: 10.1111/jeb.12694