Objectives

  • Index global maxima and minima
  • Interpret the relative position of time points within a speech utterance
  • Slice data and performing analyses on the extracted data
  • Use slicing to find local maxima and minima
  • Understand relative indexing and index offsets

Exercises

We’re going to be working this week with the same data set we worked with last week:

load("/shared/groups/jrole001/pals0047/data/aurora_data.Rda")

Exercise 1: pitch indexing

Let’s kick things off with an easy warm-up exercise. Start by plotting the fundamental frequency values over time:

plot(time, f0)

using the techniques from this week’s lecture, determine the time point of the f0 peak (i.e. maximum f0), in milliseconds

Exercise 2: pitch slicing

Now that we’re warmed up, we’re going to introduce a concept that we’ll explore in detail next week: a data frame.

The following code will create a data frame that contains information about the individual phones (NB: @ is the SAMPA convention for the schwa vowel), the syllable number within the three-syllable word, and the time points (in milliseconds) of the start and end of each phone segment:

segments <- data.frame(
  phone = c("o","r","o","r","@"),
  syllable = c(1,2,2,3,3),
  start = c(0,84,166,285,384),
  end = c(84,166,285,384,490)
)

Now let’s see what this newly created data frame looks like. As we’ll see next week, this data frame is formed of 5 rows and 4 columns:

segments
##   phone syllable start end
## 1     o        1     0  84
## 2     r        2    84 166
## 3     o        2   166 285
## 4     r        3   285 384
## 5     @        3   384 490

time to think: why are most of the time points the exact same for both the “start” and “end” column, but offset by exactly one row? what is the relationship between these two columns?

Given your answer to Exercise 1, determine the answers to the following questions:

  1. Within which phone does the F0 peak occur?
  2. Within which syllable does the F0 peak occur?

Exercise 3: variable assignment

We saw in this week’s lecture how we can use Boolean expressions and operators directly to slice and extract data that meet the relevant Boolean conditions.

Using this technique, create three different variables that contain the F0 values associated with the three syllables of the word. Name your three variables s1, s2, and s3.

s1 <- f0[time>=0 & time<0.084]
s2 <- f0[time>=0.084 & time<0.285]
s3 <- f0[time>=0.285 & time<=0.490]

If you did the slicing correctly, your three variables should contain the following data (NB: the ; operator allows you to place different commands on the same line of code):

s1; s2; s3
## [1] 129 130 129 129 129 131 132 132 131
##  [1] 129 127 126 125 124 126 129 131 132 133 133 133 135 136 138 141 143 145 146
## [20] 146
##  [1] 146 145 143 140 138 134 129 125 117 110 103  98  92  90  87  88  87  86  85
## [20]  85  81

After creating the three separate variables, you will be able to answer the following questions:

  1. What is the range of F0 values in each of the syllables?
  2. What is the mean F0 value in each of the syllables?
  3. What is the median F0 value in each of the syllables?
  4. What is the duration (in milliseconds) of each of the three syllables? (see hints)

hint #1: use the length() function to determine the number of SAMPLES

hint #2: the increment of the time values is 0.01 seconds, so the duration of data with 3 samples would be exactly 0.03 seconds (or 30 ms), for example

Exercise 4: finding peaks using slicing

For this last exercise, you’ll be combining many of the techniques you learned this week, plus learning a new one, in order to find the location of peaks within a signal. You’ll be working with the F1 data, so let’s go ahead and plot it:

plot(time, f1)

Your goal for this exercise is to determine, algorithmically, the time points associated with three different peaks (two minima, and one maximum), indicated here by the solid red dots:

plot(time, f1)
points(0.12, f1[time==0.12], col='red', pch=16)
points(0.22, f1[time==0.22], col='red', pch=16)
points(0.32, f1[time==0.32], col='red', pch=16)

The first point is the easiest, because we can see that it is the minimum (i.e. smallest value) across the entire range of F1 values. Because of this, we can simply use which.min() to find its time point:

time[which.min(f1)]
## [1] 0.12

That’s easy enough. But what about the second point, the peak that occurs around 0.2 seconds? If we use the which.max() function to index across the entire data range, we get a value that isn’t close at all:

time[which.max(f1)]
## [1] 0.49

The problem here is that the functions which.max() and which.min() will find the index of the absolute maximum and minimum across the entire range of data. This is called the global maximum (or global minimum). However, the second point indicated in red is what’s called a local maximum: it is a maximum relative to its neighboring values, even though it may not be the maximum of all of the values.

What this means is that you will need to reduce the range of data that you put into the which.max() function, by first slicing the data using Boolean operation. In other words, you will first create a local subset of data, and then find which is the global maximum within that local range. Make sense?

An added complication is that once you slice the data, the range of indices changes, which you will need to account for when determining your final answers by creating an offset value. What do I mean by this? Let’s take a look at an example using a ruler:

If we consider each 0.5 cm tick on this ruler as an index, then the 3 cm mark (denoted in red) is the 7th tick. In other words, the 3 cm mark has an index of 7.

Now let’s “zoom in” to only the range of values between 3 and 7 cm (inclusive of 3 and 7). In Boolean terms, this is the range of tick marks that are greater than or equal to 3 and less than or equal to 7:

Now that we have “zoomed in” on the data, the index for the 3 cm tick mark is no longer 7… it is 1! However, the “real world” value has not changed: the tick mark still corresponds to 3 cm. To account for this difference, we need to be aware of how much data was excluded on the left edge, i.e. how many FALSE Boolean values were on the left edge (the values that were less than 3). This is an offset that will need to be added to the new index, 1:

offset <- 6
index <- 1

index + offset
## [1] 7

Let’s take a look at an example of how we can approach this problem using the first minimum, which we have already found:

time[which.min(f1)]
## [1] 0.12

If we look at the plot above, we know that this minimum is somewhere between 0.05 s and 0.15 s. So let’s use that range to create a Boolean expression:

time >= 0.05 & time <= 0.15
##  [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [13]  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [49] FALSE FALSE

There are exactly 5 false Boolean values on the left edge of the evaluation, so the offset is 5. We don’t have to manually count this offset, however. We can instead use the which() function:

which(time >= 0.05 & time <= 0.15)
##  [1]  6  7  8  9 10 11 12 13 14 15 16

The output of the which() function tells us which indices give a TRUE value for the expression. So, given the above expression, the indices 6-16 fall within the range of 0.05 s to 0.15 s. What this means is that our offset value is 1 less than the first TRUE index. We can therefore determine the offset in an algorithmic way by following this logic:

matches <- which(time >= 0.05 & time <= 0.15)
offset <- matches[1] - 1

Now that we have the offset, we can add it to the local minimum within the selected range:

peak <- which.min(f1[time >= 0.05 & time <= 0.15])
peak <- peak + offset

Our result should now match what we found previously:

time[peak]
## [1] 0.12
time[which.min(f1)]
## [1] 0.12

using these techniques, determine the time points (in milliseconds) of the local maximum near 0.2 s and the local minimum near 0.3 s

# local maximum near 0.2 s
t1 <- 0.15
t2 <- 0.25

matches <- which(time >= t1 & time <= t2)
offset <- matches[1] - 1
peak <- which.max(f1[time >= t1 & time <= t2])
peak <- peak + offset
time[peak]
## [1] 0.22
# local minimum near 0.3s
t1 <- 0.25
t2 <- 0.4

matches <- which(time >= t1 & time <= t2)
offset <- matches[1] - 1
peak <- which.min(f1[time >= t1 & time <= t2])
peak <- peak + offset
time[peak]
## [1] 0.32