Objectives

  • Learn to convert data types
  • Use a for loop to make iterative measurements
  • Perform correlation analyses on articulatory and acoustic data

EMA & EMG data

We’re going use the same data set that we introduced in last week’s lab. However, whereas last week we used the data set for plotting alone, this week we’ll be doing actual measurements of the data.

load("/shared/groups/jrole001/pals0047/data/EMA-EMG_data.Rda")

Exercises

Exercise 1: tongue tip measures

Data conversion

The first exercise will involve using a for loop to measure the vertical minimum and maximum position of the tongue tip EMA sensor for all phone segments in the data set. Let’s walk through the process for the first phone segment to undertand how you might be able to automatically iterate the process. Let’s first grab the start and end time points of the first phone, as well as the EMA sampling rate, and save them as variables:

t1 <- data$segments$start[1]
t2 <- data$segments$end[1]

sr <- data$SR$EMA

Now let’s examine the data we’ll be working with, the vertical tongue tip dimension (TT_z):

head(data$EMA$TT_z)
## Error in data$EMA$TT_z: $ operator is invalid for atomic vectors

Whoops! R is telling us that we can’t use the $ operator here. Why not? What data type are we dealing with here?

class(data$EMA)
## [1] "matrix" "array"

Ahhh, so that’s the problem. Remember from last week’s lecture that we can only use the $ operator on data frames and lists, but not arrays. But being able to reference columns using the $ operator will be very handy for today’s exercises, so we’ll want to be able to do that. There is an easy solution here… we can convert the data array to a data frame using the as.data.frame() function!

data$EMA <- as.data.frame(data$EMA)

head(data$EMA$TT_z)
## [1] 3.798042 3.794778 3.780786 3.764642 3.758525 3.768067

Data measurement

Now that we have a data frame format, we can easily index the range of data that we want. Let’s find the maximum tongue height in the first phone segment, and add it to the corresponding first index of a new column in data$segments which we’ll name TTz_max:

data$segments$TTz_max[1] <- max(data$EMA$TT_z[round(t1*sr):round(t2*sr)])

Since we’ll need to do this for every segment, this is the perfect opportunity for a loop! Your goal for this first exercise is to use a for loop to measure the maximum and minimum tongue tip positions in each phone segment, and to add them to columns named TTz_max and TTz_min in the data$segments field. If you’ve done it correctly, your data should look like the following:

data$segments
##     word phone     start       end     TTz_max     TTz_min
## 1  asas@     a  1.540144  1.706447  2.54847133 -4.09755213
## 2  asas@     s  1.706447  1.917608  4.29187893  2.54847133
## 3  asas@     a  1.917608  2.105793  3.44575580 -4.23246091
## 4  asas@     s  2.105793  2.293978  4.52813548  0.96624229
## 5  asas@     @  2.293978  2.477787  3.31009577 -2.83205554
## 6  olol@     o  3.487426  3.665764  2.06580198 -6.35449624
## 7  olol@     l  3.665764  3.787209  4.47762063  1.16246682
## 8  olol@     o  3.787209  4.015876  2.10288920 -5.33099759
## 9  olol@     l  4.015876  4.138415  4.17636645  0.72911233
## 10 olol@     @  4.138415  4.305812  0.72911233 -4.64848357
## 11 epep@     e  5.399845  5.522384 -1.32956077 -3.99495397
## 12 epep@     p  5.522384  5.710570  0.18837983 -2.14108260
## 13 epep@     e  5.710570  5.851708 -1.56527816 -2.36715357
## 14 epep@     p  5.851708  6.001600 -0.55879101 -3.60921969
## 15 epep@     @  6.001600  6.185409 -1.97232188 -4.41110334
## 16 afaf@     a  7.245524  7.395416 -1.95156077 -5.31095088
## 17 afaf@     f  7.395416  7.573754 -0.13421389 -1.95156077
## 18 afaf@     a  7.573754  7.779445 -1.06225810 -4.06737884
## 19 afaf@     f  7.779445  7.943560 -0.19368307 -2.82346882
## 20 afaf@     @  7.943560  8.138310 -0.36762814 -2.41778215
## 21 {s{s@     {  9.271109  9.436318  0.02933077 -4.56383327
## 22 {s{s@     s  9.436318  9.619033  4.55545623  0.02933077
## 23 {s{s@     {  9.619033  9.821441  3.66402490 -3.12882622
## 24 {s{s@     s  9.821441  9.981180  4.48511063  1.42136153
## 25 {s{s@     @  9.981180 10.152954  3.48642351 -1.38420334
## 26 {r{r@     { 11.733739 11.900042 -0.40617802 -4.57663291
## 27 {r{r@     r 11.900042 12.067439  9.73559733 -0.40617802
## 28 {r{r@     { 12.067439 12.325647  8.01562076 -2.53395513
## 29 {r{r@     r 12.325647 12.445998  9.33192148  3.88548375
## 30 {r{r@     @ 12.445998 12.604643  7.86432459 -1.38026630
## 31 itit@     i 13.669756 13.793389  6.69426491 -0.45947150
## 32 itit@     t 13.793389 14.045032  9.31006561  0.93752036
## 33 itit@     i 14.045032 14.180701  4.92477498  0.63855831
## 34 itit@     t 14.180701 14.365604  6.69665261  2.27525222
## 35 itit@     @ 14.365604 14.563636  2.27525222 -4.58266512

time to think: are there any patterns you notice here? which types of sounds are produced with the highest maximum, and lowest minimum, vertical tongue tip position? why do you think this is the case?

Exercise 2: amplitude and height

In this second exercise, you’ll use the same for loop structure to add the minima of the vertical position of the middle position of the tongue (TM_z) and the tongue dorsum (TD_z), and add them to columns named TMz_min and TDz_min, respectively, just as you did for TTz_min. You’ll also create a measurement of acoustic amplitude within the loop.

RMS amplitude

Sound is characterized by local changes to atmospheric pressure, which means that it is an alternating signal: the air pressure fluctuates between states of pressure that is higher than atmospheric pressure (“compression”) and pressure that is lower than atmospheric pressure (“rarefaction”). So amplitude can’t be measured by simply taking the average of the signal, since the states of compression and rarefaction would cancel each other out.

To obtain a measurement of the absolute difference from atmospheric pressure (i.e. amplitude), the root mean square (RMS) of the acoustic signal is often used. As the name implies, RMS is the square root of the mean of the squared values of the data. Thus, for any data signal x, the RMS can be calculated as:

sqrt(mean(x^2))

Using this knowledge, add these new measures into your for loop. The end result should look like this:

data$segments
##     word phone     start       end     TTz_max     TTz_min    TMz_min
## 1  asas@     a  1.540144  1.706447  2.54847133 -4.09755213  0.8646487
## 2  asas@     s  1.706447  1.917608  4.29187893  2.54847133  3.4604227
## 3  asas@     a  1.917608  2.105793  3.44575580 -4.23246091  1.7615404
## 4  asas@     s  2.105793  2.293978  4.52813548  0.96624229  3.6183587
## 5  asas@     @  2.293978  2.477787  3.31009577 -2.83205554  3.8145834
## 6  olol@     o  3.487426  3.665764  2.06580198 -6.35449624  0.8131888
## 7  olol@     l  3.665764  3.787209  4.47762063  1.16246682  2.2203934
## 8  olol@     o  3.787209  4.015876  2.10288920 -5.33099759  2.0070579
## 9  olol@     l  4.015876  4.138415  4.17636645  0.72911233  2.7306206
## 10 olol@     @  4.138415  4.305812  0.72911233 -4.64848357  1.5677006
## 11 epep@     e  5.399845  5.522384 -1.32956077 -3.99495397  9.9319069
## 12 epep@     p  5.522384  5.710570  0.18837983 -2.14108260  9.5326802
## 13 epep@     e  5.710570  5.851708 -1.56527816 -2.36715357 10.1805743
## 14 epep@     p  5.851708  6.001600 -0.55879101 -3.60921969  2.9932604
## 15 epep@     @  6.001600  6.185409 -1.97232188 -4.41110334  1.6790136
## 16 afaf@     a  7.245524  7.395416 -1.95156077 -5.31095088  1.0908505
## 17 afaf@     f  7.395416  7.573754 -0.13421389 -1.95156077  2.3543239
## 18 afaf@     a  7.573754  7.779445 -1.06225810 -4.06737884  1.2168604
## 19 afaf@     f  7.779445  7.943560 -0.19368307 -2.82346882  1.7906092
## 20 afaf@     @  7.943560  8.138310 -0.36762814 -2.41778215  2.8063199
## 21 {s{s@     {  9.271109  9.436318  0.02933077 -4.56383327  5.1184877
## 22 {s{s@     s  9.436318  9.619033  4.55545623  0.02933077  4.6942912
## 23 {s{s@     {  9.619033  9.821441  3.66402490 -3.12882622  4.4506495
## 24 {s{s@     s  9.821441  9.981180  4.48511063  1.42136153  5.6514237
## 25 {s{s@     @  9.981180 10.152954  3.48642351 -1.38420334  5.0487260
## 26 {r{r@     { 11.733739 11.900042 -0.40617802 -4.57663291  6.6577876
## 27 {r{r@     r 11.900042 12.067439  9.73559733 -0.40617802  8.9481795
## 28 {r{r@     { 12.067439 12.325647  8.01562076 -2.53395513  6.3253801
## 29 {r{r@     r 12.325647 12.445998  9.33192148  3.88548375 10.3500723
## 30 {r{r@     @ 12.445998 12.604643  7.86432459 -1.38026630  4.3955582
## 31 itit@     i 13.669756 13.793389  6.69426491 -0.45947150 15.4639283
## 32 itit@     t 13.793389 14.045032  9.31006561  0.93752036 13.6387579
## 33 itit@     i 14.045032 14.180701  4.92477498  0.63855831 14.2497980
## 34 itit@     t 14.180701 14.365604  6.69665261  2.27525222  9.0048556
## 35 itit@     @ 14.365604 14.563636  2.27525222 -4.58266512  2.8581860
##       TDz_min        RMS
## 1   1.7692839 0.21341366
## 2   0.2561617 0.02628028
## 3   3.0627321 0.22366990
## 4   2.9999151 0.02504962
## 5   3.4705340 0.11400046
## 6   5.7130302 0.26782050
## 7   5.4460862 0.14981204
## 8   5.3162659 0.25029961
## 9   7.4895393 0.13692561
## 10  5.0229051 0.14642305
## 11 11.1282641 0.20156290
## 12  8.8772668 0.03375459
## 13 10.0433870 0.21673725
## 14  5.3519735 0.01695539
## 15  3.8695140 0.12323786
## 16  2.9132552 0.20308963
## 17  3.4587193 0.01587763
## 18  3.9373829 0.20515820
## 19  4.4131487 0.01869326
## 20  4.4449821 0.10738471
## 21  3.9379478 0.20843238
## 22  2.6192245 0.02760202
## 23  4.8941013 0.18626453
## 24  4.7912468 0.02110424
## 25  4.8623153 0.08750954
## 26  6.8074264 0.29118905
## 27  7.1324980 0.16219857
## 28  6.3685775 0.23495124
## 29  7.7238796 0.08319145
## 30  3.5727220 0.07660697
## 31 13.2667758 0.11699480
## 32  9.0866071 0.01718078
## 33 10.9417301 0.12504367
## 34  5.5001755 0.01511269
## 35  2.9231444 0.10693698

Correlation: RMS and height

The final step of the exercise to determine which portions of the tongue correlate most strongly with acoustic amplitude. To do this, you need to use the cor.test() function to determine the separate correlations between RMS and TTz_min, between RMS and TMz_min, and between RMS and TDz_min.

As an example of how to use the cor.test() function, we can see that there is a very strong correlation (R = 0.82) between the vertical minima of the middle of the tongue and the vertical minima of the tongue dorsum:

cor.test(data$segments$TMz_min, data$segments$TDz_min)
## 
##  Pearson's product-moment correlation
## 
## data:  data$segments$TMz_min and data$segments$TDz_min
## t = 8.2614, df = 33, p-value = 1.533e-09
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.6714968 0.9063009
## sample estimates:
##       cor 
## 0.8210203

time to think #1: why does the height of the middle of the tongue correlate strongly with the height of the tongue dorsum?

time to think #2: which portion of the tongue has the strongest correlation with acoustic amplitude? why do you think this is the case?

Finally, let’s save the resulting data to use next week:

save(data, file="EMA-EMG_data2.Rda")