rPraat
packagephonTools
packageThe goal of today’s analysis is make phone-wise measurements from an
audio file, using the time points included in a Praat TextGrid. Rather
than use a pre-prepared R data file, you’ll instead use
phonTools
to read a raw audio file instead… which is more
like a real-world speech data analysis scenario!
Both group A and B will use the same audio file, which can be loaded from here:
audio <- phonTools::loadsound("/shared/groups/jrole001/pals0047/data/peer_review_2.wav")
The sampling rate is in the field audio$fs
and the audio
data itself is in audio$sound
.
Additionally, both groups will use the same TextGrid file, which can be loaded from here:
tg <- rPraat::tg.read("/shared/groups/jrole001/pals0047/data/peer_review_2.TextGrid")
If you have been assigned data set A use this command to retrieve your Rmd file:
file.copy("/shared/groups/jrole001/pals0047/review/PALS0047_review2a.Rmd","~")
If you have been assigned data set B use this command to retrieve your Rmd file:
file.copy("/shared/groups/jrole001/pals0047/review/PALS0047_review2b.Rmd","~")
The analysis task for both groups should follow a similar pipeline; the only difference is which measurements you will make and which kind of results plot you will create.
Therefore, I recommend that you follow a pipeline similar to the following:
for
loop)if
statement)for
loop (step 1, above), so that the measurements are only added if the
if
condition is met (step 2, above)Because this task is more complicated than the first peer code review task, I will give you partially completed code and you will need to fill in the missing parts. In this way, I am leading you toward the final assessment… which you will do on your own!
You will need to pre-allocate a data frame for the
results, and this data frame should include a number of rows equal to
the number of phone segments in the recording. However, there is a small
problem: since there are sections of the recording that
don’t contain the target phone segments, those
intervals will be blank. This means that you can’t simply use
length(tg$phone$label)
to determine the number of phone
segments!
An easy way around this problem is to use the knowledge that each
empty interval has a length of 0 and each target phone segment interval
has a length of 1. Since the phone tier interval data are
character data types, you can use the
nchar()
(“number of characters”) function to create a
vector of character counts:
nchar(tg$phone$label)
## [1] 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0
## [38] 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1
## [75] 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1
## [112] 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1
## [149] 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1
## [186] 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1
## [223] 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0
All of the 0 elements correspond to blank intervals and all of the 1 elements correspond to target phone segments. Knowing this, you can simply sum this vector to create a total count of target phone segments:
N <- sum(nchar(tg$phone$label))
Now that you know how many target phone segments there are in the recording, you can create a new data frame with different variables that you might need for your results and plotting.
For example, neither group A nor group B will be measuring pitch, but a data frame that is pre-allocated to include f0 measurements (numeric data type) for each target phone segment (character data type) might look like this:
data <- data.frame(phone = character(N),
f0 = numeric(N))
During the second part of today’s session you will practice peer code review, where you will review a partner’s code and they will review yours. Feel free to make suggestions! And most importantly: learn from other people’s unique perspective to help make your code more efficient in the future.
During the code review, try to focus on the following aspects, which are those that I will be using to review your final assessment this term: