You’re going use the same data set that you used for the first peer code review session last week:
load("/shared/groups/jrole001/pals0047/data/EMA-EMG_data2.Rda")
Let’s remind ourselves of the speech segments in the data:
head(data$segments)
## word phone start end TTz_max TTz_min TMz_min TDz_min
## 1 asas@ a 1.540144 1.706447 2.548471 -4.0975521 0.8646487 1.7692839
## 2 asas@ s 1.706447 1.917608 4.291879 2.5484713 3.4604227 0.2561617
## 3 asas@ a 1.917608 2.105793 3.445756 -4.2324609 1.7615404 3.0627321
## 4 asas@ s 2.105793 2.293978 4.528135 0.9662423 3.6183587 2.9999151
## 5 asas@ @ 2.293978 2.477787 3.310096 -2.8320555 3.8145834 3.4705340
## 6 olol@ o 3.487426 3.665764 2.065802 -6.3544962 0.8131888 5.7130302
## RMS
## 1 0.21341366
## 2 0.02628028
## 3 0.22366990
## 4 0.02504962
## 5 0.11400046
## 6 0.26782050
In today’s session, you’re going to incorporate 3 dimensions of data into a single 2-dimensional figure. To demonstrate, we’ll investigate the dynamics of tongue tip movement during the VCVC sequence “asas” in the first word, “asas@”. Let’s first get the start point of the first “a” and the end point of the second “s”:
t1 <- round(data$segments$start[1]*data$SR$EMA)
t2 <- round(data$segments$end[4]*data$SR$EMA)
If we want to plot the tongue tip movement over time on the x-axis, we have to choose only one EMA dimension to plot on the y-axis. For example, we can plot the horizontal tongue position over time:
plot(data$EMA$TT_x[t1:t2],
ylab="Horizontal tongue tip (<- back | forward ->)")
…or we can plot the vertical tongue position over time:
plot(data$EMA$TT_z[t1:t2],
ylab="Vertical tongue tip (<- down | up ->)")
But what if we want to plot both of these dimensions together? To do so, we need to plot one dimension against the other:
plot(data$EMA$TT_x[t1:t2],
data$EMA$TT_z[t1:t2],
xlab="Horizontal tongue tip (<- back | forward ->)",
ylab="Vertical tongue tip (<- down | up ->)")
Wow, that’s cool! This is the 2-dimensional path that the tongue tip follows during the articulation of the VCVC sequence. But there is an important dimension we’ve lost here: time. For example, how do we know which dot is the beginning of the sequence and which is the end? And how can we follow the path between these two points?
Today, you’ll learn how to retain this time information by incorporating a third dimension into your plotting using color.
In this tutorial session, you’ll be creating a function that uses a color gradient, but before you do we first need to discuss how color is represented in a computer. You may be familiar with the concept of what’s called the RGB “color space”, which defines the relationship between three colors: Red, Green, Blue. Any color can be expressed by the relative amount/intensity of each of these three color channels:
RGB color space. Source: https://ieeexplore.ieee.org/document/8650504
For example, BLACK can be represented as no intensity in any of the RGB channels: (0, 0, 0), WHITE can be represented as full intensity in each of the RGB channels: (1, 1, 1), and 50% gray can be represented as half intensity in each of the RGB channels: (0.5, 0.5, 0.5.). However, any color can also be represented in this way as well. RED is defined as full intensity in the R channel and no intensity in the G and B channels: (1, 0, 0). Conversely, CYAN is defined as no intensity in the R channel and full intensity in the G and B channels: (0, 1, 1).
To represent time in our EMA trajectory plot, we’re going to plot
changing colors along a gradient, i.e. equal changes of
color at each step along a scale from one color to another. To do this,
we’ll incorporate a function that we’ve already used before:
seq()
, which creates a vector of evenly-spaced values from
one value to another. For example, we can create a vector of 10 values
between 0 and 1:
seq(0,1,length.out=10)
## [1] 0.0000000 0.1111111 0.2222222 0.3333333 0.4444444 0.5555556 0.6666667
## [8] 0.7777778 0.8888889 1.0000000
We can use three of these functions to create a matrix of 3 vectors, each with evenly spaced values:
cbind(seq(0,1,length.out=10),
seq(0,1,length.out=10),
seq(0,1,length.out=10))
## [,1] [,2] [,3]
## [1,] 0.0000000 0.0000000 0.0000000
## [2,] 0.1111111 0.1111111 0.1111111
## [3,] 0.2222222 0.2222222 0.2222222
## [4,] 0.3333333 0.3333333 0.3333333
## [5,] 0.4444444 0.4444444 0.4444444
## [6,] 0.5555556 0.5555556 0.5555556
## [7,] 0.6666667 0.6666667 0.6666667
## [8,] 0.7777778 0.7777778 0.7777778
## [9,] 0.8888889 0.8888889 0.8888889
## [10,] 1.0000000 1.0000000 1.0000000
If the values in this matrix were RGB values—R in column 1, G in column 2, B in column 3—the colors in each row would change evenly from BLACK in the first row (0,0,0) to WHITE in the last row (1,1,1). We can apply this same logic to create a similar matrix with RGB values that change evenly from RED (1,0,0) to BLUE (0,0,1):
cbind(seq(1,0,length.out=10),
seq(0,0,length.out=10),
seq(0,1,length.out=10))
## [,1] [,2] [,3]
## [1,] 1.0000000 0 0.0000000
## [2,] 0.8888889 0 0.1111111
## [3,] 0.7777778 0 0.2222222
## [4,] 0.6666667 0 0.3333333
## [5,] 0.5555556 0 0.4444444
## [6,] 0.4444444 0 0.5555556
## [7,] 0.3333333 0 0.6666667
## [8,] 0.2222222 0 0.7777778
## [9,] 0.1111111 0 0.8888889
## [10,] 0.0000000 0 1.0000000
There is a handy function in R that can change these RGB values to hexadecimal (hex) color codes,
which is called… you guessed it… rgb()
:
mycols <- cbind(seq(1,0,length.out=10),
seq(0,0,length.out=10),
seq(0,1,length.out=10))
rgb(mycols)
## [1] "#FF0000" "#E3001C" "#C60039" "#AA0055" "#8E0071" "#71008E" "#5500AA"
## [8] "#3900C6" "#1C00E3" "#0000FF"
These hex color codes can be used for plotting colors, provided that the vector of colors has the same length as the data vectors to be plotted. For example, you can plot 10 large squares* and give them all the color red:
*pch
is used to change the shape of the
plotting symbol and cex
is used to change its size (see
details)
plot(1:10, 1:10,
col="red",
pch=15, cex=3)
Or instead you can use the vector of 10 hex color codes we just created to plot along a gradient from red to blue:
plot(1:10, 1:10,
col=rgb(mycols),
pch=15, cex=3)
Applying a color gradient to the plot of EMA data follows the exact same logic as for these 10 squares, but instead of creating a vector with exactly 10 colors, the vector of colors has the same length as the number of samples in the EMA data:
N <- length(data$EMA$TT_x[t1:t2])
mycols <- cbind(seq(1,0,length.out=N),
seq(0,0,length.out=N),
seq(0,1,length.out=N))
plot(data$EMA$TT_x[t1:t2],
data$EMA$TT_z[t1:t2],
col=rgb(mycols))
In this version of the 2-dimensional plot, time is now represented by color: the speech sequence “asas” begins at red (the start of the first “a”), traverses through the changing color space, and ends at blue (the end of the second “s”). This new plot gives us 3 dimensions of information in one figure!
The goal of this exercise is for you to create a function, called
gradientEMA()
, which will plot any length of EMA data with
a color gradient from any set of RGB values to any other set of RGB
values.
Your function should contain the following arguments without default values:
x
: data to be plotted along the x-axisy
: data to be plotted along the y-axisThe function should also contain the following arguments with default values:
from
: the starting color, represented by RGB values in
the scale [0,1], with a default value representing REDto
: the ending color, represented by RGB values in the
scale [0,1], with a default value representing BLUEpch
: a numeric value of the plotting symbol to be used,
with a default value of 1 (i.e. open circle)Your function should create a plot similar to the following when
using only the x
and y
arguments…
gradientEMA(data$EMA$TT_x[t1:t2], data$EMA$TT_z[t1:t2])
…but also a plot similar to the following when using different values
for the arguments from
, to
, and
pch
:
gradientEMA(data$EMA$TT_x[t1:t2], data$EMA$TT_z[t1:t2], from=c(1,0,1), to=c(0,1,0), pch=19)
time to think: what does the time-course of the tongue tip motion in the sequence “asas” teach us about the nature of speech targets and speech dynamics?