[MUSIC] Welcome to Module 2. In Module 2, we're going to talk about applied laryngeal physiology. Our objectives here are to discuss the source filter theory of sound production. To use that acquired knowledge of voice production techniques to understand vocal activity for speaking and singing. Remember, in the source filter theory of sound production, the power supply is the lungs. The lungs passively exhale air, which drives the vocal folds into vibration. The vibration of the vocal folds create sound. It's a neutral vowel sound, the vowel. And then that is resonated through the supraglottic vocal tract or the resonating cavity to allow that neutral vowel sound to be shaped into words. Laryngeal function for voice requires intact laryngeal anatomy, such that the vocal folds are brought into an adducted, nearly closed configuration for vibration to begin. Vibration occurs through the myoelastic aerodynamic theory, such that vibration is a passive phenomena that produces the sound source or the buzzing that is then shaped into words by our lips and tongue. Sound is simply the transmission of air pressure waves that is processed by a receiver. The vibrations of the vocal folds alters the air pressure at the level of the larynx or the level of the glottis to create that change in air pressure that our human ear receives. So we can see here, vocal fold vibration begins with the vocal folds slightly apart. Because of the Bernoulli principle, the conservation of energy in flowing fluids, as the air fluid passes through the vocal folds or the glottis, it speeds up and the pressure drops. The pliable vocal fold membranes, the myoelastic portion, are drawn together by this pressure drop and then are slowly blown open again as the pressure builds up below the vocal folds. This vibration creates changes in air pressure, which creates sound. The quality or the richness of that sound is determined by how regular these vocal fold vibrations occur. The intensity or loudness of that sound is determined by the degree of air pressure change, how complete the closure is, and how rapid the closing phase occurs. Both of these require intact nerves to bring the vocal folds into this pre-phonatory configuration. And then pliable, elastic tissue of the vocal folds to allow them to be drawn inward rapidly or close. That determines the intensity and the richness of the sound. Sound is produced or is the product of laryngeal vibration. The larynx is fixed at both ends. It's fixed at the posterior end and the anterior end, the back and the front. These are referred to as nodes. This is different from a clinical node that may form in the mid portion of the vocal folds where the majority of the vibration occurs. Actually, in this response, this mid portion is referred to as the antinode, or the widest point of excursion from the fixed point on either end. Because the vocal folds are actually living tissue, as they begin to be driven into vibration by the air coming through them, they can be broken up into an infinitesimal number of different segments. Each of these different segments then can be represented as a sine wave. The number of sine waves determines the richness or the quality of the sound. So if our vocal folds are very pliable, they'll have multiple different sine waves or multiple different areas of vibration. Remember, the waves that are in phase with each other are propagated. They're only in phase with each other if they're a multiple of the primary wave. Seen here, the pink wave is the primary wave. It's the largest sine wave in this cycle and is the fundamental frequency, or the relative pitch, of voice production. This dark blue wave is occurring twice as fast as the primary or pink wave. This is the second harmonic, or a wave that's occurring twice as fast, a whole number multiple of the primary wave. This light blue wave here is occurring three times as fast as the primary wave. Again, it is in phase because it's occurring, and it's a whole number multiple of the primary pink wave. Through mathematical modeling then, each of these waves can be represented as the log of sound or amplitude and the log of frequency or pitch. So if our pink wave or our primary wave is the pitch of our voice, it's also referred to as the first harmonic in a harmonic source spectrum. The dark blue wave, which is twice as fast as the primary wave or pink wave, is referred to as the second harmonic. And then the third harmonic is the wave that's happening three times as fast as the first harmonic or fundamental frequency, this pink line here. All of the other waves in between here have been cancelled out, and these waves are being propagated. This forms the source spectrum for laryngeal vibration. In the source-filter theory of vocal fold vibration, the source spectrum is produced by vocal fold vibration. Again, as air is exhaled by the lungs, they drive the vocal folds into vibration. And then the vocal folds vibrate, producing this source spectrum which we discussed in our last slide. The vocal tract then amplifies or dampens certain areas of this source spectrum. The amplified areas are resonated even more. They're referred to as the formants. Again, just like the source spectrum goes on to infinity, because of the shape of our throat, the areas that are amplified or dampened go on to infinity. But mathematically, we can only model the first three or five of these regions. It becomes very complex after this. The areas that are amplified, again, are referred to as the formant frequencies. They're determined by the length and shape and the openings of the supraglottic vocal tract. Simply put, the supraglottic vocal tract is a tube. Every tube has certain resonatory or vibratory characteristics. If I produced a neutral vowel with my larynx and cupped my hands in front of my mouth, I can change the sound of that vowel [SOUND], simply by changing the length of the tube through which that neutral vowel is moving. In a similar manner, I don't need my hands. I can use my tongue, the side walls of my throat, my lips, to change the shape of that tube. And that's what I'm doing when I produce these different formant frequencies. The vowel is determined by the absolute amplitude and frequency in the relationship between F1 and F2. A, E, I, O, U, all have different F1 and F2 frequencies. That's what allows us to produce different vowels. Now, the color of the vowels then is determined by the later formants. So I can say A with a Southern accent, or I can say A with a Northern accent. And that is determined by the shape of my throat, and that's what gives me an accent. Similarly, singers can change their style of singing voice production by changing slightly the position of the later formants, F3 through F5. This is important because singing styles began prior to the development of electronic amplification. If a singer in an opera house wanted to be heard, they had to have something that separated them from the noise of the instruments produced in the orchestra pit. That something that separates them is the singer's formant. And what happens is they cluster the later formants together by slightly changing the shape of their throat, so that these later formants have higher amplitude or intensity. This amplitude or intensity relates to about 2,800 or 3,000 cycles per second. That's the natural vibratory frequency of the human ear bones. Because singers can take advantage of that when they sing in classical styles, they can be heard over the background noise of the musical instrument. We sometimes do this in the speaking voice. If we want to call to our children or if we want to yell to the dog to be heard, we subtly change the shape of our throat to amplify formant structure in these different regions. So that our voice will be heard further and above the background noise of the street. There are also principles by which these formants relate to each other. The loudness or amplitude of the formants is determined by the loudness or the amplitude of the formants just before it. So if I want to be heard, I have to use volume. I have to speak up. The louder these earlier harmonics are, the louder my F1 will be. Just by definition, the louder my F1, the louder my F2. So my vowel can be heard. If I want to make F3 through F5 louder, if I change the shape of my throat to cluster these formants closer together, they will give each other a relative rise in amplitude. Louder can be perceived as better. And it's dependent on the absolute loudness of the earlier harmonics in the earlier formants. And it's also dependent on how dense these harmonics are to each other. So if I have a low pitched sound, it's easier to make that louder and have more resonance in it, because there is going to be more harmonic structures. If this is only 100 cycles per second, I'm going to have twice as many harmonics as if this were 200 cycles or a higher pitch per second. For all styles or components of voice production and speaking and singing, we have three basic mechanisms of control. The first is our lung air pressure. This is referred to as the subglottic pressure or the pressure below the vocal folds. And it helps determine how fast or how rapidly the vocal folds close and are blown open. The second is we can choose between different tensions in the thyroarytenoid muscle and cricothyroid muscle to control the vibratory frequency of our vocal folds, so we can go down in pitch and up in pitch. We can stiffen the vocal folds. The last thing we can do is alter the shape of the filter or vocal tract to create different formant structure. This is by controlling the length, shape, and degree of mouth opening and opening of the vocal tract to just above the vocal folds. Let's talk about these individually. Subglottal pressure is the pressure below the vocal folds. It's determined by two components, the force of the expelled air and the degree of laryngeal closure. Why do I want to get loud? I want to get loud, so I can be heard. How do I get loud? I close my vocal folds completely, and I close them rapidly. I can't really control the elastic nature of my vocal folds on a moment to moment basis. But I can certainly control how much air I push through them or how tightly I keep them closed together by contracting the muscles. These give me volume. If I want louder volume, I can either use more air, or I can either use more laryngeal pressure or closure. Scientists believe that using more air is kinder or gentler to the vocal folds. It provides a cushion between the vocal folds and is less traumatic. Again, the degree of this closure and the change rate of this closure is directly related to the loudness of the sound. Volume allows us to be heard. The next thing I can do to communicate with you is I can stiffen my vocal folds to change the vibratory frequency. So I can go down in pitch and up in pitch. Remember, if I spoke in one tone all the time, this would be a very boring lecture. So I want to create a balance between vocal fold tension and mass. This is the only formula you'll see in this entire presentation. The frequency of vibration is proportional to the square root of tension over mass. Tension refers to how much tension I have in the vibratory tissue through action of the cricothyroid muscle primarily and the thyroarytenoid muscle secondarily. The mass refers to the natural mass of my vocal folds, which I can control slightly by tensing them. If you think of the vocal folds as a rubber band, when we stretch a rubber band, the thickness and the width of that rubber band decrease. Similarly, when I contract my cricothyroid muscle, I stretch my vocal ligament, the transition zone and the cover. As I stretch it out, that area becomes more tense, but it also reduces the overall mass that's available to participate in vibration. Again, this is the body-cover theory of vocal fold vibration that was worked out in the 1970s by these Japanese researchers. The last thing I can do is I alter the shape of my vocal tract. This allows us to produce sound, or words rather, from that neutral vowel sound of vocal fold vibration, the schwa or vowel. I can influence the color by boosting the singer's formants, so that I can be heard over background noise, so I subtly change the shape of my throat. The length of this supraglottic vocal tract or the length of this tube, by changing the position of my lips or by using the strap muscles to lower the larynx in the neck or to raise the larynx in the neck. How I alter the shape of my vocal tract is by using nerves and muscles. Most of these we've talked about previously in this lecture. We haven't really talked about the pharyngeal muscles. Just like the laryngeal muscles, the pharyngeal muscles that control the overall shape of this tube are under control of the 10th cranial nerve. So my brain has the 10th cranial nerve represented inside of it. That tenth nerve is giving signals to the recurrent laryngeal nerve and the superior laryngeal nerve in order to control the length and tension in that vocal fold and whether or not the vocal fold is open or closed. At the same time, a very similar area in my brain is also giving signals to the 10th cranial nerve that controls the shape of my throat, which is altering the color of my sound. I then have the strap muscles in the neck that raise and lower the larynx. These, combined with the tongue musculature, are all under control of the 12th cranial nerve that helps me determine the overall length of this tube. Plus with the tongue, where these narrowed areas are or where the widened areas are, I can use my lips under control of the 7th cranial nerve. And all of these things together extend the length of this tube and subtly change its shape. So that I can change my formant structure so I can provide interest in my voice production. I can filter the sound that's produced by my vocal tract. Lastly, I can articulate by using my teeth, tongue, jaw, and lips to stop the sound or to create a T sound by holding my tongue against the top of my mouth, just behind my teeth. And I can use all of these stops to chop the air stream and create the consonants that we use for word production. So in summary, in the source filter theory of sound production, the lungs act as the power source that force the air out of the trachea. As the air passes through the trachea and then into the glottis or the larynx, the vocal folds are driven into vibration. That vibration changes the air pressure, which is sound. The supraglottic vocal tract above the vocal folds is the tube that changes the sound produced by the vocal fold vibration into words by amplifying certain parts of that sound, the formant frequencies, and dampening others. The sound source for human speech is vocal fold vibration. Because the vocal folds are living tissue, as they vibrate, they break up into an infinitesimal number of segments. The segments that are vibrating as whole number multiples of the primary or fundamental frequency, what we perceive as pitch, form the harmonic series of the sound source. The formant frequencies are the amplified regions of that harmonic sound source. They create the vowels, and they color the sound. Formants are created by subtly changing the shape of our throat through motion of the tongue, throat muscles, or pharyngeal muscles, lips, and teeth. Speech and singing production requires a person to manipulate the following properties. Air pressure through the vocal folds, the subglottic pressure, vocal fold tension for pitch, and the length, shape, and opening of the vocal tract to create the formant structure. Thank you for your attention during this module. Please join us for Module 3. [MUSIC]