Native language influence on brass instrument performance

Matthias Heyne, myself, and Jalal Al-Tamimi recently published Native language influence on brass instrument performance: An application of generalized additive mixed models (GAMMs) to midsagittal ultrasound images of the tongue. The paper contains the bulk of the results form Matthias’ PhD Dissertation. The study is huge, with ultrasound tongue recordings of 10 New Zealand English (NZE) and 10 Tongan trombone players. There are 12,256 individual tongue contours of vowel tokens (7,834 for NZE, 4,422 for Tongan) and 7,428
515 tongue contours of sustained note production (3,715 for NZE, 3,713 for Tongan).

The results show that native language influences tongue position during Trombone note production. This includes tongue position and note variability. The results also support Dispersion Theory (Liljencrants and Lindblom 1972; Lindblom, 1986; Al-Tamimi and Ferragne,
832 2005) in that vowel production is more variable in Tongan, which has few vowels, then in NZE, which has many.

The results also show that note production at the back of the tongue maps to low-back vowel production (schwa and ‘lot’ for NZE, /o/ and /u/ for schwa). These two result sets support an analysis of local optimization with semi-independent tongue regions (Ganesh et al., 2010, Loeb, 2012).

The results do not, however, support the traditional brass pedagogy hypothesis that higher notes are played with a closer (higher) tongue position. However, Matthias is currently working with MRI data that *does* support the brass pedagogy hypothesis, and that we might not have seen this because of the ultrasound transducer stabilization system needed to keep the ultrasound probe aligned to the participant’s head.

Liljencrants, Johan, and Björn Lindblom. 1972. “Numerical Simulation of Vowel Quality Systems:
The Role of Perceptual Contrast.” Language, 839–62.

Lindblom, Björn. 1963. Spectrographic study of vowel reduction. The Journal of the Acoustical
Society of America 35(11): 1773–1781.

Al-Tamimi, J., and Ferragne, E. 2005. “Does vowel space size depend on language vowel inventories? Evidence from two Arabic dialects and French,” in Proceedings of the Ninth European Conference on Speech Communication and Technology, Lisbon, 2465–2468.

Ganesh, Gowrishankar, Masahiko Haruno, Mitsuo Kawato, and Etienne Burdet. 2010. “Motor
Memory and Local Minimization of Error and Effort, Not Global Optimization, Determine
Motor Behavior.” Journal of Neurophysiology 104 (1): 382–90.

Loeb, Gerald E. 2012. “Optimal Isn’t Good Enough.” Biological Cybernetics 106 (11–12): 757–65.

Tri-modal speech: Audio-visual-tactile integration in speech perception

Myself, Doreen Hansmann, and Catherine Theys just published our article on “Tri-modal Speech: Audio-visual-tactile Integration in Speech Perception” in the Journal of the Acoustical Society of America. This paper was also presented as a poster at the American Speech-Language-Hearing Association (ASHA) Annual Convention in Orlando, Florida, November 21-22, 2019, winning a meritorious poster award.

TL-DR; People use auditory, visual, and tactile speech information to accurately identify syllables in noise. Auditory speech information is the most important, then visual information, and lastly aero-tactile information – but we can use them all at once.

Abstract: Speech perception is a multi-sensory experience. Visual information enhances (Sumby and Pollack, 1954) and interferes (McGurk and MacDonald, 1976) with speech perception. Similarly, tactile information, transmitted by puffs of air arriving at the skin and aligned with speech audio, alters (Gick and Derrick, 2009) auditory speech perception in noise. It has also been shown that aero-tactile information influences visual speech perception when an auditory signal is absent (Derrick, Bicevskis, and Gick, 2019a). However, researchers have not yet identified the combined influence of aero-tactile, visual, and auditory information on speech perception. The effects of matching and mismatching visual and tactile speech on two-way forced-choice auditory syllable-in-noise classification tasks were tested. The results showed that both visual and tactile information altered the signal-to-noise threshold for accurate identification of auditory signals. Similar to previous studies, the visual component has a strong influence on auditory syllable-in-noise identification, as evidenced by a 28.04 dB improvement in SNR between matching and mismatching visual stimulus presentations. In comparison, the tactile component had a small influence resulting in a 1.58 dB SNR match-mismatch range. The effects of both the audio and tactile information were shown to be additive.

Derrick, D., Bicevskis, K., and Gick, B. (2019a). “Visual-tactile speech perception and the autism quotient,” Frontiers in Communication – Language Sciences 3(61), 1–11, doi: http://dx.doi.org/10.3389/fcomm.2018.00061

Gick, B., and Derrick, D. (2009). “Aero-tactile integration in speech perception,” Nature 462, 502–504, doi: https://doi.org/10.1038/nature08572.

McGurk, H., and MacDonald, J. (1976). “Hearing lips and seeing voices,” Nature 264, 746–748, doi: http://dx.doi.org/https://doi.org/10.1038/264746a0

Calculating an Erdös-Chomsky-Bacon number – 13

Some days it is hard to focus on work – any day where I have to look at large-scale copy-edits is one of them. So I decided to procrastinate by calculating my Erdös-Chomsky-Bacon number (modified), which is any publication links across co-authors to Paul Erdös and Noam Chomsky, as well as any filmed acting across actors to Kevin Bacon. That last part is a cheat because a Bacon number is supposed to be movie-only connections, but I’m OK with that because I was paid to do the acting.

My Erdös-Chomsky-Bacon number is 13:

Erdös Number = 4

Donald Derrick -> Daniel Archambault
Derrick, Donald and Archambault, Daniel Treeform: Explaining and exploring grammar through syntax trees. Literary and Linguistic Computing, (2010). 25(1):53–66.

Daniel Archambault -> David G. Kirkpatrick
Archambault, Daniel; Evans, Willam; Kirkpatrick, David Computing the set of all the distant horizons of a terrain. Internat. J. Comput. Geom. Appl. 15 (2005), no. 6, 547–563.

David G. Kirkpatrick -> Pavol Hell
Kirkpatrick, D. G.; Hell, P. On the complexity of general graph factor problems. SIAM J. Comput. 12 (1983), no. 3, 601–609.

Pavol Hell -> Paul Erdős
Erdös, P.; Hell, P.; Winkler, P. Bandwidth versus bandsize. Graph theory in memory of G. A. Dirac (Sandbjerg, 1985), 117–129, Ann. Discrete Math., 41, North-Holland, Amsterdam, 1989.

Chomsky number = 5

Donald Derrick -> Michael I. Proctor
Examining speech production using masked priming.
Chris Davis, Jason A. Shaw, Michael I. Proctor, Donald Derrick, Stacey Sherwood, Jeesun Kim
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

Michael I. Proctor -> Louis Goldstein
Analysis of speech production real-time MRI.
Vikram Ramanarayanan, Sam Tilsen, Michael I. Proctor, Johannes Töger, Louis Goldstein, Krishna S. Nayak, Shrikanth Narayanan
Computer Speech & Language, 2018

Lousi Goldstein -> Srikantan S. Nagarajan
A New Model of Speech Motor Control Based on Task Dynamics and State Feedback.
Vikram Ramanarayanan, Benjamin Parrell, Louis Goldstein, Srikantan S. Nagarajan, John F. Houde
Proceedings of the Interspeech 2016, 2016

Srikantan S. Nagarajan -> David Poeppel
Asymptotic SNR of scalar and vector minimum-variance beamformers for neuromagnetic source reconstruction. (DOI)
Kensuke Sekihara, Srikantan S. Nagarajan, David Poeppel, Alec Marantz
IEEE Trans. Biomed. Engineering, 2004

David Poeppel -> Noam Chomsky
Governing Board Symposium The Biology of Language in the 21st Century. (DOI)
Noam Chomsky, David Poeppel, Patricia Churchland, Elissa L. Newport
Proceedings of the 33th Annual Meeting of the Cognitive Science Society, 2011

Bacon number = 4

Donald Derrick -> Earl Quewezance
“Frontrunners” by European News at the “Play the Game” conference (2005)

Earl Quewezance -> Rob Morrow
“The Mommy’s Curse”, episode 6, Northern Exposure (1995)

Rob Morrow -> Embeth Davdtz
Emperor’s Club (2002)

Embeth Davidtz -> Kevin Bacon
Murder in the First (1995)

“Articulatory Phonetics” Resources

Back in 2013, Bryan Gick, Ian Wilson and myself published a textbook on “Articulatory Phonetics”. This book contained many assignments at the end of each chapter and in supplementary resources. After many years of using those assignments, Bryan, Ian, and many colleagues figured out that they needed some serious updating.

These updates have been completed, and are available here. The link includes recommended lab tools to use while teaching from this book, as well as links and external resources.

P.S. Don’t get too excited students – I’m not posting the answers here or anywhere else 😉

Aero-tactile integration during speech perception: Effect of response and stimulus characteristics on syllable identification

Jilcy Madappallimattam, Catherine Theys and I recently published an article demonstrating that aero-tactile stimuli does not enhance speech perception during open-choice experiments the way it does during two-way forced-choice experiments.

The abstract (with citation method modified) is as follows:

Integration of auditory and aero-tactile information during speech perception has been documented during two-way closed-choice syllable classification tasks (Gick and Derrick, 2009), but not during an open-choice task using continuous speech perception (Derrick et al., 2016). This study was designed to compare audio-tactile integration during open-choice perception of individual syllables. In addition, this study aimed to compare the effects of place and manner of articulation. Thirty-four untrained participants identified syllables in both auditory-only and audio-tactile conditions in an open-choice paradigm. In addition, forty participants performed a closed-choice perception experiment to allow direct comparison between these two response-type paradigms. Adaptive staircases, as noted by Watson (1983). Were used to identify the signal-to-noise ratio for identification accuracy thresholds. The results showed no significant effect of air flow on syllable identification accuracy during the open-choice task, but found a bias towards voiceless identification of labials, and towards voiced identification of velars. Comparison of the open-choice results to those of the closed-choice task show a significant difference between both response types, with audio-tactile integration shown in the closed-choice task, but not in the open-choice task. These results suggest that aero-tactile enhancement of speech perception is dependent on response type demands.

Derrick, D., O’Beirne, G. A., De Rybel, T., Hay, J., and Fiasson, R. (2016). “Effects of aero-tactile stimuli on continuous speech perception,” Journal of the Acoustical Society of America, 140(4), 3225.

Gick, B., and Derrick, D. (2009). “Aero-tactile integration in speech perception,” Nature 462, 502–504.

Watson, A. B. (1983). “QUEST: A Bayesian adaptive psychometric method,” Perceptual Psychophysics, 33(2), 113–120.

TreeForm for Windows reverted to version 1.03

Apologies to all Windows users, but my revised version of TreeForm does not seem to run on your system due to the fact that Java developers have effectively ruined internationalization for windows runs. It will be a least 1 month before I can even begin to have time to address this issue.

I now have SourceForge automatically send you version 1.03 (as TreeFormWindows.zip) if you are running Windows, and that one should work for most users still. Apple users still benefit from the new and improved version.

Apologies for the inconvenience.

Ultrasound Transducer Stabilizer for Children.

Our three-dimensional printable ultrasound transducer stabilizer has been a huge success. It is in use here at the University of Canterbury, as well as the University of Michigan, Hiroshima University, University of California, Los Angeles, and soon at the University of British Columbia. (And it is available at Western Sydney University).

However, Phil Hoole at Ludwig Maximilian University of Munich figured out that the transducer stabilizer does *not* work with Children. He developed a solution to that problem, and I am making it available here. Within this zip file, there is a new probe holder. The base and clip-holder should be printed as is. Each remaining file needs to be scaled to 75% of their size and then printed. Each file marked with X2 needs to be printed *twice*.

I will put photos of this version of the probe-holder online once I have printed new copies and sewn all the pieces together sometime in October.

Thoughts on ecological activism

Before I return to my normal posts on Linguistics and Speech research, I have one more thought on my post-ICPhS trip to Cairns. After the dive, I went to the edge of the rain-forest on a half-day 4×4 tour. It was more sitting and less walking than I would normally go for, but the views were pleasant.

The trip showed us the amazing strangler fig, which is essentially an immortal tree that has serious ill-intent with the trees it grows next to. If you are dumb enough to grow near one of these monsters, within 100 years you are dead, dead, dead!

And the waterfall we went to at the end of the trip was stunning.

But there was one long part where the guide had us standing still for 30 minutes listening to a discussion of local wildlife mixed with the usual guilt-trip about ecological destruction. In one sense, that is fair enough. Humans have an enormous impact on this planet, and plenty of it is negative. But in another sense, I just wanted to crawl out of my skin. Not because I felt guilty for what I’ve done, but because I have absolutely no idea how this approach can help make the world a better place.

I can appreciate that the Australian government is not letting Cairns reuse brown-space for a new boat launch but instead is forcing them to tear down a valuable mangrove. But I can’t do anything about it. I am not Australian, I don’t vote in Australia, and I can’t force the Australian government to save the mangroves. Even though I would LOVE to because I want the Great Barrier Reef to keep growing spectacular fish! There was also a lot about how tourists should support family businesses over large-scale tourism businesses.

But it went to long. We had old people on this trip, and one of them had lost circulation in her legs listening to the over-long presentation. She fell trying to walk back to the vehicle after the talk. She wasn’t badly hurt, but that is the kind of thing that can break a hip, greatly shortening the life of the elderly person in question!

The guide also complained about the large influx of population into Cairns, who then demand a quieter place that involved cutting trees bats live in, and otherwise reducing the wonders of nature in the area to make the place more like the big cities they came from. Fair enough, but I heard no solutions. And I thought “stronger insulation and noise-control laws, or education about good construction standards, would end that nonsense.” I though “there are really effective solutions that we can implement ourselves, so tell everyone about them!” And as a result, I was frustrated because of the missed opportunity.

I compare this approach to that of Reef Encounters. They brought us to a beautiful place full of natural wonders. When we complimented them on their good job, they made it clear it was *nature* that did the good job, and we all benefit from what nature does. When we went diving, the guides always picked up any trash they saw on the ocean floor, and taught us to do the same. When the great food was served and the good times were had, they thanked us for supporting a local family business instead of one of the large-scale tourism businesses.

And there it is. They let nature speak for itself. They embodied solutions. They did a great job and thanked us for supporting local businesses *after* they did that great job. People who experience such things will appreciate nature, know how and engage in good ecological behaviour, and continue to make better choices for local communities.

So here is to all those who embody good ecological behaviour, cleaning up after themselves and others. Here’s to the people who build improved technologies that waste less and are more efficient. Here’s to those who keep track of nature – and trade – exposing it to the light where it can be made as good as possible, a little better every day. And yes, here’s to those who vote to preserve mangroves and re-use brown space for boat-docks.

Diving the great barrier reef

After the International Congress of the Phonetic Sciences in Melbourne, my friend Phil Howson and I went diving in the Great Barrier Reef off the coast of Cairns. The trip was truly amazing. During this time, I did 10 dives, 5 of them to train for advanced open-water conditions – diving to 30 meters (100 ft).

The conditions were absolutely amazing, as you can see from the boat shots from the professional photographer (these are all Tilly’s shots, I saw similar things, but I do not have either the gear nor the eye to take shots like this!)

My friend Phil and I had a lot of fun, above and below water.

And the reef was amazing.

And that was just the coral. I most definitely found Nemo. Often. More often than Tilly photographed them.

And I might have encountered a couple of elder things. Tilly even got a shot with the face-hugger look. For me, the cuttlefish was always closed like photo 1 and 2.

I saw lots of little fish like these.

And crazy schools of fish – some even more impressive than these.

I cannot count the number of times I saw scenes like this, but with much wider views and more variety of fish.

I saw rays quite often.

And I played light with heaps of these little doggies of the sea. If you ever told me I’d ever play light with a shark, I’d have called you barking mad! I clearly have no actual sense! (Sharks tend to like the light as they use it to catch fish, but other fish such as fusiliers are super-keen on using your light and they will surround you like crazy!)

I swam with these turtles, but I did NOT see the one eating the jellyfish. That my buddy saw, and of course Tilly taking the photos.

And I even have some proof of swimming with the turtles.

I also enjoyed the slower creatures. Giant clams!

Unfortunately, I did not see the moray pictured here. Tilly got great shots though!

And I never saw a starfish on the trip either, though we do have shots from Tilly.

But, I did see these guys:

This trip was truly amazing. It really does look like this under the ocean at the Great Barrier Reef, and even more amazing than this. My first night-dive was a kaleidoscopic fever-dream better than my wildest imaginings. I cannot recommend diving enough.

EDIT: I now have a photo of my deep dive to 30M during dive training. The depths are an eerie place, where cracked eggs stay intact, and red tomatoes look green. They are worth a quick and carefully planned visit. Running out of air is EASY. On my training my instructors deliberately shared air with me, and I deliberately used the back-up bottle at 5M depth, as skill practice.

ICPhS 2019

The Nineteenth International Congress of the Phonetic Sciences was held in Melbourne from August 5-9, 2019. It was an amazing success with over 950 delegates, and a nearly unlimited opportunities to forge new collaborations and improve the quality of phonetic science research worldwide.

I was especially impressed with the entire science committee who organized over 400 reviewers for the conference, dealt with difficult-to-administer programme software, and kept every talk and poster well coordinated despite the inevitable last-minute changes. Paola Escudero, Sasha Calhoun, and Paul Warren are to be commended!

I also commend Rosey Billington of the social media liaison. Social media was the knife’s edge between success and failure. I’m not good at the stuff to the point of having to effectively leave facebook of late, but I admire those who can bend social media to their will – especially when their will is goodwill.

I also commend the keynote speakers. My former PhD supervisor Bryan Gick made an amazing presentation on how bodies talk. I really enjoyed seeing the old research, and seeing the new stuff I haven’t been involved with as much. It was great to see Connor Meyer is joining in on writing a new book on that same topic – I await it with great anticipation!

Lucie Menard presented on “Production-perception relationships in sensory deprived populations: the case of visual impairment”. Her talk really helped me see how seeing helps with speaking. I cannot recommend reading her papers enough.

And of course the media darling of the event was Jonas Beskow on “On talking heads, social robots and what they can teach us”.

His talk shows us some of the state-of-the-art on human-robot interactive systems, which while super-interesting, also strongly points out to me how much we can still do to improve human/computer interaction. We have only just begun to exploit such opportunities.

Visual Prosody

I also really enjoyed the visual prosody contest – the poster on the left showed a method of highlighting both pitch and intensity at the same time. Visual prosody requires innovative techniques for showing multi-dimensional information in an intuitive way that people can grasp using the built-in abilities of their visual systems. I intend to write a blog post on this topic, highlighting the incredible multidimensionality of some of the greatest visualizations used in data presentation today – weather maps. The best of these present rain, wind, maps, and pressure systems all at the same time, and in a manner nearly anyone can decipher instantly with but a little training.

The conference dinner was also fun, with really good food and a spectrogram contest with participants who were insanely fast. The winners of two of the contests had answers before I could even finish a draft segmentation. I’m not sure who taught them to read spectrograms faster than I read text, but someone did, and I was impressed!

I was glad to be a member of the organizing committee, despite being quite bad at getting corporate sponsors. I contacted over 200 companies, and got 0 sponsors. We had only a couple, mostly publishers, and mostly organized by other committee members. Only one company contacted us on their own. If I were to do it again, I would have contacted the previous delegates from 4 years before, and asked each three questions: “What research tools do you use that you like? What have you bought in the last year? What is the contact information for the salesperson who sold you those items.” With this information, it becomes possible to build a database of exactly how we as phonetics researchers can benefit companies, with contacts to those who would care the most.