Human Aeroecology

Myself, Bryan Gick (as co-first author) and Mark Jermy recently published “Human Aeroecology” in Frontiers in Ecology and Evolution. To quote: “Airspace has been recognized as habitat for at least a decade (Diehl, 2013). However, the ecology of airspace has generally been defined with respect to airborne lifeforms such as birds and insects (e.g., Chilson et al., 2017). Humans are as much creatures of the air as lifeforms that walk the ocean floor are creatures of the sea. Yet, little is understood about the full scope of human interaction with the airspace, much of which is normally invisible and intangible. Topics relating to human aeroecology have long remained isolated at the periphery of many disparate fields.”

“Here we identify five broad areas within human aeroecology that researchers have developed over the past years, and which we argue would benefit from focused collaboration. These include but are not limited to: Airscape Design; Air Quality for Comfort, Health, Education and Productivity (Air Quality for CHEaP); Shared Airspaces for Social Connection; Auditory, Aerotactile, Olfactory, and Visual Communication; and Pathogen Transmission, as seen in Figure 1.”

Uniformity in speech: The economy of reuse and adaptation across contexts

Myself, Connor Mayer, and Bryan Gick recently published “Uniformity in speech: The economy of reuse and adaptation across contexts” with Glossa. This article compares how Kiwis and North Americans produce flap sequences like “editor” in North America, or “added a” in New Zealand. Kiwis produce these similarly during slow and fast speech, North Americans often have two different methods for slow and fast speech. We show that difference likely stems from the extreme variability built into the “r”s of rhotic dialects of English reaching flaps because of reuse and adaptation of motor “chunks”.

To illustrate our claim: In the image below showing tongue tip frontness for the second vowel in 3-vowel-sequences for words like “editor”, you can see that for faster speech (6-7 syllables/second), there is a jump where the tongue tip is not nearly as fronted, but only for North American English (NAE) vowels, not for New Zealand English (NZE) vowels or NAE rhotic vowels. Here the high variability intrinsic to NAE rhotic (and commonly seen in other contexts) is visible in adjacent NAE non-rhotic vowels, but NZE has no access to rhotic vowels at all, so the non-rhotic vowels do not have a source of such motor control variability, even though such variability would provide mechanical advantage.

The abstract for this article, which explains in more technical but also more accurate terms, can be seen here:

“North American English (NAE) flaps/taps and rhotic vowels have been shown to exhibit extreme variability that can be categorized into subphonemic variants. This variability provides known mechanical benefits in NAE speech production. However, we also know languages reuse gestures for maximum efficiency during speech production; this uniformity of behavior reduces gestural variability. Here we test two conflicting hypotheses: Under a uniformity hypothesis in which extreme variability is inherent to rhotic vowels only, that variability can still transfer to flaps/taps and non-rhotic vowels due to adaptation across similar speech contexts. But because of the underlying reliance on extreme variability from rhotic vowels, this uniformity hypothesis does not predict extreme variability in flaps/taps within non-rhotic English dialects. Under a mechanical hypothesis in which extreme variability is inherent to all segments where it would provide mechanical advantage, including flaps/taps, such variability would appear across all English dialects with flaps/taps, affecting adjacent non-rhotic vowels through coarticulation whenever doing so would provide mechanical advantage. We test these two hypotheses by comparing speech-rate-varying NAE sequences with and without rhotic vowels to sequences from New Zealand English (NZE), which has flaps/taps, but no rhotic vowels at all. We find that NZE speakers all use similar tongue-tip motion patterns for flaps/taps across both slow and fast speech, unlike NAE speakers who sometimes use two different stable patterns, one for slow and another fast speech. Results show extreme variability is not inherent to flaps/taps across English dialects, supporting the uniformity hypothesis.”

Hearing, seeing, and feeling speech: the neurophysiological correlates of trimodal speech perception

Doreen Hansmann, myself, and Catherine Theys recently published a partially null-result article on the neurophysiological correlates of trimodal speech in Frontiers in Human Neuroscience: Hearing: Speech and Language. The short form is that while we saw behavioural differences showing integration of audio, visual, and tactile speech in closed-choice experiments, we could not extend that result to show an influence of tactile speech on brain activity – the effect is just to small:

Figure 3. Accuracy data for syllable /pa/ for auditory-only (A), audio-visual (AV), audio-tactile (AT), and audio-visual-tactile (AVT) conditions at each SNR level (–8, –14, –20 dB). Error bars are based on Binomial confidence intervals (95%).

The abstract for this article is below:

Introduction: To perceive speech, our brains process information from different sensory modalities. Previous electroencephalography (EEG) research has established that audio-visual information provides an advantage compared to auditory-only information during early auditory processing. In addition, behavioral research showed that auditory speech perception is not only enhanced by visual information but also by tactile information, transmitted by puffs of air arriving at the skin and aligned with speech. The current EEG study aimed to investigate whether the behavioral benefits of bimodal audio-aerotactile and trimodal audio-visual-aerotactile speech presentation are reflected in cortical auditory event-related neurophysiological responses.

Methods: To examine the influence of multimodal information on speech perception, 20 listeners conducted a two-alternative forced-choice syllable identification task at three different signal-to-noise levels.

Results: Behavioral results showed increased syllable identification accuracy when auditory information was complemented with visual information, but did not show the same effect for the addition of tactile information. Similarly, EEG results showed an amplitude suppression for the auditory N1 and P2 event-related potentials for the audio-visual and audio-visual-aerotactile modalities compared to auditory and audio-aerotactile presentations of the syllable/pa/. No statistically significant difference was present between audio-aerotactile and auditory-only modalities.

Discussion: Current findings are consistent with past EEG research showing a visually induced amplitude suppression during early auditory processing. In addition, the significant neurophysiological effect of audio-visual but not audio-aerotactile presentation is in line with the large benefit of visual information but comparatively much smaller effect of aerotactile information on auditory speech perception previously identified in behavioral research.

Confirming authorship on papers

Recently, I was working on a paper where we all made mistakes regarding authorship. We withdraw the paper in question before publication, and we have all been writing new guidelines for our labs in order to prevent similar mistakes in the future.

Our new guidelines require that active authors on a paper ensure that everyone who has touched any of the data or intellectual contributions on the paper read and respond to the email message below. Responses are then stored on the authors’ computers to document who does and does not wish to be an author.

(We will have later blog post on authorship order – those policies are currently being rewritten.)

Progress on the paper does not occur until all are in agreement on authorship AND authorship order:

Dear {name}, potential author on {article/project}

A paper on the above topic is currently in preparation. You are receiving this email because you may have had some contact with some aspect of this project.

According to what is sometimes called the Vancouver Convention, there are four key components to justify authorship on a given poster, proceedings paper, journal article, or project:

1)         Substantial contributions* to the conception or design of the work, or the acquisition, analysis, or interpretation of data for the work; AND 

2)         Drafting the work or revising it critically for important intellectual content; AND 

3)         Final approval of the version to be published; AND 

4)         Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. 

*we consider the term “substantial contributions” here to be equivalent to “substantive intellectual contributions” as described in the Vancouver Convention protocols as well as to “substantial professional contributions” as described in section 8.12 of the “Ethical Principles of Psychologists and Code of Conduct” of the American Psychological Association: https://www.apa.org/ethics/code.

Details on the Vancouver Convention protocols can be found in this website of the international committee of medical journal editors: https://www.icmje.org/recommendations/browse/roles-and-responsibilities/defining-the-role-of-authors-and-contributors.html

All those designated as authors should meet all four criteria for authorship, and all who meet the four criteria should be identified as authors.  Those who meet some but not all four criteria should be acknowledged.

In view of the above considerations, we are asking you as an individual for a statement of your contributions relative to the four points above.

Referring to the above points, please answer the following questions regarding your own contributions:

1)         Do you consider your contributions to satisfy the requirement of “substantial contributions” as described above? If so, please describe your contributions here:

2)         Have you or will you contribute to drafting the work or revising it critically for important intellectual content (yes or no)? If so, please describe:

3)         Have you or will you commit to providing final approval of the version to be published? (Yes or No): 

4)         Do you agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved (Yes or No)?: 

Phonological conditioning of affricate variability in Emirati Arabic

Today, Marta Szreder (first author) and I published an article on Phonological conditioning of affricate variability in Emirati Arabic. The article studies the [k∼tʃ] and [dʒ∼j] alternations in Emirati Arabic. In the article, we show that coronal obstruents [t,d] and coronal postalveolar fricatives [ʃ] inhibit production of the fricative variant of [dʒ] in the [dʒ∼j] alternation, but not the fricative variant of [k] in the [k∼tʃ] alternation, as seen in Figure 5 from the paper (below). The results suggest the [k∼tʃ] alternation is a completed phonemic change, while the [dʒ∼j] alternation is a an ongoing process.

Figure 5: Interaction graph showing relative affrication in /k/ and /dʒ/ phonemes based on whether there was a /t,d,ʃ/ within one vowel, somewhere further away in the word, or completely absent from the word.

The full abstract is quoted below:

This study investigates the conditioning effects of neighbouring consonants on the realisation of the phonemes /k/ and /dʒ/ in Emirati Arabic (EA), which are optionally realised as [tʃ] and [j], respectively. Based on previous accounts of EA and other Gulf Arabic (GA) dialects, we set out to test the prediction that proximity of other, phonetically similar coronal (COR) obstruents [COR, −son, −cont] and coronal postalveolar fricatives [COR, −ant] inhibit the surface realisation of the affricate variants of these phonemes. We examine elicitation data from twenty young female native speakers of EA, using stimuli with the target segment in the presence of a similar neighbour, as compared to words with the neighbour at a longer distance or with another coronal consonant. The results point to an asymmetry in the behaviour of the voiced and voiceless targets, such that the predicted inhibitory effect is confirmed for the voiced, but not the voiceless target. We argue that this finding, coupled with a consideration of the intra-participant and lexical trends in the data, is compatible with an approach that treats the two processes as being at different stages of development, where the [k∼tʃ] alternation is a completed phonemic change, while the [dʒ∼j] alternation is a synchronic phonological process.

Szreder, Marta & Derrick, Donald (2023) Phonological conditioning of affricate variability in Emirati Arabic, Journal of the International Phonetic Association. 1-19.

Red Wolf: My new video game

Red Wolf

My first commercial video game is now available on the Android Google Play Store, and you can see an advertisement on my YouTube Channel. This game is inspired by the fairy-tale of Little Red Riding Hood. It revises the story through the eyes of a farmer named Crimson who is trying to protect his cows, sheep, and chicken.

Will Crimson hear the call to protect his animals?

Will he rush foolishly into battle, ignore the plight of his animals and go back to sleep, or visit the local town of Wolfville?

There are 27 endings to this game, and upon completion, you can replay the game to see each of the memorials to Crimson’s possible lives in the cemetery of the possible.

Exploring how speech air flow may impact the spread of airborne diseases

I am participating on an American Association for the Advancement of Science (AAAS) 2022 meeting panel on “Transmission of Airborne Pathogens through Expiratory Activities” on Friday, February 18th from 6:00 to 6:45 AM Greenwich mean time. You can register for the meeting by clicking here. In advance of that meeting, the University of British Columbia asked me some Q&A questions exploring how speech air flow may impact the spread of airborne diseases.

The AAAS meeting itself is hosted by Prof. Bryan Gick of the University of British Columbia. It has individual talks by Dr. Sima Asadi on “Respiratory behavior and aerosol particles in airborne pathogen transmission”, Dr. Nicole M. Bouvier on “Talking about respiratory infectious disease transmission”, and myself on “Human airflow while breathing, speaking, and singing with and without masks”.

Dr. Sima Asadi’s talk focuses on the particles emitted during human speech, and the efficacy of masks in controlling their outward emission. For this work, Sima received the Zuhair A. Munir Award for the Best Doctoral Dissertation in Engineering from UC Davis in 2021. She is currently a postdoctoral associate in Chemical Engineering at MIT (Boston).

Dr. (Prof) Nicole M. Bouvier is an associate professor of Medicine and Infectious Diseases and Microbiology at the Icahn School of Medicine at Mount Sinai (New York). Nichole discusses how we understand the roots by which respiratory microorganisms, like viruses and bacteria, transmit between humans, which is fundamental in how we develop both medical and public health countermeasures to reduce or prevent their spread. However, much of what we think we know is based on evidence that is incomplete at best, and full of confusing terminology, as the current COVID-19 pandemic has made abundantly clear.

I myself am new to airborne transmission research, coming instead from the perspective that visual and aero-tactile speech help with speech perception, and so masks would naturally interfere with clear communication. They would do this by potentially muffling some speech sounds, but mostly by cutting off the perceiver form visual and even tactile speech signals.

However, since my natural interests involve speech air flow, I was ideally suited to move into research studying how these same air flows may be reduced or eliminated by face masks. I conduct this research with a Mechanical Engineering team at the University of Canterbury, and some of their results are featured in my individual presentation. Our most recent publication on Speech air flow with and without face masks was highlighted in previous posts on Maps of Speech, and in a YouTube video found here.

Speech air flow with and without face masks

It took a while due to the absolutely shocking amount of work required for the “Gait change in tongue movement” article, but Natalia Kabaliuk, Luke Longworth, Peiman Pishyar‑Dehkordi, Mark Jermy and I were able to get our article on “Speech air flow with and without face masks” accepted to Scientific Reports (Nature Publishing Group). The article is now out (though a pre-review version had been available since we submitted this article to Sci Rep). You can also watch my YouTube video describing many of the results.

Here is an example of a low-stiffness air-flow from a porous mask, which allows leaks from the tops, bottoms, and sides, and forward flow prevention, as taken from Figure 5 of the article.

Figure 5. Audio and Schlieren of speech through a porous face mask (Frame 621, 1st block, CORI Supermask). Image from 88 ms after the release burst for the [kh ] in “loch”. Note that the k’s puff is smoother and less well defined than the one in Fig. 2, but still has eddies that change air-density across the span of the puff. The red-dashed line in the audio waveform indicates the timing of the schlieren frame.

And here is an example of typical higher-stiffness flow from a less porous mask from Figure 8.

Figure 8. Audio and Schlieren of speech with a tightly fitting surgical mask (Frame 334, 1st block, Henry Schlein surgical mask [level 2]). Air slowly flows out above the eyes, floating out and upward continuously. The red-dashed line in the audio waveform indicates the timing of the schlieren frame.

Masks can be made to fit tighter, as in well-designed KN95/N95 masks and masks with metal strips at the nose to prevent upward-escaping air flow. However, for all the masks we studied, the tradeoff was not entirely avoided. And with that, here is our abstract:

Face masks slow exhaled air flow and sequester exhaled particles. There are many types of face masks on the market today, each having widely varying fits, filtering, and air redirection characteristics. While particle filtration and flow resistance from masks has been well studied, their effects on speech air flow has not. We built a schlieren system and recorded speech air flow with 14 different face masks, comparing it to mask-less speech. All of the face masks reduced air flow from speech, but some allowed air flow features to reach further than 40 cm from a speaker’s lips and nose within a few seconds, and all the face masks allowed some air to escape above the nose. Evidence from available literature shows that distancing and ventilation in higher-risk indoor environment provide more benefit than wearing a face mask. Our own research shows all the masks we tested provide some additional benefit of restricting air flow from a speaker. However, well-fitted mask specifically designed for the purpose of preventing the spread of disease reduce air flow the most. Future research will study the effects of face masks on speech communication in order to facilitate cost/benefit
analysis of mask usage in various environments.

Gait Change in Tongue Movement

Bryan Gick and I recently published an article on “Gait Change in Tongue Movement” in Scientific Reports (Nature Publishing Group). Below is the abstract, with images alongside. However, if you want an easy-to-follow walkthrough of the paper, I also published a YouTube video on the paper on my YouTube Channel for Maps of Speech.

During locomotion, humans switch gaits from walking to running, and horses from walking to trotting to cantering to galloping, as they increase their movement rate. It is unknown whether gait change leading to a wider movement rate range is limited to locomotive-type behaviours, or instead is a general property of any rate-varying motor system. The tongue during speech provides a motor system that can address this gap. In controlled speech experiments, using phrases containing complex tongue-movement sequences, we demonstrate distinct gaits in tongue movement at different speech rates. As speakers widen their tongue-front displacement range, they gain access to wider speech-rate ranges.

At the widest displacement ranges, speakers also produce categorically different patterns for their slowest and fastest speech. Speakers with the narrowest tongue-front displacement ranges show one stable speech-gait pattern, and speakers with widest ranges show two. Critical fluctuation analysis of tongue motion over the time-course of speech revealed these speakers used greater effort at the beginning of phrases—such end-state-comfort effects indicate speech planning.

Based on these findings, we expect that categorical motion solutions may emerge in any motor system, providing that system with access to wider movement-rate ranges.

Colorized Schlieren – with and without face masks

This is my first professional (as opposed to personal) youtube video, on my work channel for Maps of Speech. Today I’m making my debut with a scientific report on colorized schlieren images of speech air flow with and without masks. Please share widely, and encourage others to share widely. This is intended to be a worldwide release, and has been approved by the New Zealand Ministry of Business, Innovation, and Employment’s communications team, as well as our COVID-research team. In general, videos on this channel are more formal than my HoT videos, and will often be made in collaboration with whatever research team is working on the related projects. The details below this video link to three other unlisted videos that show the entirety of the schlieren videos referenced – without any audio commentary.

Frame of Colorized Schlieren data from a speaker saying “The beige hues on the waters of the loch impressed all”, no face mask.