Automated analysis of language production in aphasia and right hemisphere damage: Frequency and collocation strength.

Zimmerer, V.C., Newman, L., Thomson, R., Coleman, M., & Varley, R.A. (2018). Automated analysis of language production in aphasia and right hemisphere damage: Frequency and collocation strength. Aphasiology, 32(11), 1267-1283. DOI: 10.1080/02687038.2018.1497138

People with aphasia rely on more common words, and more strongly collocated word combinations, in spontaneous language production.

In aphasia, the effects that make a word or sentence easier or harder to process become intensified. Words that take milliseconds longer for a healthy speaker may become out of reach after brain damage. Sentences that are a bit more taxing for grammatical systems may become uninterpretable.

With regards to word processing, theories have long been “usage-based”. How common a word is, when it is typically acquired in development, how abstract its meaning is, all these factors are understood to have an impact on how it is represented in the brain. In a nutshell, common words that were learned early and have concrete meanings (e.g. dog) are easier than rare words that are learned later in life and have abstract meanings (e.g. levity).

However, suggesting that the same factors also play a role at the level of word combinations is still considered crazy in some circles. The field has gotten used to describing grammar using abstract phrase structure rules that have no place for usage patterns. Our paper makes a good argument for rules not being the entire story. We focus on usage frequency as a variable and assume that combinations of words that commonly co-occur are easier to process: I don’t know is easier than I can’t row. Some people with aphasia may find it impossible to produce rarer, or entirely novel, combinations.

We looked at samples from Rosemary Varley’s PhD dissertation: Interviews with 10 people with fluent aphasia, 10 people with non-fluent aphasia, 10 non-aphasic people with damage to the right hemisphere and 10 neurotypical controls. We used the Frequency in Language Analysis Tool (FLAT), which is “my baby” in a sense, and about which I will write a separate blog post soon. The FLAT does something very dumb, but which astonishingly no one had done before. It takes every word and word combination (of two or three words) from a written sample, looks up its frequency in a normative corpus, and therefore determines how common the unit is. Based on frequency it computes related values (such as collocation strength). It does some additional things like categorizing words as content or function words.

Results from the study using the FLAT. Bars represent results for each group (fA: fluent aphasia; nfA: non-fluent aphasia; RHD: Right-hemisphere damage) in relation to the control group, e.g. positive values mean that in interviews the group produce… — Results from the study using the FLAT. Bars represent results for each group (fA: fluent aphasia; nfA: non-fluent aphasia; RHD: Right-hemisphere damage) in relation to the control group, e.g. positive values mean that in interviews the group produced more frequent or strongly collocated forms than controls. We found significant increases for the aphasia groups for both content words and two-word combinations (bigrams; three related variables for measuring frequency/collocation strength), and non-significant decreases for the RHD group.

We found that people with aphasia used more common words and more common word combinations than controls. Importantly, our measures for word combinations take word frequency into account. The effect for word combinations is therefore not simply a reflection of word selection, but comes on top of that. With regards to people with right-hemisphere damage, based on previous work, mostly by Diane van Lancker-Sidtis and colleagues, one could expect people them to produce less common combinations. We found that effect, however, it was smaller and not statistically significant. There was a way to use linear regressions and show the effect to be significant, but I decided to not include it in the manuscript since regressions are a bit misleading given our sample size.

Why are more common combinations easier to process, and why are they more resilient to brain damage? We could be looking at a simple practice effect as speakers become used to combining these particular words. However, some common utterances may not involve combining words at all. Instead, they may be stored like words themselves. For more thoughts about this, check out my post on formulaic language.

This is also my first publication that includes the work of project dissertation students. Loveday Newman and Rosalind Thomson each conducted part of the FLAT analysis. Both did an excellent job working with new methods.

Finally, we had two fantastic reviewers who really made me do my homework before they accepted the paper. Kathy Conklin inspired me to dig deeper into the data and provide a clearer picture about what kind of word combinations we encountered. One anonymous reviewer pointed me towards some very important literature about frequency effects. It’s good when the process works.