Predicting EFL Learners’ Susceptibility to Various Disfluency Types Based on Gender and Age Прогнозування впливу статі та віку осіб, які вивчають англійську мову як іноземну, на виникнення в них різних видів порушення плавності мовлення

The aim of the current study is to investigate the relationship of production of speech disfluencies in EFL learners based on gender and age through regression modeling. Gender and age have been examined to influence the production of disfluencies in both native and nonnative speakers so it’s an important issue since fluency and disfluency are crucial aspects of language learning, however, the influence of age and gender on disfluency remains a controversial issue with studies often producing conflicting results with one another. Predict ing EFL Learners’ Suscept ibi l i ty to Various Disf luency. . . 175 © Minavandchal Amirmahdi & Sal imi Mahmood Methods. This study took a new approach to this subject as we produced regression models which can predict the likelihood of production of each disfluency type based on speakers’ age and gender. In order to do this 40 Iranian advanced EFL learners (20 male, 20 female) in four age groups (youth 19–24, young adults 25–30, adults 31–44, and older adults 45+) took part in the study. Later semi-structured interviews with a variety of questions regarding different topics were conducted and participants’ responses were first recorded and then transcribed. The frequency of occurrence of each disfluency type in participants’ speech samples formed our data. This data was then used for our regression analysis. Results. Our findings indicated that, while filled pauses are the most frequently produced disfluency in both genders and all age groups, female speakers are more likely to produce hesitations in their speech compared to male speakers. We also found out that, older adults are less likely to produce filled pauses in their speech compared to younger speakers. With Further analyses, we also investigated the likelihood of producing certain disfluency types over other ones based on age and gender and how this may help instructors. Conclusions. Based on our findings, it can be concluded that all six types of disfluencies are produced by the Iranian EFL learners. Also, we found that, filled pauses, hesitations, and repetitions are by far the most frequently produced disfluency types by Iranian EFL learners, respectively.


Introduction
English as the working language of 85.0% of international organizations (Crystal, 1997) has established itself as the de facto lingua franca of modern times. As postulated by Crystal (1997), the English language has achieved its status primarily due to the colonial past of Great Britain and the economic power of the US in the 20 th century. This high status makes interaction in English crucial for people who intend to interact with others outside of their native tongue's milieu in order to pursue their academic or professional careers. Speaking skill is one of the essential language skills for EFL learners on account of its significance for interacting with others in the world. The speaking ability has been proven crucial for finding jobs and better work-related opportunities for learners (Namaziandost & Ahmadi, 2019;Nasri & Biria, 2017). Main aspects of speaking skill consist of accuracy and fluency. Naturally, it would be desirable for language learners to become fluent in the target language, and this may entail attaining cognitive fluency with constructing spoken utterances that is perceived as fluent. Notwithstanding, in order for a language learner to reach the degree of fluency that matches that of a native speaker, it might likewise be sensible for the learner to reach the same degree of disfluency typical of a native speaker, this could be done by acquiring proper native-like disfluencies, and then being able to repair them like a native speaker (Tavakoli & Skehan, 2005).
The complex process of speech production involves the closely coordinated interaction of different processes such as utterance planning, formulation and motor planning. In order for the messages to be conveyed quickly and smoothly the speech needs to be fluent, any breakdown in fluency can be considered a disfluency (Lickley, 1994;Shriberg, 1994;Schnadt, 2009;Miller, 2010;Finlayson, 2014). Disfluency is an important issue not only in language learning but also in native-speakers speech production and speech pathologies since Speech disfluency can be pathological in the cases of stuttering and cluttering (Redford, 2015). However, in this study we focus on the issue of disfluency in normal speech of English language learners.
Typically for language learners, fluency can be interrupted by a number of problems such as difficulty in finding words or formulating grammatically sound utterances, pronunciation and articulatory problems, and intrusion of speaker's L1 or interlanguage at any level of speech production. These difficulties normally show themselves in the forms of: filled pauses, hesitations, repetitions, insertions, substitutions and deletions (Redford, 2015).
Different linguistic and environmental factors can influence disfluency, therefore, in the literature, different approaches have been taken to this phenomenon. In the current study we investigated this issue through a psycholinguistic lens.
The abstract notion of fluent speech does not include disfluencies, however, this is not the case in typical speech (Redford, 2015). It would be of value to attempt to decipher the psychological and physiological factors that affect fluency. The effects of stress, anxiety, and reward/ punishment and other psychological factors have been studied in various studies (Christenfeld & Creager, 1996;Marshall & Cullinan, 1971;Martin & Hasbrouck, 1977;Martin & Rangaswamy, 1972;Siegel et al., 1969). In addition to the aforementioned psychological factors, gender and age may also affect the fluency and disfluency of speech. The effects of gender on speech disfluency, however, have been a controversial issue in the literature. Despite a well-established relationship between gender and disfluency, exactly how gender influences the production of disfluencies is a problematic issue in the literature, and studies at times have conflicting results. The same could be said about the relationship between age and disfluency. Given the controversies regarding the impacts of gender and age on disfluencies, more studies in this regard are in order.
Fluency is often disregarded in EFL classes in Iran and the focus of the instructions is often on the accuracy of speech production in terms of grammatical competence and vocabulary learning, so as a result, many Iranian EFL learners are not fluent speakers (Ghonsooly & Hoseinpour, 2009;Namaziandost et al., 2018). It would be of great value if instructors could have educated opinions on the production of disfluencies amongst their students. This way, they could be better equipped to remedy such problems. Regression analysis is a statistical tool used for producing a mathematical model through an equation that can explain and more importantly predict the effects of one or more independent variables on a dependent variable (Pedhazur, 1997). In other words, regression analysis yields a predicted value for the criterion resulting from a linear combination of the predictors (Palmer & O'Connell, 2009). Another benefit of using regression modeling is that unlike other methods, it is not necessary to isolate the effect of each variable separately. The effect of each variable will be isolated by the analysis itself since the effects of other variables are being held constant while one variable is changing (Frost, 2019).
Numerous studies have been conducted regarding the accuracy of EFL learners in the context of Iran. However, despite the crucial role that fluency plays in oral communication, the vast majority of studies have neglected fluency/disfluency of speech production. This study aims to investigate the relationship between those characteristics of the learners that are easily observable by the instructors, namely age and gender, and based on them, predict the kind of disfluency that is more likely to be produced by each group and its production rate, so that the instructors can be better prepared regarding such issues.

Literature review
According to Bulc, Hadzi, and Horga (2010), speech fluency could be defined as "speech at a natural rate without many hesitations, pauses, repetitions, reformulation, filler words and filled or unfilled pauses" (p. 88).
The typical definition is commonly based on the listeners' perception, which refers to the smoothness of flow (Redford, 2015), however, this cannot be sufficient since there are restrictions with this form of defining fluency such as that the listeners' perception may not always be as reliable as it first may appear. There might be minor disturbances that are rarely detected by listeners'. Another concern with restricting the definition of fluency to listeners' perception is that the difficulties at planning or formulation stages are often resolved so quickly that they don't show up in speech, also it may be fixed at a rate that listeners might miss the disturbance that actually occurred so a more meticulous investigation is required. Fluency for EFL learners is the ability to make long utterances with as few pauses as possible (Fillmore et al., 1979: 93) at the same speech rate as native-speakers, unhindered by hesitations (Lennon, 1990: 390). However, given the complexity of speech production, disfluency is inevitable even in native-speakers, so in order for a second language learner to achieve near native-like proficiency it is desired for them to become familiar with how native speakers deal with disfluency and the typical repairs they employ.
Disfluency first appeared in Johnson's (1961) list of types of stutter in typical speech but, since then, it has been increasingly used more in the literature regarding speech production in a variety of disciplines.
Even though there is only a weak consensus on the definition of disfluency, in the literature it is often defined as any disturbance or interruption or irregularities in the flow of speech (Shriberg, 1994;Redford, 2015). Disfluency is a normal part of speech even for nativespeakers, more in-depth analyses of corpus studies reveal that disfluencies happen at an average rate of 6 per 100 words (Bortfeld et al., 2001;Eklund, 2004;Shriberg, 1994). 43.0% of cognitively demanding utterances include some sort of disfluency (Lickley, 2001), similarly it has been observed that, complex and long utterances tend to generate more disfluencies (Oviatt, 1995;Shriberg, 1994;Lickley, 2001).

Views on Disfluency
Disfluency can be seen in two broad perspectives: formal description and functional description. the formal description aims to describe the "patterns of words and syntactic units that disfluencies display", while a functional description makes assumptions about what went wrong in the underlying processes of speech production (Redford, 2015). Formal categorization of disfluencies can be traced back to the 1950s and early 1960s with Mahl's (1956) categorizations of disturbances of pathologically disfluent speech and other studies regarding hesitation phenomena (Blankenship & Kay, 1964;Maclay & Osgood, 1959). In the later decades with the advent of speech technologies and recorded corpora the, need for a reliable labeling patterns and formal annotation schemes grew. According to Redford (2015) careful inspection of the related studies suggests a consensus amongst researchers on several types of disfluencies, categorized as: filled pauses, hesitations, repetition, insertions, substitution, and deletions.
A filled pause is a pause in the speech that includes fillers like 'um', 'ah', 'er' or similar sounds (Kormos & Dénes, 2004). Silent pauses are featured even in native speakers' fluent speech so only a longer duration of silence can be considered as a hesitation. According to Redford (2015) hesitations normally occur when the flow of speech has been momentarily suspended. Hesitations may be the result of the difficulty in accessing lexical items either due to a lack of familiarity with the words or due to contextual considerations. It may also occur when the speaker has rival words to select or when other words are being planned along with the word that is being articulated with which it may share some phonological features. Hesitations are typically realized by either stopping speech altogether temporarily, or by prolonging a syllable, or producing a filled pause or a filler, repeating parts of speech, or by an expression of speaker's lack of certainty on what word to say next. According to Butcher (1981), 75.0% of listeners notice hesitations when they are 220 ms or longer, also long pauses between tone groups were more often detected by listeners, so pauses are recognized as hesitations, not only by duration but with tone groups. The syntactic structure also affects the perception of hesitation, for example, a pause between a determiner and a noun is unequivocally considered as a hesitation (Redford, 2015). Since exact durations are hard to measure, detection of silent pauses is normally done through the subjective perception of the researcher (Nakatani & Hirschberg, 1994). Duez (1993) found that most of what people perceive as pauses are actually prolongations of one syllable. Silent pauses that are caused by prosodic structures are likely to be followed by the prolongation of a syllable (Cooper & Paccia-Cooper, 1980;Ferreira, 1993). Eklund (2001) also found out that this prolongation usually happens with the final syllable of a word.
Repetition as a hesitation does not involve repeating words for rhetorical purposes (like emphasis) or other forms of repetition that are part of the natural fluent speech but rather when a speaker pauses in the middle of an utterance and starts over and repeats some parts of what he had said with a fluent flow. Studies have revealed that the repeated words are often function words and not content words (Lickley, 1994;Maclay & Osgood, 1959;Shriberg, 1994). This figure can be as high as 96.0% of the repeated words (Lickley, 1994). Substitution happens when a speaker replaces a part-word, word, string of words with another word or words. Insertion happens when a speaker repeats his or her words but adds one or more words to them. Deletion happens when a speaker abandons the utterance mid-stream. Table 1 summarized these classifications of disfluency types based on the formal description as stated by Redford (2015). Table 1 Disfluency Types

Disfluency type Example Hesitation
My brother is twenty o-twenty-two years old Filled Pauses I'm uh um a good person Repetition Straight up f-from there Substitution Have you got a-some gorillas on the left Insertion to the mona-just to the monastery Deletion Heading back up sort of two thir-have you got allotments?

Relationship of Age and Gender on Disfluency
The effect of gender on the production of speech disfluencies has been a controversial issue in the literature and the findings are at times contradictory. Numerous studies have been conducted regarding this issue. Johnson (1961) conducted a study consisting of 100 male and 100 female participants. 50 participants of each gender were stutterers. In the study, the participants had to complete two different speaking tasks and one reading task. After the analysis of the collected data, the researcher concluded that male stutterers produced more revisions compared to female stutters. Nonstutterer males also produced more revisions and interjections (extraneous sounds such as 'uh' 'er' and 'hmmm' and extraneous words such as 'well') in both speaking tasks compared to nonstutterer females. He concluded that overall, males tend to produce more disfluencies than females irrespective of whether they are stutterers or nonstutterers. Shriberg (1994) used the analysis of over 5000 hand-annotated disfluencies from a database of 250.000 words, he found that those filled pauses were more typical of male's speech than female's speech. However, other researchers claimed that males produce some types of disfluencies more than females. For instance, Lickley (1994) conducted informal interviews with 3 male and 3 female participants aged 25 to 45, he found out that male speakers produced more disfluencies than female speakers. However other researchers have asserted that female speakers produce more disfluencies. Menyhárt (2003) conducted a research with 15 male and 15 female speakers in which the spontaneous speech of participants on various topics was sampled and analyzed. The researcher concluded that female speakers produced more disfluencies. Acton (2011) showed that female's average um/uh (filled pauses) ratios were more than those of men in his two corpus-based examinations. Conversely, other researchers have suggested that gender does not affect disfluency in general (Andrade & Martins, 2011;Shin & Lee, 2017). Age as another factor that influences disfluency has been proven a controversial issue in the literature regarding speech disfluency (Leeper & Culatta, 1995;Menyhárt, 2003;Yairi & Clifton, 1972). Menyhárt (2003), in his aforementioned research, conducted a series of experiments with 30 Hungarian-speaking persons in three age groups: children (9-12 year olds), adults (22-45), elderly people (60-90), at the end he concluded that all age groups produced disfluencies at the same level, with the hesitations constituting the majority of disfluencies followed by filled pauses, repetitions. Leeper and Culatta (1995) examined the effects of age and gender on speech in three speaking conditions in 78 elderly participants (55-92 years). The treatment group consisting of older participants were compared to a control group of young speakers (25-35 years). The results indicated that disfluencies increase as people age. The results were similar to those of Yairi and Clifton (1972) and Manning & Monte (1981), who after examining spontaneous speech samples of 40 nonstutterers and 4 stutterers above the age of 50 concluded that fluency breaks (especially fillers and interjections) increase in older speakers' speech. However, other researchers have conducted similar studies with results that conflict with the aforementioned studies. Andrade and Martins (2011) after analyzing speech samples of 136 fluent speakers of Brazilian Portuguese language, in age groups of preschoolers, early adolescence, late adolescence, adults, and elders for disfluencies noted that, despite an increase of instability between childhood and late adolescence, followed by a period of stabilization during adulthood, and a decrease at the ages of 60-70 years and an increase at the age of 80, they concluded that age does not distinguish speakers' occurrence of disfluencies as the noted differences were not statistically significant. These controversies regarding the effects of age and gender on speech disfluency given the importance of perceived fluency for EFL learners' hints at an increasing need for further studies regarding these issues from different perspectives.

Regression analysis
Regression analysis is a statistical tool for the investigation of relationships between variables. Regression analysis normally produces a mathematical equation/model that enables us to first isolate the effect of each independent variable on the dependent variable, and secondly to predict how any change in each individual independent variable would change and dependent variable (Frost, 2019). Usually, the investigator seeks to ascertain the causal effect of one variable upon another. Regression techniques have long been central to the field of economic statistics. For example, Tabasi, Aslani, and Forotan (2016) utilized regression analysis in order to predict energy consumption.
Multinomial logistic regression is a simple extension of binary logistic regression that allows for more than two categories of the dependent or outcome variable and it is normally used to predict categorical placement or the probability of category membership on a dependent variable based on multiple independent variables.
Multinomial logistic regression have been used in many different studies, to name a few; Meng and Miller (1995) modeled the sex differences in occupations in China, Spector and Mazzeo (1980) examined different experimental teaching methods on class performance and Stevens (1992) analyzed the language choice in multilingual societies.
The research questions and hypotheses can be formulated as follows: RQ 1. Does Iranian English learners' gender predict the production rate of each disfluency type in their speech? RQ 2. Does Iranian English learners' age predict the production rate of each disfluency type in their speech? RQ 3. Which disfluency type Iranian English learners are more likely to produce in their speech based on their gender and age?
The following null hypotheses will be formulated in this study: H0 1. Iranian English learners' gender does not predict the disfluency production rate of each disfluency type.
H0 2. Iranian English learners' age does not predict the disfluency production rate of each disfluency type.

Participants
The sample of the study consists of 20 Iranian male and 20 Iranian female advanced learners of English in four age categories (youth 19-24, young adults 25-30, adults 31-44, older adults 45+). The sampling was based on non-random sampling: the participants were chosen from people who volunteered to participate in the research by responding to an ad on social media, however, the final participants were accepted based on whether they were able to pass the online Cambridge assessment English general test with C1 or C2 proficiency level. Afterward, the researcher contacted them via online video calls to conduct the interviews. Before conducting the interview, the researcher asked for participant's verbal consent in order to record their voices to transcribe and analyze them.

Instruments
Online Cambridge assessment English general test. This placement test also known as Linguaskill, is a quick online test to determine the English levels of individuals and groups of candidates consisting of 25 questions. The link to the test was sent to participants to take on their own schedule and report back the results.
A C2 Proficiency level shows that the learner has mastered English to an extraordinary level. It demonstrates you can speak with fluency and accuracy, C1 Proficiency level shows that the learner is a confident and flexible language user. Volunteers passing by C1 and C2 were invited to participate in the study.

Computer software
In order to conduct the interviews the researcher called the participants via WhatsApp and a laptop and their voices were recorded by Adobe Audition software in separate audio files for further analysis. There are several computer assisted tools to help with the process of transcription, Praat is one such tool. The researcher transcribed the data manually while listening to the audio files and Praat will be a reasonable complementing tool.

Interview questions
The questions were chosen from the speaking units of Cambridge English Objective Advanced for semi-structured interviews. Some speech acts and topics have been observed to cause more turbulence in the flow of speech than others. Long, cognitively demanding, and grammatically complex utterances are more likely to cause disfluencies (Lickley, 2001). The topics and questions of the interview, were chosen from such topics so that we could have a richer dataset. Giving instructions or directions has been observed by Lickley (2001) to cause more disfluencies, also abstract ideas and conceptual figures tend to cause more disfluencies as well (Bortfeld et al., 2001). The questions were regarding various topics such as childhood memories, moral judgments, politics, and speech acts such as giving instructions and directions.

Data Collection Procedure
First, the researcher put several ads on different social media. The volunteers were asked to take the General English placement test called Linguaskill and report back the results. The volunteers who were placed at C1 or C2 levels were our legitimate participants. After we found our 20 male and 20 female participants in 4 age groups (equal numbers in each group), each participant was interviewed in a semi-structured interview, via WhatsApp video call for 8-10 minutes. Ten questions which were chosen from Cambridge English Objective Advanced speaking units, were then asked from the participants. The questions covered topics that have been proven to be disfluency inducing in the literature such as asking for directions and discussing abstract ideas. Their answers were recorded by Adobe Audition software for further analysis. Then the collected audio files were carefully transcribed, computer-assisted tools were also utilized for transcription. Pratt is a popular tool in transcribing speech samples and was a complementing tool in the task of transcription. Number of disfluencies for each gender and age group were collected. Afterward, a second rater went through the same procedure. To ensure intra-rater reliability, Pearson correlation coefficient formula was employed. The collected dataset was be used as the resource for our regression analyses.

Data Analysis
In this quantitatively designed study, the collected data was then analyzed to test the null hypotheses of the study. The researcher employed SPSS version 26 to run regression analyses in order to model and predict the type of disfluencies that speakers are susceptible to and its production rates, based on their gender and age as independent variables. Multinomial logistic regression makes a few assumptions that our data must meet before we may employ it: The dependent variable should be categorical or nominal, and categories must be exclusive. There should be no multicollinearity i.e. having two or more independent variables that are overly correlated since it may confuse the results, as it would not be possible to distinguish the variable that explains the observed changes in the dependent variable. Multinomial logistic regression is often considered an attractive analysis because; it does not assume normality, linearity, or homoscedasticity, also it does not necessitate careful consideration of the sample size and examination for outlying cases but normally for each independent variable there should be at least 20 participants.
Data Availability: The data underlying this article are available in Mendeley Data Search (Minavand, 2021).

Adherence to ethical standards
No funds, grants or other support was received for conduct of this research. The authors do not have any potential conflict of interests (financial or non-financial) that may influence the decision to publish this article. All participants volunteered to participate in this research and gave verbal consent to the authors so participation was voluntary and that they were free to withdraw at any time, without giving a reason. © Мінавандчал Амірмехді & Салімі Махмуд

Findings
In this section the statistical analyses regarding the relationship of gender and age and speech disfluencies are presented.
As can be seen in Table 2, filled pauses, hesitations and repetitions are by far the most observed disfluency types in both genders and all age groups. Separate linear regression models were produced for each of the aforementioned disfluency types in order to see how gender and age predict the production rates of said disfluency types. Statistical Relationship between the production rate of filled pauses and gender and age Table 3 presents the analysis of variance (ANOVA) for our model. The P-value of .003 indicates that the null hypothesis is rejected and age and gender predict the production rate of filled pauses in speech.  Table 4 presents the Parameter estimates of our model. According to Table 4, gender is not statistically significant. Therefore, it is not a good predictor of the production rate of filled pauses in speech.
However, age is statistically significant. Therefore, a good predictor of the production rate of filled pauses in speech. A standard coefficient beta of -.436 indicates that one unit of change in our age groups (i.e. youth group to young adults) is likely to decrease the production rate of filled pauses for individuals by .436 unit. Statistical relationship between the production rate of hesitations and age and gender Table 5 presents the analysis of variance (ANOVA) for our model. P-value of .004 indicates that the null hypothesis is rejected and age and gender predict the production rate of hesitations in speech.  Table 6 presents the Parameter estimates of our model. According to Table 6, age is not statistically significant therefore it is not a good predictor of the production rate of hesitations in speech. However, gender is statistically significant therefore it is a good predictor of the production rate of hesitations in speech. However, age is not statistically significant. A standard coefficient beta of .490 indicates that female speakers compared to male speakers are likely to produce .490 more units of hesitation in their speech. © Мінавандчал Амірмехді & Салімі Махмуд Statistical relationship between the production rate of repetitions, insertions, substitutions and deletions and age and gender Table 7 presents the analysis of variance (ANOVA) for our models. P-value of .118 for repetitions production model (model l), .736 for insertions production model (model 2), .438 for substitutions model (model 3), and .069 for deletions production model (model 4), indicates that the null hypothesis is accepted and age and gender do not predict the production rate of those disfluency types in speech.

Statistical relationship between the production rates of insertions, substitutions and deletions and age and gender
The prominence of insertions, substitutions and deletions in the samples are comparable. Therefore, a multinomial logistic model can be produced to investigate the relationship between the production rates of these disfluency types and age and gender.
From Table 8, we can affirm that our model is fit. Pearson (8.150) and deviance (9.552) statistic tests prove the fitness of our model since the tests are not significant. The odds of production of disfluency types in speech From Table 9 we can see that male speakers compared to female speakers are 24.3% more likely to produce insertions rather than substitutions in their speech, while from Table 10 we can say that they are 582.4% more likely to produce insertions rather than deletions in the speech.
With regards to our age groups, from Table 9, we can see that the odds of the age group of youth producing insertions rather than substitutions in their speech are 59.7% less than those of older adults (with gender held constant). For the age group of young adults, the odds are 70.1% less than those of older those of adults, while its odds are 59.7% lower for adults compared to older adults.
While from Table 10, we can say that the odds of the age group of youth producing insertions rather than deletions in their speech is 27.8% more than those of the age group of older adults (with gender held constant). For the age group of young adults, the odds are 41.2% lower than those of older adults, while its odds are 27.8% higher for adults compared to older adults.

The odds of production of deletions is in disfluent speech
From Table 11, we can say that male speakers compared to female speakers are 81.8% less likely to produce deletions rather than substitutions in their speech.
From Table 11, we can say that the odds of the age group of youth producing deletions rather than substitutions in their speech is 68.5% less than those of older adults (with gender held constant). For the age group of young adults, the odds are 49.1% less than those of older adults, also its odds are 68.5% lower for adults compared to older adults.

Discussion and Conclusions
In this study, we attempted to investigate how basic characteristics of Iranian English learners, namely gender and age, affect the production of disfluencies in speech. Based on our findings, it can be concluded that all six types of disfluencies are produced by the Iranian EFL learners. Also, we found that, filled pauses, hesitations, and repetitions are by far the most frequently produced disfluency types by Iranian EFL learners, respectively. However, the production frequency of insertions, substitutions, and deletions are comparable. Our findings answered our research questions as follows: As to the first research question, we found that gender of the speakers does not predict the production rate of filled pauses, repetitions, insertions, substitution and deletions. However, gender is a good predictor of production rate of hesitations speech. Female speakers compared to male speakers are likely to produce more hesitations in their speech.
As to the second research question, we found that, speakers' age does not predict the production rate of hesitations, repetitions, insertions, substitution, and deletions. However, it is a good predictor of the production rate of filled pauses in their speech. Older speakers compared to younger groups are less likely to produce filled pauses in their speech.
As to the third research question, we found that, both genders and all four age groups are likely to produce filled pauses more than other types of disfluencies. Followed by hesitations and then repetitions. However, since the production rate of insertions, substitutions and deletions are similar, a multinomial logistic regression model was produced. This model could predict the likelihood of production these disfluency types compared to one another.
Based on our model, we found that male speakers compared to female speakers, are only slightly more likely to produce insertions rather than substitutions. However, they are very highly more likely to produce insertions rather than deletions. Also, we found that male speakers compared to female speakers, are only slightly less likely to produce deletions rather than substitutions.
In relation to the age groups, we found that the age group of youth is relatively less likely than older adults to produce insertions instead of substitutions. For young adults, the probability is relatively lower, and for adults, the probability is again relatively lower than older adults. On the other hand, the probability of the age group of youth producing insertions instead of deletions compared to older adults is only slightly higher, while for the young adults, the probability is slightly lower, and for adults, the probability is again slightly higher. We also found that the probability of age groups of youth, young adults, and adults producing deletions instead of substitutions is slightly lower than those of older adults.
Overall, in terms of production rate of disfluencies, gender only minimally influenced the production rates of disfluencies and female speakers are likely to produce disfluencies at the same rate as male speakers, except for one category (hesitations) in which female speakers produced more disfluencies. However, this could be due to sociolinguistic factors such as gender roles in society especially since the interviews were conducted by a male researcher. Also, it is likely that these results are due to psycholinguistics factors, which should be investigated in further studies. Other studies have also investigated the impact of gender on disfluency. Menyhárt (2003) found female speakers to be more disfluent, while Altıparmak and Kuruoğlu (2018) found no differences between men and women in terms of production of disfluencies. However, Shriberg (1994) found male speakers to be more disfluent. In our study, age and gender were both found to only minimally impact the production of disfluencies. Age was found to be statistically significant in production of filled pauses. We found older adults to be less likely to produce filled pauses. Other studies such as Andrade and Martins (2011) found the production of disfluencies to stabilize during the adult years, conversely Menyhárt (2003) found that age does not impact the production of disfluencies in a meaningful way. With regards to our findings, it must be noted that the aforementioned studies investigated the issue of production of speech disfluencies in native speakers, while our study was conducted with non-native speakers. Therefore, the slight and minimal decrease of production of disfluencies in older adults could be either due to the fact that older learners are more experienced. However, these results may also be due to sociolinguistic factors, such as social status. Therefore, further studies could be conducted regarding the issue of disfluency in EFL learners through a sociolinguistic lens. Normally in regression analysis, for each independent variable 20 items need to be incorporated in the dataset, studies similar to ours with a greater number of participants can produce much stronger and accurate models. It may also be fruitful to investigate the issue of disfluency with regards to psychological, social or prosodic factors in order to get a fuller understanding of the nature of disfluency in language learners. The current study might shed some light on the issue of disfluency in language learners and help us understand the underlying factors that may cause these breaks of fluency. As discussed earlier, speech fluency is a crucial aspect of language learning, and studies such as ours could help instructors and learners and material developers mitigate such problems.