The Acquisition of Prosody: Evidence from French- and English-learning Infants*

Haskins Laboratories Status Report on Speech Research 1993, SR-113, 41-50 The Acquisition of Prosody: Evidence from French- and English-learning Infa...
Author: Daniella Turner
2 downloads 0 Views 803KB Size
Haskins Laboratories Status Report on Speech Research 1993, SR-113, 41-50

The Acquisition of Prosody: Evidence from French- and English-learning Infants* Andrea G. Levitt t

The reduplicative babbling of five French- and five English-learning infants recorded when.the infants .were between the ages of 7;3 months and 11;1 months on av~rage, was exammed for eVIdence of language-specific prosodic patterns. Certain fundamental ~requency and sylla?le-timing patterns in the infants' utterances clearly reflected the mflu~nce of the ambIent language. The evidence for language-specific influence on syllable amp~I~u.des was less cle~r. The results are discussed in terms of a possible order of acqmsItIon for the prosodic features of fundamental frequency, timing, and amplitude.

French and the other English, Doug Whalen, Qi Wang and I have found evidence for the early acquisition of certain language-specific prosodic features. These results can be discussed in terms of a possible order of acquisition for languagespecific prosodic features and in terms of evidence for possible regression in children's apparent sensitivity to prosodic information.

1. INTRODUCTION Prosody is generally described in terms of three main suprasegmental features that vary in language-specific ways: the fundamental frequency contours, which give a language its characteristic melody; the duration or timing measures, which give a language its characteristic rhythm; and the amplitude patterns, which give a language its characteristic patterns of loud versus soft syllables. When does the prosody of infants' utterances begin to show language-specific effects? To answer this question it is important first to understand the linguistic environment of the child, which is characterized by a special sociolinguistic register called child-directed speech (CDS). CDS has marked grammatical as well as prosodic characteristics, for which a number of possible uses have been suggested. It is also important to understand what is known about infants' sensitivity to the three prosodic features of speech. Since English and French provide very different prosodic models for young infants, they ~re thus excellent choices for investigating the Issue of language-specific prosodic influences on infan~s' utterances. Analyzing the reduplicative babblIng of two groups of infants, one learning

2. CHILD-DIRECTED SPEECH (CDS) In the last twenty-five years or so, researchers have documented the existence of child-directed speech (CDS), also know as "motherese," a special style of speech or linguistic register used with young first language learners (e.g., Ferguson et al., 1986). Most researchers consider CDS to be universal (e.g., Fernald et al., 1989; but cf. Bernstein Ratner, & Pye, 1984; Heath, 1983). Compared to speech between adults or adultdirected speech (ADS), CDS shows b~th special grammatical and prosodic features. From a grammatical perspective, CDS consists of shorter simpler, and more concrete sentences, uses mor~ repetitions, questions, and imperatives and more emphatic stress. From a prosodic perspective CDS includes high pitch, slow rate, exaggerated pitch contours, long pauses, increased finalsyllable lengthening, and whispering. Some researchers have attributed adults' production of the higher pitch and more variable fundamental frequency of CDS to the preference of very young children for higher pitched sounds

This work was supported by NIH grant OC00403 to Catherine Best and Haskins Laboratories. We thank the families of our French and American infants for their participation in this research.

41

42

Levitt

(Sachs 1977), whereas others have focused on these ~rosodic characteristics as contributing to the expression of affection (Brown, 1977) or for attracting the child's attention (Garnica, 1977). More recently, however, some investigators have argued for a more linguistically significant role for the prosodic characteristics of CDS. Thus, researchers have variously suggested that the prosodic patterns of CDS may help infants in learning how to identify their native language (Mehler et aI., 1988); to identify important linguistic information, such as names for unfamiliar objects (Fernald & Mazzie, 1983); and even to parse the syntactic structures of their native language (Hirsh-Pasek et al., 1987), Some of our own current work suggests that the prosodic features of CDS may also serve to enhance speaker-specific properties of the speech signal. As it turns out, not all of the linguistic features attributed to CDS are present at once. Indeed, certain features are quite unlikely to co-occur. Other sociolinguistic registers remain relatively stable over time, but CDS does not. In fact, it is characterized by notable systematic changes that appear linked to the developmental stage of the child spoken to (Bernstein Ratner, 1984, 1986; Malsheen, 1980; Stern, Spieker, Barnett, & MacKain, 1983). As do the other features of CDS, the prosodic aspects also appear to change over time. For example, pitch height and the use of whispering are reduced as children grow older (Garnica, 1977). There may even be changes in the types of fundamental frequency contours that a child hears over time. A recent study (Papou.sek & Hwang, 1991) has shown that Mandarin CDS prosody, as produced for presyllabic infants, may even distort the fundamental lexical tones, which are each marked by specific fundamental frequency contours in the adult language. But Chinese children do go on to learn the appropriate tones, and indeed our preliminary analyses of Mandarin CDS, produced to an infant between 9 and 11 months of age, suggest that for the older infant there is considerably less distortion. Even if very early CDS has more universal .than language-specific prosodic patterns (Papou.sek, PapouSek, & Symmes, 1991), the CDS addressed to older infants, as well as all other forms of speech which young infants are likely to hear, provide ample exposure to language-specific prosodic patterns as well. What is known about young infants' sensitivity to the prosodic patterns oflanguage?

3. INFANT RESPONSE TO PROSODY Bull and his colleagues (Bull, Eilers, & Oller, 1984, 1985; Eilers, Bull, Oller, & Lewis, 1984) have shown that infants in the second half year of life can detect changes in each of the three prosodic parameters under discussion. Researchers have found that infants' response to fundamental frequency variation or intonation is particularly strong. Indeed, infants' strong response to CDS (Fernald, 1985) can be interpreted as a preference on their part for its special fundamental frequency contours (Fernald & Kuhl, 1987). In terms of early pitch production, Kessen, Levine, and Wendrick (1979) found that infants between 3 and 6 months of age were able to match with their voices the pitches of certain notes and there have also been reports of young infants being able to match the fundamental frequency contours of spoken utterances (Lieberman, 1986). Once children have begun to speak, they can make communicative use of pitch, e.g., contrast a request from a label, even at the one-word stage (Galligan, 1987; Marcos, 1987). Other research has demonstrated that infants show an early perceptual sensitivity to some specific rhythmic properties of language. For example, it has been shown that very young infants can discriminate two bisyllabic utterances when they differ in syllable stress (Jusczyk & Thompson, 1978; Spring & Dale, 1977), Infants could perform this task both when the syllable stress was cued by all three typical prosodic markers as well as when the stress was cued by duration alone (Spring & Dale, 1977). Furthermore, Fowler, Smith, and Tassinary (1985) found evidence that the basis for infants' perception of speech timing is stress beats, just as it is for adults. Relatively little attention has been placed on infants' sensitivity to amplitude or loudness differences, independently of its role in stress, except for the work of Bull and his colleagues (1984), mentioned above.

4. EARLY LANGUAGE-SPECIFIC PROSODIC INFLUENCES ON PRODUCTION Although not all attempts to find support for early language-specific effects on infant utterances have been successful (see Locke, 1983 for a review), Boysson-Bardies and her colleagues were able to find such evidence in their crosslinguistic investigations of infant utterances. For example, using acoustic analysis, they found that

43

The Acquisition of Prosody: Evidence from French- and English-learning Infants

the vowel formants of 10-month-old infants varied in ways consistent with the formant-frequency patterns in the adult languages (Boysson-Bardies, Halle, Sagart, & Durand, 1989). Some of our own Fesearch (Levitt & Utman, 1991), along with results from another study by Boysson-Bardies and her colleagues (Boysson-Bardies, Sagart, & Durand, 1984), suggested that young infants from different linguistic communities might also show early language-specific prosodic differences, which we decided to explore by comparing the utterances of French-learning and English-learning infants.

ing in French), final-syllable lengthening is more salient in French because French nonfinal syllables are not typically lengthened due to word stress, as are nonfinal syllables in English. iii

300

v

• c

200 ()

Cll III

E 100

4.1 Prosodic differences in English and French French and English differ in a number of ways on each of the three prosodic features. In terms of fundamental frequency contours, there are several differences, including contour shape (Delattre, 1965) and incidence of rising contours (Delattre, 1961). It is the difference in the incidence of rising contours that we investigated. Delattre (1961), who analyzed the speech of Simone de Beauvoir and Margaret Mead, found that the French speaker had many more rising continuation contours (93%) than her American counterpart

o

wws

wwws

wwwws

Two- to five- syllable words in French

300

200 ()

Cll III

E

(11%).

In our investigation of timing differences between French and English we focused on the syllable level, where we found at least three clear differences: the salience of final syllable lengthening, the timing of nonfmal syllables, and the interval between stressed syllables or, in other terms, the typical length of the prosodic word. In Figure 1, we can see the first two rhythmic properties illustrated for French and English. The graphs (taken from Levitt [1991]) show syllable timing measures based on reiterant productions by adult native speakers of French and English of words of two to five syllables in the two languages. The native speakers replaced the individual syllables of each word with the syllable/mal, while preserving natural intonation and rhythm. To provide these data, ten native speakers of English and ten native speakers of French were asked to produce reiterant versions of a series of 30 words in their own language. The top graph represents timing measures for French words of two to five syllables, the middle graph represents English words of two and three syllables with all possible stress patterns, and the bottom graph represents English words of four and five syllables with a selection of stress patterns. The first property, final-syllable lengthening, is a more salient feature of French than of English. Although both French and English exhibit finalsyllable lengthening (breath-group final lengthen-

ws

100

o

sw ws sww wsw ww Two- and three- syllable words in English

300

200 ()

3l

E

100

o swww wsww wwsw wwsww wwwsw Four and five- syllable words in English

Figure 1. Syllable timing measures for words of two- to five-syllables in French (top panel) and English (bottom two panels).

The second properly, nonfinal syllable timing, is also clearly different for the two languages. French has been classified as syllable-timed, with a rhythmic structure known as "isosyllabicity," which is characterized by syllables generally equal in length. However, this description ignores the obvious, important final-syllable lengthening we see in French. On the other hand, aside from the effects of emphatic stress and inherent segmental

44

Levitt

length differences, nonfinal syllables in French generally are equal in length. In Figure 1 in the top panel, the nonfinal syllables in French words of three to five syllables are quite equal in length, whereas English nonfinal syllables (in the bottom two panels) are not because of variable word stress. Finally, the third rhythmic property that we investigated, the length of a prosodic word, here defined as the number of syllables from one stressed syllable to the next, may be expected to differ in English and French, again because of differences in the stress patterns in the two languages. Information about the typical length of the prosodic word in French and English comes from studies by Fletcher (1991) and by Crystal and House (1990). Fletcher analyzed the conversational speech of six native speakers of French. Reanalyzing a portion of her data, we found that 56% of the speakers' polysyllabic "prosodic words," which included all unaccented syllables preceding an accented final syllable, were 4 or more syllables in length, on average. On the other hand, when we examined similar data from Crystal and House, who had analyzed the read speech of six English subjects, we found that prosodic words of 4 or more syllables accounted for only six percent of the total, on average. Thus, there is some evidence that interstress intervals or prosodic words tend to be longer in French. How do the amplitude patterns of the two languages differ? Figure 2 shows the waveforms of the French word "population" with its reiterant version, spoken by a male native speaker of French on top, and the waveforms of the English word "population" with its reiterant version, spoken by a male native speaker of English on the bottom. The patterns in Figure 2 are very representative. Basically, French words tend to start high in amplitude and generally decline, so that final syllables, which are systematically longer than nonfinal syllables, tend to be lowest in amplitude or loudness. The French reiterant version of "population," on the right, which avoids loudness variations due to inherent amplitude differences in different segments, as on the left, looks rather like a Christmas tree on its side. On the other hand, as can be seen from the waveforms for the English words, nonfinal stressed syllables in English tend to have greater amplitude than surrounding syllables (as well as greater duration), although there is also a tendency for the last syllable in an English word to have lower amplitude if it is not stressed.

What sorts of evidence for language-specific prosodic structure might we find in the vocal productions of young infants themselves? *

POAT 0

Hm

~

*

PORT

0

IPOPUl

4"10214"

Figure 2. Waveforms (showing characteristic amplitude patterns) of French population and its reiterant version (top panel) and English population and its reiterant version (bottom panel).

4.2 Reduplicative babbling studies In order to investigate whether prosodic differences in fundamental frequency contour, rhythm, or amplitude emerged in the vocal productions of French and American infants between the ages of 5 and 12 months, the babbling of five Englishlearning infants (three male and two female) and five French-learning infants (four male and one female) was recorded weekly by their parents at home. The French-learning infants were recorded in Paris and the English-learning infants were recorded in cities in the northeastern United States. Recordings began when the infants were between 4 and 6 months old and continued until they were between 9 and 17 months old. Each tape was phonetically transcribed, and all infant speechlike vocalizations were digitized for computer analysis. The vocalizations were divided into utterances, or breath groups, which were defined as a sequence of syllables that were separated from adjacent utterances by at least 700 ms of silence and which contained no internal silent periods longer than 450 ms in length. From the transcribed and digitized utterances, we selected all the reduplicative babbles, that is, those which contained the same consonant-like element as well

The Acquisition of Prosody: Evidence from French- and English-learning Infants

as the same vowel-like element, repeated in an utterance of two or more syllables, according to our transcriptions. U sing these criteria, we obtained 208 reduplicative utterances, approximately half (102) from the English-learning children and half (106) from the French-learning infants. (See Table 1, taken from [Levitt & Wang, 1991]). Table 1. Description of the source of the 208 stimuli.

Ages (in months) at which recordings were made

Ages (in months) at which reduplicative Number of babbles were reduplicative babbles detected

French Infants

MB

5-11

7-11

EC

6-12

6-12

MS IZ

5-16

7-12

24 42 23

4-9

5- 7

9

NB

4-14

8-13

8

American Infants

MA MM

5-16

8-12

24

5-17

CR

5-17 5-17

7-12 9-10

35 7

8-11 7-12

18 18

AB VB

4-15

4.2.1. Fundamental Frequency Contours. For our analysis of the fundamental frequency contours of the reduplicative babbles of the French and American infants, we decided to obtain contour judgments for the reduplicative babbles and to analyze them acoustically as well (Whalen, Levitt, & Wang, 1991). First, we asked a group of experienced listeners to judge whether each infant babble had a falling, a rising, a fall/rise, a rise/fall, or a flat contour. In order to make the perceptual judgments feasible, we limited our data set to those reduplicative babbles that were two or three syllables in length. We found both acoustic and perceptual evidence for language-specific effects in the FO contours of the reduplicative babbles of the French- and English-learning infants. Although about 65% of the perceptual judgments made for both the French and the American reduplicative babbles were either rise or fall, these two categories were about equally divided in the judgments of the French babbles, whereas about 75% were labelled fall for the American subjects. Thus, in agreement with the higher incidence of

45

rIsmg intonations in adult French speech (Delattre, 1961), the reduplicative babbles of our French infants showed a significantly higher incidence of rising FO contours by comparison to those of our American infants. The results of our acoustic analysis of the reduplicative babbles also supported our perceptual finding. All of the reduplicative babbles were categorized according to the contour opinion of the majority of the listeners and then acoustically analyzed. The contour patterns were then averaged for each language. The mean patterns for each of the contour types revealed an appropriate fundamental frequency curve, and statistical tests of the fundamental frequency values also support the finding that French infants produced more rising contours. 4.2.2. Timing Measures. What about timing measure differences in the infants' reduplicative babbling? We investigated this aspect of the French-learning and English-learning infants' production in another recent study (Levitt & Wang, 1991). Recall that final syllable lengthening is more salient in French, which also has more regularly timed nonfinal syllables, and longer prosodic words. Using the entire corpus of 208 utterances, we measured each syllable. A conservative criterion for measuring syllable length was adopted: the duration as measured included only the visibly voiced portion of each syllable. In order to test for final-syllable lengthening, we compared the length of the final syllables with the penultimate syllables in each utterance. To test for regularity in the timing of nonfinal syllables, we calculated the mean standard deviation of the nonfinal syllables in each utterance of three or more syllables, and to test for length of prosodic word, we looked at the number of syllables per utterance per child. In Figure 3, the three graphs represent the results of our investigation of the timing properties. The top graph shows that the French infants had a significantly greater proportion of long final syllables (54%) than did the Englishlearning American infants (29%). In terms of the regularity of nonfinal syllables as shown in the middle graph, French infants produced more regular nonfinal syllables overall, although that difference was not significant. However, when we analyzed the nonfinal syllable timing measurements in terms of an early and a late stage of reduplicative babbling production for each of the infants, we found a significant interaction, in that the nonfinal syllables of the French infants tended to become

Levitt

46

more regular whereas the nonfinal syllables of the English-learning American infants tended to become more irregular. Finally, we also found a significant difference in the length of prosodic words, with the French infants producing considerably more longer utterances (of 4 syllables or more) than the Americans, as shown in the bottom graph of Figure 3. 60

III

60

iii c

u::

40

01

~

30

1:

20

a..

10

~ Gl

our French and American infants. Duration measures for each of the syllables had already been obtained. Our results are pictured in the two graphs in Figure 4. As indicated in the top graph, we found that, as mentioned earlier, French adults tend to produce long final syllables with lowest amplitude (81%) significantly more often than American adults (45%) in their utterances overall [t(8)=3.2, p=.0061, one-tailed], although Americans did show a similar tendency for long finals with low amplitudes, especially for words without a finalsyllable stress. The infants showed a similar pattern of results, with French infants linking long final syllables with lowest amplitude more often (33%) than English-learning American infants (21%), but this difference in the infant populations was not significant [t(8)=1.3, ns]. 100

Eng'loh

French

• C

Gl

80

0

. ... c

.2 iii

"> ~ ~

-g

CIS

.!

60

::;) 40



i:Gl

.

Earl'fSlago

_u.S_ge

0

111 ;;; 20

60 40 20

Gl

a..

., C UI

::E

o French Adults French Infants English Adults English Infants

0

100

8

100

III g

c_ 80

Cll Gl

60

::;)~

~:o

Cll

~ ~

8 ~

Long finaillow amp

80

40



Perc.nlS ta



Parcant long

:t:

CIS

i:

(f)

Gl

+

80 60 40

OM ~

20

IJ Long nonfinal/Low amp

a..

..... 20

o French

Engloh

French Adults French Infants English AdultsEnglish Infants

Figure 3. Comparison of French and English infants' syllable timing patterns for final-syllable lengthening (top panel) regularity of nonfinal syllables (middle panel), and number of syllables per utterance (bottom panel).

Figure 4. Comparison of duration-linked amplitude patterns for French- and English-speaking adults and French- and English-learning infants. The top panel shows the typical French pattern and the bottom panel shows the typical English pattern.

4.2.3. Amplitude Measures. What then about the last of the prosodic factors, amplitude or loudness? In order to answer this question, we first analyzed adult amplitude patterns in the two languages from the reiterant speech study mentioned earlier (Levitt, 1991). We chose five speakers of each language, 3 men and 2 women, at random. We measured the peak amplitude in each of the reiterant syllables produced by the adults and also of each of the reduplicated syllables produced by

As displayed in the bottom graph of Figure 4, when we looked at nonfinal syllables in utterances of three or more syllables, we found that American adults tended to link long nonfinal syllables with highest amplitude or loudness (80%), significantly more often than the French adults (45%) [t(8)=4. 1, p=.0016, one-tailed]. Similarly, American infants tended to produced their highest amplitudes on the longest nonfinal syllables (57%), whereas the French infants did so less often (41%). This latter

The Acquisition of Prosody: Evidence from French- and English-learning Infants

difference between the two groups of infants approached significance [t(8)=1.6, p=.071l].

5. EVIDENCE FOR PROSODIC INFLUENCES IN CHILDREN'S LATER PRODUCTIONS By the age of two, children have already largely mastered a number of the syllabic timing properties of their language. Thus, Allen (1983) has shown that French children exhibit finalsyllable lengthening in polysyllabic words by two years of age. Although the patterns of finallengthening produced by the children were more variable than those produced by French adults, the children's median ratios of final to nonfinal syllables were very comparable to those of French adults, roughly 1.6:1. Similarly, Smith (1978) has shown that English-speaking children between two and three years of age have mastered finalsyllable lengthening as well, with a final to nonfinal ratio of close to 1.4:1 for both the adults and the children. Some research with two-year-old children learning tone languages (Li & Thompson, 1977) suggests that children can reproduce tonal patterns more accurately than speech segments, although certain tone contours appear easier to acquire than others.

6. POSSIBLE ORDER OF ACQUISITION OF PROSODIC FEATURES Our results, taken together with some of the other research concerning infants' early perception (and occasional production) and young children's production of certain fundamental frequency and rhythmic properties, lend support to the notion that infants begin to imitate some of the prosodic properties of their native languages before they fully master its segmental properties. Specifically, it would appear that the more global properties of fundamental frequency and syllabic timing are acquired before amplitude patterns. Within each prosodic domain, there also appears to be some evidence for a learning sequence. Li and Thompson (1977), as noted above, found that children learning Mandarin acquired some tone contours, which are based on fundamental frequency, earlier than others. Similarly, our results on the acquisition of syllable timing suggest an early beginning for children's development of control of final-syllable lengthening and of utterance length, whereas acquiring the regular timing of nonfinal syllables in French appears more difficult. Children's vocal productions are notably

47

more variable than those of adults and gradually move towards more adult-like stability as they gain increasing motor control (e.g., Kent, 1976). Producing regularly-timed syllables would thus he considerably more difficult for children than for adults. However, before we relegate the child's control of the amplitude patterns of his/her language to the status of prosody's stepchild, we have to keep in mind that relatively little exploration has been done of the infant's sensitivity to language amplitude patterns and that the present results dealt with two languages, English and French, for which differences in the amplitude patterns of syllables may he less important in perception than are the other prosodic variables. Until more direct tests are undertaken of infants' sensitivities to all of the prosodic features and comparisons are made between languages such as English or French, on the one hand, and Ik, a language spoken in eastern Sudan, which contrasts voiceless and presumably low amplitude versus voiced, presumably high amplitude vowels, on the other hand (Maddieson, 1984), our conclusion that amplitude pattern control is acquired later than other prosodic features must be provisional.

7. EVIDENCE FOR REGRESSION IN PROSODIC LEARNING We would also speculate, based on some of our own findings as well as on suggestions from the literature, that infants show a special sensitivity to prosody beginning perhaps at birth and lasting until about 9 or 10 months of age, when there may be some regression in the child's sensitivity to prosody. It would come, of course, at a time when Werker and her colleagues (e.g., Werker, 1989; Werker & Lalonde, 1988; Werker & Tees, 1984) and Best and her colleagues (Best, in press; Best, McRoberts, & Sithole, 1988) have shown that there is a shift in infants' phonetic perception of some nonnative segmental contrasts as their focus turns to learning the words of their native language. Our evidence comes from both perception and production studies of prosody. Recently Catherine Best, Gerald McRoberts, and I (1991) investigated the ability of infants who were either 2-4, 6-8, or 10-12 months old to discriminate a prosodic contrast (questions versus statements) in English

Suggest Documents