The components of intonation.
As it has already been mentioned intonation consists of the following components: sentence stress, melody, rhythm, tempo, pausation and tambre. Although the components of intonation are closely interrelated and interdependent in expressing its intellectual, emotional and attitudinal (modal) content, and none of them can be isolated or separated from the others in actual speech, it is possible to single out each component for purposes of analysis.
Sentence-stress. In comparing word-stress with sentence-stress, we see that their function is different. The function of word-stress is to mould the words by indicating the strongest syllable in a word. The function of sentence-stress is different and more complicated. Sentence stress organizes the phrase phonetically, helps to make speech articulate, provides the basis for identification and understanding of the contents by contribution to clear rendering of the meaning. It indicates the end of the syntagm by means of strengthening the last syllable, by a definite pitch-pattern and frequently also by a pause. Sentence-stress is used to indicate the important words in a syntagm (from the point of view of grammar, meaning or the speaker’s attitude).
In accordance with these functions of sentence-stress, we may distinguish three types of it: (1) syntagm stress (unemphatic or normal sentence-stress); (2) logical sentence-stress; (3) emphatic sentence-stress. Each type is characterized by different degree of stress.
Syntagm stress is used in unemphatic speech to break up connected speech into syntagms and to indicate the important words in syntagms. Some linguists distinguish between syntagmatic (or primary) stress which singles out only the semantic centre of a syntagm and is usually realized in the last stressed word, and syntactic (or subsidiary) stress which emphasizes all the other notional elements of speech.
Logical stress is used to push into prominence a word or words in a syntagm that are significant from the point of view of meaning or of the speaker’s attitude to the subject discussed. It consists in shifting the syntagmatic stress from its normal place in the last stressed syllable to one of the preceding words.
So there are two positions of syntagmatic stress – unmarked, or normal position on the last lexical item of the syntagm, and marked, or special position on an earlier part of the syntagm, when the speaker wants to draw attention to it, usually to contrast it with something already mentioned, or understood in the context. In the first case the nucleus is called the end-focus. In the second case the nucleus is called contrastive-focus.
Ex. “Did your brother study in Moscow?” “No, he was born in Moscow.”
In a marked position, the syntagmatic stress may be on any word in a syntagm. Even words which are not normally stressed at all can receive nuclear stress for special contrastive purposes.
In exceptional cases, contrastive stress in a word of more than one syllable may shift to a syllable which does not normally have word stress. For example, if you want to make a contrast between the two words normally pronounced bu'reaucracy and 'autocracy you may do so as follows: 'bureaucracy and 'autocracy.
Emphatic stress is used to express the speaker’s emotions or to suggest to the listener some idea or some shade of meaning which is not expressed in words. Sentence stress is made emphatic by widening the range of pitch of the nucleus, increasing the degree of loudness of the syllable, slowing down the tempo.
Degrees of stress in an utterance correlate with the pitch range system. Nuclear stress is the strongest – it carries the most important information. Non-nuclear stresses are subdivided into full and partial. Full stress occurs only in the head, partial stress occurs also in the pre-head and tail. Words given partial stress do not lose prominence completely, they may retain the whole quality of their vowels.
Sentence-stress and word-stress are mutually dependent. Their relationships consist in the modifications which the accent of a word undergoes when this word is used in a sentence. These modifications are as follows:
The word accent of a monosyllabic word may disappear in a sentence. This is usually the case with form words, in which the loss of stress usually results in their quantitative, qualitative or zero reduction. The word accent of a monosyllabic word may be retained in a sentence without any marked diminution or increase. This is usually the case with words forming the scale of a syntagm in unemphatic speech. The word accent of a monosyllabic word may be increased in different degrees in a sentence. A slight increase is observed when such a word forms the accentual nucleus of a syntagm. The increase may be very great in emphatic and emotional speech. The main word accent of a disyllabic and polysyllabic word never disappears altogether in a sentence. It may only become weaker, i.e. have the force of secondary or even tertiary stress, when such a word has no sentence-stress.
The functions of sentence-stress are accomplished in the English language by means of two main principles: the dynamic (the greater force of utterance) and the musical (changes in the direction of voice pitch), as well as by two subsidiary principles: the qualitative and the quantitative.
The dynamic principle applies also to word-stress; however, sentence-stress makes use of the emphatic degree of stress which is expressed partly by pitch variations, partly by the following methods:
a) glottal stop (Ex. It was "utterly im'possible! [it wəz "?Λtəli im'posibl]);
b) modifications of stress (Ex. "No! "Absolutely 'nothing. "Im"possible!);
c) specially distinct articulation of words, syllable by syllable (Ex. "Absolutely! ["æb-so-"lu:-tli]).
The activity of the musical principle is expressed in the pitch-patterns that are used in final stressed elements of syntagms, and also in the variations of pitch among the stressed elements within the same syntagm.
The quantitative principle, which plays a subsidiary role in English, mostly concerns consonants which are frequently lengthened for the sake of emphasis, especially sonorants (except [w] and [j]). Ex. Marvellous! ['m:α:vləs]; How late you are! [hau "l:eit ju α:]. Even a voiceless consonant may be lengthened: It's filthy! [its "f:ilθi].
As a rule, vowels in English are not subject to emphatic lengthening, especially short vowels. Vowel-lengthening is used freely in Russian for the purpose of creating emphasis. In English, the length of long monophthongs and diphthongs may be increased only when they are final or when followed by voiced consonants; in this position, even in unemphatic speech, vowels are longer. As to short vowels, they are lengthened only in two special cases: under the influence of emphatic tones (for example — the fall-rise) and in singing.
The presence of the qualitative principle is based not only on the fact that words may have no sentence-stress, but also upon the fact that the quality of the vowel may change. The word “many” has the vowel [e] in the first stressed syllable; but the quality of the vowel changes if the word receives no sentence-stress, and the vowel [e] of the first syllable is reduced to [ə]. Ex. How many pennies are there in a shilling? ['hau məni 'peniz α: ðər in ə '∫iliŋ]
In unemphatic speech there is a certain uniformity in the distribution of sentence-stress in a syntagm. Of course, these principles vary in different languages. In an English syntagm, stress mostly marks groups of words and less frequently – words. These so called “stress groups” give to an English syntagm, and, consequently, to English speech in general, a peculiar rhythmical pattern. Thus, an English syntagm consists of a number of “stress-groups”; a “stress-group”, in its turn, consists of a number of unstressed syllables which are grouped around a stressed one.
It is possible to formulate general rules for the distribution of stress in unemphatic English sentences. The stressed elements are those which are more essential in rendering the meaning, namely: the nouns, adjectives, notional verbs, auxiliary and modal verbs in negative contracted forms, when introducing a question, substituting a notional verb; adverbs, numerals, demonstrative, negative, reciprocal, interrogative and emphasizing pronouns, indefinite pronouns somebody, someone, something, anybody, anyone, anything, used as subject; possessive pronouns in absolute form; interjections, two-word prepositions and conjunctions, particles only, also, too, even, just.
The following words are usually not stressed in unemphatic sentences: articles, one-word prepositions and conjunctions, personal, relative, reflexive pronouns, indefinite pronouns somebody, someone, something, anybody, anyone, anything, usedas object, possessive pronouns in the conjoint form, particles there, to, auxiliary, semi-auxiliary and modal verbs. Their number in English is great and they form clusters, grouping themselves around the stressed notional words in a syntagm.
A word that has just been used is not stressed. In exclamatory sentences such words as what, how, etc. are not stressed, if an “emphatic” word follows. (Ex. What 'crowds of people! How 'beautiful!).
Melody. Each syllable of the speech chain has a special pitch colouring. Some of the syllables have significant moves of tone up and down. Pitch movements are inseparably accompanied with variations in loudness. It can be explained by the fact that on the acoustic level pitch correlates with the fundamental frequency of the vibration of the vocal cords; loudness correlates with the amplitude of vibrations. The pitch parameters include the distinct variations in the direction of pitch, pitch level, pitch range and pitch angle, or rate.
Pitch range is the interval between two pitch levels or two differently-pitched syllables or parts of a syllable. The pitch range of a whole syntagm is the interval between the highest-pitched and the lowest-pitched syllables. Variations in pitch range occur within the normal range of the human voice, i.e. within its upper and lower limits. The whole range may be normal, which is used in unemphatic delivery, wide and narrow which are brought into use in emphatic speech. These ranges, even in the case of an individual speaker, are not fixed, either absolutely or relatively to one another. They may, according to circumstances, be shifted slightly up or down, or expanded or contracted to a moderate degree.
Within the normal range of the speaking voice, i.e. within the interval between its lower and upper limits in unemphatic speech, most phoneticians distinguish three pitch levels: low, mid (or medium), and high. These pitch levels are, of course, relative, not absolute: a man’s voice produces the three in a lower register than a woman’s. There exist not only the obvious differences in the pitches used by men and women respectively, but also the smaller though noticeable differences between individuals of the same sex. In emphatic and emotional speech an extra high and an extra low pitch levels may be distinguished in addition to the three unemphatic pitch levels. The pitch level of a whole syntagm is determined by the pitch of its highest-pitched syllable which, in unemphatic speech, is usually the first stressed syllable of the syntagm.
Pitch ranges should not be confused with pitch levels, although the two are closely interdependent. For instance, the pitch range between two syllables or two parts of a syllable is narrow when the first of them is pronounced on a high level and the second on a mid level or the first on a mid level and the second on a low level. But the pitch range is wide, when the first syllable is pronounced on a high level and the second on a low one. The more the height of the pitch contrasts within the intonation pattern the more emphatic the syntagm sounds.
The significant change in pitch direction takes place in the nucleus where the pitch goes distinctly up or down. In terms of pitch ranges the high-falling tone is a tone with a wide pitch range (from high to low), whereas a low-falling tone has a narrow pitch range (from mid to low).
In English there are also cases when no audible nuclear tone movement precedes a syntagm boundary. In such a circumstance one may consider it to be the level nuclear tone. The tone of a nucleus determines the pitch of the rest of the intonation pattern following it which is called the tail. Thus after a falling tone, the rest of the intonation pattern is at a low pitch. After a rising tone the rest of the intonation pattern moves in an upward pitch direction. The nucleus and the tail form what is called terminal tone. The pre-nuclear part, consisting of the pre-head and the head can take a variety of pitch patterns. Variation within the pre-nucleus does not usually affect the grammatical meaning of the utterance, though it often conveys meanings associated with attitude or phonetic styles. The pitch of the pre-nuclear part may gradually descend or ascend to the nucleus or stay more or less on the same level. The pitch pattern of a syntagm is formed by the combination of the pitch movements in the nucleus and in the pre-nuclear part within a pitch range of different pitch levels.
The changes of pitch are not haphazard variations. The rules of such changes are highly organized. No matter how variable the individual variations of this prosodic component are they tend to become formalized or standardized, so that all speakers of the language use them in similar ways under similar circumstances.
Rhythm is a very general term. From the materialistic point of view rhythm is one of the means of matter organization. The rhythmical arrangement of different phenomena of objective reality is presented in the form of periodicity in time and space, or tendency towards proportion and symmetry. We find it everywhere in life. In nature rhythm is observed in the successions of seasons, days and nights, the changes of the moon phases, high and low tide. The work of all kinds of machinery is rhythmical. We very well feel and appreciate the artistic rhythm in music, dance and other fields of art. Rhythm as a linguistic notion is realized in lexical, syntactical and prosodic means and mostly in their combinations. For instance, sound or word repetition, syntactical parallelism, intensification and others are perceived as rhythmical on the lexical, syntactical and prosodic levels. Most of human activities appear to be rhythmical - swimming, running, skiing, knitting and other muscular movements. The most evident illustration of rhythm in the physiology of living beings is the heart beating and breathing. Speech production is naturally closely connected with the process of breathing. So speech activity as well as any other human activity is conditioned by physiological factors among others and is characterized by rhythm. A more detailed definition of speech rhythm is “the regular alternation of acceleration and slowing down, of relaxation and intensification, of length and brevity, of similar and dissimilar elements within a speech event”.
The basic unit of the rhythmical structure of an utterance is called stress group, accentual group, pause group, breath group or rhythmic group. The term “pause group” underlines that this unit contains a group of words between two pauses. The term “breath group” emphasizes the physiological factors that this unit can be uttered within a single breath. The term “rhythmic group” is used by most of the linguists as it implies more than a stressed group or breath group. It is a speech segment which contains a stressed syllable with or without unstressed syllables attached to it. The most frequent type of a rhythmic group includes 2-4 syllables, one of them stressed, others unstressed. Most rhythmic groups are simultaneously sense units. A rhythmic group may comprise a whole phrase, like “I can’t do it” or just one word: “Unfortunately...” or even a one-syllable word: “Well...”; “Now...”. So a syllable is sometimes taken for a minimal rhythmic unit.
The stressed syllable is the prosodic nucleus or peak of prominence. The initial unstressed syllables preceding the nucleus of the rhythmic group are called proclitics, those following it are called enclitics. In qualifying the unstressed syllables located between the stressed ones there are two main alternative views among the phoneticians. According to the so-called semantic viewpoint the unstressed syllables tend to be drawn towards the stressed syllable of the same word or to the lexical unit according to their semantic connection, concord with other words. According to the other viewpoint the unstressed syllables in between the stressed ones tend to join the preceding stressed syllable. It is the so-called enclitic tendency. The enclitic tendency is more typical of the English language, where, as a rule, only initial unstressed syllables cling to the following stressed syllable; non-initial unstressed syllables cling to the preceding stressed syllables, though in the speech flow it is sometimes difficult to define the borders of rhythmic groups. The speech tempo and style often regulate the division into rhythmic groups. The enclitic tendency is more characteristic of informal speech whereas the semantic tendency prevails in accurate, more explicit speech.
The more organized the speech is the more rhythmical it appears, poetry being the most extreme example of this. Prose read aloud or delivered in the form of a lecture is more rhythmic than colloquial speech. On the other hand rhythm is also individual - a fluent speaker may sound more rhythmical than a person searching for the right word and refining the structure of his phrase while actually pronouncing it. There are some obvious differences between the rhythmic patterns of various speech realizations. For instance, rhythm organization of a dispassionate monologue will vary greatly from that of a familiar conversation.
But regularity in a speech chain is not realized in its exact isochronous form. Absolutely regular speech produces the effect of monotony. It means that the intervals between the stressed syllables are not physically equal. Some “strokes” may often be missing or mistimed. Whenever short rhythmic groups are mixed with longer ones the speaker minimizes the differences by means of changes in his rate of delivery. Any number of unstressed syllables occurring between the stressed ones are actually compressed to allow the next stressed syllable to come on the regular beat. In other words the length of the intervals is perceived by the listener as equal despite the changing number of unstressed syllables between the peaks of the rhythmic groups.
Linguists divide languages into two groups: syllable-timed languages like French, Spanish, and stress-timed languages, such as English, German, Russian. In a syllable-timed language the speaker gives an approximately equal amount of time to each syllable, whether the syllable is stressed or unstressed and this produces the effect of even rather staccato rhythm. In a stress-timed language the rhythm is based on a larger unit than syllable. Though the amount of time given on each syllable varies considerably, the total time of uttering each rhythmic unit is practically unchanged. The stressed syllables of a rhythmic unit form peaks of prominence. They tend to be pronounced at regular intervals no matter how many unstressed syllables are located between every two stressed ones. Thus the distribution of time within the rhythmic unit is unequal. The regularity is provided by the strong “beats”.
The markedly regular stress-timed pulses of speech seem to create strict, abrupt and spiky effect of English rhythm. Russian rhythm is perceived as more flexible, liquid and smooth. The analytical character of English explains the presence of a considerable number of monosyllabic form words which are normally unstressed in a stretch of English speech. To bring the meaning of the utterance to the listener the stressed syllables of the notional words are given more prominence by the speaker and unstressed monosyllabic form words are left very weak. Speech rhythm has the immediate influence on vowel reduction and elision. Prepositions, conjunctions as well as auxiliary and modal verbs, personal and possessive pronouns are usually unstressed and pronounced with reduced or even elided vowels to secure equal intervals between the stressed syllables. Under the influence of rhythm words which are normally pronounced with two equally strong stresses may lose one of them, or may have their word stress realized differently.
The sphere of rhythm functioning is actually very wide. Rhythm is complicated language system, comprising well-organized elements of different sizes in which smaller rhythmic units are joined into more complex ones: a rhythmical group ® an intonation group ® a phrase (a line in poetry) ® a phonopassage. Thus, the rhythmic structure of speech continuum is a hierarchy of rhythmical units of different levels. The regular recurrence of the stressed syllables at relatively isochronous intervals is perceived as rhythmicality. Rhythmic groups blend together into syntagms which reveal the similarity of a number of prosodic features. Thus, a syntagm includes from 1 to 4 stressed syllables and usually lasts 1-2 seconds. The tone and loudness vary from maximum at the beginning of a syntagm to minimum at the end. A syntagm is characterized by the lengthening of the first rhythmic group and of the last rhythmic group in comparison with other ones, the descending character of the melody and a short pause after it. The similarity of the prosodic organization of the syntagm makes it a rhythmic unit. A phrase often coincides either with a syntagm or even with the phonopassage. In both those cases a phrase is perceived as a rhythmic unit having all the parameters of either a syntagm, or a phonopassage. The recurrence of similar and equal text segments makes them rhythmic units. So that rhythmicality marks every text segment. The rhythmic effect of the text units is obtained by the prosodic parameters, the pitch of the voice, its level and range, loudness, duration, pausation and other phenomena of a stretch of speech. The rhythm constituents vary not only in different rhythm units but also in different speech realizations, different linguistic activities.Rhythmically organized speech is easily perceived. From the psycholinguistic point of view the accuracy of the temporal similarity in rhythm has a definite effect on the human being. The regularity in rhythm seems to be in harmony with his biological rhythms.
Rhythm serves to connect elements in speech: smaller units are organized into larger ones, larger units include smaller ones. So rhythm unites text segments into a whole and at the same time cuts the discourse into elements. This integrative and delimitative function of rhythm illustrates the dialectical unity of the contrary manifestations of rhythm.Besides, rhythm is a very effective means of speech expressiveness, conveying different degrees of emotional effect on the listener (Ex. 'Will you 'stop that 'dreadful 'noise.).
Tempo of speech can be normal, slow and fast. The parts of the utterance which are particularly important sound slower. Unimportant parts are commonly pronounced at a greater speed than normal. Each syntagm of the sentence is pronounced at approximately the same period of time, unstressed syllables are pronounced more rapidly: the greater the number of unstressed syllables, the quicker they are pronounced. Proclitics are pronounced faster than enclitics.
Pausation is closely connected with the other components of intonation. Pause is a stop of phonation for a short period of time before starting again. Any stretch of speech can be split into smaller portions, i.e. phonetic wholes (chains of oral speech which is semantically and intonationally complete), phrases, syntagms by means of pauses.
Functionally, there may be distinguished syntactic, or temporal pauses, emphatic, hesitation and breathing pauses. Syntactic pauses serve for segmentation of speech continuum into units and are considered an additional means of unifying and delimiting syntagms or sentences by showing relations between them. They play the semantic and syntactic role.
Syntactic pauses are subdivided into:
a) short optional pauses which may be used to separate syntagms within a phrase;
b) longer obligatory pauses which normally manifest the end of the phrase;
c) very long pauses, which are approximately twice as long as the first type, are used to separate phonetic wholes.
The length of syntactic pauses varies and depends on the degree of semantic importance, completeness and connection of the syntagm with the following one. The more important the syntagm is, the longer the pause after it. The length of pauses is also connected with the rate of speech. It is relative to the tempo and rhythmicality norms of an individual. Sometimes pauses may even disappear in fast speech and the delimiting function is performed by the nuclear tone alone. The length of the “end-of-utterance” pauses is controlled by the person who is about to speak.
Emphatic pauses serve to make especially prominent certain parts the utterance, to attach special importance to the word, which follows it. Hesitation pauses serve as signals of doubt, suspense and are mainly used in spontaneous speech to gain some time to think over what to say next. They may be unfilled and filled, corresponding to silent and voice pauses. The latter have the quality of the central vowels [ə, з:] or [m, з:m].Emphatic and hesitation pauses are made within syntagms as well. They are an additional means of expressing the speaker’s emotions thus performing attitudinal function.
Tambre expresses various emotions, attitudes and moods of the speaker, such as joy, anger, sadness, indignation, etc. Tambre should not be equated with the voice quality only, which is the permanently present person-identifying background, it is a more general concept, applicable to the inherent resonances of any sound. Tambre is studied along the lines of quality: whisper, breathy, creak, husky, falsetto, resonant, and qualification: laugh, giggle, tremulousness, sob, cry.