The main theories of syllable formation and syllable division.

There are several theories of syllable formation and syllable division and none of them is shared by all linguists.

The most ancient theory states that there are as many syllables in a word as there are vowels. This theory is primitive and insufficient since it does not take into consideration consonants which also can form syllables in some languages, neither does it explain the boundary of syllables.


The so-called breath-puff (expiratory, chest-pulse, or pressure) theory is based on the fact that expiration in speech is not a continuous and uninterrupted process as it is in ordinary breathing, but a pulsating one. According to this theory there are as many syllables in a word as there are expiration pulses made during its utterance, because each syllable corresponds to a single expiration. Each vowel sound is pronounced with a fresh expiration, so vowel sounds are always syllabic. The borderline between the syllables is, according to this theory, at the point where a fresh expiratory pulse begins that is the moment of the weakest expiration. The American scholar Stetson tried to prove the validity of the expiratory theory of the syllable by an instrumental investigation of syllable formation and syllable division, the results of which he published in 1951 in his book Motor Phonetics, a Study of Speech Movements in Action. He used a number of instrumental techniques to record the lip, tongue and chest movements and to measure variations in the lung and subglottic air pressure during phonation.

The expiratory theory is strongly criticized by both Russian and foreign linguists. Thus N.I. Zhinkin questions the correctness of the instrumental techniques used by Stetson and doubts the validity of his conclusions which run counter to easily observable facts, because more than ten syllables are easily uttered with a single expiration. G.P. Torsuyev writes that in a phrase a number of words and consequently syllables can be pronounced with a single expiration without breaking it up into pulses. This theory is inconsistent also because of the impossibility of explaining all cases of syllable formation on its basis.


The so-called relative sonority theory of the syllable was put forward by O. Jespersen and further developed by other western linguists who often refer to it also as the prominence theory. By the term sonority is meant here the prevalence in a speech sound of musical tone over noise (hence the word sonorant). In this theory the term sonority is used in the meaning which is conveyed by the precise acoustic term carrying power. The latter means the acoustic property of speech sounds which determines the degree of their perceptibility. Thus, sonority theory is based upon the fact that each sound has a different carrying power. Jespersen by means of linguistic experiments proved that the most sonorous sounds are the vowels, the low vowels are more sonorous than the high and the back vowels are more sonorous than the front of corresponding height. Next lower in sonority are semi-vowels [w, j], the frictionless continuants [l, r, m, n, ŋ], the voice fricatives [v, ð, z,3], the voice stops (plosives) [b, d, g], the voiceless fricatives [f, θ, s, ∫] and least sonorous of all the voiceless stops (plosives) [p, t, k] which apart from their closure and release have no sound at all. The sonority theory states that there are as many syllables in a word as there are peaks of prominence according to the scale of sonority.

Ex. In the word sudden the most sonorous is the vowel [Λ], then goes the nasal sonorant [n], which forms the second peak of prominence, [s] and [d] are sounds of low sonority, they cannot be considered as syllable forming:

[s Λ d n]

So in any sequence the most sonorous sounds tend to form the center of the syllable and the least sonorous the marginal segments; a syllable contains one peak of sonority (or prominence) separated from other peaks by valleys of lower sonority (or prominence).

The sonority theory helps to establish the number of syllables in a word, but fails to explain the actual mechanism of syllable formation and syllable division, because it does not state to which syllable the weak sound at the boundary of two syllables belongs. Besides it is valid only for the artificial conditions under which it was established. But in speech, length, force and pitch are constantly varying, so that the sonority of different speech sounds in actual use varies considerably from the established scale of sonority. Thus in the word puzzle [z] can be made much more sonorous than [l]. The drawbacks of this theory were admitted by its creator and its adherents D. Jones, A. Gimson.


The muscular tension theory (articulatory tension, or energy theory) was put forward by L.V. Shcherba. He explained the phenomenon of syllable formation by muscular tension impulses. The fact that syllables cannot be further subdivided in connected speech proves that in speaking muscular tension impulses follow one another. Each impulse has its strongest point the peak of prominence and its weakest point the valley of prominence. Valleys of prominence correspond to points of syllable division. In the center of the syllable there is a syllabic phoneme which is usually a vowel. In pronouncing a syllable the energy of articulation increases within the range of prevocalic consonants and then decreases within the range of postvocalic consonants.

Unfortunately Shcherba has not left any further explanations of his theory of the syllable, with the result that some of its points remain unclear.

This theory has been modified by V.A. Vassiliev who stated that the syllable like any other pronounceable unit can be characterized by three physical parameters: pitch, intensity and length. Within the range of the syllable these parameters vary from minimum on the prevocalic consonants to maximum on the center of the syllable, and then there is another decrease within the postvocalic consonants. So the acoustic properties increase and decrease the tension of articulation and thus form an arc.


The three types of consonants theory was also put forward by Shcherba To explain the mechanism of syllable division he distinguished between the three types of consonants, such as initially strong, finally strong and geminate, or double. The difference between these types is in the way they are pronounced. In the initially strong consonants the beginning is more energetic, while the end is weaker. In the finally strong consonants the beginning is weak and the end is more energetic. Geminate (or double) consonants are pronounced in such a way that both the beginning and the end are energetic with a weakening of muscular tension in the middle. Acoustically, they give the impression of two consonants. The more energetic part of a consonant is attached to a vowel, so that initially strong consonants occur at the end of a syllable, while finally strong consonants occur at the beginning of a syllable.

Ex. initially strong consonants: it, us, oath, add;

finally strong consonants: may, tea, new;

geminate (double) consonants: penknife, what time, midday.

In English geminate (double) consonants usually occur at the juncture of words or morphemes. Initially strong consonants follow short vowels, while finally strong consonants follow long vowels or diphthongs. Acoustically, finally strong consonants produce the impression of an intimate blend with a vowel which follows.


Ex. finally strong initially strong

not a tall one not at all

a name an aim

The use of a finally strong consonant instead of an initially strong one in similar sound sequences strikes the ear of a native as incorrect.

Since in syllable division the character of the end of a consonant is more important that that of its beginning, it is more convenient to use terms strong-end (finally strong) and weak-end (finally weak) consonants.


The so-called loudness theory was put forward by N.I. Zhinkin. On the basis of his analysis of the x-ray moving pictures, together with the sound spectrograms and kymograms he has found the organ which is immediately responsible for syllable formation. This organ is the pharyngeal cavity, or rather its walls. Their contraction gradually narrows the pharyngeal passage, which together with the resulting increase in the muscular tension of its walls just at the vocalic peak of the syllable, increases the amplitude of sound waves and correspondingly the actual loudness of this vocalic element to such an extent that it becomes the peak of the syllable.

So according to this theory the syllable can be thought of as the arc (or curve) of loudness which correlates with the arc of articulatory effort since variations in loudness are due to the work of all the speech mechanisms. This arc is weak in the beginning and in the end and strong in the middle.

In terms of the loudness theory there are as many syllables as in a word as there are arcs of loudness and the point of syllable division corresponds to the moment when the arc of loudness begins or ends, that is finally strong consonants begin a syllable, finally weak consonants end it.

The loudness theory combines both the level of production and the level of perception of the syllable, due to which the syllable can be defined as a phonetic unit which is pronounced by one articulatory effort, by one muscular contraction, which results auditory in one uninterrupted arc of loudness. But Zhinkin has not investigated the mechanism of the formation of syllables by sonorants and as far as English is concerned, it is not clear, how the pharyngeal contraction theory can account for the formation of syllables by sonorants.

So it is obvious that the syllable is not a simple phenomenon. No phonetician has succeeded so far in giving an exhaustive and adequate explanation of what it is. The difficulties seem to arise from the various possibilities of approach to this unit.



