The acoustic signal leaving the speaker's mouth and entering the hearer's ear is a sound wave whose properties vary continually along the dimensions of time, pitch, intensity and timbre. It is an object of phonetics. In the diagram, the waveform visualizes the vibration of the vocal chords.
What a listener perceives is a complex acoustic object of the form of the above example. His task is to find the sense that the speaker has coded by it. Different occurrences of the same word differ in their physical composition along the above acoustic dimensions. Therefore this cannot be the form in which the significans of a linguistic sign is stored in the mental lexicon. Instead, it is stored in a form closer to its alphabetic representation in writing, i.e. as composed of discrete units. The sound system of a language, i.e. its phonology, consists of several categories of discrete units such as phonemes and features whose selection and combination obey phonological rules.
Viewed systematically, speech perception is a sequence of steps running through the levels of speech production essentially in reverse order (Arjmandi & Behroozmand 2024):
level | operations/processes |
Acoustics | acoustic percept is analyzed to produce a phonetic representation |
Phonology | phonological rules analyze the phonetic representation to produce a phonological representation |
Morphology | phonological representation is analyzed morphologically to produce a morpho-phonological representation |
Symbolization | morpho-phonological representation is matched with a schematic semantic representation |
Semantics | meaningful elements with their structural relations build a complex meaning |
Pragmatics | understood meaning is integrated with other knowledge to reconstruct speaker's idea |
This, however, is not how speech perception actually works. First of all, the listener has a certain measure of empathy with the speaker, if only because both are human beings. The listener understands the speaker by putting himself in the speaker's place. He thus undertakes forward construction of the growing idea just like the speaker. On the basis of the speech situation and the context, he can anticipate to a considerable extent what the speaker is going to say. Whenever his expectations are met, they help him in decoding the message. The hearer thus does not, in the first place, run in a bottom-up direction through the same series of steps that the speaker ran through in top-down direction. Instead, he accompanies the speaker in the generation of sense. To a large extent, he uses what he hears only as a corrective for his construction of sense. The entire process is accompanied by self-monitoring where the hearer can check, at every step, whether the result currently reached is compatible with everything else in the speech situation and world knowledge.
A complex phonological unit, e.g. the significans of a word, is constructed from a percept and serves as the input for motor commands given to the speech apparatus. Consequently, speech perception activates the same phonological unit in memory that is used in speech production.
The mirror-neuron system gets involved when we perceive somebody executing his own motor commands and may activate the corresponding motor channels in the perceiver. The motor theory of speech perception maintains that when we perceive speech, the phonological units that we reconstruct activate the motor neurons which instigate the speech apparatus. There is, however, no direct connection between acoustic input and motor output; i.e., acoustic features are not directly mapped onto gestures of the speech organs. Instead, the connection is mediated by the mental representation of abstract phonological units. To the extent that motor neurons are actually activated during perception, this is a consequence of spreading activation which passes beyond the phonological representation.
Spreading activation (Dell 1986) implies that the process of perception and understanding does not necessarily stop when the sense of the heard utterance has been construed. Activation of cells may spread further and stimulate even such cells as are only needed to produce an utterance with that sense. Thus,