Evolution of language

The lexicon of a language of contemporary occidental civilization comprises hundreds of thousands of entries; the Oxford English Dictionary Online contains over 500,000 entries. The lexicon of a language lacking a written tradition – and consequently, science – typically comprises up to 10,000 entries. Even with this lower size, a language which is to acquire such a vocabulary needs double articulation; no sign inventory of such a size is known whose signs differed from each other globally.

Words are linguistic signs with an expression and a meaning. For a long time, there have been attempts to reconstruct the first words of humanity. For this to succeed requires that we know both the meanings and the expressions of the first signs. These are two different tasks.

The methods hitherto employed in the establishment of the expressions of the original words are essentially those of historical-comparative linguistics. In a first step, one uses the reconstructed vocabularies of the proto-languages of all the language phyla known. In a second step, one compares elements of these to reduce them to common roots that would belong to the proto-language of mankind. Now given the observable rate of language change and, specifically, of lexical and phonological change, a scientific reconstruction may be able to bridge some 3,000 years. Thus, the protolanguages that can be responsibly reconstructed from historical languages may have been spoken maximally some 6,000 years ago. The protolanguage of mankind must have started about 160,000 years ago. Since then, its vocabulary has been replaced many times in all the languages brought into the comparison. Thus, beyond some general patterns of phonological structure, the expression side of specific first words of mankind cannot be known by scientific method.

The meaning of the original words is a different issue. Given a set of axioms for human language, their basic semantic structure can be deduced with some confidence. All languages have two basic classes of words, lexical items like ‘apple’ and grammatical items like ‘the’. Most grammatical formatives originate in lexical items and acquire their grammatical function by grammaticalization. Formatives like 'the', 'be', 'will' are therefore not among the first words. There remain a few grammatical formatives which are necessary operators in the primal propositional operations:

For the operation of reference, a word with deictic function, thus with a meaning like 'that/there', is needed.
For the operation of denial, a negator, i.e. a word meaning 'no/not', is needed.

Such notions were initially coded by gestures and can be so coded to this day. Thus, the first words did not complement gestures, putatively coding what could not be conveyed by gestures – sign languages possess effability! Instead, Homo sapiens transferred signs into the new medium. More grammatical formatives are created in subsequent phases together with grammar.

Now as to lexical items, pre-linguistic lore, starting with Genesis 2, 19f and not ending with Plato's Cratylus, has it that the first namegiver gave all things their names. This implies that the first lexical words were nouns. This is not so. At the origin of human language, there are no word-classes to begin with. The first utterances are holophrastic; an elementary sign is, at the same time, a speech act.² The first speech acts are of two kinds:

Some have directive illocutionary force. They are utterances conveying ideas like ‘Come!’, ‘Go away!’, ‘Let's go!’, ‘Look!’.
Some are existential sentences with declarative force. They are exclamations conveying ideas like ‘There is fire!’, ‘There is a bear!’

In a further step, the meaning of such signs is reduced to simple notions like ‘come’, ‘go’, ‘look’, ‘fire’ and ‘bear’. This happens when these signs are used in different speech acts and combined with other signs into utterances by primitive grammatical operations.

As to the type of bond between expression and content, the most primordial signs are indexes (Keller 1995). This is not only true of gestures, but also for the vocal signs which replace them. Exclamations like ‘aah’ indicating pain are primitive indexical signs. At a more controlled level, the indexes 'there' and 'that' are among the first words. In the further development, indexes play a minor role.

In second place come iconic signs. To the extent possible, the first lexical items, e.g. for notions like ‘bear’, are onomatopoetic. This is because they do not require a convention to be understood.

At last, symbolic signs are developed. These presuppose a full-fledged double articulation and then provide the bulk of the lexicon. The first symbolic signs are introduced per ostensionem. Later on, signs may be defined by signs.

¹ Otto Jespersen (1922, ch. XXI/15) already defended this hypothesis against its earlier misunderstandings.