Kinds of data

The phenomena that provide the heuristic point of departure for a discipline are only its ultimate substrate, which is not itself processed by the scientist (though it may be by the practitioner). The data of a discipline are always representations of such phenomena. That is, they are essentially third-order entities by the ontology of naive realism. Thus, the ultimate substrate of linguistics is the total of actual linguistic activity occurring in the world. Linguistic data are recordings and representations of a set of such phenomena. Linguistic data may be classified by several parameters:

This categorization of linguistic data types is visualized in the following schema:

Types of linguistic data (from Lehmann 2004, ยง 3.4)
Linguistic data types

To give some examples:

For more discussion, s. Lehmann 2004[Data].