The Critique of "Language" and "Speech"
In the study of human communication our attention has long been captured by our vocal conduct - talk is a particularly attentionally demanding form of human behavior. Both in our everyday assessment of of social interactions (i.e. "having a conversation") and in our theoretical and methodological analysis of the same, vocal conduct takes center stage. Vocal conduct is, however, but one of the activities, no matter how important, going on in social interaction. An analysis of vocal conduct must situate this activity within a context of other events and interactive processes in order to construct an account of its nature and what function it serves in human living.
Limitations of the Linguistic and Sociolinguistic Approach to the Analysis of Vocal Conduct
All too frequently, research proceeds without any explicit attention being given to the presuppositions which provide the initial direction to the research. Investigators proceed on the basis of "what everyone knows," and what everyone knows as concerns vocal conduct is that one begins with the language- speech distinction. These two terms are generally read as referencing different but related domains that may best be analyzed and understood as formal systems of rules which constitute a cultural member's competence or knowledge. These rules may be generative of infinite vocal possibilities, they may represent obligatory and compelling interpretive practices, they may be used as resources by social participants for making judgments and guiding action. However they are conceived, the order and coherence of social discourse is explained by postulating the operation of a pre-existent, finite set of formal rules, rules which pre-specify certain features of, and which assume that certain functions are satisfied by, the ordered vocal output.
These formal rules constitute the participant's "competence." Originally this applied to an individual's linguistic competence and pertained to the study of language as a formal system of rules. What actually came out of a person's mouth was "speech," and this was subject to various circumstantial factors which rendered it something less than an accurate reflection of linguistic competence. Speech therefore pertained to "performance" as opposed to competence (see De Saussure, 1915).
As a result of the recognition that speech is context sensitive, and can be observed to change with changes in context, the original notion of language competence was expanded to include pragmatic rules of speaking for explaining how we are able to produce appropriate vocal behavior in particular situations (e.g. Hymes, 1974; Bates, 1976). It is considered that identification of these rules will explain the orderliness of conversations and how we are able to communicate effectively. On this basis it is held that the individual becomes adaptively fit as a cultural member when these rules are acquired, since they make possible the adjustment and coordination of behavior between members via the exchange of information. This has come to be known as the study of sociolinguistics.
All theories of competence postulate internal mechanisms, cognitive structures that have evolved and develop because of their success in generating orderly, predictable, and therefore adaptive, behaviors. The success of this type of mechanism, of course, depends on at least two individuals sharing the same knowledge. Language competence as an abstract, formal set of rules appears to be so characteristic of human beings that strong biological constraints are generally presumed to account for it. Speech practices on the other hand, since they are culturally varied and appear to be acquired during development, are presumed to have an experiential origin.
Both the linguistic and sociolinguistic perspectives attempt to handle the complexity of human social conduct by asserting the status of "talk" or of "semantics" as an independent level of organization. As a reaction against other approaches which tend to reduce human conduct to physical processes, micro S-R chains, or to the output of an "automaton," the linguistic approach defines a process that is uniquely human and theoretically incapable of reduction to any other process or set of processes. In essence, it is made immune from other aspects of the organism or the organism's surround. A closed level of organization is created, and social interaction is in effect considered to operate and be uninfluenced by processes occurring on other levels. An understanding of vocal conduct will supposedly emerge through an analysis that maintains the integrity of talk as talk: "Social behavior is to be explained through the collection and analysis of participant's accounts (Harre and Secord, 1973, p. 151)." Or, if the analyst feels that such after-the-fact accounts by participants are not reliable, he may analyze the features of actual behavior in order to recover the meaning of the situation for the participants. Gumperz and Herasimchuk (1975) write:
...such terms as role, status, social identities, and social relationships will be treated as communicative symbols. They are signalled in the act of speaking and have a function in the communication process akin to that of syntax in the communication of referential meaning. Just as grammatical knowledge enables the speaker to distinguish potentially meaningful sentences from non-sentences, knowledge of the social values associated with the activities, social categories, and social relationships implied in the message is necessary in order to understand the situated meaning of a message, i.e., its interpretation in a particular context (p. 81)."
Clearly, the primary function of speaking is limited here to the conveyance of encapsulated meanings.
Researchers also focus on the establishment of procedural rules of conduct, both the developmental acquisition of general conventions and the negotiation of specific rules within a situation. From Elinor O. Keenan (1974):
Two interlocutors who wish to communicate with one another are faced with...a coordination problem. To interact effectively, they need to share not only a linguistic code, but also a code of conduct. That is to say, interlocutors need to establish a loose set of conversational conventions. These conventions establish certain expectations on the part of the speaker and hearer. For example, speaker-hearers may establish speech conventions concerning turn- taking, points of interruption, audibility (p. 165).
In sum, interactive coordination is considered to depend on establishing shared expectations or a shared understanding regarding the appropriate "moves" to be taken in the exchange. This "structure" is necessary for successful exchange of meanings via the linguistic code. Although the shared understandings in specific situations are in a sense "mutually accomplished," this approach is still an attempt to explain the orderliness of social encounters by locating its origin within individual's as their "pragmatic competence." The context is once more treated as secondary in that the same form of cognition operates on all situations and subjects all contextual elements to the same set of transformations.
From the present perspective of situated activity, a purely individualistic account of rule-governed or semantically generated behavior is necessarily reductionistic. In seeking to avoid other forms of reductionism, the linguistic and sociolinguistic approaches fall prey to a particular reductionism of their own. By uncritically allowing terms such as "language", "talk", or "meaning" to define independent phenomena of investigation, analytic boundary conditions are drawn around the individual and the individual's conduct. The individual's space-time locus is relatively unimportant, and is certainly not constitutive of the individual's performance. The reason that such boundary conditions are drawn around the individual seems to be due to three interrelated factors: a taken for granted cognitivism, and ideal notion of competence, and an idealized "etiquette" or normative view of a successful/valid vocal interaction. As a result, the production of order often becomes essentially a matter of the participants following internal instructions, albeit adjusted to the "particularities" of the situation as processed information. Through its emphasis upon internal rules and meanings, vocal behavior is decontextualized and separated from its various functional roles in human living.
The work of producing "talk" is seen to be done entirely "by" the participants via the use of rules. Therefore the question of how much support is present from the surround is simply not considered. The relation between the individual's behavioral performance and the context is mediated by cognitive operations and the processing information. The context is significant mainly as information for the individual to process (see Bates, 1976, on "contextual-dependent rules"), or as information that allows the analyst to "repair" an utterance's indexical character and accurately assess what interpretations are being made by the participants (e.g., see Gumperz and Herasimchuk, 1975, p. 85; Keenan and Schieffelin, 1976, p. 345). Most approaches tend to follow Gumperz and Herasimchuk in reducing roles, status, social identities, and social relationships to communicative symbols, thereby linking interactants to the context via a theory of signs. An interesting consequence of this "cognitivization" of context is that the appropriate use of rules by the participants and their recovery by the analyst requires an "accurate" knowledge of contextual conditions (as information)(Labov and Fanshel, 1977, p. 73). Hence the desirability of having "objective" videotaped records to assist analysis. It is not clear what meaning "accurate" can have from the epistemological position outlined earlier. How much information is considered "accurate" if the information is an output of the coupling between analyst and material or the participants and their immediate situation? How accurate does the information have to be to guarantee closure, which is presumably required in order to act? It appears within this perspective that the participants in an encounter are themselves engaged in a process of "discovery" in order to warrant their interpretation. They have, supposedly, the exact same resources available in the form of criteria for establishing the socially warranted "empirical" truth of their interpretations as has the analyst. Though meaning is continually emerging out of the situation, it is an emergence of discovery and not of construction.
That the situated, contextual character of vocal behavior is not seriously confronted is most clearly evident in the practice of extracting brief samples from a number of exchanges for identifying the cross-situational occurrence of resources. The significance of the selected features is given in terms of the ideal form of competence they are presumed to demonstrate. For example, in a volume on discourse analysis of children's speech (Ervin-Tripp and Mitchell- Kernan, 1977), all studies present excerpts from ongoing interactions. Information about the historical relationship between the participants, the interactive structure from which the samples were chosen, or the temporal location of the selection within the interaction, is seldom recoverable. When it is given, it is reduced to the status of cognitive information that directly determines lexical and paralinguistic choice, rather than approached in an open-ended manner as indexing possible constraints on the structural character of the encounter. Other kinds of contextual information regarding ecological features, kinesics, pauses, laughter, prosodic, and intonational features are sometimes provided and sometimes not, depending on what form of competence is being discussed. By selecting examples in this manner the idea of "competence" is being discussed. By selecting examples in this manner the idea of "competence" becomes self-validating. Only those features are chosen for display that demonstrate its presence. The lack of other information makes it i possible to formulate alternative accounts. In addition, resources are abstracted from their occurrence in particular situations and given the same functional significance wherever they occur. Discovering order in the material then involves comparing it to a large body of observations of repeated patterns. By doing this, a particular generalized "reading" of numerous encounters is imposed upon the immediate one, warranted by "empirical", normative findings that hold across situations. Contextual information pertaining to the use of resources in managing the unique particularities of the immediate situation is not employed, regardless of how complete the original record of the encounter.
In the attempt to elicit the competence that underlies vocal production, certain contexts are often considered to give a more "true" picture than others. Labov (1972) provides an interesting example of a black child who responds minimally to an adult, white male researcher but exhibits complex vocal abilities when with peers. This is often interpreted entirely as an analyst's issue of how to create conditions that will elicit maximum performance. As such, the interactive context is treated as something to be overcome. Alternatively, if adaptive functioning is the primary issue, a child's "competence" can only be assessed by witnessing performance across a range of contexts. If context is partially constitutive of performance, then it is not clear what it means to say that the child had resources available that were just not used. Instead, an open, contextual perspective must make the assumption that the child had no other resources available for managing the particular situation than those employed. Thus, the difference in speech production point to the manner in which context is constitutive of speech; and therefore ought to be seen as evidence demanding that the context be taken seriously.
A similar non-concern for taking the context seriously is found in the use of role playing situations. For example, Brenneis and Lein (1977) had children role play disputes, and Mitchell-kernan and Kernan (1977) had children role play in the use of directives. If the focus is on an underlying competence then a role-play is a good as a "real" situation. In fact, when the object of study is an emically defined speech event the role-play may actually be "better", since it will more directly get at the participant's knowledge of how such an event is constructed. On the other hand, though it may index an individual's order sensitivity (to surface features, at least), its significance in that type of situation must be viewed as qualitatively different than in other contexts. All contexts are equally "real", and equally deserving of analytic attention.
The establishment of "talk" as an independent level of organization also has two implications specifically concerning the functional analysis of vocal behavior, which also contribute to its decontextualization. First, it imposes a priori restrictions on the range and types of functions vocal behavior may serve in social interaction. Second, it establishes "talk" as the primary resource employed by participants.
The extended quote by Gumperz and Herisimchuk at the beginning of this paper expresses the prevailing view regarding the basic function of speech. it is presumed to be adaptively significant because it makes possible the conveyance of meaning. From this primary function is derived its value in coordinating social conduct. If the primary function is satisfied so is the second. This characterizes almost all functional investigations in the child discourse literature. Vocal conduct is "adaptive" if it succeeds in conveying thoughts, intentions, meaning. It is for this reason that the "rules of conduct" discussed by Keenan in her quote at the beginning this paper are considered to be so important. They are meant to be that which assists participants in constructing a regularized conversational structure that will provide proper conditions for the free exchange of meaning via the "linguistic code". The concern such activities as "breaking and entering", "securing the attention of the other", "articulating utterances for the listener", "identifying referents in discourse topic", "locating a referent in physical space", etc. "Children must develop means to accomplish each of these steps, if they are both to contribute to, and sustain, a coherent discourse (Keenan and Schieffelin, 21976, p. 378)."
One must consider that this approach takes something that is at best (if it makes sense at all) an occasional characteristic of social interaction, "coherent discourse", and sets it up as the goal of the entire enterprise. Instead of being a candidate as one possible form of exchange between participants it is taken to be the ideal form, defined according to a preconceived notion regarding the basic function of language. As a result, the range of functions served by all interactive behavior is limited to the construction of "talk" or conveying meaning, and the models employed to not allow us to consider other possible functions, except as secondary derivations. The approach also treats interactants as if they were in only partial contact with the world, in the sense that it reduces behavioral resources to a very narrow range. The rules or interpretive processes operate within a narrow time domain with regard to the "total" performance, and thus the activities are not regarded as part of more micro and macro temporal processes on which their operation may depend.
Again, this may be seen as a consequence of the too precipitous imposition of analytic boundary conditions around the domain of study. Taking one form of exchange as the ideal form defines a domain of successful, valid, or adaptive, exchanges and functions. Anything outside that domain is a priori considered defective, maladaptive, or a breakdown requiring "repair". Thus the categories of the valid and the defective are read into the data and analysts only find what they are looking for. This leads to another form of decontextualizing vocal behavior. By limiting the range of functions vocal behavior appears to serve, it is treated as if it were separated from human needs, problems, and concerns. It is a step removed from the functional requirements it presumably satisfies in our lives, these other functions being satisfied only if meaning is successfully conveyed. The alternative for establishing a heuristic approach suggests that we withhold the imposition of such boundary conditions. In doing this the analyst remains open to the multi-functional possibilities of vocal behavior.