There is no language instinct
University of Sussex
This paper is an updated version of what began as an invited talk to the Third International Conference on the Evolution of Language (EVOLANG-2000) at the Ecole Nationale Supérieure des Télécommunications, Paris, April 2000.
Steven Pinker’s The Language Instinct persuaded many readers that much of the complex structure of human language is encoded in the human genetic inheritance. I systematically analyse this body of argument. Some lines of argument were first developed 40 to 50 years ago by Noam Chomsky and others; other arguments were original with Pinker; and in the 21st century Chomsky has developed his ideas further. But all the arguments fail. They are based on false premisses, or they embody logical fallacies (or both). There is no reason to doubt that languages are wholly learned cultural creations.
Steven Pinker’s book The Language Instinct (Pinker 1994) has achieved great influence, with good reason: it is superbly well written. Pinker convinced a wide readership that many detailed structural properties of human language are encoded in our genes – they are an “instinct”. However, skilled writing does not in itself make an argument correct. This paper aims to show that Pinker is mistaken.
Pinker’s arguments are empirical – they are based on observational evidence. He must therefore concede that a contrary view is equally logically admissible. (If Pinker’s linguistic nativism were a logical necessity, observational evidence would be irrelevant.) I shall argue that languages are purely cultural artefacts.
I shall identify each strand of argument which Pinker either states explicitly or draws on through allusions to earlier literature, and show that in each instance the argument is based on false premisses, or logically self-refuting (or both).
Within the limits of this paper I can give only a skeleton version of the case. It is spelled out fully in my book The “Language Instinct” Debate (Sampson 2005), which covers many detailed considerations that must be omitted here. And see also, now, The Linguistics Delusion (Sampson 2017).
Pinker’s thesis draws on several groups of arguments, some of which have been current for several decades while others are new:
1 Pinker frequently relied on arguments developed by Noam Chomsky in the 1960s and 1970s relating to grammatical structure
2 Others at the same period developed nativist arguments based on separate categories of data:
2.a categorial perception of speech sound
2.b vocal tract shape
2.c colour vocabulary
3 Pinker put forward new arguments for linguistic nativism, not widely aired before
4 Since about 1990 other writers have developed further arguments, relating to:
4.a discontinuity between “protolanguage” and true language
4.b sign language
5 In the 21st century, Chomsky has developed a new argument of his own – the “snowflake” argument.
I shall examine and refute these various arguments successively.
My strategy is frankly negative rather than positive: I aim to overturn a widely-held view of the human language capacity, rather than to explore an alternative in detail. The “Language Instinct” Debate, pp. 149–51, does include positive argument for the belief that languages are wholly culturally-evolved and individually-learned systems of behaviour, but this material does not form a major part of that book and I shall not discuss it here.
Logically speaking, the fact that all current arguments for a proposition are fallacious does not entail the falsity of the proposition. But, if my criticisms of Pinker’s and others’ arguments are justified, realistically it is grossly implausible that their conclusions are true. When intelligent people jointly try over several decades to persuade the public to believe a novel idea, and every argument fails, then in practice (though not in logic) one is entitled to infer that the idea is false. If it were true, better arguments would be available; why has none of these able people managed to formulate one of them?
1 Chomsky’s original arguments
These are of two kinds:
1.a logical arguments in the standard sense, proceeding from observations to conclusions
1.b rhetorical moves based on counterfactual “idealizations”
1.a Chomsky’s “arguments proper”
Having combed through Chomsky’s writings from about 1960 to 1980 – the period when he was reaching out and making converts outside the technical discipline of theoretical linguistics – I find that his significant arguments reduce to five:
1.a.i speed of language acquisition
1.a.iii poverty of data
1.a.iv convergence among individuals
1.a.v language universals
1.a.i Speed of language acquisition
This argument, briefly, is that children pick up their mother tongue so fast that they must “know” a lot to begin with. The argument comes in two varieties. Variant (I) claims that language-acquisition is absolutely fast:
Mere exposure to the language, for a remarkably short period, seems to be all that the normal child requires … (Chomsky 1962: 529)
Variant (II) claims that language-acquisition is relatively fast, compared with acquisition of other bodies of knowledge:
Grammar … [is] acquired … effortlessly, rapidly … Knowledge of physics, on the other hand, is acquired … through generations of labor (Chomsky 1976: 144)
Variant (I) of the argument works only if we can judge how long language-acquisition would take, without innate knowledge. Unless we can identify some rough figure, significantly larger than the time children actually do take, then (I) is vacuous. Chomsky and Pinker nowhere offer any basis for an estimate: would language-acquisition from scratch take ten years? fifty years? Variant (I) is empty: we have no grounds for describing the time taken by normal children (say, on the order of three years) as “remarkably short” rather than “what we would expect”.
Variant (II) fails to compare like with like. Explicit knowledge of physics is analogous to the knowledge about language structures set out in the academic literature of linguistics. Both of these kinds of knowledge have developed slowly, and are acquired only by small minorities. The ordinary person’s ability to speak competently is analogous to his ability to conform to the laws of physics in doing things like pouring water into a jug, or riding a bicycle. Both of these sets of skills develop with little or no formal, explicit instruction, and it is not obvious that the latter take notably more time or effort. (Of course, cycling has to wait until the child has a bicycle and is physically large enough to use it.)
Chomsky claims that human language-acquisition ability is governed by a biological clock which causes the ability to diminish sharply “… at a relatively fixed age, apparently by puberty or somewhat earlier” (Piattelli-Palmarini 1980: 37). He relies in making this claim on Eric Lenneberg (1967).
We must distinguish two cases: acquisition of a first language, and acquisition of a second or subsequent language.
So far as L2 learning is concerned, Lenneberg quotes no relevant evidence, and the point does not seem to be true: plenty of adult immigrants acquire the language of their adopted society rapidly and well (though schoolchildren who have little motivation to study a foreign language usually achieve far less than native mastery). There may be evidence that even successful adult L2 learners progress slower than children exposed to a second language in their early years; but it is uncontroversial that learning in general tends to slow with age. Lenneberg gives no evidence that language-learning is “special”.
As for L1 learning: a person who is first exposed to language at puberty or after must be a very exceptional case, given the way that children are normally reared. The case commonly discussed is “Genie”, the daughter of an insane father who isolated her from normal human interactions until she was discovered by the authorities at age 13. But the scholar who documented the Genie case, Susan Curtiss, saw her as refuting the claim that “natural language acquisition cannot occur after puberty” (Curtiss 1977: 209). (Curtiss later wrote about Genie in ways that were inconsistent with her 1977 book, but no fresh evidence was available to her and no explanation for the volte-face was given. See Jones 1995.) Genie certainly did not progress as a language-learner as successfully as an infant, but she also had difficulty in acquiring other social skills, which is not surprising in view of the psychological trauma implied by such an appalling childhood.
1.a.iii Poverty of data
The argument here is that the data available to a child through observation of elders’ speech are not adequate for learning the language successfully.
Again the argument has two variants.
Variant (I) claims that the speech heard by a child is of poor quality, containing slips of the tongue, incomplete utterances, and so forth:
… much of the actual speech observed consists of fragments and deviant expressions of a variety of sorts (Chomsky 1965: 201)
Variant (II) argues that the child’s language experience will typically include no evidence bearing on specific features which children nevertheless succeed in mastering. Chomsky always, to my knowledge, uses the same example, based on the English question rule. A yes/no question in English is formed from the corresponding statement by operating on a verb: in the simplest case, by moving the verb to the beginning of the sentence. If the sentence contains multiple clauses, a learner must select among alternative hypotheses about which verb to move. The correct rule is to move the verb of the main clause, but an alternative hypothesis would be to move the first verb. That is, from the statement:
the man who is tall is sad
the correct question rule forms:
is … the man who is tall __ sad?
while the alternative hypothesis would give the nonsense-sequence:
is … the man who __ tall is sad?
According to Chomsky, the average child could not choose between these hypotheses by observation, because relevant examples
rarely arise; you can easily live your whole life without ever producing a relevant example … you can go over a vast amount of data of experience without ever finding such a case. (Piattelli-Palmarini 1980: 114–15)
But, in answer to variant I: when people began to look at the language used to small children, they found that its quality was far better than Chomsky imagined: “the speech of mothers to children is unswervingly well formed” (Newport et al. 1977: 121). As for Chomsky’s extravagant claims (in variant II) about the rarity of utterances distinguishing between the alternative English question rules, the truth is that relevant examples are common. Geoffrey Pullum and Barbara Scholz (2002: 45) quoted evidence suggesting that a child is likely to hear during its first three years thousands of questions of the kind that Chomsky believes “you can easily live your whole life” without using. (I reinforced Pullum & Scholz’s argument by searching a corpus of spontaneous spoken English that is more representative than those available to Pullum & Scholz: Sampson 2005: 77–89.)
Furthermore, there seems to be a circularity in the argument (under either variant). If we agree that a language has some property P, which children belonging to the language community all manage to learn, then we must have encountered evidence that the language has property P – unless one assumes that we know things about the language without encountering evidence, which is what the nativists are trying to prove.
1.a.iv Convergence among individuals
This argument says that individuals growing up in a language community each encounter different finite samples, yet they all acquire essentially the same language – which the nativists find inexplicable if a child’s experience is the only information available to him:
every child … acquires knowledge of his language, and the knowledge acquired is, to a very good approximation, identical to that acquired by others on the basis of their equally limited … experience (Chomsky 1975: 30)
Again there were at one time two variants of this argument, though one of these has been abandoned. Variant I claimed that individuals of widely different intelligence and education mastered their mother tongue to closely similar levels of skill; Chomsky eventually withdrew this claim, because he realized that it is observably false (Piattelli-Palmarini 1980: 175–6). However, Chomsky and other nativists continued to maintain variant II, the claim that speakers of a language coincide surprisingly well in the structural properties of their language models.
If English-speaking linguists regularly agreed with one another about the status of particular strings of words, this argument might seem to have some substance. In reality, disagreements along the lines “This is a good sentence for me” – “I couldn’t possibly say it” are incessant. But that is a side-issue, because again there is a logical fallacy in the variant II argument. The obvious response to a claim that speakers A, B, C, etc. have highly similar grammars is “How do you know?” To test the claim fairly, one would need to construct and compare separate grammars of A’s language, B’s language, … Yet Chomsky says that the empirical data are inadequate for constructing one grammar; so how could anyone construct grammars for numerous idiolects? (The grammars of A’s language, B’s language, etc. would need to be based purely on observed data, to avoid the danger of contamination by prior assumptions or hypothetical innate language knowledge, which might make the grammars more similar than they would otherwise be and thus destroy the fairness of the test.) Variant II is self-refuting.
1.a.v Language universals
Chomsky and other linguistic nativists have claimed that all human languages share specific structural properties which are not logically necessary (they are not part of what we mean by calling something a “language”), and which have no functional explanation. An example would be the “structure dependence” of transformational rules, such as the English question-forming rule discussed above:
it is natural to postulate that the idea of “structure-dependent operations” is part of the innate schematism applied by the mind to the data of experience (Chomsky 1972: 30)
When writing The “Language Instinct” Debate, I found this argument stronger than the nativists’ other arguments. I met it by developing an alternative explanation for the universals. Herbert Simon (1969: ch. 4) showed that, for purely formal, statistical reasons, complex systems which emerge from gradual evolutionary processes will have a certain, hierarchical type of structure. Simon’s theory is particularly applicable to cultural evolution (it works in that domain better than in the domain of biological evolution); and I argued that, if present-day human languages were the outcome of cultural evolution, Simon predicts that they should share very much the structural properties which Chomsky identified.
However, it later emerged that this part of my case may be redundant. The argument from universals has force only if languages do share non-trivial properties. Chomsky and his followers have modified their theories of language structure extensively down the decades. Under their recent, “Minimalist” theory, it is not clear that there are any language universals left. Culicover (1999: 137–8) pointed out that this version of generative theory has rejected all of the surprising, non-trivial structural universals which formed important parts of earlier versions.
Thus, by moving to the Minimalist Theory (which they presumably found more descriptively adequate), generative linguists undercut their best argument for nativism.
1.b Counterfactual idealizations
Part of the force of Chomsky’s nativist case stemmed not from his arguments, but from counterfactual “simplifying assumptions” he chose to impose on the discussion.
Chomsky often urged that it is both harmless and necessary, in analysing complex empirical phenomena, to abstract away from many of their complexities:
Opposition to idealization is simply objection to rationality. … you must eliminate those factors which are not pertinent … if you want to conduct an investigation which is not trivial. (Chomsky 1979: 57)
Chomsky is right to say that non-trivial scientific investigation requires one to simplify the domain of study. But if a scientific debate involves abstractions which not merely omit true-but-irrelevant considerations, but also introduce assumptions that are counterfactual (i.e. false) – as Chomsky explicitly does – then this is harmless only if the false assumptions are compatible with both sides of the debate. Consistently, Chomsky’s simplifying assumptions are compatible with his theory but not with that of his opponents. That is, the assumptions are question-begging.
The obvious alternative to linguistic nativism is the idea that children learn their first language through a process similar to the process of scientific advance, as described by Sir Karl Popper (e.g. 1963). The child formulates hypotheses to account for small-scale observed regularities, tests them against further experience, abandons those hypotheses which are refuted, and builds on the unrefuted hypotheses by formulating higher-level, more inclusive conjectures – so that he gradually builds up a model of the language, starting with simple features, and moving on to its large-scale architecture.
Chomsky makes three kinds of counterfactual simplifying assumption:
1.b.ii steady state
1.b.iii speaker perfection
We shall see that each of them is incompatible with the Popperian scenario.
Chomsky conceptualizes language acquisition as “an instantaneous process” (Chomsky 1976: 51): he sees no harm in discussing language-acquisition as if children moved from their complete data-set to their mature grammar in one fell swoop. But Popperian learning through conjectures and refutations is crucially gradual rather than instantaneous.
1.b.ii Steady state
Chomsky describes language acquisition as a process terminating in
a “steady state” attained fairly early in life and not changing in significant respects from that point on (Chomsky 1976: 119)
There is in fact ambiguity in Chomsky’s writings about whether he sees the steady-state assumption as a harmless counterfactual idealization, or believes that it is actually true.
But Popperian learning (in any domain) is an “unended quest” which never terminates. The idea that we might cease to deepen our knowledge of our mother tongue after childhood was explicitly rejected by eminent linguists before Chomsky; if Chomsky does believe it, he offers no evidence.
1.b.iii Speaker perfection
According to Chomsky,
Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogeneous speech community, who knows its language perfectly … (Chomsky 1965: 3)
Chomsky describes this idealization as being “of critical importance” (1980: 24–5), although he recognizes that such speakers and speech communities do not exist.
Again the idealization excludes the Popperian account of language acquisition. For Popper, learning (in any domain) crucially involves making mistakes – knowledge advances by trying conjectures all of which are fallible and many of which will be wrong; individuals reach different provisional structures of belief. For Popperians, an individual never has perfect knowledge, and a community of individuals who learn a topic independently will not be homogeneous in their structures of knowledge about it.
Thus Chomsky’s “innocent” counterfactual assumptions are assumptions that eliminate a plausible alternative theory, without taking the trouble to construct arguments against it.
2 Arguments by others in the 1960s–1970s
While Chomsky was winning converts to linguistic nativism through his rather formal, abstract arguments, his case was reinforced by others who quoted more concrete evidence that seemed to point in the same direction.
2.a Categorial perception of speech sound
Alvin Liberman and colleagues (Liberman et al. 1967) noted that parameters of speech sound which, physically, are continuous clines are perceived by humans in a yes-or-no fashion. For instance, pairs of voiced and voiceless consonants, such as /d/ and /t/, are distinguished by voice-onset timing. Time is a continuous variable; but, if artificial stimuli are created in which voice-onset timing varies by small steps, hearers do not perceive them as differing along a cline. Stimuli with voice-onset timing below a critical value are heard as the same; stimuli with voice-onset timing above that value are heard as the same, and different from the former stimuli.
Categorial perception looks like a biological mechanism tightly coupled to language. It is useful to hear sharp contrasts between phonemes, but there is no obvious non-linguistic value in categorial perception.
It turns out, though, that other species (e.g. chinchillas (Kuhl & Miller 1975), crickets (Wyttenbach et al. 1996)) have very similar categorial discrimination functions. We do not know why biological evolution has equipped crickets or chinchillas with categorial perception of voice-onset timing; but it certainly was not so that they could distinguish word pairs like dip and tip. Evidently this is a case where humans have exploited, for language purposes, a feature of our hearing mechanism which had already evolved to serve some other, non-linguistic function(s).
2.b Vocal tract shape
Philip Lieberman (Lieberman et al. 1972) argued that the vocal tract of modern Man is significantly different from that of Neanderthal Man and the chimpanzee. The difference allows modern Man to produce a greater variety of speech sounds, though it is otherwise disadvantageous (it allows death by choking). Thus, again, it seemed that Man has a biological preadaptation for language.
But John Ohala (1994) pointed out, first, that much of what Lieberman describes applies only to adult males, not to females or children; yet women and children also speak. He further argues that explanations referring to language are redundant. The same anatomical feature makes men’s voices deep in pitch, which is advantageous for defence (many species recognize that deep notes come from big things). Similar developments, serving that same function, have occurred in distant species. In later writing, Lieberman (2000) withdrew his earlier support for linguistic nativism, which gained many converts to that belief in its day.
2.c Colour vocabulary
A natural corollary of the idea that language structure is innate would be that the mind might impose conceptual categories, so that there would be more in common among the vocabularies of the world’s languages than could be explained by reference to the things to which words apply.
Brent Berlin and Paul Kay (1969) impressed many readers by claiming to discover common structure in colour vocabularies. Physically, colour is a property determined by wavelength and other continuous parameters of light. It had long been notorious that colour words of different languages could not be neatly mapped onto one another. But Berlin and Kay claimed that there is a common system underlying this diversity. Although boundaries between colours vary from language to language, languages agree on the points in colour space which are “focal” (the “best” examples of different colour words); and these focal colours form a universal implicational hierarchy:
Any language has colour terms for a continuous segment of this hierarchy, starting from the left (i.e., if a language has just three colour words, they will translate as white, black, and red).
But, first, Berlin and Kay’s cross-linguistic survey is methodologically suspect. They taught a course on the topic and invited students to write term papers on different languages; much of the book comprises summaries of research by students of very uneven ability. For instance, whoever took Homeric Greek somehow failed to notice the word melas, “black” (although this is by far the most frequent colour word in Homer). Because Berlin and Kay required every language to have a word for “black”, the error was compounded by an unsupported claim that glaukos, normally translated as something like “silvery blue-green”, probably meant “black” for Homer. Thus confirmation of the Berlin-Kay theory was produced at the cost of severely misrepresenting the language; other examples could be quoted.
Furthermore, as Collier (1973) pointed out, Berlin and Kay’s implicational hierarchy merely reflects, quite accurately, the points in colour space where human visual sensitivity is greatest. The colour space is a featureless continuum in terms of the physics of light, but the mechanisms of human eyesight make some colours much vivider for us than others. The reason why any language which has colour terms other than “white” and “black” has a word for “red” in particular is that our eyes are more sensitive to red than to any other hue.
So there are biological universals in this area, but they are not linguistic universals. There is nothing controversial in recognizing that the human visual system is genetically determined.
3 Pinker’s own arguments
Pinker introduced many new lines of argument:
3.a “language mutants”
3.b irregular inflexions
3.c avoidance of “tempting” errors
3.d late learning of non-linguistic skills
3.e fewness of grammar-relevant factors
3.f cultures where parents don’t talk to children
3.h L2-learning success not predicted by motivation
3.a “Language mutants”
Pinker publicized research by Myrna Gopnik (e.g. 1990), who located a multi-generation family, many members of which are alleged to share a congenital disability specific to grammar. Gopnik, and Pinker, claim that the affected individuals cannot generalize from particular inflected forms (e.g. watched, dressed, washed) to general rules (such as the rule for forming past tenses). Pinker further claimed that “Most of the [affected individuals] were average in intelligence” (1994: 324).
An inherited disability affecting just language structure might be evidence for the relevant aspects of language structure being innate in normal individuals. But this family was studied in greater depth by another team, who found (Vargha-Khadem 1995) that Gopnik’s and Pinker’s claims do not stand up:
(i) the average IQ scores of the affected individuals were very low (Pinker did not source his statement about “average” intelligence);
(ii) many other skills, including non-linguistic skills, were affected;
(iii) affected individuals, claimed by Gopnik and Pinker to be unable to apply general rules, were observed repeatedly to produce “over-generalizations”. An over-generalization in this context means treating an irregular root as if it were regular, for instance saying bringed instead of brought. Someone who produces an over-generalization must be applying a general rule; he cannot be merely imitating a competent speaker’s utterance, because competent speakers do not utter such forms.
So the family documented by Gopnik, while undoubtedly suffering from an inherited mental disability, are not “language mutants”.
3.b Irregular inflexions
Pinker asserted that there are hidden regularities in word formation; as he puts it (1994: 146), “children’s minds seem to be designed with the logic of word structure built in”.
Probably the most striking example is the case of “headless” compounds based on irregular nouns, such as sabre-tooth from tooth. Tooth is irregular, because its plural is teeth. The compound sabre-tooth is called “headless”, because it denotes a kind of tiger, not a kind of tooth. According to Pinker, children instinctively know that plurals of headless compounds are regular, even when the root is irregular: everyone says sabre-tooths, not sabre-teeth, for “sabre-tooth tigers”.
The trouble is that, as a general rule, they don’t. I checked headless compounds based on foot, and quickly found examples of pinkfoot geese being called pinkfeet, and Blackfoot Indians being called Blackfeet. According to Pinker, everyone should instinctively say pinkfoots and Blackfoots.
Pinker has responded to my objection by suggesting (1999: 171–2) that my examples may be compatible with his theory. Some speakers may say pinkfeet or Blackfeet because, for those speakers, the compounds are not “headless” – they may actually think of the birds and Indians as feet rather than complete organisms.
I find this incredible. But, even if my reaction is unjustified, Pinker’s suggestion reduces his theory to circularity. Pinker predicts that the way people form plurals depends on how they think about the entities denoted – but the only way we can tell how they think is by seeing how they form the plurals.
3.c Avoidance of “tempting” errors
Pinker claims that young children, whose language skills are still imperfect, systematically avoid mistakes which we should expect them to make, if their only information came from observation.
For instance, one might expect children to inflect modal verbs like ordinary verbs, and say things like *he cans go, on the analogy:
I like going : he likes going :: I can go : ??
But Pinker (1994: 272) quotes research by Karin Stromswold which found no evidence of such errors in a large body of child speech, containing many opportunities for such mistakes.
Here, we must ask what the hypothetical innate language knowledge is, which (according to Pinker) tells children that *he cans go is impossible. Perhaps “Verbs like can never inflect for subject agreement”? But, in the sixteenth century, they did: Elizabethans said Thou canst go, not *Thou can go (and cf. modern German du kannst, er kann, sie können). The concept of innate linguistic knowledge which is specific to 21st-century English makes no sense.
If Karin Stromswold’s data show that children make fewer mistakes with modal verbs than other aspects of grammar, one plausible hypothesis is that children get more and earlier experience of modal verbs. Pinker does not consider that hypothesis.
3.d Late learning of non-linguistic skills
Pinker notes that children acquire their mother tongue earlier than other skills which, on the face of it, look simpler. A three-year-old who speaks fluently may “be flummoxed by no-brainer tasks like sorting beads in order of size, [etc.]” (Pinker 1994: 276). Pinker takes this to show that language is built-in.
In some abstract, formal sense, sorting beads in order of size is doubtless simpler than speaking a language. But complexity is not the only variable relevant to learning priorities. Motivation is surely also crucial. What would a three-year-old’s motive be for putting effort into learning to sort beads (or various other skills listed by Pinker as late-acquired)? Very weak, one might think, relative to a child’s motive for communicating with the adults who constitute his social world.
3.e Fewness of grammar-relevant factors
Pinker argues that only a few factors are ever relevant to grammar rules, in any language; “a noun’s inflection might depend on whether it is in subject or object position” (1994: 288), but on few other things. Children must know this before they start to learn their mother tongue; otherwise,
the task of learning inflections would be intractable – logically speaking, an inflection could depend on whether the third word … referred to a reddish or a bluish object … whether the sentence was being uttered indoors or outdoors, and billions of other fruitless possibilities (ibid.)
The truth is, though, that grammar rules depend on factors far more diverse than Pinker imagined. In Biblical Hebrew, the factor of whether the preceding word happens to be “and” flips the interpretation of verb forms from past to future and vice versa. In many Australian languages, the rules are extensively affected by whether or not the speaker’s mother-in-law is present.
Why is “indoors v. outdoors” different from “mother-in-law v. no mother-in-law”? If Australia remained undiscovered, Pinker might assert with equal confidence that children instinctively know that “mother-in-law languages” are impossible – but he would be wrong.
3.f Cultures where parents don’t talk to children
Within middle-class British and American societies, young children are sometimes described as getting systematic language tuition. Mothers are said by some researchers to talk to their children in “Motherese” – language carefully graded to meet the child’s language-learning needs.
In these circumstances, perhaps children’s success at L1 acquisition is not too remarkable. But, Pinker said, there are many societies which make things far less easy. Parents hardly speak to their children – yet children succeed in language-acquisition, using the fragmentary data encountered through “overhearing adults and other children” (Pinker 1994: 40). How could this be, unless children have information independent of experience?
However, although Pinker repeatedly described this pattern as common, he quoted only one example: a rural negro community in South Carolina documented by Shirley Brice Heath (1983). And the language experience described by Heath is not really as impoverished as Pinker suggests. Young children are spoken to, though more by older children and less by adults than in societies familiar to me; they are present at occasions where plenty of talking goes on. Why should it matter whether mothers, or other individuals, provide a child’s main experience of language? What matters is that the child gets the experience.
We saw that the “Genie” case proves little. Apart from the fact that Genie did demonstrate some ability to acquire English, the psychological scars of her dreadful childhood might explain any subsequent cognitive disabilities. Pinker accepts this.
However, he quoted a later case, “Chelsea”, who was born deaf, and provided with hearing aids only at age 32; she failed subsequently to learn to speak in meaningful sentences. According to Pinker, this cannot be explained in terms of emotional trauma, because Chelsea was brought up by a loving family; she constitutes genuine evidence that L1 acquisition is impossible after a “critical period” expires.
But the only documentation on the Chelsea case quoted by Pinker is one page in an article by Susan Curtiss (1989), reporting an unpublished talk given by a speaker at a local dyslexia society. So it is difficult to judge what weight, if any, to give to this case.
Pinker’s source says nothing either way about the emotional tone of Chelsea’s family background. More important, there is no information about the period which elapsed between Chelsea getting hearing aids, and being observed to produce meaningless utterances. For all the reader can tell, the observations may have been made after a period that would not be long enough for a young child to learn to speak.
3.h L2 learning success not predicted by motivation
If older individuals are less efficient at mastering a second language, even when circumstances provide opportunity and motive (as in the case of emigrants), an obvious explanation would be that older people tend to be more thoroughly acculturated into their original society, and correspondingly feel less identification with a new society and language. Pinker questioned this: “recent evidence is calling these social and motivational explanations into doubt” (1994: 290). He referred to research by Jacqueline Johnson and Elissa Newport (1989), who found that English-speaking ability among Chinese and Korean students and staff at a US university showed little correlation with the individuals’ self-ratings in response to questions like “How strongly would you say you identify with the American culture?”
But differences between high and low ratings on this sort of questionnaire, from respondents all of whom have moved from one viable society to another as adults, seem minor, relative to the difference between a typical adult emigrant’s attachment to his adoptive society, and a small child’s passionate emotional involvement with his mother or nanny. Typical cases of highly-successful adult learners of English, in my experience, include Jewish immigrants to Britain who arrived as refugees from Nazism. Such people were not just temporarily distanced from their original culture and family; the culture, and often the family, was permanently destroyed. Crises of this order may create an emotional reorientation, including motivation for L2 learning, that simply could not be studied through Johnson and Newport’s research.
4 Others’ supporting arguments since 1990
Although achieving less impact on the general reading public than Pinker, some other writers have produced arguments during the past dozen or so years which tend to support his idea of an inherited “language instinct”.
4.a Discontinuity between “protolanguage” and true language
Derek Bickerton (e.g. 1990) has argued that there is not a continuous cline linking the complex languages typically used by adult humans to simple precursor communication systems, but a sharp discontinuity. This alleged discontinuity is claimed to mark the difference between, on the one hand, the special linguistic abilities inherited by our species alone, which are genetically constrained to come into operation during a “critical period”, and on the other hand cruder and more general communicative abilities, called by Bickerton “protolanguage”, which are shared with other species and for which there is no critical age.
Bickerton argues this mainly with respect to two phenomena:
4.a.i creoles v. pidgins
4.a.ii grammar development in young children
4.a.i Creoles v. pidgins
A pidgin is a crude communicative system which sometimes comes into being when speakers of different languages have dealings for limited purposes, e.g. barter. When children grow up speaking a pidgin as their first language, it is called a creole; creoles are typically richer in structure than pidgins which are no-one’s mother tongue. That much is neither controversial nor surprising, but Bickerton claimed that the difference is extreme: while creoles are essentially similar to “ordinary” languages, pidgins (according to Bickerton) have no grammar at all. “… [W]ords and utterances are simply strung together like beads, rather than assembled according to syntactic principles” (Bickerton 1990: 122).
Bickerton illustrated this from Russenorsk, used at one period for barter between Russian and Norwegian sailors. He quotes the Russenorsk example:
big expensive flour on Russia this year
– meaning “Flour is very expensive in Russia this year”.
This example does not seem grammarless. It has a reasonably complex structure:
[[big expensive] flour [on Russia] [this year]]
and is syntactically consistent: modifiers precede heads. If Russenorsk words were “strung together like beads”, without syntax, then a random permutation – say, expensive on this Russia big year flour – should be equally appropriate. Bickerton did nothing to persuade us that this is so. I see no reason to believe in “discontinuity” between pidgins and creoles; on Bickerton’s evidence, creoles have more of something that pidgins have a moderate amount of.
4.a.ii Grammar development in young children
Bickerton used records of the speech of a boy “Seth” to argue that child language development shows a sharp discontinuity between a stage of one-word or few-word utterances, and a later stage of ramified syntactic structures.
At 21 months, typical utterances by Seth were: Get up! Apple. Six months later, he was producing utterances like I want to put the squeaky shoes some more, Daddy. Bickerton’s explanation was that, at the earlier period, Seth could only use the psychological mechanisms of “protolanguage”; by 27 months, the “critical period” had kicked in and Seth could exploit the species-specific language mechanisms.
But there is a simpler explanation. Uncontroversially, adult grammars are “recursive” – constructions may contain subordinate examples of the same grammatical category, so that a finite range of constructions gives rise to structures of unlimited complexity. If the rules of a recursive grammar are learned one by one, there must necessarily be some point before which the set of rules learned so far is nonrecursive, and after which it is recursive. That is, a sharp transition from a grammar permitting only a finite range of simple utterances, to a grammar permitting an infinitely numerous range of complex utterances, is a mathematical inevitability.
Hence, although I accept Bickerton’s point about the difference between Seth’s utterances at the two stages, this does not argue for two psychological systems of language-processing machinery.
4.b Sign language
Ray Jackendoff’s Patterns In The Mind (1993) resembled Pinker (1994) as a book designed to win the general educated reader over to a belief in the detailed structure of human language being part of our genetic endowment.
Many of Jackendoff’s arguments coincided with those of other writers, already discussed. But Jackendoff had a distinctive topic of his own: “ASL” (American Sign Language), the manual language used by the deaf and dumb in the USA.
Jackendoff made two points about ASL:
(i) it has distinctive structural features making it interestingly different from any spoken language
(ii) the kinds of evidence which allegedly suggest that spoken languages are biologically governed rather than culturally learned apply equally to ASL
Between them, these points imply that human beings inherit a genetic predisposition enabling them to master specialized communication systems which are useful only to the tiny minority of individuals who are deaf or dumb. This would be grossly implausible in terms of the axioms of evolutionary biology. Pinker’s and Chomsky’s “language instinct” idea is superficially plausible, because it is easy to agree that language ability might increase individuals’ reproductive “fitness”, and we know that evolution tends to create predispositions which increase fitness. Evolution is not usually thought of as creating complex predispositions that are irrelevant to the fitness of almost all their possessors.
Jackendoff’s two claims about ASL are unlikely both to be true. I find (i) plausible, and reject (ii) – in the sense that I assume ASL, like spoken languages, is learned rather than biologically governed.
5 Chomsky’s “snowflake” argument
Chomsky has recently been putting forward a new argument, involving a more extreme version of the point of view I discussed under 1.a.v above: Chomsky now urges that the main structural features of human languages are not only common to all languages but are a matter of “(virtual) conceptual necessity” (see e.g. Chomsky 2005: 10, or references in Postal 2003). He suggests that “Language is something like a snowflake, assuming its particular form by virtue of laws of nature”.
I shall not waste many words on Chomsky’s analogy with snowflakes. It is true that while the shapes of individual snowflakes differ in detail all snowflakes resemble one another in overall structure, being hexagonal and symmetrical, as a consequence of basic laws of physics. Chomsky appears to believe that this analogy strengthens his case for linguistic nativism. He does not seem to realize that, on the contrary, it undermines that case. When some biological trait might be other than it is, and yet is shared by all members of a species, then it is reasonable to infer that the feature is controlled by some aspect of the genetic code which defines the species. But if a shared feature is “logically necessary”, that inference fails. (We might well ask what aspect(s) of human DNA account for the fact that everybody’s eyes are sensitive to radiation between the red and violet frequencies, and not to higher or lower frequencies. But if we find that everyone has a propensity to say that the sum of three and four is seven, we are not tempted to look for a 3 + 4 = 7 gene.)
It is interesting, as a matter of biography, to see a man who has made a towering reputation by arguing for a genetic basis for cognition proceeding in effect to destroy his own case in old age. But if our concern is with the facts of the human language faculty, rather than with the history of one man’s intellectual evolution, then we can afford to forget about snowflakes.
I have examined every significant argument that has been used over half a century to convince us that humans master language by virtue of a “language instinct”. None of the arguments works.
I conclude that there is no language instinct. On the available evidence, languages seem to be products of cultural evolution only. The biological foundations on which they depend are an open-ended ability to formulate and test hypotheses, which we use to learn about anything and everything that life throws at us, and perception and phonation mechanisms which evolved to serve other functions and have no special relationship with language.
The question how cultural evolution developed the complex languages used during recorded history out of simple precursors is an interesting, worthwhile question. But it is surely a very different question, to which different kinds of evidence are relevant and different sorts of answer available, from the question how an alleged “language instinct” might have evolved biologically
Berlin, B. & P. Kay (1969) Basic Color Terms. University of California Press.
Berwick & Chomsky (2011) “The biolinguistic program: the current state of its development”. In Anna Maria di Sciullo & C. Boeckx, eds, The Biolinguistic Enterprise. Oxford University Press. [My quotation is taken from an online prepublication version of the chapter.]
Bickerton, D. (1990) Language & Species. University of Chicago Press.
Chomsky, A.N. (1962) “Explanatory models in linguistics”. In E. Nagel et al., eds. Logic Methodology and Philosophy of Science. Stanford University Press.
Chomsky, A.N. (1965) Aspects of the Theory of Syntax. MIT Press.
Chomsky, A.N. (1972) Problems of Knowledge and Freedom. Fontana/Collins.
Chomsky, A.N. (1975) The Logical Structure of Linguistic Theory. Plenum.
Chomsky, A.N. (1976) Reflections on Language. Temple Smith.
Chomsky, A.N. (1979) Language and Responsibility. Harvester (Hassocks, Sussex).
Chomsky, A.N. (1980) Rules and Representations. Blackwell (Oxford).
Chomsky, A.N. (2005) “Three factors in language design”. Linguistic Inquiry 36.1–22.
Collier, G.A. (1973) Review of Berlin & Kay (1969). Language 49.245–8.
Culicover, P.W. (1999) “Minimalist architectures”. Journal of Linguistics 35.137–50.
Curtiss, Susan (1977) Genie. Academic Press.
Curtiss, Susan (1989) “The independence and task-specificity of language”. In A. Bornstein & J. Bruner, eds., Interaction in Human Development. Erlbaum (Hillsdale, N.J.).
Gopnik, Myrna (1990) “Feature-blind grammar and dysphasia”. Nature 344.715.
Heath, Shirley B. (1983) Ways with Words. Cambridge University Press.
Jackendoff, R. (1993) Patterns in the Mind. Harvester Wheatsheaf.
Johnson, Jacqueline & Elissa L. Newport (1989) “Critical period effects in second language learning”. Cognitive Psychology 21.60–99.
Jones, P. (1995) “Contradictions and unanswered questions in the Genie case”. Language and Communication 15.261–80.
Kuhl, P.K. & J.D. Miller (1975) “Speech perception by the chinchilla”. Science 190.69–72.
Lenneberg, E.H. (1967) Biological Foundations of Language. Wiley.
Liberman, A.M. et al. (1967) “Perception of the speech code”. Psychological Review 74.431–61.
Lieberman, P. (2000) Human Language and our Reptilian Brain. Harvard University Press.
Lieberman, P. et al. (1972) The Speech of Primates. Mouton (the Hague).
Newport, Elissa L., et al. (1977) “Mother, I’d rather do it myself”. In Catherine E. Snow & C.A. Ferguson, eds., Talking to Children. Cambridge University Press.
Ohala, J. (1994) “The frequency code underlies the sound-symbolic use of voice pitch”. In Leanne Hinton et al, eds., Sound Symbolism. Cambridge University Press.
Piattelli-Palmarini, M., ed. (1980) Language and Learning. Routledge & Kegan Paul.
Pinker, S. (1994) The Language Instinct. William Morrow (New York); my page references are to the 1995 Penguin edition.
Pinker, S. (1999) Words and Rules. Weidenfeld & Nicolson.
Popper, K.R. (1963) Conjectures and Refutations. Routledge and Kegan Paul.
Postal (2003) “(Virtually) conceptually necessary”. Journal of Linguistics 39.599–620. [A revised version is in Postal, Skeptical Linguistic Essays, Oxford University Press, 2004.]
Pullum, G.K. & Barbara C. Scholz (2002) “Empirical assessment of stimulus poverty arguments”. In Ritter, Nancy A., ed., A Review of “The Poverty of Stimulus Argument”. Special issue of The Linguistic Review, vol. 19, nos. 1–2.
Sampson, G.R. (2005) The “Language Instinct” Debate. (An enlarged and updated edition of a book first published in 1997 as Educating Eve: The “Language Instinct” Debate.) Continuum.
Sampson, G.R. (2017) The Linguistics Delusion. Equinox (Sheffield and Bristol, Conn.)
Simon, H.A. (1969) The Sciences of the Artificial. MIT Press.
Vargha-Khadem, F., et al. (1995) “Praxic and nonverbal cognitive deficits in a large family …”. Proceedings of the National Academy of Sciences of the USA 92.930-3.
Wyttenbach, R.A., et al. (1996) “Categorical perception of sound frequency by crickets”. Science 273.1542-4.