Literature & Language
Linguistics
The structure of language: phonology, syntax, semantics, and historical change.
Phonetics
Phonetics is the study of the physical properties of speech sounds, independent of their function within a particular language.
- Phonetics vs phonology — phonetics studies actual sound production and acoustics; phonology studies how sounds function as contrastive units within a language system.
- IPA (International Phonetic Alphabet) — a standardized notation system in which each symbol represents a unique speech sound; enables cross-linguistic description.
- Place of articulation — where in the vocal tract the primary constriction occurs; major places include bilabial (both lips), labiodental, dental, alveolar (behind upper teeth), palatal, velar (soft palate), and glottal.
- Manner of articulation — how airflow is modified: stops (complete closure then release), fricatives (turbulent airflow through narrow constriction), affricates (stop + fricative, e.g., /tʃ/ in “church”), nasals (velum open, air through nose), liquids (/l/, /r/), and glides/approximants.
- Voiced vs voiceless — voiced sounds involve vocal-cord vibration (/b/, /d/, /v/); voiceless do not (/p/, /t/, /f/). Minimal pairs like “bat”/”pat” illustrate voicing contrast.
- Vowels — described by tongue height (high/mid/low), tongue backness (front/central/back), and lip rounding. The IPA vowel chart plots these dimensions.
- Consonants — described by voicing, place, and manner; e.g., /p/ is a voiceless bilabial stop.
- Aspiration — a puff of air following a stop; English initial /p/ in “pin” is aspirated [pʰ]; in “spin” it is unaspirated [p].
- Ejective — a consonant produced with a glottalic egressive airstream (closure of the glottis + upward larynx movement); common in Caucasian languages, many Indigenous American languages; written with a following apostrophe in IPA (e.g., /p’/).
- Implosive — a consonant produced with a glottalic ingressive airstream while the vocal cords vibrate; the larynx moves downward; common in Hausa and other African languages; written with a hooked symbol in IPA (e.g., /ɓ/).
-
Click consonant — produced by a velaric ingressive airstream; clicks are found as regular phonemes in Khoisan languages (e.g., Zulu and Xhosa also use them); IPA uses symbols , !, ǂ, ǁ. - Suprasegmental features — properties spanning more than one segment: stress, tone, length, and intonation.
- Tone languages — languages in which pitch is lexically contrastive; Mandarin Chinese has four tones (plus a neutral tone); Cantonese has six.
- Prosody — the rhythmic and melodic aspects of speech; includes stress patterns and intonational contours.
- Spectrogram — a visual display of acoustic energy across frequency and time, used in instrumental phonetics.
Phonology
Phonology analyzes how sounds function as abstract, contrastive units within the grammar of a language.
- Phoneme — the smallest unit of sound that distinguishes meaning; /p/ and /b/ are separate phonemes in English (contrast: “pit” vs “bit”).
- Allophone — a phonetic variant of a phoneme that does not change meaning; aspirated [pʰ] and unaspirated [p] are allophones of /p/ in English.
- Minimal pair — two words differing by exactly one phoneme in the same position, proving that phoneme’s contrastive status; e.g., “cat”/”bat.”
- Complementary distribution — allophones that never appear in the same phonetic environment (i.e., they do not contrast); aspiration in English /p/ follows this pattern.
- Free variation — two sounds that can substitute for each other in the same environment without changing meaning.
- Phonological rules — systematic processes governing sound alternations: assimilation (sounds become more alike), dissimilation, deletion, insertion (epenthesis), and metathesis.
- Assimilation — a sound takes on a feature of a neighboring sound; the “n” in “impossible” becomes [m] by assimilation to the following bilabial /p/.
- Syllable structure — syllables have an onset (initial consonant/s), nucleus (vowel), and coda (final consonant/s); onset + nucleus = rhyme with the coda.
- Phonotactics — language-specific constraints on permissible sound sequences; English permits “str-“ onsets but not “tl-.”
- Stress — prominence given to a syllable by increased loudness, pitch, and/or length; English has lexical stress (“PERmit” noun vs “perMIT” verb).
- Vowel harmony — a phonological process requiring vowels within a word to share a feature (e.g., backness); prominent in Turkish and Finnish.
- Markedness — the tendency for some phonological categories to be more “basic” or cross-linguistically common (unmarked) than others (marked).
- Optimality Theory (OT) — a constraint-based framework (Prince and Smolensky, 1993) in which surface forms are evaluated by ranked, violable universal constraints; the winning candidate best satisfies the ranking.
- Feature geometry — the representation of phonological features in a hierarchical tree, allowing rules to target natural classes by referring to nodes rather than lists of features.
Morphology
Morphology is the study of word structure and the units of meaning within words.
- Parts of speech (grammatical categories) — the traditional classification of words by their syntactic and semantic function; the classical eight (from Latin grammar): noun (names persons, places, things, ideas), verb (expresses action or state), adjective (modifies a noun), adverb (modifies a verb, adjective, or another adverb), pronoun (substitutes for a noun), preposition (links a noun phrase to another element), conjunction (connects clauses or phrases), and interjection; participle is a non-finite verbal form (present: “running”; past: “broken”) that functions adjectivally; modern linguistics prefers the language-neutral term lexical category and identifies categories distributionally (by where words can appear) rather than by meaning alone.
- Morpheme — the smallest meaningful unit of a language; “unbreakable” contains three morphemes: un-, break, -able.
- Free morpheme — can stand alone as a word (“cat,” “run”).
- Bound morpheme — cannot stand alone; must attach to another form (“-ing,” “un-,” “-ed”).
- Affix types — prefix (before the root), suffix (after), infix (inside), circumfix (around); English uses prefixes and suffixes most.
- Inflectional morphology — morphemes that mark grammatical categories (tense, number, case, agreement) without changing word class; English “-s” (plural), “-ed” (past tense).
- Derivational morphology — creates new words, often changing part of speech; “-ness” turns adjectives into nouns (“kind” → “kindness”).
- Root / stem — the core form to which affixes attach; distinct from base (any form that receives an affix).
- Allomorph — a phonological variant of a morpheme; the English plural has allomorphs [-s], [-z], and [-ɪz] (as in “cats,” “dogs,” “buses”).
- Compounding — combining two or more free morphemes into one word (“blackbird,” “deadline”).
- Morphological typology — languages are classified (loosely) as isolating/analytic (minimal morphology; Mandarin), agglutinative (transparent, stackable morphemes; Turkish), fusional/inflectional (morphemes fuse; Latin), and polysynthetic (many morphemes per word; many Indigenous North American languages).
- Suppletion — irregular alternation in morpheme forms with no phonological cause; “go”/”went,” “good”/”better.”
- Zero morpheme — a morphologically meaningful but phonologically null morpheme; the plural “sheep” has a zero plural morpheme.
- Agglutinative language — type in which each morpheme carries one grammatical meaning and boundaries between morphemes are clear; Turkish and Swahili are classic examples (“ev-ler-im-de” = “in my houses”).
- Fusional / inflectional language — type in which a single morpheme simultaneously encodes multiple grammatical categories; Latin “-ō” on a verb marks first person, singular, present, indicative, active.
- Isolating / analytic language — type with minimal or no bound morphology; grammatical relations expressed by word order and free particles; Mandarin Chinese is the canonical example.
- Polysynthetic language — type in which entire propositions can be expressed in a single complex word; many Indigenous North American languages (e.g., Mohawk) are polysynthetic.
- Genitive case — a grammatical case that primarily marks possession or association; in English it is expressed by the ‘s clitic (“the dog’s bone”) or the preposition “of”; in highly inflected languages (Latin, German, Russian, Greek) it is a distinct case form; also used to mark the subject of certain nominal constructions; one of the core cases alongside nominative, accusative, dative, and ablative in classical grammars.
- Ergative-absolutive alignment — a morphosyntactic pattern in which the subject of an intransitive verb and the object of a transitive verb receive the same case (absolutive), while the transitive subject receives a distinct case (ergative); found in Basque, many Australian and Caucasian languages.
- Nominative-accusative alignment — the dominant pattern in European languages: the subject of both transitive and intransitive verbs takes nominative case; the object takes accusative.
- Split ergativity — a language shows ergative alignment in some contexts and accusative in others; often conditioned by tense/aspect (e.g., Hindi-Urdu is ergative in the perfective, accusative in the imperfective).
- Noun incorporation — a morphological process in which a noun is absorbed into a verb to form a complex predicate; common in polysynthetic languages.
Syntax
Syntax is the study of how words combine into phrases and sentences according to structural rules.
- Generative grammar — Noam Chomsky’s framework (from Syntactic Structures, 1957) proposing that human linguistic competence consists of a finite set of rules capable of generating an infinite set of sentences.
- Competence vs performance — competence is the internalized grammatical knowledge of an ideal speaker-listener; performance is actual language use, which may include errors and hesitations (Chomsky, parallel to Saussure’s langue/parole).
- Universal Grammar (UG) — Chomsky’s hypothesis that all humans are born with an innate, language-specific faculty that constrains possible grammars; supported by poverty-of-the-stimulus arguments.
- Phrase structure (constituency) — sentences are organized into hierarchically nested phrases: NP (noun phrase), VP (verb phrase), PP (prepositional phrase), AP (adjective phrase).
- X-bar theory — a framework within generative grammar positing a uniform phrase structure: specifier + head + complement, with intermediate levels.
- Deep structure vs surface structure — in early generative grammar, deep structure encodes underlying semantic relations; transformations map it onto surface structure. Later replaced by more abstract representational levels.
- Transformation — a syntactic rule that moves or deletes elements; e.g., passivization moves the object to subject position.
- Passive construction — “The cake was eaten by the dog” derives from an active counterpart by passivization.
- Head of a phrase — the obligatory, category-determining word in a phrase; the noun is the head of an NP, the verb of a VP.
- Argument structure — the required participants a verb takes; “sleep” is intransitive (one argument), “eat” is transitive (two), “give” is ditransitive (three).
- Theta roles (thematic roles) — semantic roles assigned to arguments: agent (doer), patient/theme (undergoer), experiencer, beneficiary, goal, source.
- Binding theory — governs the interpretation of pronouns and reflexives; reflexives must be bound within their clause (“herself”), pronouns must be free.
- Wh-movement — fronting a wh-phrase to the beginning of a clause (“What did she eat?”); leaves an abstract trace in the original position.
- SOV / SVO typology — languages are typed by their dominant order of Subject, Object, and Verb. English is SVO; Japanese and Turkish are SOV; classical Arabic and Welsh are VSO.
- Head-initial vs head-final — in head-initial languages the head precedes complements (English VPs: “eat cake”); head-final languages place heads after (Japanese).
- Government and Binding (GB) — Chomsky’s 1981 theoretical framework articulating syntax through interacting modules: theta theory, case theory, binding theory, bounding theory (subjacency), control theory; succeeded by the Minimalist Program.
- Minimalist Program — Chomsky’s framework from the 1990s onward; seeks a minimal, elegant computational system for language; key operations are Merge (combining two syntactic objects) and Move (feature-driven displacement); posits that grammar interfaces with the conceptual-intentional (semantics) and sensorimotor (phonology) systems.
- Greenberg’s universals — Joseph Greenberg’s (1963) survey of 30+ languages yielded implicational universals; e.g., Universal 1: languages with dominant VSO order are almost always prepositional.
- Principles and Parameters — Chomsky’s approach in which UG consists of fixed universal principles and parameters that are set by exposure to input (e.g., the head-direction parameter, the pro-drop parameter).
- Pro-drop parameter — languages like Spanish and Italian allow subject pronouns to be omitted (“habla” = “(he/she) speaks”); English and French require an overt subject.
- Specifier-Head-Complement — the canonical X-bar structure: a phrase XP consists of a specifier (left branch), a head X, and a complement; all phrasal categories share this template.
- CP and IP/TP — in Government and Binding, sentences are analyzed as CPs (complementizer phrases housing wh-elements and that-clauses) embedding TPs (tense phrases), which embed VPs.
- Locality constraints — syntactic rules are bounded; islands (Ross’s island constraints) prevent movement out of certain embedded structures (relative clauses, noun complements).
Semantics
Semantics is the formal and informal study of meaning in language.
- Lexical semantics — the study of word meanings and their organization in the lexicon.
- Sense vs reference (Frege) — the sense (Sinn) of an expression is its mode of presentation or descriptive content; the reference (Bedeutung) is the entity it picks out. “Morning Star” and “Evening Star” have different senses but the same reference (Venus).
- Extension vs intension — the extension of a predicate is the set of entities it applies to; the intension is its meaning (the property it expresses).
- Synonymy — two expressions with the same meaning (“begin”/”commence”); strict synonymy is rare.
- Antonymy — opposite-meaning pairs: gradable antonyms (“hot”/”cold”), complementary pairs (“alive”/”dead”), and converses (“buy”/”sell”).
- Hyponymy / hypernymy — “rose” is a hyponym (more specific) of “flower”; “flower” is the hypernym (more general).
- Polysemy — a single word with multiple related meanings (“mouth” of a person, of a river).
- Homonymy — identical form, unrelated meanings (“bank” as financial institution vs riverbank).
- Entailment — sentence A entails B if whenever A is true, B must be true; “She assassinated the president” entails “The president died.”
- Presupposition — a proposition assumed to be true for a sentence to be felicitously used; “John stopped smoking” presupposes he smoked.
- Semantic compositionality (Frege’s principle) — the meaning of a complex expression is determined by the meanings of its parts and the rules by which they are combined.
- Truth-conditional semantics — a formal approach (associated with Tarski and Montague) that represents meanings as conditions under which sentences are true.
- Prototype theory — Rosch’s cognitive-semantic view that categories are organized around best examples (prototypes) rather than strict necessary-and-sufficient conditions.
Pragmatics
Pragmatics studies how context shapes the interpretation of utterances beyond literal linguistic meaning.
- Speech act theory — J. L. Austin distinguished locutionary acts (saying), illocutionary acts (the intended communicative force: promising, asserting, requesting), and perlocutionary acts (the effect on the hearer).
- Performative vs constative — Austin’s initial distinction: performatives enact something by being uttered (“I promise”); constatives describe states of affairs.
- Grice’s Cooperative Principle — H. P. Grice proposed that conversation is governed by a general principle of cooperation, elaborated into four maxims:
- Maxim of Quantity — say no more or less than required.
- Maxim of Quality — do not say what you believe false or lack evidence for.
- Maxim of Relation — be relevant.
- Maxim of Manner — be clear, brief, and orderly.
- Conversational implicature — what is communicated beyond what is literally said; derived by reasoning from apparent maxim violations. “Can you pass the salt?” implicates a request, not a question about ability.
- Scalar implicature — the use of a weaker term implicates the falsity of stronger ones on a scale; saying “some” implicates “not all.”
- Deixis — expressions whose interpretation depends on context: person deixis (“I,” “you”), place deixis (“here,” “there”), time deixis (“now,” “yesterday”), discourse deixis, social deixis.
- Indexical — an expression whose reference shifts with context of utterance; all deictics are indexicals.
- Reference and anaphora — the way noun phrases pick out entities in the world or in prior discourse; anaphors (pronouns, “do so”) point back to an antecedent.
- Relevance theory — Sperber and Wilson’s framework proposing that communication is governed by a single principle of relevance: maximizing cognitive effects while minimizing processing effort.
- Politeness theory — Brown and Levinson’s model based on face (public self-image); positive face wants (desire to be liked) and negative face wants (desire for autonomy) are managed through politeness strategies.
- Indirect speech act — an utterance that performs an illocutionary act different from its literal form; “It’s cold in here” is literally a statement but functions as a request to close the window.
- Flouting a maxim — deliberately and openly violating a Gricean maxim to generate an implicature; e.g., using obvious overstatement (irony) flouts the Maxim of Quality.
- Presupposition triggers — lexical items or constructions that introduce presuppositions: factive verbs (“know,” “regret”), change-of-state verbs (“stop”), cleft constructions (“It was John who left”).
- Discourse marker — a word or phrase (e.g., “well,” “however,” “so”) that signals relations between discourse segments rather than carrying propositional content.
Historical and Comparative Linguistics
- Historical linguistics — the study of how languages change over time in phonology, morphology, syntax, and vocabulary.
- Comparative method — systematic procedure for reconstructing a proto-language by comparing regular correspondences across daughter languages; the foundation for establishing language families.
- Sound correspondence — a regular relationship between sounds in cognate words across related languages; e.g., English /f/ : Latin /p/ in “father”/”pater,” “fish”/”piscis.”
- Reconstruction — the inference of ancestral proto-language forms (starred forms, e.g., *pəter) from systematic correspondences; these are hypothetical, not attested.
- Sir William Jones (1786) — British philologist who, in a lecture to the Asiatic Society in Calcutta, observed the structural similarities among Sanskrit, Greek, Latin, Gothic, Celtic, and Old Persian, arguing for a common ancestor.
- Proto-Indo-European (PIE) — the hypothetical ancestor of the Indo-European family, reconstructed from descendants; spoken roughly 5,000–6,000 years ago, likely on the Pontic steppe (Kurgan hypothesis, Gimbutas).
- Indo-European family — the world’s most widely distributed language family by number of speakers; major branches include Indo-Iranian (Sanskrit, Hindi-Urdu, Persian), Hellenic (Greek), Italic (Latin, Romance), Germanic, Slavic, Baltic, Celtic, Armenian, Albanian, and others.
- Grimm’s Law — a set of systematic consonant shifts distinguishing Proto-Germanic from other Indo-European branches; formulated by Jacob Grimm in the 1820s. Voiceless stops shifted to fricatives (PIE *p → Germanic /f/), voiced stops became voiceless, and voiced aspirates became voiced fricatives or stops.
- Great Vowel Shift — a major change in Middle English (roughly 1400–1700) in which the long vowels of English systematically raised and diphthongized, largely explaining the gap between English spelling and pronunciation.
- Sound change — tends to be regular and exceptionless within a speech community (the Neogrammarian hypothesis); exceptions arise from analogy, borrowing, or dialect mixture.
- Analogy — the reshaping of irregular forms toward regular patterns; many strong verbs in English have been regularized over time.
- Borrowing / loanword — a word taken from another language; English borrowed heavily from French and Latin after the Norman Conquest (1066).
- Cognate — a word in one language related by common ancestry to a word in another (English “night,” German “Nacht,” Latin “nox/noctis”).
- False cognate / false friend — words that look similar across languages but have different meanings (Spanish “embarazada” = pregnant, not embarrassed).
- Language family vs language isolate — a family is a group of genetically related languages; an isolate has no demonstrated genetic relatives (e.g., Basque).
- Glottochronology — a controversial method estimating the time of language divergence from the rate of change in core vocabulary (basic wordlist); criticized for assuming constant rates.
- Verner’s Law (expanded) — Karl Verner’s 1875 observation that Proto-Indo-European voiceless fricatives (produced by Grimm’s Law) became voiced when the PIE accent did not immediately precede them; demonstrated that Grimm’s Law was truly exceptionless when accent position was factored in.
- Grassmann’s Law — a dissimilation rule independently operative in Ancient Greek and Sanskrit: when a word has two successive aspirated consonants, the first loses its aspiration; e.g., PIE *bheudh- → Greek peuth- (not *pheuth-); formulated by Hermann Grassmann (1863).
- Bartholomae’s Law — a Proto-Indo-Iranian (and Sanskrit) rule of regressive voicing and aspiration assimilation: a voiced aspirate before a voiceless consonant causes the following stop to become voiced and aspirated; e.g., Sanskrit “labdha” from *labh + ta.
- High German consonant shift (Second Germanic consonant shift) — a set of sound changes affecting the southern High German dialects (c. 500–800 CE) that distinguish High German (and Standard German) from Low German, English, and Dutch; voiceless stops shifted to affricates or fricatives (PIE/Germanic *p → HG pf or ff).
- Neogrammarian hypothesis — the late 19th-century claim by the Junggrammatiker (Brugmann, Osthoff, etc.) that sound change is exceptionless within a speech community at a given time; apparent exceptions result from analogy, borrowing, or dialect mixture; a cornerstone of comparative linguistics.
- Internal reconstruction — inferring earlier stages of a single language by analyzing paradigm irregularities within that language alone, without recourse to related languages.
- Swadesh list — a standardized wordlist of 100–200 culturally universal concepts used in glottochronology and comparative work; compiled by Morris Swadesh in the 1950s.
Major world language families
- Indo-European — largest family by speakers; includes English, Spanish, Hindi, Russian, Bengali, Portuguese, German, French, and many others.
- Sino-Tibetan — includes Mandarin Chinese, Cantonese, Tibetan, Burmese; Mandarin is among the most widely spoken native languages globally.
- Afro-Asiatic — includes Arabic, Hebrew, Amharic, Hausa; spread across North Africa and the Middle East.
- Niger-Congo — a large African family including the Bantu subfamily (Swahili, Zulu, Yoruba).
- Austronesian — from Madagascar to Hawaii and New Zealand; includes Malay, Tagalog, Hawaiian, Māori.
- Dravidian — South Indian family including Tamil, Telugu, Kannada, Malayalam; not related to Indo-European.
- Turkic — includes Turkish, Uzbek, Kazakh, Azerbaijani; characterized by vowel harmony and agglutination.
- Japonic — Japanese and the Ryukyuan languages; genetic affiliation with Korean remains debated.
- Koreanic — Korean; sometimes grouped with Japonic in a proposed Transeurasian or Altaic hypothesis, but this remains contested.
- Uralic — Finnish, Estonian, Hungarian; notable as non-Indo-European languages in Europe.
- Niger-Congo / Bantu — the Bantu subfamily within Niger-Congo (itself the largest African family) includes ~500 languages spoken across sub-Saharan Africa; characterized by noun class systems with agreement prefixes.
- Semitic family — a major branch of Afro-Asiatic; includes Arabic, Hebrew, Aramaic, Amharic, Tigrinya; characterized by triconsonantal roots to which vowel patterns and affixes are applied.
- Nilo-Saharan — a large but internally disputed African family (if it is a single family); includes Luo, Kanuri, Songhay.
- Basque — a language isolate spoken in the Pyrenees region of Spain and France; unrelated to any known living or extinct language; notable for an ergative-absolutive case system; sometimes cited as a relic pre-Indo-European European language.
- Sumerian — an extinct language isolate; the language of ancient Sumer (Mesopotamia), written in cuneiform from c. 3200 BCE; no demonstrated genetic relationship to any other language.
- Altaic hypothesis — a proposed macrofamily grouping Turkic, Mongolic, Tungusic, and sometimes Koreanic and Japonic; once influential but now largely rejected by historical linguists due to insufficient evidence of shared morphology and probable borrowing as an alternative explanation.
- Nostratic hypothesis — a proposed macrofamily (Pedersen, Illič-Svityč) linking Indo-European, Afro-Asiatic, Uralic, Dravidian, Kartvelian, and Altaic into a single super-family; highly speculative and rejected by most mainstream historical linguists.
- Austronesian expansion — the spread of Austronesian-speaking peoples from Taiwan across the Pacific and Indian Oceans; the family includes ~1,300 languages, the most geographically widespread family; characterized by a focus/topic-prominence system in Philippine-type languages.
Sociolinguistics
- Dialect — a regional or social variety of a language; all speakers speak a dialect (standard varieties are dialects with prestige).
- Dialect vs language — the distinction is more political than linguistic; the quip “a language is a dialect with an army and a navy” is attributed to Max Weinreich.
- Sociolect — a social variety of language tied to class, age, ethnicity, or other social factors.
- Register — a variety of language tied to situation or purpose (e.g., legal register, medical register, casual speech).
- Code-switching — alternating between two or more languages or dialects within a conversation or even a single utterance; a skilled behavior in multilingual communities.
- Diglossia — the coexistence of two varieties of the same language in a community with distinct functions; high (H) variety used in formal contexts, low (L) in everyday speech. Classical Arabic vs. colloquial Arabic dialects is a textbook example.
- Pidgin — a reduced contact language with no native speakers, developed for communication between groups without a shared language; has simplified grammar and a restricted lexicon.
- Creole — a fully developed language with native speakers that arose from a pidgin; has expanded grammar and vocabulary. Haitian Creole is a well-studied example.
- Language death / endangerment — when a language loses all native speakers or is no longer transmitted to children. UNESCO estimates thousands of languages are endangered.
- Esperanto — the most widely spoken constructed (planned) international auxiliary language; designed by L. L. Zamenhof, a Polish ophthalmologist, and published in 1887 under the pseudonym “Doktoro Esperanto” (“one who hopes”); designed to be simple, regular, and politically neutral, with vocabulary drawn primarily from Romance and Germanic roots; estimated to have several million speakers worldwide and a small native-speaker community; the canonical example in linguistics of a consciously constructed language (conlang) that achieved real-world use.
- Labov’s sociolinguistics — William Labov pioneered the empirical study of linguistic variation and change; studies on New York City speech and African-American English were foundational.
- Speech community — a group sharing rules for the use and interpretation of speech.
- Linguistic relativity (Sapir-Whorf hypothesis) — the claim that the language one speaks influences or determines how one thinks. The strong version (linguistic determinism) is largely rejected; the weak version (linguistic relativity, i.e., language influences cognition to some degree) has empirical support in limited domains (e.g., color perception, spatial reasoning).
- Whorfian experiments — studies testing whether grammatical or lexical differences (e.g., in color terms, gendered nouns, tense systems) correlate with non-linguistic cognition; results are mixed and often contested.
- Variationist sociolinguistics — Labov’s framework treating linguistic variation as structured and quantifiable; uses socioeconomic class, age, gender, and ethnicity as independent variables to explain variation and predict change.
- Martha’s Vineyard study — Labov’s 1963 landmark study showing that centralization of diphthongs in /ay/ and /aw/ correlated with island identity and resistance to mainland influence; demonstrated that variation is socially meaningful.
- New York City vowels — Labov’s The Social Stratification of English in New York City (1966) documented class-stratified variation and style-shifting, especially for (r) and the raised /oh/; foundational to sociolinguistics.
- Observer’s paradox — Labov’s term for the methodological problem that the presence of an observer changes the language behavior being studied; the goal is to observe vernacular speech.
- Language change from below — changes that originate in the vernacular, below the level of social consciousness, and spread upward; contrasted with change from above (prestige-driven borrowing).
- Accommodation theory — Giles’s model in which speakers adjust their speech toward (convergence) or away from (divergence) an interlocutor’s style; a mechanism of short-term dialect contact.
Writing Systems
- Logographic system — symbols represent morphemes or words rather than sounds; Chinese characters are the canonical example. No writing system is purely logographic.
- Syllabary — symbols represent syllables; Japanese hiragana and katakana are syllabaries. The Japanese writing system combines syllabaries with Chinese-derived kanji (logographic characters).
- Alphabet — symbols represent individual consonants and vowels; the Phoenician alphabet (c. 1050 BCE) is the ancestor of most modern alphabets including Greek, Latin, Arabic, and Hebrew scripts.
- Abjad — a consonantal alphabet that does not write vowels (or writes them as optional diacritics); Arabic and Hebrew scripts are abjads.
- Abugida (alphasyllabary) — each symbol represents a consonant with an inherent vowel; vowel changes are marked by diacritics. Devanagari (Sanskrit, Hindi) is an abugida.
- Featural system — symbols encode phonological features; the Korean Hangul script (created 1443–1444 under King Sejong) is designed so that symbol shapes reflect place and manner of articulation.
- Braille — a tactile writing system for the visually impaired in which characters are represented by raised dot patterns; developed by Louis Braille (France) in 1824, building on an earlier military night-writing system by Charles Barbier; each cell contains up to six dots in a 2×3 grid; adapted to most of the world’s scripts; the standard accessibility writing system for the blind worldwide.
- Cuneiform — the earliest known writing system, developed in Mesopotamia (Sumerian, c. 3200 BCE); a logosyllabic script pressed into clay with a reed stylus.
- Hieroglyphics — the ancient Egyptian writing system; a mixture of logographic, consonantal (abjad-like), and determinative signs.
- Linear B — a syllabic script used to write Mycenaean Greek; deciphered by Michael Ventris in 1952.
- Direction of writing — most scripts are left-to-right (Latin, Devanagari), some right-to-left (Arabic, Hebrew), and historical examples of top-to-bottom (classical Chinese/Japanese columns) and boustrophedon (alternating lines).
Language Acquisition
- First language acquisition (L1) — the process by which children acquire their native language; largely complete by puberty.
- Critical period hypothesis — Eric Lenneberg proposed that there is a biologically determined period (roughly before puberty) during which language acquisition occurs most naturally; after this, acquisition becomes harder and less complete. The case of “Genie” (a feral child studied in the 1970s) is often cited as evidence.
- Babbling — canonical repetitive syllable production (e.g., “ba-ba-ba”) beginning around 6–8 months; a universal stage in infant language development.
- Holophrastic stage — the one-word stage (around 12 months) where a single word conveys a complex meaning.
- Telegraphic speech — two- and three-word utterances omitting function words and morphology; typical around 18–24 months.
- Overgeneralization — applying a regular rule to irregular forms (“goed,” “foots”), evidence that children internalize rules rather than simply memorizing forms.
- Language Acquisition Device (LAD) — Chomsky’s hypothetical innate mechanism enabling children to acquire language from limited input; the theoretical basis for Universal Grammar.
- Input hypothesis (Krashen) — learners acquire language when exposed to input slightly beyond their current level (“i + 1”); influential but contested in second language acquisition research.
- Second language acquisition (SLA) — the study of how people learn additional languages after L1 is established; influenced by age, input, motivation, and transfer from the L1.
- Interlanguage — Selinker’s term for the developing linguistic system of an L2 learner, which has systematic properties of its own and is neither L1 nor target L2.
- Poverty of the stimulus — the argument that children are exposed to insufficient data to induce the complex grammar they acquire, implying innate linguistic knowledge.
- Usage-based acquisition — constructivist alternative to nativist theories (Tomasello); children acquire language through domain-general social-cognitive mechanisms such as intention-reading and pattern-finding, without a language-specific LAD.
- Connectionist models — computational models of acquisition (Rumelhart and McClelland, 1986) that learn grammatical regularities from input via neural networks, without explicit rules; sparked major debate with generativist accounts over past-tense learning.
Key Figures
- Ferdinand de Saussure (1857–1913) — Swiss linguist whose posthumous Cours de linguistique générale (1916) founded structuralism. Key distinctions: langue (the abstract system of language) vs parole (actual instances of use); synchronic (at one point in time) vs diachronic (across time) analysis; signifier (sound-image) and signified (concept) as the two faces of the sign.
- Noam Chomsky (b. 1928) — American linguist who revolutionized the field with generative grammar (Syntactic Structures, 1957); argues for Universal Grammar and an innate language faculty; critiqued behaviorist accounts of language learning in his 1959 review of Skinner’s Verbal Behavior.
- Roman Jakobson (1896–1982) — Russian-American linguist who contributed to phonological theory (distinctive features), structural analysis of poetry, and communication functions; his six functions of language (referential, emotive, conative, phatic, metalingual, poetic) are widely cited.
- Edward Sapir (1884–1939) — American anthropological linguist; documented dozens of Native American languages; proposed (with Whorf) the linguistic relativity hypothesis; a founder of American structuralism.
- Benjamin Lee Whorf (1897–1941) — American linguist and insurance inspector; extended Sapir’s ideas into the strong linguistic relativity claim based on analysis of Hopi.
- Leonard Bloomfield (1887–1949) — American structuralist whose textbook Language (1933) defined the field for a generation; rigorous behaviorist approach.
- William Labov (b. 1927) — founder of quantitative sociolinguistics; showed that linguistic variation is socially structured, not random.
- William Jones (1746–1794) — British philologist who first formally proposed a common ancestor for Indo-European languages in 1786.
- Jacob Grimm (1785–1863) — German philologist (also of the fairy-tale Grimm brothers); formulated Grimm’s Law of Germanic consonant shifts.
- Karl Verner (1846–1896) — Danish linguist who explained the exceptions to Grimm’s Law with accent-conditioned voicing (Verner’s Law, 1875).
- J. L. Austin (1911–1960) — British philosopher of language; developed speech act theory in How to Do Things with Words (1962, posthumous).
- H. P. Grice (1913–1988) — British philosopher; formulated the Cooperative Principle and conversational maxims governing implicature.
- Gottlob Frege (1848–1925) — German mathematician and philosopher; introduced the sense/reference distinction and the principle of compositionality, foundational to formal semantics.
- Pāṇini (c. 4th century BCE) — ancient Indian grammarian who composed the Ashtadhyayi (“Eight chapters”), a systematic, near-complete grammar of Sanskrit using ~4,000 rules in a metalanguage anticipating formal grammar; considered one of the greatest intellectual achievements of antiquity.
- John Searle (b. 1932) — American philosopher who extended Austin’s speech act theory; distinguished five categories of illocutionary acts: assertives, directives, commissives, expressives, and declarations; argued (in the Chinese Room argument) against strong AI.
- Joseph Greenberg (1915–2001) — American linguist who established Niger-Congo and Afroasiatic as families through mass comparison; formulated the implicational typological universals (1963); proposed the Amerind macrofamily hypothesis (contested).
- Kenneth Pike (1912–2000) — American linguist who developed tagmemics and coined the emic/etic distinction: emic descriptions are meaningful within a specific language/culture; etic descriptions are language-universal and observer-derived.
- Otto Jespersen (1860–1943) — Danish linguist; authored A Modern English Grammar and The Philosophy of Grammar; proposed the cycle of language change and the idea that language evolves toward greater efficiency.
- Karl Brugmann (1849–1919) — German Neogrammarian; with Osthoff published the manifesto asserting the exceptionlessness of sound change; major contributor to Indo-European reconstruction.
- Antoine Meillet (1866–1936) — French historical linguist; student of Saussure; advanced the comparative grammar of Indo-European; argued for the social basis of language change.
- Zellig Harris (1909–1992) — American structuralist linguist; teacher of Chomsky; developed distributional analysis and transformational analysis before Chomsky; also pioneered computational linguistics and discourse analysis.
- Richard Montague (1930–1971) — American logician who applied intensional logic to natural language semantics (Montague Grammar); showed that natural language could be treated with the same formal rigor as logical languages.
- Eleanor Rosch (b. 1938) — cognitive psychologist whose research on prototype theory and basic-level categories reshaped cognitive semantics and challenged classical category theory.
- Uriel Weinreich (1926–1967) — American linguist; Languages in Contact (1953) is foundational for bilingualism research; father of Max Weinreich who coined the “army and a navy” quip.