Word
Grammar
A brief introduction for graduate students
by Richard Hudson
Last changed 12 April 2008
Back to Welcome
(e.g. for download version)
Scope and main distinctive ideas
Main headings: lexical relations - morphology
- syntax - lexical semantics
- combinatorial semantics
The main idea of WG is that language is a cognitive network - a network
of concepts for all the elements of a linguistic analysis such as words, phonemes,
relations, meanings, etc. So:
-
the theory of language is embedded in a (fairly uncontroversial) theory
of knowledge, and stresses the similarities and links between language
and other kinds of knowledge;
- language turns out to be a somewhat separate sub-network, but probably
not a distinct module (certainly not in the strict sense of Fodor);
-
the same theory applies to all aspects of language: syntax, morphology
and semantics have been quite thoroughly developed, but phonology is more
or less untouched;
-
as in general knowledge, the basic logic is multiple default inheritance.
These ideas are easiest to explain in lexical
relations and in morphology, but they
also apply to syntax where they are much more controversial.
They are also less controversial in lexical semantics
(where there is no clear orthodoxy) than in combinatorial
semantics.
Lexical relations
A word is obviously and uncontroversially a node in a network of formal and semantic
relations which link it to a variety of other words; for example, the word one
is linked by form to the word won or wan (depending on accent) and
by meaning to the word you (think of One has to laugh, meaning the
same, except for style, as You have to laugh). This little network is shown
in Figure 1.

Figure 1
- This diagram is typical of WG notation, which consists of nodes
connected to each other by arrows, with labels attached to the
arrows (here 'meaning' and 'form') as well as to the nodes.
- Like other WG networks it constitutes a grammar because it sanctions
any structure in which those nodes and arrows carry those labels - e.g. this
network sanctions a structure in which ONE means '1' and has one of the forms
indicated.
- Because linguistic structures, e.g. the structures for particular words,
are sanctioned by a network they themselves must be networks; for example,
the structure of any example of ONE is a little network which links this word
to its meaning and form (as well as to other nodes such as grammatical word
classes). Figure 2 shows this structure for a particular token which we call
one (contrasting with the lexeme ONE). We shall see that the same is
as true of syntax as it is of lexical relations.

Figure 2
This figure introduces another basic idea of WG: that information is generalised
by means of default inheritance down an is-a hierarchy.
- The diagram includes two extra links labelled 'is-a', meaning that
ONE is-a pronoun and that one is-a (i.e. is an example of) ONE. Is-a
links are very common and very important, so they have a special kind of line
with a small triangle resting on the super-category. The label 'is-a' is redundant
and will be omitted in later diagrams.
- Default inheritance transmits properties down the is-a hierarchy
by default (i.e. unless they are blocked by more specific properties): so
because one is-a ONE, and the meaning of ONE is 'people', the meaning
of one is also 'people'; and likewise for their forms and word classes.
(This is why the structures for individual linguistic items such as word tokens
or sentences must have the same network form as the grammar itself.)
Morphology
Morphology illustrates three other important characteristics of WG:
- how defaults are overridden by exceptions,
- how relations can be sub-classified,
- how one node may is-a several super-categories at the same time and inherit
from all of them (multiple default inheritance).
Figure 3 gives a somewhat simplified analysis of English past tenses, showing
the default regular pattern and the exception of went which overrides
it.

Figure 3
- The default pattern shows that a past-tense verb's 'whole' (its complete
word-form) consists of its stem plus its suffix {ed}; the whole
is a word-form which contains these two morphemes as its parts. The relations
'whole', 'stem' and 'suffix' are all special cases of the relation 'form'
used in Figure 1, so they each is-a form. These is-a relations between
relations are shown in Figure 4, along with the hierarchy for 'part' (the
relationship between the entire word-form and its parts).
- The default applies by inheritance to any past tense verb, such as SOW:past
(the past tense of SOW); but it is overridden in the case of GO:past,
whose value for 'whole' has to override the default value for the whole of
past because GO:past is-a past.
- A past-tense verb normally inherits from two super-categories: its lexeme
and the inflectional category; e.g. SOW: past inherits its stem from SOW and
its suffix from past. This is multiple default inheritance.

Figure 4
The main theoretical points that we carry forward from lexical relations and
morphology to syntax are:
- Both the grammar and individual linguistic items that we analyse have the
form of a network. I haven't tried to define 'network', but the examples
show that a network is more complex than a phrase-structure tree - nodes may
be connected to multiple nodes in all directions.
- Not only nodes but also the links between them are classified in is-a hierarchies
which allow multiple default inheritance.
Syntax
The most controversial idea in WG syntax is that phrase structure is redundant
because all its work can be done, and done better, by means of dependencies
between individual words. (Coordination is an exception - see
below.) For example, Figure 5 gives the complete syntactic analysis of the
relations among the words in the sentence Syntactic dependencies make phrase
structure redundant. (The term 'sharer' is the same as the traditional 'complement',
and corresponds to XCOMP in LFG and PRED in HPSG; the idea is that structure
doubles up as object of make and subject of redundant, so the
word redundant 'shares' this word with make.)
Figure 5
This kind of syntactic analysis follows naturally from the general theory established
so far:
- Each word in the sentence is in the centre of a small network of
links to other words, and these networks combine into a network for the whole
sentence. Notice that this network is not equivalent to a phrase structure
tree because of the double dependency link between structure and the
words make and redundant. This is structure sharing as
in HPSG, where one element provides the value for two different attributes.
- The network is generated by inheritance from the grammar network
which sanctions each of the individual dependencies; for example, the dependencies
round make are sanctioned by the part of the entry for MAKE which allows
it to have a subject, an object and a sharer. This is a constraint-based
grammar, though it allows defaults to be overridden.
These two points balance each other: the network basis allows an unlimited
range of possible inter-connections, of which the grammar only sanctions a limited
subset. As in other theories, the notation actually puts very few limits on
possible structures, so linguistic universals are expressed as limits on what
grammars can allow (e.g. rules for raising but not for lowering - see
below).
Some arguments for dependencies:
- Any rule which relates two words implies a dependency between them:
- one depends on the other for its choice of word class, inflection or
lexical item (government or agreement);
- one takes its position from the other;
- one modifies the meaning of the other.
- Typically these different kinds of dependency coincide - e.g. if one word
governs the inflectional class of the other, the latter takes its position
from the former and modifies the meaning of the former. In other words, grammatical
dependencies are usually cluster concepts. For example, because dependencies
in Figure 5 above is the subject of make, we can predict several facts
about it:
- make agrees with it;
- dependencies precedes make;
- in semantic structure, dependencies provides the 'maker' of the
verb's meaning.
- Almost all modern theories of syntax accept endocentricity: every
phrase has a single head which determines the characteristics of the entire
phrase. 'Dependency' is just the name for the relation between all the other
words in the phrase and its head word (more accurately: between the heads
of all the other phrases within the phrase and its head word).
- It is uncontroversial that many languages are consistently head-final or
head-initial. In such a language we must be able to generalise across all
dependents, so the notion 'dependent' must be available. (It is not enough
to have a collection of more specific categories such as 'complement', 'specifier',
'adjunct'.)
Some non-arguments for phrase structure:
- Phrasal nodes are necessary for sub-categorisation; e.g. a verb's
subject must be an NP, not just a noun, because mere nouns are not allowed
(e.g. *Boy came).
- No: the subject is just a word, but (of course) this word's syntactic
requirements must be satisfied; since boy always needs a determiner,
it needs one when it is subject. It is redundant to require dependents
to be maximal projections.
- In any case endocentricity means that the head word carries precisely
the same information as a mother node would (apart from bar levels, which
aren't needed).
- Very often a head word selects a particular lexical item (e.g. a particular
preposition) as a dependent - e.g. rely selects on, cope
selects with; in this case a phrase node (PP) would just get in
the way because it would not distinguish one lexical head from another.
In short, it is more explanatory to say that rely needs on
than to say that it needs a PP whose head is on.
- Phrasal nodes are necessary for the semantics; e.g.the phrase syntactic
dependencies carries a meaning which is not available on the individual
words, and the whole clause's meaning is different again.
- No: Remember the word dependencies in Figure 5 is distinct from
DEPENDENCY: plural; the relation between them is that the former is-a
the latter. Consequently the former can have a meaning which is distinct
from the latter - thanks to the presence of syntactic, the word
dependencies in this sentence actually means 'syntactic dependencies'.
Similarly, the word-token make carries the meaning of the whole
sentence, which is a specific example of the general meaning of the verb
MAKE. In every case, the phrase's meaning is the meaning of its head word.
- Moreover, dependencies can be added one at a time, so a standard phrase
structure for syntax can be matched in the semantics even though the syntax
is completely flat. For example, Figure 6 shows how easily a semantic structure
can be added to a dependency structure and how it automatically shows the
kind of bracketing you find in a typical phrase-structure analysis; this
is called 'semantic phrasing'. Notice
that every syntactic dependency is matched by a dependency in the semantics,
and that every semantic dependency defines a new concept which is a hyponym
of the head-word's meaning. (For more on semantic phrasing see Combinatorial
semantics.)

Figure 6
- Phrases move as single units, so we must be able to move the entire
phrase rather than just one of its words - i.e. rules are always 'structure-dependent';
e.g. if the subject moves round the verb, it is the entire subject phrase
that moves, not just its head word.
- No: dependencies guarantee that phrases will be intact because each
word always takes its position from the word on which it depends. If we
move the head word, all the other words that take their position from
it will move with it.
- We can make a simple distinction in dependency analyses between dependencies
that are relevant to word order and those that are not: a word's 'highest'
dependency is, while all others are not (i.e. words can raise
into higher structure but they cannot lower). For example, in Figure 5
structures depends on both makes and redundant, but
only the former dependency is relevant to word order (because redundant
depends on makes so the latter is 'higher' than the former). Consequently,
structures follows the word-order rules for objects rather than
those for subjects, hence the possibility of stylistic reordering so that
a long object can stand after the sharer/complement : ... makes redundant
phrase structure of the familiar kind with ....
- In any case, some phrases do not move around as single units; e.g. in
German it is possible to move a participle or infinitive to the start
of the clause without moving the rest of the 'verb phrase': Paul hat
einen Apfel gegessen (Paul has an apple eaten i.e. Paul ate an apple)
can change to Gegessen hat Paul einen Apfel. This is easier to
explain in terms of dependencies than in terms of phrases because we can
say that any dependents of the participle may 'raise' so that they are
also dependents of the auxiliary. This choice is made separately for each
dependent rather than once and for all for the whole phrase.
- Phrases are justified by coordination,
because the conjuncts of a coordinate structure must be complete phrases.
- No: conjuncts are often less than a complete phrase; e.g. in I drink
coffee at 11 and tea at 4, the conjuncts are coffee at 11 and
tea at 4, neither of which is a phrase.
- In WG coordination is handled in terms of a kind of constituent structure
which is much more primitive than phrase structure: continuous strings
of words whose only unifying factor is that all 'external dependencies'
apply equally to all conjuncts. They are shown by brackets: two or more
[conjuncts] inside {a complete coordinated structure}. Figure 7 gives
the structure for the above example. [Back to Syntax.]

Figure 7
- Phrase structure is needed in order to define the c-command relation,
which in turn is needed in order to define possible relations between anaphors
(e.g. reflexives) and their antecedents: an anaphor must be c-commanded by
its antecedent; e.g. John hurt himself but not *Himself hurt John.
C-command is defined specifically in terms of the dominance relations which
exist in phrase structure but not in dependency structure; in particular,
subjects and objects are asymmetrical because the former c-commands the latter.
- This one is harder to dispose of because it is true that dependency
structures show no structural asymmetry between subjects and objects:
both are simply dependents of the verb. However semantic phrasing (defined
above) does make them asymmetrical if we require verbs to combine semantically
with their objects before their subjects. (This is probably necessary
anyway in order to explain why there are so many idioms that consist of
a verb and its object, but not which consist of subject + verb with a
variable object.) The meaning of hurt himself would then be 'x
hurt x', with x ready to be bound by the subject.
- In any case, others have pointed out that c-command may not be the relevant
relation for examples like John talked to Mary about herself or
For himself, John bought a book.
- Phrase structure is needed in order to distinguish dependents. For
example, objects are sisters of the verb whereas subjects are sisters of the
verb-phrase (e.g. V').
- No. Dependents can be distinguished more efficiently by means of labels,
as in WG. This is more efficient because it allows classification and
generalisation of the relations. For example, we saw above that some languages
have (more or less) consistent head-final (or head-initial) word order,
and that at least in these languages it is important to be able to generalise
across all dependents; but even in these languages subjects, objects,
adjuncts and so on also need to be distinguished. There is no natural
way to achieve both of these goals in terms of phrase-structure geometry,
but it is easy if one relation may 'be-a' another. Figure 8 gives
a simplified analysis of the dependency relations for English, where:
- valents are head-licensed, adjuncts are self-licensed;
- pre-dependents precede the head word, post-dependents follow it;
- subjects and complements are-a valent, but one is-a pre-dependent,
the other post-dependent;
- pre- and post-adjuncts both are-a adjunct, once again differing
between pre- and post-dependent.
- Classified dependencies in syntax are very similar to the classified
semantic roles (theta roles) that most linguists assume in semantics.
If they are Ok in semantics, why not in syntax too?

Figure 8
Lexical semantics
Lexical semantics is inseparable from the study of encyclopedic knowledge because
the sense of a word such as DOG is the same everyday concept Dog which we use
when dealing with dogs - recognising them, thinking about them, running from
them etc. Like other concepts, this is represented as a node linked to other
nodes by arrows, but the other nodes in this case are mostly not words but other
non-linguistic concepts - the concepts for mammals, kennels, barking, meat,
etc. Some of these concepts are sensory, e.g. schematic images, smells, sounds
and so on. Figure 9 shows a few of the multiplicity of relations that, in combination,
define the concept Dog. The network structure of WG makes it ideally suited
to this kind of lexical semantics, where encyclopedic concepts are used directly
as word-meanings.

Figure 9
Some words have meanings which may be more 'linguistic' than 'encyclopedic'
in the sense that we may never use them except when speaking and listening -
i.e. they involve Slobin's 'thinking for speaking'. For example, the verb RIDE
refers to Riding, a super-category which embraces travel on a horse or a bicycle
but not a car (compare He rode the horse/bicycle/*car to the station.).
Although this concept is clearly important when we are speaking, it may not
be relevant otherwise; for example, there are other languages (e.g. German)
which have no word for this concept, and yet there is no reason to think that
Germans 'think differently' in matters of transport.
The main features of WG lexical semantics are these:
- Word meanings are concepts which are more or less tightly embedded in ordinary
encyclopedic knowledge.
- Concepts are defined by their relations to other concepts - there is no
'definition' apart from these relations. All the characteristics of a concept
are defined in terms of relations to other concepts.
- Concepts are applied by the logic of multiple default inheritance, so each
concept really stands for 'the typical ...'. For example, Dog (the typical
dog) has four legs and is about a foot high, but three-legged and four-foot
high dogs can be accommodated as exceptions.
- Most of the concepts that are referred to in lexical semantics are the meanings
of words, so lexical semantics is a network of relations among word-meanings.
- The relations involved go well beyond the tradition 'lexical relations'
of semantics: hyponymy, synonymy, etc. For example, a dog is the 'barker'
of Barking and the 'inhabitant' of Kennel.
Combinatorial semantics
Combinatorial semantics explains how dependents modify the meaning of the head
word. As explained earlier, each dependent
defines a different concept which is-a the sense of the head-word, and if the
various dependents build on each other's definitions the result is 'semantic
phrasing'. For example, if loves means X Loving Y, then loves Mary
means X Loving Mary and John loves Mary means John Loving Mary, which
is-a X Loving Mary and X Loving Y. (Verb meanings are given as gerunds so as
to separate them from the effects of tense and mood.) Because the effect of
the dependents is shown on the meaning of the head word itself, the latter carries
the meaning of the entire phrase.
Each concept in the semantic structure is thus defined by its is-a link to
some more general concept plus its relationship to the meaning of one dependent;
for example, the concept X Loving Mary is defined by its is-a link to X Loving
Y, plus its relationship to the concept Mary. This semantic relationship reflects
the syntactic dependency between the words concerned, and in general there is
a close match between the syntactic dependencies and the semantic relationships.
However a syntactic dependent may correspond to at least three patterns in semantics:
- a semantic subordinate - i.e. a distinct concept in a specified semantic
relationship, e.g. Mary is the 'object-of-affection' in loves Mary;
- is-a - i.e. the word's meaning is-a the meaning of its dependent, e.g. will
love (where love depends syntactically on will) refers to
a single event, defined as 'Loving in the future' (i.e. a special case of
Loving, whose time follows Now);
- a semantic superordinate - i.e. a 'predication' about the head-word, e.g.
John will obviously loves Mary refers to an example of John
Loving Mary in the future which is obvious. Figure 10 shows the relevant structure.

Figure 10
The main points to notice about this semantic analysis are these:
- Thanks (in part) to the network notation, it is totally integrated into
the rest of the linguistic analysis as well as into a (potential) analysis
of world knowledge.
- The semantics is totally combinatorial and local because each word's contribution
is defined in relation to its dependent and/or the word on which it depends.
(Each word is linked to its sense by a straight arrow.)
- The deictic semantics of tense is handled by relating the time of the event
to that of the word itself - i.e. the time of utterance.
- Each concept is defined by its relations to other concepts, so in principle
the node labels carry no analytical information at all - they are just mnemonics.
The major gap in WG semantics is the treatment of quantifiers, where work so
far has been rather tentative.
History of Word Grammar
WG has been developing since the early 1980's
- The first version is described in Hudson 1984.
- Many changes since then are described in more recent publications (Hudson
1990; Hudson 2001).
It grew out of 'Daughter Dependency Grammar' (Hudson 1976), which grew
out of a mixture of:
-
Systemic Functional Grammar (Halliday 1985, Hudson 1971)
-
Dependency Grammar (Anderson 1971; Tesnière 1959)
-
Generative Grammar (Chomsky 1965)
Other theories
WG can be compared with the following alternative theories:
References
Anderson, J. (1971). The grammar of case: towards a localistic theory.
. Cambridge: Cambridge University Press.
Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, MA:
MIT Press.
Goldberg, A. (1995). Constructions. A Construction Grammar Approach to
Argument Structure. Chicago: University of Chicago Press.
Halliday, M. (1985). An Introduction to Functional Grammar. London:
Arnold.
Hudson, R. (1971). English Complex Sentences. An introduction to systemic
grammar. Amsterdam: North Holland.
Hudson, R. (1976). Arguments for a Non-transformational Grammar.
. Chicago: Chicago University Press.
Hudson, R. (1984). Word Grammar. Oxford: Blackwell.
Hudson, R. (1990). English Word Grammar. Oxford: Blackwell.
Hudson, R. (2001). Encyclopedia
of English grammar and Word Grammar.
Langacker, R. (1990). Concept, Image and Symbol. The Cognitive Basis of
Grammar. Berlin: Mouton de Gruyter.
Pollard, C. and Sag, I. (1994). Head-Driven Phrase Structure Grammar.
Chicago: Chicago University Press.
Tesnière, L. (1959). Éléments de syntaxe structurale.
Paris: Klincksieck.
Comparison with other theories
| |
|
|
|
|
| WG |
Chomsky |
HPSG |
Cognitive Grammar
|
Construction Grammar |
| Language is a cognitive network. |
no |
no |
yes |
yes |
| Multiple default inheritance is used. |
no |
yes |
yes ('schematicity') |
disputed |
| Lexical and general facts are in the same hierarchy. |
no |
yes? |
yes |
yes |
| Syntactic structure is separate from semantic. |
unclear (Is LF semantics?) |
no |
no |
yes |
| Syntax is monostratal. |
no |
yes |
(yes) |
yes |
| Dependencies, not PS, are basic in syntax. |
no |
maybe |
maybe |
no |
| Syntax allows structure sharing. |
yes (traces) |
yes |
no |
no? |
| Syntactic dependencies are labelled. |
no |
yes |
no |
yes |
| All relations are classified hierarchically. |
no |
no |
no |
no |
For more information about WG
A great deal more information is available on the WG
home page. This includes:
- a freely downloadable Encyclopedia of English Grammar and Word Grammar,
- a variety of articles on WG, including several that are general introductions
to the theory,
- some chapters from the most recent monograph about WG, English Word Grammar,
- some teaching materials from two lecture courses for undergraduates.