68. How Computers Get Grammar Wrong (1)



Word processors sometimes say correct grammar is wrong, especially when interrupted structures or ellipsis are involved



Most computer word processors include a grammar-checking facility to help writers find and correct their own grammar mistakes. However, this kind of aid is nowhere near as reliable as spellchecking software. It often misses grammar mistakes altogether (see 138. Test your Command of Grammar), or it underlines wording that is perfectly grammatical. In the latter case, many writers who cannot see what they have done wrong understandably choose to accept the computer’s suggested rewording, trusting that their own poor understanding of grammar is the cause of their confusion.

Recognising the failings of grammar-checking software is an important way of gaining confidence as a writer, and it also helps to prevent computer companies from having an undue influence on language use. In this and the next post, I wish to present a number of examples of computer software wrongly criticising English grammar usage, and to offer some general guidelines on when this is likely to happen



There seem to be a variety of reasons why computer word processors incorrectly criticise written grammar. Two of the most important seem to involve interrupted structures and ellipsis.


1. Failure to Recognise an Interrupted Structure

Interrupted structures (my own term – technically they are called “discontinuity”) are examined in detail in the Guinlist post 2. Interrupted Structures. They are partner words split by other words that do not belong to the partnership. For example, in the phrase the industrialization/food security conundrum in China, the word the is a partner of conundrum, and not of any other noun in the phrase, the reason being the English grammar rule that the before two or more nouns placed directly next to each other “goes with” the last of them (see 38. Nouns Used Like Adjectives). Hence, the “structure” the … conundrum is “interrupted” by the words industrialization/food security.

This particular interrupted structure is one of many that are not prone to underlining by my word processor. However, for some reason, certain others are. Consider this:

(a) Learner motivation may occur because of the possibility mentioned above that learners can enjoy reading aloud.

The word sometimes given coloured underlining here is learners, the proposed “correction” being to remove –s. The explanation is that “a noun and the words that modify that noun must agree in number”. This means the computer thinks that is partnering (“modifying”) the noun learners – that it is the singular of adjectival those – and hence that learners should mirror its singular form.

The true partner of that is the earlier noun possibility. That and the words after it are describing possibility in an adjective-like way. They have to come after it and not before because they are multi-word (see 109. Placing an Adjective after its Noun). They are needed instead of the more common of + noun (see 78. Infinitive versus Preposition after Nouns) because they contain a verb with its subject.

This is a completely different kind of that from the singular of those, even being pronounced differently (with /ә/ instead of /æ/). Its accompaniment by its own subject and verb in fact means it is a conjunction. Describing an earlier noun is one of many uses that it has (see 153. Conjunction Uses of “that”).

The interrupting words here – the probable cause of the computer’s error – are the participle phrase mentioned above. They are also clarifying the meaning of possibility (see 52. Participles Placed Just After their Noun). They seem to have made the computer program “forget” possibility by the time it comes to that, so that the link with learners is made instead.

Other examples of interrupted structures confusing the computer are:

(b) Catholic worshippers in those pre-Vatican II days had to answer in Latin.

(c) Prepositions have to be used in sentences with a noun, often called their “object”, that is usually positioned straight after them.

In (b), the computer wants those to be that because Vatican II is singular. It appears to miss the fact that Vatican II is a noun used like an adjective and hence not the one that that goes with (that/those goes with the last noun in a group – days – just like the). Perhaps the presence of the Roman numeral II is a factor in this computer error, but the interruptedness of the structure may contribute too.

In (c), the computer points out that the relative pronoun that cannot have a comma before it. This is generally true (see 34. Relative Pronouns and Commas), but not here because of the interrupting words often called their object. These words form a parenthesis of the kind that needs to be surrounded by commas, separating that from the word it really partners (noun not object). Other examples of a parenthesis making a comma necessary where one would not normally go are in the Guinlist post 50. Right and Wrong Comma Places.


2. Failure to Recognise Ellipsis

Ellipsis is leaving “obvious” words unsaid. It is considered in some detail within this blog in the post 36. Words Left Out to Avoid Repetition. It appears to give problems to computers just as it can to readers of English. The following ellipsis-containing sentences were all wrongly questioned by the word processor on my computer:

(d) One of these housed the chapel, another a library.

(e) Such an argument is of course highly subjective and thus open to dispute.

(f) Initially, “sweetshop” was a pile of sweets that my father would sometimes buy as a treat and invite each of us to choose from in turn.

The suggested correction of sentence (d) was to remove either another or a. The computer failed to recognise the unmentioned repetition of the verb housed between them, of which another (the pronoun) is the subject and a library is the object. It thought that the writer was trying to use both another (the adjective/determiner) and a with the same noun library – a grammatically impossible combination (see 110. Nouns without “the” or “a”).

The advice for sentence (e) was to add -s to open. The computer thought open was a verb when in fact it is an adjective. The problems given by words like OPEN, which can be either a verb or an adjective, are considered in detail in the posts 66. Types of Passive Verb Meaning and 140. Words with Unexpected Grammar 2 (problem #[f]). These words can be identified as either a verb or an adjective in a text by their form after the verb BE: if they have an ending (-ing or -ed) they are verbs, while without one they are adjectives.

The word open in (e) is in a combination with BE and lacks an ending, so it must be an adjective. However, the BE form in question – the word is – is left out (it should appear either before or after thus) because it is a repetition of an earlier is in the sentence. The reader should be able to recognise the repetition from the word and, a common accompaniment of ellipsis.

In sentence (f), the computer also wanted -s to be added. Here, although the word in question, invite, is a third-person singular verb, ellipsis explains why it still cannot have -s. The ellipted word is would, a “modal” verb that needs a verb without -s after it (see 148. Infinitive Verbs without “to”). It is recognizable from the earlier verb would … buy, the link again being the conjunction and.

Interrupted structures and ellipsis appear to be two especially important reasons why word processors wrongly analyse grammar.  For some others, see 69. How Computers Get Grammar Wrong (2).


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s