Much has been written about general categories of reading difficulty, but there is little about the detail, particularly with respect to lexis and structure. The need for more information is evidenced by the fact that those who teach and test reading are unable to predict with certainty the difficulties that learners will encounter in particular reading texts.  This article briefly considers ways in which detailed information about text difficulties has been gathered, and proposes an additional strategy that is based on learners themselves reporting problematic extracts from texts of their own choice. An outcome of this strategy, a list of 18 short academic text extracts, is presented as a justification of it, and is briefly analysed for the information that it provides about students’ reading problems.


Why Identify Learner Problems? 

One of the requirements for teaching reading in any foreign language is a good knowledge of the most usual reasons why learners misunderstand texts.  One must know not just general sources of difficulty like linguistic structure, assumed background knowledge, implicitness, etc, but also the circumstances where each of these areas becomes difficult.  Knowing this detail enables the most difficult parts of texts to be identified for teaching or testing, and inspires the creation of learning materials. 

To illustrate the latter, the elucidation some years ago of the nature and importance of textual cohesive devices – pronouns, repetition, use of synonyms, etc – by writers such as Halliday & Hasan (1976) and Widdowson (1978) established cohesive reference questions (e.g. “What does they in line 5 refer to?”) as a staple in EFL reading courses. In my own teaching, a worksheet entitled “Hidden Negatives” resulted from a realisation that learners often extracted the opposite of an intended meaning when negation was lexical rather than grammatical.  My worksheet highlighted the concept of lexical negatives as an alternative to familiar forms like “not”, “no” or “never”, then presented and practised a useful reference list containing items like “bogus”, “mythical”, “tempting to conclude”, and “beyond the bounds of belief”.


What we Know and do not Know about Learners’ Reading Difficulties

Our existing knowledge of common reading problems, however, is incomplete.  This is evidenced by our frequent failures to predict all of the problems that learners encounter in the texts we present to them, and by the fact that what we do expect to be difficult sometimes turns out not to be.  Alderson and Urquhart (1984: xxiii) suggest a number of reasons why the latter might be so, while Bernhardt (1991: 186) speaks of “inappropriate assumptions about what students already know” leading to “misanalyses of student difficulties”.  We might reason that structural errors at least, in reading, ought to be similar to writing ones, so that existing lists of common written errors should allow predictions about reading.  However, Issidorides and Hulstijn (1992) have shown that students often do understand what they struggle to produce correctly themselves.  Urquhart and Weir (1998) believe that the role particularly of grammar in comprehension has never been systematically investigated (p. 255), and they suggest that a receptive grammar of English would fill a major gap (p. 268). 

Our fallibility in predicting reading difficulties exists despite some admirable classifications.  Nuttall (1982) presents four general sources of reading difficulty (code, assumed knowledge, concept complexity and vocabulary), each of which is further broken down, so that “code” for example includes complex noun phrases, nominalization, coordination, subordination, and participle or prepositional phrases as modifiers.  A syntactic classification in Berman (1984) may be summarised as follows


Syntactic Causes of Reading Difficulty (adapted from Berman, 1984) 


Constituent Structure      Structural Items Dependencies[1]
OPACITY Variable syntactic function of lexical items(e.g.  downas preposition, adverb, noun or verb)Unexpected position of adverbial in sentence or clause Deletion (e.g. of “that”)Multiple meaningsIrregular forms Ambiguity of elements within a discontinuous structure (e.g. of “as” in “such … as”)
HEAVINESS Single constituent is extended and informationally dense Long distance between discontinuous elements



Frameworks such as these are certainly helpful, but we have no way of knowing the proportion they represent of all the possible common errors encountered by EFL readers.  Berman herself indicates this in continuing to document unpredicted learner difficulties. 

Ways of Gathering Data about Reading Difficulties 

A major obstacle to compiling a comprehensive list of common reading errors is the fact that reading rarely has a natural observable outcome in the way that writing does, and can be observed only indirectly by such means as answers to comprehension questions. 

Indirect approaches have their failings. Reading comprehension questions indicate common reading errors very inefficiently.  This is because the quantity required to cover every single facet of a reading text is likely to be impractical (cf Urquhart & Weir, 1998: 86), so that the researcher is forced to make choices about what and what not to base questions on, with no guarantee that the right choices will be made.  Indeed, if there were, we might conclude that logically there was no need to identify common errors through questions at all, since we would know already what we were trying to discover! 

As an alternative to comprehension questions, Bernhardt (op. cit.) advocates a “recall protocol procedure”, where students write up for the researcher what they remember about the content of a previously-read text, thus allowing any misunderstandings to show up naturally.  Berman (op cit), on the other hand, tried going through a text sentence by sentence, eliciting students’ understanding of it.  Both procedures proved useful for identifying unpredicted reader difficulties.  Their problem, however, is the time that they still require to produce a reasonable list of difficulty types.  A single session concentrates on a single text, so that the examples of reader difficulty are limited to what is in that text.  Given the very large sample of errors that is needed to identify what is frequently troublesome[2] in texts for readers, there would need to be a very large number of sessions. 

There is an alternative approach which avoids this time problem.  Students could work with texts of their own choice and report reading difficulties that they themselves identify.  This is the approach that I decided to adopt in the context of a first year undergraduate EAP course.  Inspiration came from a procedure first reported by Scott et al (1984) and refined by Walker (1987).  Their aim was simply to expand the amount and variety of student reading without correspondingly increasing the burden on tutors.  They utilised the idea of universal comprehension questions (i.e. identical whatever text the students were reading) to which students submitted a list of answers for each text read.  Most of the questions required students to extract and record main points, but Walker’s “standard reading exercise” also included an exhortation to identify specific problems in the texts which might later be an object of focussed study, with or without tutorial help.  However, Walker appears not to have attempted to record such problems systematically. 

To collate reading difficulties I asked students to quote a single problematic extract from each of three texts submitted, plus an explanation of what was causing the problem.  This promised a much richer variety of problems coming out of a single reading assignment.  The disadvantage, of course, is that students can only report problems they are aware of – and any reading tutor knows that many common reading errors take place without reader awareness at all.  This problem should not invalidate the procedure, however, provided it is complemented by other approaches, like those already described.


Students’ Reports of Reading Difficulty and Issues Raised 

The appendix lists eighteen reading difficulties that students reported – disappointingly short, considering that it was collected over a number of years.  There are two main reasons for this brevity: the infrequency of the data-gathering opportunities that our situation allowed, and the number of samples that we had to reject. We gathered data once a year, because that was how often the course ran and because the means of gathering data (an assessed project) was not repeatable outside the course.  The number of students each time averaged about 20.  The main reason for rejecting responses was that the source of the reading problem was often surprisingly unclear – the student either omitted to mention it, or said something hopelessly vague such as “difficult language”.  Another common reason for rejection was when students simply cited the existence of unfamiliar vocabulary as a cause of the difficulty, despite our request not to because lexical difficulties were not our major concern.  The effect of this shortfall is that the present report is perhaps more useful for the procedure described and issues raised than the reading problems found. 

Before trying to analyse the 18 text extracts in the appendix, a word should be said about the possibility of poor writing being a source of reading difficulty, rather than reader failings.  There is no doubt that in some texts the writer has to take all or most of the blame for the reader’s puzzlement. Academic texts seem to be particularly prone to this failing, perhaps because the academic stature of authors often impresses publishers more than their literary skill.  The main criterion for poor writing has to be the inability of the text to be understood by a recognised expert.  As an example, I, a university study skills tutor, once had an international student who booked individual tutorials with me on a regular basis for the slightly unusual purpose of obtaining my help with his reading difficulties.  Although this student had some fairly major problems with English, his difficult reading extracts were invariably as incomprehensible to me as they were to him.  He turned out to be a much better reader than writer (perhaps caused by his unusual obsession with understanding every part of any text he encountered), who could sort out by himself problems caused by his own weaknesses in English, but was flummoxed by problems for which, though he did not know it, he could not be blamed.  Usually the poorness of the writing was not lexical or grammatical, but information-based – unjustified assumptions about the reader’s existing knowledge or about the recoverability of unmentioned ideas (such as omitted “by” phrases after passive verbs). 

Such stories are well worth passing on in reading classes where student morale is in need of a boost.  However, they do also raise problems for analysing the sort of data listed in the appendix.  We may spend much time identifying lexis and grammar that must have caused the reported difficulty, when in fact the real problem is culpable inexplicitness or complexity (some of the data is certainly challenging even for skilled native speakers).  Asking the students to detail the exact problematic word(s) helps overcome the problem, but not always. 

Many of the listed problems are easily linked to Nuttall’s or Berman’s lexico-syntactic classifications. For example 6, 7, 9, 16, 17 all involve interrupted phrases  (discontinuity);  “heavy” constituent structure is perhaps evidenced in 4, 10, 12, 16, 17; multiple meanings seem a clear problem in 1, 2, 5, 11, 13; and 3, 4, 9, 18 show where variable syntactic function might cause problems.  Example 15 illustrates beautifully how coordination can be a problem. 

Certain individual cases, however, raise interesting issues.  Example 1 may illustrate no more than lexical ignorance.  But it could also provide a valuable reminder of the way cultural background needs constantly to be accommodated.  There is now plenty of evidence that many Far East languages are much more reluctant than English to personify inanimate entities by making them subjects of active verbs.  Master (1991:15), for example, found clear indications of this trend among Japanese speakers.  The Thai student who reported this difficulty may have similarly struggled to accept that “prompting”, something usually done to living beings or machines, could possibly act on anything as abstract as a “need”. 

In example 5 the student seems to be identifying the ambiguity of “state” (condition vs country) as the problem.  But could there be more?  What is also happening here is a failure to use the context to sort out the ambiguity.  The student has missed the opposition between “global” and “individual”, which is emphasised by both “while” and “nonetheless”.  Various possible underlying problems might be posited: failure to recognise cohesion and coherence relations (because of either unfamiliarity with language like “nonetheless”, or lack of general reading experience), Nuttall’s “complex NPs”, or Berman’s “heavy” constituent structure.  Yet the ambiguity of “state” does also contribute to the student’s confusion.  The reason for the particular difficulty reported here might thus be not any one of the commonly-listed errors, but a combination.  Example 4 is similarly complex, involving demanding concepts, complex linguistic structure, and commonly problematic individual items. 

The above analysis brings into question the reliability of the students’ own explanations of their difficulty.  There is clearly a need to treat what they say with caution.  However, there is a danger of taking this too far, so that we ignore what they say in favour of some more sophisticated applied linguistic explanation.  As an example, in No 17 there is a temptation to focus on the anaphoric opacity of “their” as a factor in the student’s confusion, when in fact the student’s report ignores it totally.  The fact that students’ ideas can sometimes bring us back to earth seems very much to be a justification for getting them to tell us about their difficulties, rather than our trying to deduce them independently. 

Example 3 raises a further point.  It illustrates Berman’s structural item opacity, since it arises from multiple meanings of BE (copula vs auxiliary).  The question is whether, for the purposes of listing common reading errors, it is better to generalise in the way Berman does, or to itemise every multiple-meaning structural item that causes reading problems.  We should perhaps do the former if we could be sure that all members of the general class had equal problem potential; the latter would be necessary if the problem potential was variable.  Only empirical investigations of the sort described here can allow comparisons of problem potential.  Certainly before example 3 was obtained the problem potential of BE did not appear obvious. 

A similar point about subclassification also arises from example 13, one of the more surprising reports.  The lexical ambiguity here (classes in general vs social classes) is different from that occurring in, say, example 1 (meanings of “to prompt”).  Of the two possible meanings of “class”, one is a hyponym (subcategory) of the other.  Since this was a surprising difficulty, it might make sense to list this sort of ambiguity separately as a reminder to look out for it in future reading texts.



Readers might wish to continue for themselves the examination of the 18 student reports, in order to decide the usefulness of this approach to understanding the linguistic difficulties of reading texts.  It is clear that only a few new insights can be gleaned from such a small sample, but those that can are exciting and motivate further study.  The greatest need if similar studies are continued is to generate enough samples to enable us to separate common errors from idiosyncratic ones, and to decide the right level of generality at which to describe them.


Reading Difficulties Reported by Students


1.         … says that two reasons in particular prompted the need for …


REPORT:  “Prompt” is a word I know from computing but seems to have a different use here. (Thai). 

HYPOTHESIS:  Prompting is literally something that happens to living things.  To make its object an abstract idea like need may be a figurative usage that is outside the experience of people from the Far East. 


2.         … the monolingualism of the scientific community.


REPORT:  Confusing use of “scientific”. (Thai) 

HYPOTHESIS:  Student may be familiar with only one of the meanings of scientific (= “done in a scientific way”, as in “scientific approach”).  Here the meaning is “associated with science”. 


3.         … the most apt term could be determined, or driven, or focused.


REPORT:  Doesn’t make sense. (Ghanaian). 

HYPOTHESIS:  Ambiguity of BE (copular vs. auxiliary).  It is copular here, but is probably being interpreted as an auxiliary because the next word ends -ed. 


4.         An important concern in decoding images should be that of undermining the ways in which dominant forms of visual representation reduce complex issues … to a few “recognisable” aspects which appear to constitute an acceptable totality”.


REPORT:  two relative clauses, and the verb “reduce” has two main objects.  (Japanese). 

HYPOTHESIS:  The student’s own analysis may be correct (except that one of the “objects” – aspects – is actually a complement of the object issues). Also noticeable are (a) polysemy of concern (= “desire”, not “worry”); (b) double negative (undermine … a few); and (c) structural complexity of the object of undermining.


5.         While concern in that first environmental crisis was with global collapse, the appropriate focus for social action and political pressure was nonetheless seen to be the individual state.


REPORT:  ” … it is quite hard to get the meaning, especially the words ‘individual state’ which are easily misunderstood as they are a simile (sic) here but cannot define (sic) as the original meaning”. (Cantonese) 

HYPOTHESIS:  Polysemy of state (political entity vs condition), compounded by opaque opposition between global and individual state  (long noun phrases?).



6.         (Sentence 1 in a paragraph introduced the topic of the succeeding few paragraphs, but the rest of that paragraph gave only background information to the topic.  The topic was picked up again in a later paragraph).


REPORT:  Unclear link between first sentence of a paragraph and the rest of that paragraph (Cantonese). 

HYPOTHESIS:  Student is not aware of the existence of topic sentences that introduce many paragraphs rather than just their own.  This is perhaps a phenomenon of discoursal discontinuity. 


7.         One new product in the grocery trade out of seven survives to the third year.


REPORT:  ‘The sentence “trade out of seven survives to the third year” is not straightforward for me to grasp’ (Nigerian). 

HYPOTHESIS:  Unfamiliarity with the expression “one out of n” compounded by the complexity of the noun phrase after one. 


8.         These skills have hitherto been the domain of what are called typographers.


REPORT:  “Difficult phrasing” (Nigerian). 

HYPOTHESIS:  (a) Non-transparency of figurative phrase “be the domain of” (= belong to); (b) Unfamiliarity with adjectival “what” clauses. 


9.         Marketing is a philosophy of running a business that should dominate every major decision”.


HYPOTHESIS:  Student summarised as “Marketing is an idea of running a business”.  Possible problems are (a) lexical misunderstanding (equating philosophy with idea); and (b) failure to recognise the qualifying nature of the “that” clause.  Possible reasons are discontinuity of head-relative construction (philosophy … that …); and variable syntactic class (demonstrative/relative) of that. 


10.       For example, if success and failure are to be judged solely by the quality of legislative reform …, the conclusion has to be that recognition of animals as anything other than commodities that exist mainly for human benefit has been limited”.


REPORT:  “It is difficult to see a link between legislative reform and animal welfare” (Indian bilingual). 

HYPOTHESIS:  Implicitness of information is probably a factor, since the reader has to recognise a link between recognition and legislative reform.  Added to this is the distance between recognition and its complement limited, as well as the double negative other than commodities/limited (other than = “not”, limited = “not much”)


11.       There is general agreement that the human number 2 chromosome is the product of the fusion of two chimpanzee chromosomes with many other differences between the chromosome complements in the two species being due to …”.


HYPOTHESIS:  Structural ambiguity of with. It could easily be taken as any of the following: (a) introducing an adjectival phrase describing the preceding noun chromosomes; (b) introducing an adverbial phrase relating to the earlier verbal noun fusion; or (c) introducing (in combination with the following –ing verb being) an adverbial adjunct similar in meaning to “and … are”.  Logic, real-world knowledge, and familiarity with the slightly unusual structure (c) seem the most helpful for recognising (c) as the correct interpretation. 


12.       The speed with which countries have accomplished an industrial revolution varies, and there is debate as to the exact chronology of events in particular instances”.


REPORT:  “Sentence is long and has difficult new words, especially from “industrial varies” to the end.”  (Turkish) 

HYPOTHESIS:  There are various factors: (a) the subject of varies is long; (b) The word instances is an unexpected synonym of countries; and (c) the link between events and revolution depends on real-world knowledge.



13.       Bourdieu’s main argument is … that culture is used to distinguish among classes and fractions of classes, and to disguise the social nature of these distinctions by locating them in the universals of aesthetics or taste.”


REPORT:  I do not understand the use of “universals” and the meaning of “classes”.  (Danish). 

HYPOTHESIS:  The writer assumes the reader knows the particular modifiers that apply here to these normally transparent words, something like “human” universals and “social” classes.  Such an assumption may be standard in the discourse of his academic discipline, or clear from the surrounding text. 


14.         At the heart of the debate about globalization is an argument, hidden in a description, informed by a utopian projection.


REPORT:  “I do not distinguish which the subject and the predicate in a sentence are.”  (Japanese). 

HYPOTHESIS:  This is an existential sentence without “there” (because of the initial adverbial phrase?).  The absence of “there” might remove the clue that the subject comes after the verb.  The participles hidden & informed might be taken as the main verbs. 


15.       Minimal risk … is defined as anticipated risks that are no greater than those ordinarily encountered in daily life or during the performance of routine physical or psychological tests or procedures.


REPORT:  “Too many or‘s.”  (Japanese) 

HYPOTHESIS:  There are three (!) different lists or alternatives.  They require conscious processing to sort out what belongs to which list in the sentence.= 


16.       Of equal … importance is the broader issue of the effects of what the information media communicate on individuals and on society.


REPORT:  “I had to read it several times very concentrated in order to understand it as I had at first the impression that the punctuation was incorrect or the sentence was incomplete.” (German) 

HYPOTHESIS: Here on could link grammatically with either effects or communicate.  The nearness of the latter (which is the wrong interpretation), together with the fact that many learners are unfamiliar with “effects of … on …”, may lead to the right linkage being missed.  The use of a “what” clause as a noun phrase after one of the prepositions is probably also a factor.



17.       However, there is no doubt that heat conservation was behind the abiding preference for thatched roofs, despite the fire risks which led municipal authorities to forbid their use, within urban areas at least.


REPORT:  “Difficult because the second part consists of additions and a relative clause which was slightly confusing at first sight.”  (German). 

HYPOTHESIS:  Consecutive preposition phrases at the end of the sentence, each preceded by a comma, creates ambiguity.  Does the final within phrase go with preference (making the despite phrase a parenthesis), or does it go with forbid their use? 


18.       We have to get away from this assumption the industry can deliver everything everybody wants immediately.


REPORT:  “I thought this assumption was the object of can deliver.  I couldn’t see how the last part fitted in.”  (Mandarin). 

HYPOTHESIS:  Student either failed to see the deleted reporting “that” after “assumption”, or took it to be a relative pronoun.  The juxtaposition of everything everybody may also be relevant.



Alderson, J. C. & Urquhart, A.H. (eds.). 1984.   Reading in a Foreign Language.        Harlow:   Longman.


Berman, R.A.  1984.  “Syntactic components of the foreign language reading process”  in J.C. Alderson and A.H. Urquhart (eds).


Bernhardt, E.B. 1991.   Reading Development in a Second Language:  Theoretical,  Empirical and Classroom Perspectives.    New Jersey:  Ablex.


Halliday, M. & Hasan, R. 1976.   Cohesion in English.   Harlow:  Longman


Master, P.  1991.  “Active verbs with inanimate subjects in scientific prose”.   English for Specific Purposes 10(1): 15-33.


Nuttall, C.  1982.   Teaching Reading Skills in a Foreign Language.   London:  Heinemann.


Scott, M., Caroni, l., Zanatta, M., Bayer, E. & Quintanilha, T.  1984.   “Using a standard exercise in teaching reading comprehension”.   ELT Journal  38.2: 114-120


Urquhart, S. & Weir, C.  1998.   Reading in a Second Language: Process, Product and  Practice.   Harlow:  Longman.


Walker, C.  1987.   “Individualising reading”.  ELT Journal  41,1: 46-50.


Widdowson, H. G.  1978.   Teaching Language as Communication.  Oxford:  Oxford University Press.



    [1]Dependencies involve phrases whose words are not all next to each other, e.g. so … that (such phrases are also sometimes called “discontinuous”)

    [2]We need to identify frequent errors in order to filter out errors caused by something other than inherent difficulty of the language in question, such as errors caused by language not previously encountered by the reader.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.