Readability formulas

Coh-Metrix & Coh-GIT methods timeline

 

Readability measures are primarily based on factors such as the number of words in the sentences and the number of letters or syllables per word (i.e., as a reflection of word frequency). Two of the most commonly used measures are the Flesch Reading Ease formula and the Flesch-Kincaid Grade Level.

 

Flesch reading Ease

The output of the Flesch Reading Ease formula is a number from 0 to 100, with a higher score indicating easier reading. The average document has a Flesch Reading Ease score between 6-70. The formula reads as follows:

 

 

206.835 – (1.015 x ASL) – (84.6 x ASW)

where:

ASL = average sentence length (the number of words divided by the number of sentences)

ASW = average number of syllables per word (the number of syllables divided by the number of words)

 

 

Flesch-Kincaid Grade Level

The more common Flesch-Kincaid Grade Level formula converts the Reading Ease Score to a U.S. grade-school level.

 

 

(.39 x ASL) + (11.8 x ASW) – 15.59

where:

ASL = average sentence length (the number of words divided by the number of sentences)

ASW = average number of syllables per word (the number of syllables divided by the number of words

 

In addition, more than 40 readability formulas have been developed over the years (Klare, 1974-1975). Readability measures guide the construction of textbooks such that the readability conforms to the intended grade level. However, there are at least three major problems with readability formulas that prevent valid predictions of text comprehension.

1. Surface characteristics. Readability scores are based on the surface characteristics of the text. Comprehension and learning, however, depend to a greater extent on processing at the textbase and situation levels (Kintsch, Welsch, Schmalhofer & Zimny, 1990; McNamara et al., 1996). Measuring text elements that are primarily needed for surface processing does not adequately capture comprehension and learning, which is the concern of educators. Recent advances in discourse processing and computational linguistics afford more advanced measures of readability due to more precise predictions of which text characteristics improve comprehension and learning.

2. Reader's cognitive aptitudes. Predicting reading, understanding, and learning requires consideration of the reader’s knowledge, language skills, and other cognitive aptitudes. Although text characteristics can certainly predict aspects of readability, readability should be viewed as an interaction between a text and a reader’s cognitive aptitudes (Kintsch, 1994; Miller & Kintsch, 1980; McNamara et al. 1996).

3. Cohesion and coherence. Readability formulas cannot capture the cohesion or coherence of a text. Research has clearly shown that readers have less difficulty reading cohesive texts (Beck, McKeown, Sinatra, & Loxterman, 1991; Britton & Gulgoz, 1991; Gernsbacher, 1997; Graesser, Gernsbacher & Goldman, 2002; McNamara, 2001; McNamara & Kintsch, 1996; McNamara et al., 1996). We would therefore expect greater readability scores for high-cohesion texts than low-cohesion texts; however, this is not generally the case. In the following examples the low cohesion sentences have lower or equal Flesch-Kincaid grades, but are intuitively more difficult to read than the high cohesion texts. Similarly, the Flesch Reading Ease scores do not necessarily differentiate between low and high coherence sentences. Hence, traditional readability measure can run orthogonal to cohesion measures. Average sentence length and average number of syllables per word alone cannot sufficiently predict coherence and therefore understanding of a text.

 

 

 

Flesch Reading Ease

Flesch-Kincaid Grade

 

low

cohesion

The streets were wet.  It had rained.

 

100

 

0.0

 

 

high

cohesion

The streets were wet because it had rained. 100 0.8

 

 

 

 

Flesch Reading Ease

Flesch-Kincaid Grade

 

low

cohesion

One part of the cloud develops a downdraft. Rain begins to fall.

 

80.8

3.4

 

high

cohesion

One part of the cloud develops a downdraft, which causes rain to fall. 83 4.9

 

 

 

 

Flesch Reading Ease

Flesch-Kincaid Grade

 

low

cohesion

Among Glaswegians the precipitation caused havoc and vexation.

 

8.3

 

12.0

 

 

high

cohesion

The rainfall caused devastation and irritation among citizens of Glasgow. 9.7 12.0

 

 

 

 

Flesch Reading Ease

Flesch-Kincaid Grade

 

low

cohesion

Since John always jogs a mile and a half seems a short distance to him.

 

95.7

 

3.6

 

 

high

cohesion

Since John always jogs, a mile and a half seems a short distance to him. 95.7 3.6

 

Although it must be noted that a text should generally have more than 200 words before the Flesch Reading Ease and Flesch-Kincaid Grade Level scores can successfully be applied, the conclusion is the same. To measure readability, coherence and comprehensiveness of a text, more than surface features need to be taken in consideration than surface features alone. Quantitative and qualitative factors like the number of anaphora, number of overlapping text segment, vocabulary difficulty, sentence and text structure, concreteness and abstractness, are equally needed. It is the sum of these and other factors that constitutes cohesion.