A Simplified Flesch Reading Ease Formula
Davis Foulger
1977
Florida Technological University (now the University of Central Florida)
Modified by author in 2003
As originally submitted to Journalism Quarterly as Research
in Brief.
(more papers by Davis Foulger)
Readability formulas have appeared and disappeared regularly for
about forty-five years now.(1) Of these formulas, three, all developed in the
late 1940's and early 1950's, stand out. The first, the Cloze procedure,(2)
isn't really a formula at all. It is a behavioral research method which requires
time for survey and analysis. The second, the Dale-Chall formula,(3)
is the result of the collaboration of two researchers who had been working on
the problem of readability for several years prior to their successful joint
venture. Although their formula is the best of the many formulas for estimating
the readability of a document, it is complicated, requiring consultation of
lists of commonly used words.(4) Neither the Dale-Chall formula nor the Cloze
procedure are practical for the editor who wants to compare his newspaper to
others, the reporter who wants to make his articles more readable, or the students
who is learning what makes good journalism.
A more practical formula is the Reading Ease formula,(5) the
best of several such formulas developed by Rudolph Flesch.(6) Although
very nearly as accurate a measure as the Dale-Chall formula, the reading ease
formula is considerably easier to use, requiring no comparisons with word lists.
The computations involve only the counting of words, syllables,and sentences.
From these counts sentence length and word length are combined to compute the
actual scale score. This score can range from zero, for extremely difficult
reading, to one hundred, for very easy reading.
Although the Flesch reading ease formula is probably the best
combination of simplicity and meaningfulness yet built, many researchers have
not considered the scale simple enough. As a result, several scales requiring
less counting have been developed.(7) The work reduction has, however, been
invariably accompanied by a change in the equation and a decline in the accuracy
of the measurement. Perhaps the most complete counting reduction was that accomplished
by Irving Fang.(8). His formula collapsed the elements of the Flesch scale into
a single measure of the excess syllables in each sentence. The logic of the
scale is that readability declines as word length and/or sentence length increases.
By counting the excess syllables (the number of syllables in a word minus one)
short sentences with overlong words and overlong sentences are both detected.
The beauty of the scale is that it only requires one count of a manuscript.
The scale accomplishes this, however, at some expense to accuracy.
Measurement of the role of sentence length is minimized in the formula.
The original Flesch reading ease procedure: |
Words: |
1 |
2 |
|
3 |
4 |
5 |
6 |
7 |
8 |
|
|
Syllables |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
Sentences |
|
|
|
|
|
|
|
|
|
|
1 |
Sample Sentence: |
The |
meth- |
od |
saves |
two |
counts |
of |
the |
man- |
u- |
script. |
The simplified Flesch reading ease procedure: |
Excess Syllables |
|
|
1 |
|
|
|
|
|
|
2 |
3 |
Words: |
1 |
2 |
|
3 |
4 |
5 |
6 |
7 |
8 |
|
|
|
Figure 1: The original and Revised reading ease procedure.
The Flesch procedure entails three counts of the manuscript. The simplified
procedure entails only two.
|
The irony of this work, which undoubtedly consumed --many
hours research, is that the reading ease formula and procedures can be simplified
to the point where calculation of the Flesch formula requires very little
greater effort than would be required to compute the Fang formula. This can
be accomplished, moreover, with no loss in accuracy. The key to this
simplification is the Fang procedure. Instead of counting words, syllables,
and sentences in three separate counts, as required by the Flesch formula,
one need only count the words and excess syllables in each sentence. A list
of words and excess syllables, by sentence, will yield the number of sentences
without additional reference to the document. In Figure 1, the counts required
by the original Flesch procedure are shown above the sentence. Those required
by the revised procedure are shown below. It should be apparent that the revised
procedure results in a considerable savings in counting time, as the overall
count will go no higher than it would, in the original procedure, for the
syllable count alone. The results can then be summed and inserted into the
revised Flesch reading ease formula:
|
Figure 2: Formula for a simplified Flesch Reading
Ease procedure
|
where R.E. is the reading ease score, X is the excess syllables in each sentence,
Y is the number of sentences, and Z is the number of words in each sentence.
Although the equation may appear more complicated than the original, it is actually
much more flexible and simple to compute. The use of a hand calculator makes
the job even simpler, easy enough, in fact, for almost anyone to compute rather
quickly. The revised formula was derived from the original Flesch reading ease
formula as follows:
- (The original Flesch Reading Ease Formula, where R.E. is reading
ease, wl is the number of syllables in a 100 word sample, and sl
is the average length of the sentences in that 100 word sample:
.
- If we decompose wl into its constituents (e.g. the number of words (100)
and the number of additional syllables(X)), where X = wl - 100,
we can derive a series of changes to the reading ease formula. 100 is the
defined length, in words, of a sample in the Flesch Reading Ease procedure.
The results of inserting X into the Flesch Reading Ease formula are
as follows:
Note in particular the final reduction of the constant to 122.235 as
we derive X (additional syllables) from wl. This change may
be the most important one we will make to the formula here, insofar as it
instantly clarifies the meaing of the formula. To acheive a fourth grade reading
level in text (Flesch's benchmark for a score of 100 points on a text sample),
the average sentence length and the number of additional syllables can equal
22. In other words, reading ease is maximized with you use short sentences
and minimize use of long words.
- It is possible to dispose of sample size as a limitation in computing
the Flesch Reading ease formula. To enable this we will define a variable
Z as number of words, noting again that in the original Flesch Reading
Ease formula Z is assumed to equal 100. Note that this assumption caused
some problems for Flesch in computing the Reading Ease formula, and the procedure
documented for computing R.E. includes a work around for solving the problem.
Here we willl solve the problem directly by modifying the formula to allow
samples of any length.
- The average length of a sentence is defined by Flesch as the number of
words in a sample divided by the number of sentences in that sample. If we
define Y to be the number of sentences, we can decompose sl as follows:
Y = sl * Z. The results of inserting Y into the Flesch Reading
Ease formula are as follows:
Note in particular that the derived equation only assumes raw values (e.g.
the number of sentences, the number of words, and the number of additional
syllables associated with those words). This simplies computation a little
by allowing Reading Ease to be computed in one computation. More importantly,
however, it allows us to treat sentences as the fundamental unit of computation
for Flesch Reading Ease. We aren't quite there yet, but this move will allow
us to estimate the Flesch Reading Ease of individual sentences.
- At this point we have removed most of the assumptions that force a Flesch
Reading Ease procedure sample to be 100 words. The remaining assumption involves
the of excess syllables (X) and its ratio to the number of words (presumed
to be 100). We can resolve this problem by multiplying X by the ratio of the
number of words in the actual sample to the number of words in Flesch's assumed
sample size (e.g. the ration of Z to 100). This multiplication is neutral
to the original Flesch formula, as Z is forced to 100 and the result is multiplication
of X by 1. The results of inserting this multiplication into the formula are
as follows:
This change substantially increases the range of applications in which the
Flesch Reading Ease formula can be used reliably. Documents or document components
of any size can be assessed for their readability, right down to a single
sentence. Hence the sentence "This is a short sentence" would have
a readability score of 100, corresponding to a fourth grade reading level.
It is up the reader to decide if this is useful in normal practice (e.g. flagging
particularly difficult sentences in a word processor). It does, however, set
up the modified Flesch Reading Ease Procedure which is the core of this papers
early argument.
- A document is, at one level, a collection of sentences. It is convenient,
in the current procedure, to view Flesch Reading Ease as being the result
of the sum of the readability of all of the individual sentences in the document,
where each sentence (Z=1) has some number of words (Y) and additional syllables
(X). If we view the Reading Ease formula in this way, the last derived equation
above can be usefully restated as follows:
Note that this formula is identical to that presented in Figure 2.
Footnotes
- For a detailed history and description of readability formulas, see: George
R. Klare, The Measurement of Readability (Ames, Iowa: Iowa State University
Press, 1963
- Wilson L. Taylor, "Cloze Procedure: A New Tool for
-Measuring Readability." Journalism Quarterly, 30: 415-433 (1953).
"Recent Developments in the Use of the Cloze Procedure." Journalism
Quarterly, 33: 42-48 (1953).
- Edgar Dale and Jeanne S. Chall, "A Formula for Predicting Readability."
Educational Research Bulletin, 27: 11-20 (21 January 1948).
- Edgar Dale and Jeanne S. Chall, A Formula for Predicting Readability: Instructions."
Educational Research Bulletin, 27: 37-54 (18 February 1948).
- Rudolph F. Flesch, "A New Readability Yardstick." Journal of Applied
Psychology, 32: 221-233 (1948).
- Rudolph F. -Flesch, "Estimating the Comprehension Difficulty of Magazine
Articles." Journal of General Psychology, 28: 63-80 (1943). "Measuring
the Level of Abstraction." Journal of Applied Psychology, 34: 384-390
(1950).
- James N. Farr, James J. Jenkins, and Donald G. Patterson, "Simplification
of the Flesch Reading Ease Formula." Journal of Aoplied Psychology,
35: 333-337 (1951). R. Gunning, The Technique of Clear Writing (New
York: McGraw-Hill, -1952).
- lrving E. Fang, "The 'Easy Listening Formula.'" Journal of Broadcasting,
11: 63-68 (Winter 1966-67).
Notes on the 2003 Modifications to this Paper
An edited version of this paper appeared as "Research in Brief"
in Journalism Quarterly in 1978. A number of important elements of that paper,
including detail of the measurement prodedures and detail of the derivation
of the formula, were edited out of the published version. It is hoped that republication
of the complete paper will have value to someone.
In republishing this paper on the web, some alterations have been made to
the text. Those alterations have are indicated by the use of italics. The reader
will quickly observe that most of the paper remains unaltered (although even
the unaltered sections deserve some additional comment). The primary alterations
are at the end of the paper. In scanning this paper in for republication on
the web, an error was discovered in the logic of the equation and its derivation.
While the error is a small one, it does result in an overestimation of readability
for small sample sizes and and underestimation of readability for large samples.
For what its worth, if anyone ever noticed the problem, no one has ever commented
on it that I am aware of. In any case, here is the original equation as published
in Journalism Quarterly:
Anyone with a spreadsheet can quickly model the errors that are made in
this formula. They can also just as easily model the newly derived formula and
see its consistency with the original Flesch formula.
The primary modifications to the paper, then, are the insertion of a somewhat
different formula (note that most of the essentials remain unchanged) and an
elaborated explanation of the derivation of the formula from the original Flesch
Reading Ease Formula. The derivation follows the same logic as the original
derivation, but corrects for an error in the computation of the effects of sentence
length that resulted from an oversimplification of sentence length computation.
I have added some additional discussion and explanation to the derivation in
the interests of:
- illustrating why and how the Flesch Reading Ease formula works. In some
sense Flesch adds nothing to that which any good writing teacher will tell
you: People have an easier time understanding shorter sentences and shorter
words.
- illustrating how and why the procedure and formula described here are
more generally applicable (down to the level of individual sentences) than
the Flesch Reading ease formula is. The world has changed since this paper
was originally written. References to calculators at one point in the article
are, if not obselete (the author still pulls out a hand calculator here and
again), certainly less appropriate today than references to spreadsheets and,
more importantly, word processors would be. Estimates of writing quality have
become a routine feature of word processors today (the author can recall designing
such features into IBM Work Processors in the late 1980's). Interestingly,
however, estimates rarely descend to the level of sentences. The possibility,
offered by this formula, to resolve readability at the level of sentences
is an interesting one which might be usefully added to writing software.
The only other change to the manuscript is at the end of the discussion
of Fang's procedure and formula. Dr. Fang wrote to me about this article about
a year after its appearance in JQ complaining that I had used "strong words"
in describing his procedure as accomplishing its goals "at the expense
of accuracy". That description is unduly harsh. In accomplishing most of
the work of Flesch's formula with a single count of excess syllables, Fang highlights
the dominant importance of word complexity to readability and provides a tool
that accomplishes most of the work of the Flesch Reading Ease Formula with minimal
measurement effort. While it remains that, in discounting the importance of
sentence length, that Fang's formulation does sacrifice accuracy in favor of
rapid measurement (one can be sure that Flesch had the option of not measuring
sentence length and chose to include sl in the R.E. equation). it doesn't give
up much compared with Flesch's formulation, and easy measurement mattered at
the time of his article. While, minimizing measurement effort is probably less
important now, it remains that Fang's article was the "shoulders"
(to paraphrase Issac Newton) that this article stands on. Hence I have modified
my strong language to read :at some expense to accuracy"
and have added the clarifying sentence: "Measurement of the role of
sentence length is minimized in the formula."