A frequency distribution is a tool for organizing data. We use it to group data into categories and show the number of observations in each category. Here are some test scores from a math class.
| 65 | 91 | 85 | 76 | 85 | 87 | 79 | 93 |
| 82 | 75 | 100 | 70 | 88 | 78 | 83 | 59 |
| 87 | 69 | 89 | 54 | 74 | 89 | 83 | 80 |
| 94 | 67 | 77 | 92 | 82 | 70 | 94 | 84 |
| 96 | 98 | 46 | 70 | 90 | 96 | 88 | 72 |
It's hard to get a feel for this data in this format because it is unorganized. To construct a frequency distribution, you should first identify the lowest and highest values in the list. We do this because we want to be sure that each value in the list fits into one of our categories. The low value here is 46, and the high is 100. A set of categories that would work here is 41-50, 51-60, 61-70, 71-80, 81-90, and 91-100. Here's a finished product :
We can now see that the biggest number of tests were between 81
and 90, and most of the tests were between 71 and 100.
The low number in each category (or class) is called the lower
class limit, and the high number is called the upper class limit.
Now for some guidelines for constructing a frequency distribution.
After the first two rules above, the rest are merely suggestions.
Each set of data may require you to violate some of these suggestions.
The best advice is to try and follow them whenever possible.
One further extension to the frequency distribution is to look at the percentage of values that show up in each category. This is called a relative frequency distribution or percent frequency distribution. Here's how the above data would be presented in this way.
The final frequency distribution that we will discuss is the cumulative frequency distribution. Think about the word cumulative, it generally refers to some sort of total. A cumulative frequency distribution is a way to list how many values fit into the first class, the first 2 classes, the first 3 classes, etc., or the last class, the last 2 classes, etc. Here's a cumulative less than frequency distribution for the above set of data.
The 1 means that there is 1 value that is 50 or less, the 3 means
that there are 3 values that are 60 or less, the 9 means that
there are 9 values that are 70 or less, and so on.
Now for a cumulative greater than frequency distribution.
The 40 means that there are 40 values that are 41 or more, the
39 means that there are 39 values that are 51 or more, the 37
means that there are 37 values that are 61 or more, and so on.
Practice Questions