A Note on Statistics
CPWR - The Center for Construction Research and Training
Suggestions on how to interpret statistics and charts
eLCOSH makes every effort to present the best-quality statistics available, but all numbers need interpretation. Here are some guidelines.
- Look carefully at the source for any data and whether the source might have reason to be biased.
- Look at how the data were collected and how they are presented. Two good ways to collect data are a census - in which, for instance, every known work-related death is counted nationwide - or a statistical sample, such as when questionnaires are given to a scientifically selected group of people and the results are used to estimate what is true for that whole population. For instance, a study about senior citizens exercise habits in the U.S. might be based on interviews with a randomly chosen group of men and women over age 65.
- The best source for construction safety and health data in the U.S. is the Bureau of Labor Statistics, part of the U.S. Department of Labor. Most other data sources on safety and health are based on incomplete information. What is included or missing can affect the results.
- Be clear about what the data are supposed to show. For instance, are they showing deaths of all construction workers on highway construction sites in the U.S. or all deaths on those sites in the U.S.? The second group would include people driving through who are not construction workers.
- Be wary of reported numbers or percentages that are based on a small data set. If, for instance, you’re comparing information about occupations and you have a sample of 10 of 10,000 aerobics instructors nationwide, don’t place much value on the statistics for aerobics instructors. Or, if you’re looking at statistics about deaths from a given hazard, be careful if a census shows only a few deaths per year. (This is why some reports combine and average numbers over 3 or 5 years – to avoid fluctuations from year to year that don’t mean much statistically.) When the numbers are small, only a slight change in circumstances could completely change the rankings of items.
Here’s an example of the problem of small numbers: A chart might list serious injuries/illnesses in a city park in 2003 this way:
Falls 10 (40%) Boating 7 (28%) Plants 5 (20%) Other 3 (12%) Total 25 (100%)
If the numbers are small (as shown here), don’t pay attention to the percentages. With such small numbers, it is best to conclude simply that falls, boating, and contact with plants such as poison ivy cause injuries and illnesses in that park. Do not focus on which was ranked first in 2002 or whether falls out of trees and on the ground caused 42% more injuries than boating accidents did. Why? Because, for instance, a single rowboat tipover in the pond could injure 4 people and completely switch the ranking of the hazards. Or, unfortunately, a vehicle might jump a sidewalk curb and injure people walking along a park path, thus expanding the category labeled “other.”
- Before you compare two sets of statistics, remember that it might not be possible to compare one set of numbers to another; for instance, if you want to compare injury data for two parks and one park has boating but one does not.