By Thomas C. Grubb, Jr. , Department of Zoology, Ohio State University
The Boxwood Press; Pacific Grove; 1986
(Used with Permission)



Chapter Two  Analytical Ornithology
In the Preface, I mentioned the descriptive study of Song Sparrows made by Mrs. Margaret Morse Nice, and noted that I planned to illustrate procedures that would show how descriptive ornithology, such as Mrs. Nice's research, could lead to the analytical phase of hypothesistesting.
In order to do ornithology of an analytical, hypothesistesting nature, we need to employ analytical statistics. The word statistic means a numerical fact, a measurement. Any statistic is classified as being either descriptive or analytical. Descriptive statistics measure by describing something and analytical statistics measure by comparing things. Why do we need statistics at all? In the Chapter 1 example of white birds and aquatic food, we tested predictions without using any statistics. In general, the reason we need statistics is so we can deal with the variation shown by all biological phenomena. Let's explore further the idea of variation in biology. Suppose I asked you to tell me how tall American Woodcocks are, or how red male Northern Cardinals are, or how loudly American Robins sing. First, you would have to decide what unit of measurement to use, centimeters of height for the woodcocks, wavelength of light for the cardinals, decibels of sound for the robins. However, you would encounter a problem immediately. You would find that no two woodcocks were exactly the same height, no two cardinals exactly the same color, and no two robins exactly the same loudness. Indeed, for our test of white birds and aquatic food, we assume that all species of white birds were the same "whiteness," an assumption which is unlikely to be true.
In ornithology, we have the problem of how to describe things which are variable. We respond to this problem by using, together, two kinds of descriptive statistics, measures of central tendency and measures of dispersion. One measure of central tendency everyone knows is the average. Familiar examples are baseball player's batting averages, average annual family income, and average monthly rainfall. The average, known in science as the mean, is a statistic which describes the central tendency of a group of values. The mean equals the sum of the values in a group divided by the number of values in that group. Other descriptive measures of central tendency are the median, which is the middle value in a group, and the mode, which is the most frequent value in a group.
Measures of dispersion are the second major form of descriptive statistic. The best known measure of dispersion is called the variance. It reveals how much variation exists among the values clustered about a mean or average, that is, how spread out those values are. Two groups of values with the same mean can have quite different variances, as illustrated in Fig. 2.1. As a further example, consider two unusual decks of playing cards. One deck contains only fours, fives and sixes, while the second deck has aces (ones) through nines. Now suppose that you drew one card from a deck, wrote down its value, replaced it in the deck, shuffled the cards thoroughly, then drew another card in the same way, and you did this over and over for each of the two decks. You would find that your samples of values from both decks had just about the same mean of five, but that their measures of dispersion were different. The sample from the deck with aces through nines would have the greater variation in values, the greater variance. The variance is simply a mathematical way of describing how extensively values are spread about the mean or average.
Figure 2.1
Imaginary means and variances. Parts A
and B of the figure show imagined groups of values. The size
of a value increases from left to right along the horizontal
axes, and the number of values increases from bottom to top
along the vertical axes. The two groups of values have
identical means, but the values in group B are more
dispersed about the mean. Because of this greater variation,
group B would have a larger calculated variance than would
group A.
The amateur ornithologists of Mrs. Nice's era and earlier were quite comfortable using descriptive statistics like the mean, as we all are. The disadvantage from which amateurs have had difficulty recovering arose when ornithology became an analytical, hypothesis testing science as well as continuing to be a descriptive one. At that point, more involved procedures termed analytical statistics were called upon to overcome the problems posed by the variation inherent in biological records. This chapter and this book will consider these analytical statistics.
Analytical statistics are necessary tools for testing predictions. Indeed, it seems we find predictions fascinating not only in ornithology or biology, but in almost any field of endeavor. Here are a few representative hypotheses and their predictions. If your nephew had earned his Master's degree (hypothesis), he would now be earning a higher salary (prediction). If the combined French and Spanish fleet had not been positioned perpendicular to Nelson's shipsoftheline (hypothesis), it would have triumphed at Trafalgar (prediction). If I cut my fingernails (hypothesis), I won't wear holes in my gloves so fast (prediction). Two of these predictions cannot be tested; they are not scientific predictions. In the fingernails example, however, we can compare what would happen to my gloves if I cut my fingernails with what would happen if I did not cut my fingernails. That is, we can establish a controlled test of the prediction to determine whether it is true or false. This procedure of evaluating a prediction (and, thereby, a hypothesis) by means of a controlled test is the heart of doing analytical science of any kind. If you perform a controlled test of a hypothesis, you are doing analytical science. If you don't, you're not. We will never be able to test the predictions about the SpanishFrench fleet or about your nephew's Master's degree because a controlled test is impossible. The distinction between doing analytical science and doing any other intellectual pursuit is hard, clear, and permanent.
Analytical ornithologists evaluate hypotheses by subjecting the hypotheses; predictions to testing. Why are analytical statistics necessary for these tests? They are necessary because they help us to decide whether any test result is real, or could be due just to an error in sampling. Suppose we consider the hypothesis, based on psychological principles, that within a bird species, body size is inversely related to environmental temperature. That is, the colder the environment, the larger the individuals of a given species that live in that environment will be. Such a hypothesis is known to ecologists as Bergmann's Rule. What testable predictions can we deduce from this hypothesis? One prediction might be that, since average temperature drops progressively from the equator to the poles, any North American bird species with a large latitudinal range, the Northern Bobwhite for instance, should be represented by smaller individuals in the southern part of its range than in the northern part. Let's say we test this prediction by capturing and weighing samples of bobwhites in Alabama and Ohio. We will prove true the prediction and support the hypothesis (Bergmann's Rule) if the Ohio birds are heavier than the Alabama birds. Here is where a problem immediately arises which can only be solved with analytical statistics. The problem occurs because there isn't just one size bobwhite in Ohio and one size in Alabama. If there were only one size in each place, it would be easy to tell whether Ohio birds were larger, but in both places there will be variation in the size of bobwhites in our samples, and this variation is the problem because it can have either of two causes. The variation can be real, it can truly reflect the variation in size of all the bobwhites in Ohio or Alabama, or it can be the result of sampling error, a sample of bobwhite weights that does not reflect the variation of weights within a whole statewide population. From our sample of weights taken in each state, we can calculate a measure of central tendency such as the mean, and a measure of dispersion such as the variance. If the two means are different, the question we have to deal with is this: Are they different because bobwhites in Ohio and Alabama really are of different size, or are they different because purely by chance, we happened to draw heavy birds from one state and lighter birds from the other state. Fig. 2.2 illustrates the uncertainty that increasing variance introduces into the testing of predictions. For simplicity, Fig. 2.2 ignores the known difference in weights between male and female bobwhites.
Imaginary means and variances of the
weights of Northern Bobwhites in Alabama and Ohio. In parts
A, B, and C, the vertical lines indicate the positions of
the mean weights in Alabama and Ohio. The mean values for
birds from each state are identical in all three parts of
the figure. In part A, all birds in the same state have the
same weight. Because the variance in part A is zero, it
appears safe to conclude that Ohio birds are heavier. In
part B, some intrastate variation in weights is evident,
but because every Ohio bobwhite sampled was heavier than
every Alabama bobwhite sampled, it still seems reasonable to
conclude that there is a real difference between Alabama and
Ohio in the weight of Northern Bobwhites. In part C, the
variation within a state is extensive, and some Alabama
birds sampled were heavier than some Ohio birds sampled.
Because the resulting variances are large, we cannot be
sure, initially, whether a real difference exists between
the weights of Alabama and Ohio bobwhites in spite of the
overlap between the two samples of weights. The use of
analytical statistics provides a method for determining
whether we should believe that the difference between the
means shown in part C is real, or whether we should conclude
that the difference could have been caused solely by chance
sampling error.
This point about the sources of variation is important enough to be approached another way. Suppose you have two jars, each containing 1,000 red or white marbles. How would you test the hypothesis that one jar has a higher percentage of red marbles than the other? The surest way, of course, would be to count every last red marble in each jar. Then you could be absolutely sure whether the hypothesis was correct, but suppose you don't have enough time to make a complete count and you must evaluate the hypothesis based on samples of only 100 marbles from each jar. You could then predict that if one jar has a higher percentage of red marbles, then a sample of 100 marbles from the jar should have more red ones than a sample of 100 marbles from the second jar. Let's imagine we take the two samples and find that in one case, 25 marbles of the 100 marbles are red, while in the second case, 35 of the 100 marbles are red. We are aware that in each sample the percentage of red marbles is probably not exactly the same as the percentage of red marbles in the whole jar. The 10% difference in the proportion of red marbles in the samples from the two jars could have occurred either because one jar actually contained a higher percentage of red marbles, or it could have been caused solely by chance. To belabor the obvious, the two jars could represent Ohio and Alabama and the marbles could denote bobwhites, with the simplification that each state has only "big" or "little" bobwhites, not the continuous range of sizes we see in the real world.
Are we dealing with a difference in weights of bobwhites between Alabama and Ohio, or with a chance event? We must know the likelihood of a chance outcome being responsible before we can declare the prediction true or false. The use of analytical statistics tells us this likelihood, the probability that the difference we found was due to random chance events associated with taking a small sample from a large population. Suppose we find that the Ohio bobwhites we sample have a higher mean weight than that of the birds in our sample from Alabama, as illustrated in Fig. 2.2. Analytical statistics can tell us what the probability is that the difference in weights was due to sampling error. If that probability is very low, we might confidently support the prediction that the average bobwhite in Ohio is larger than the average Alabama bird.
Ornithologists and many other scientists have adopted what is called the 5% level of confidence. This means that if the probability is 5% or less that a difference between two samples could be due to chance alone, then the difference is considered to be real, actually to exist in nature. If the probability is greater than 5% that chance alone could have caused the difference, the difference is not considered to be real. Because of this procedure, ornithologists and other scientists have an objective basis for rejecting or failing to reject hypotheses. That is, they have a method for determining which explanations about nature they should believe. An optional section at the end of each of the following 20 projects in field ornithology involves using analytical statistics and the 5% level of confidence to evaluate hypotheses about the behavior and ecology of wild birds.