6 Chapter 6 Inductive Reasoning and Arguments
This chapter is taken from “Logical Reasoning” by Bradley H. Dowden
Inductive reasoning consists of several independent topical areas that focus on a particular kind of inductive argument. This chapter introduces some of the different kinds of inductive reasoning.
Generalizations
If it it looks like a duck, walks like a duck, and quacks like a duck, then it’s a duck. This is usually good reasoning. It’s probably a duck. Just don’t assume that it must be a duck for those reasons. The line of reasoning is not sure-fire. It is strong inductive reasoning, but it is not strong enough to be deductively valid. Deductive arguments are arguments intended to be judged by the deductive standard of, “Do the premises force the conclusion to be true?” Inductive arguments are arguments intended to be judged by the inductive standard of, “Do the premises make the conclusion probable?” So, the strengths of inductive arguments range from very weak to very strong.
This chapter focuses specifically on the nature of the inductive process because inductive arguments play such a central role in our lives. We will begin with a very important and very common kind of inductive argument, generalizing from a sample. Then later we will consider the wide variety of inductive arguments. As we shall see, inductive reasoning is about seeing patterns and making claims that extend beyond the data at hand.
Practice:
For the following statistical report, (a) identify the sample, (b) identify the population, (c) discuss the quality of the sampling method, and (d) find other problems either with the study or with your knowledge of the study.
Voluntary tests of 25,000 drivers throughout the United States showed that 25 percent of them use some drug while driving and that 85 percent use no drugs at all while driving. The conclusion was that 25 percent of U.S. drivers do use drugs while driving. A remarkable conclusion. The tests were taken at random times of the day at randomly selected freeway restaurants.
a) The sample is 25,000 U.S. Drivers, (b) The population is U.S. drivers, (c) The sample size is large enough, but it is not random, for four reasons: (1) Drivers who do not stop at roadside restaurants did not have a chance of being sampled, (2) the study overemphasized freeway drivers rather than other drivers, (3) it overemphasized volunteers, (4) it overemphasized drivers who drive at 4 a.m. (d) The most obvious error in the survey, or in the report of the survey, is that 25 percent plus 85 percent is greater than 100 percent. Even though the survey said these percentages are approximate, the 110 percent is still too high. Also, the reader would like more information in order to assess the quality of the study. In particular, how did the study decide what counts as a drug, that is, how did it operationalize the concept of a drug? Are these drugs: Aspirin? Caffeine? Vitamins? Alcohol? Only illegal drugs? Did the questionnaire ask whether the driver had ever used drugs while driving, or had ever used drugs period? Did the pollster do the sampling on one day or over many days?
Generalizing from a Sample
Scientists collect data not because they are in the business of gathering facts at random but because they hope to establish a generalization that goes beyond the individual facts. The scientist is in the business of sampling a part of nature and then looking for a pattern in the data that holds for nature as a whole. For example, a sociologist collects data about murders in order to draw a general conclusion, such as “Most murders involve guns used on acquaintances.” A statistician would say that the scientist has sampled some cases of murder in order to draw a general conclusion about the whole population of murders. The terms sample and population are technical terms. The population need not be people; in our example it is the set of all murders. A sample is a subset of the population. The population is the set of things you are interested in generalizing about. The sample is examined to get a clue to what the whole population is like. We sample in order to discover a pattern that is likely to hold across the whole population.
The goal in drawing a generalization based on a sample is for the sample to be representative of the population, to be just like it. If your method of selecting the sample is likely to be unrepresentative, then you are using a biased method, and that will cause you to commit the fallacy of biased generalization. If you draw the conclusion that the vast majority of philosophers write about the meaning of life because the web pages of all the philosophers at your university do, then you’ve got a biased method of sampling philosophers’ writings. You should use a more diverse sampling method. Sample some of the philosophers at another university.
Whenever a generalization is produced by generalizing on a sample, the reasoning process (or the general conclusion itself) is said to be an inductive generalization. It is also called an induction by enumeration or an empirical generalization. Inductive generalizations are a kind of argument by analogy with the implicit assumption that the sample is analogous to the population. The more analogous or representative the sample, the stronger the inductive argument.
Generalizations may be statistical or non-statistical. The generalization, “Most murders involve guns,” contains no statistics. Replacing the term most with the statistic 80 percent would transform it into a statistical generalization. The statement “80 percent of murders involve guns” is called a simple statistical claim because it has the form
x percent of the group G has characteristic C.
In the example, x = 80, G = murders, and C = involving guns.
A general claim, whether statistical or not, is called an inductive generalization only if it is obtained by a process of generalizing from a sample. If the statistical claim about murders were obtained by looking at police records, it would be an inductive generalization, but if it were deduced from a more general principle of social psychology, then it would not be an inductive generalization, although it would still be a generalization.
Here’s an example of what inductive reasoning looks like:
You return from the grocery store with your three cans of tomato sauce for tonight’s spaghetti dinner, you open the cans and notice that the sauce in two of the cans is spoiled. You generalize and say that two-thirds of all the cans of that brand of tomato sauce on the shelf in the store are bad. Here is the pattern of your inductive generalization:
x percent of sample S has characteristic C.
————————————————————-
x percent of population P has characteristic C.
In this argument x = 66.7 (for two-thirds), P = all the tomato sauce cans of a particular brand from the shelf of the grocery store, S = three tomato sauce cans of that brand from the shelf of the grocery store, and C = spoiled. Alternatively, this is the pattern:
Sample S has characteristic C. So, population P has characteristic C.
C is now not the property of being spoiled but instead is the property of being 66.7 percent spoiled. Either form is correct, but be sure you know what the C is.
The goal in taking samples is for the sample to be representative of the population it is taken from, in regard to the characteristics that are being investigated.
The more the sample represents the population, the more likely the inductive generalization is to be correct. By a representative sample we mean a sample that is perfectly analogous to the whole population in regard to the characteristics that are being investigated. If a population of 888 jelly beans in a jar is 50 percent red and 50 percent white, a representative sample could be just two jelly beans, one red and one white. A method of sampling that is likely to produce a nonrepresentative sample is a biased sampling method. A biased sample is a non-representative sample.
The fallacy of hasty generalization occurs whenever a generalization is made too quickly, on insufficient evidence. Technically, it occurs whenever an inductive generalization is made with a sample that is unlikely to be representative. For instance, suppose Jessica says that most Americans own an electric hair dryer because most of her friends do. This would be a hasty generalization, since Jessica’s friends are unlikely to represent everybody when it comes to owning hair dryers. Her sampling method shows too much bias toward her friends.
Random Sample
Statisticians have discovered several techniques for avoiding bias. The first is to obtain a random sample. When you sample at random, you don’t favor any one member of the population over another. For example, when sampling tomato sauce cans, you don’t pick the first three cans you see.
Definition A random sample is any sample obtained by using a random sampling method.
Definition A random sampling method is taking a sample from a target population in such a way that any member of the population has an equal chance of being chosen.
It is easy to recognize the value of obtaining a random sample, but achieving this goal can be difficult. If you want to poll students for their views on canceling the school’s intercollegiate athletics program in the face of the latest school budget crisis, how do you give everybody an equal chance to be polled? Some students are less apt to want to talk with you when you walk up to them with your clipboard. If you ask all your questions in three spots on campus, you may not be giving an equal chance to students who are never at those spots. Then there are problems with the poll questions themselves. The way the questions are constructed might influence the answers you get, and so you won’t be getting a random sample of students’ views even if you do get a random sample of students.
Purposely not using a random sample is perhaps the main way to lie with statistics. For one example, newspapers occasionally report that students in American middle schools and high schools are especially poor at math and science when compared to students in other countries. This surprising statistical generalization is probably based on a biased sample. It is quite true that those American students taking the international standardized tests of mathematics and science achievement do score worse than foreign students. The problem is that school administrators in other countries try too hard to do well on these tests. “In many countries, to look good is very good for international prestige. Some restrict the students taking the test to elite schools,” says Harold Hodgkinson, the director of the Center for Demographic Policy in Washington and a former director of the National Institute of Education. For example, whereas the United States tests almost all of its students, Hong Kong does not. By the 12th grade, Hong Kong has eliminated all but the top 3 percent of its students from taking mathematics and thus from taking the standardized tests. In Japan, only 12 percent of their 12th grade students take any mathematics. Canada has especially good test results for the same reason. According to Hodgkinson, the United States doesn’t look so bad when you take the above into account.
The following passage describes a non-statistical generalization from a sample. Try to spot the conclusion, the population, the sample, and any bias.
David went to the grocery store to get three cartons of strawberries. He briefly looked at the top layer of strawberries in each of the first three cartons in the strawberry section and noticed no fuzz on the berries. Confident that the berries in his three cartons were fuzzfree, he bought all three.
David’s conclusion was that the strawberries in his cartons were not fuzzy. His conclusion was about the population of all the strawberries in the three cartons. His sample was the top layer of strawberries in each one. David is a trusting soul, isn’t he? Some grocers will hide all the bad berries on the bottom. Because shoppers are aware of this potential deception, they prefer their strawberries in see-through, webbed cartons. If David had wanted to be surer of his conclusion, he should have looked more carefully at the cartons and sampled equally among bottom, middle, and side berries, too. Looking at the top strawberries is better than looking at none, and looking randomly is better than looking non-randomly.
When we sample instances of news reporting in order to draw a conclusion about the accuracy of news reports, we want our sample to be representative in regard to the characteristic of “containing a reporting error.” When we sample voters about how they will vote in the next election, we want our sample to be representative in regard to the characteristic of “voting for the candidates.” Here is a formal definition of the goal, which is representativeness:
Definition A sample S is a (perfectly) representative sample from a population P with respect to characteristic C if the percentage of S that are C is exactly equal to the percentage of P that are C.
A sample S is less representative of P according to the degree to which the percentage of S that are C deviates from the percentage of P that are C.
If you are about to do some sampling, what can you do to improve your chances of getting a representative sample? The answer is to follow these four procedures, if you can:
1. Pick a random sample. 3. Pick a diverse sample. 2. Pick a large sample. 4. Pick a stratified sample.
We’ve already discussed how to obtain a random sample. After we explore the other three procedures, we’ll be in a better position to appreciate why it can sometimes be a mistake to pick a random sample.
Concept Check:
For the following statistical report, (a) identify the sample, (b) identify the population, (c) discuss the quality of the sampling method, and (d) find other problems either with the study or with your knowledge of the study.
Voluntary tests of 25,000 drivers throughout the United States showed that 25 percent of them use some drug while driving and that 85 percent use no drugs at all while driving. The conclusion was that 25 percent of U.S. drivers do use drugs while driving. A remarkable conclusion. The tests were taken at random times of the day at randomly selected freeway restaurants.
(a) The sample is 25,000 U.S. Drivers, (b) The population is U.S. drivers, (c) The sample size is large enough, but it is not random, for four reasons: (1) Drivers who do not stop at roadside restaurants did not have a chance of being sampled, (2) the study overemphasized freeway drivers rather than other drivers, (3) it overemphasized volunteers, (4) it overemphasized drivers who drive at 4 a.m. (d) The most obvious error in the survey, or in the report of the survey, is that 25 percent plus 85 percent is greater than 100 percent. Even though the survey said these percentages are approximate, the 110 percent is still too high. Also, the reader would like more information in order to assess the quality of the study. In particular, how did the study decide what counts as a drug, that is, how did it operationalize the concept of a drug? Are these drugs: Aspirin? Caffeine? Vitamins? Alcohol? Only illegal drugs? Did the questionnaire ask whether the driver had ever used drugs while driving, or had ever used drugs period? Did the pollster do the sampling on one day or over many days?
Sample Size
If you hear a TV commercial say that four out of five doctors recommend the pain reliever in the drug being advertised, you might be impressed with the drug. However, if you learn that only five doctors were interviewed, you would be much less impressed. Sample size is important.
Why? The answer has to do with the fact that estimations based on sampling are inductive and thus inherently risky. The larger the sample, the better its chance of being free of distortions from unusually bad luck during the selection of the sample. If you want to predict how California voters will vote in the next election it would be better to have a not-quite random sample of 10,000 future voters than a perfectly random sample of two future voters.
To maximize the information you can get about the population, you will want to increase your sample size. Nevertheless, you usually face practical limits on the size; sampling might be expensive, difficult, or both.
In creating the government census, it is extremely difficult to contact and count those people who live temporarily on the couch at a friend’s apartment and those who live in their cars and have no address and those who are moving to a new job in a different state. You can make good estimates about these people, but if you’re required to disregard anyone you haven’t talked to during your census taking, then you’ll under-represent these sorts of people in your census results. People who complain that the government census will make an educated guess about how many people live in a city even if they haven’t counted all of the people, never seem to complain when their doctor samples their own blood rather than takes all of it to examine.
So, when is your sample size big enough for your purposes? This is a fascinating and difficult question. To illustrate, suppose you are interested in selling mechanical feeding systems to the farmers in your state. You would like to know what percentage of them do not already own a mechanical feeding system—they will be your potential customers. Knowing that this sort of information has never been collected, you might try to collect it yourself by contacting the farmers. Since it would be both difficult and expensive to contact every single farmer, you would be interested in getting your answer from a sample of small size. If you don’t care whether your estimate of the percentage of farmers without a mechanical feeding system is off by plus or minus 10 percent, you can sample many fewer farmers than if you need your answer to be within 1 percent of the (unknown) correct answer. Statisticians would express this same point by saying that a 10 percent margin of error requires a smaller sample size than a 1 percent margin of error. All other things being equal, you’d prefer to have a small margin of error than a large one.
Let’s suppose you can live with the 10 percent margin of error. Now, how sure do you need to be that your estimate will fall into that interval of plus or minus 10 percent? If you need only to be 90 percent sure, then you will need a much smaller sample size then if you need to be 97 percent sure. Statisticians would express this same point by saying that a 90 percent confidence level requires a smaller sample size than a 97 percent confidence level. Just exactly how much smaller is a matter of intricate statistical theory that we won’t go into here, although we will explore some specific examples later.
A margin of error is a margin of safety. Sometimes we can be specific and quantify this margin, that is, put a number on it such as 6%. We can say that our sampling showed that the percentage of farmers without a mechanical feeding system is 60 percent plus or minus 6 percent. Sometimes we express the idea vaguely by saying that the percentage is about 60 percent. At any rate, 441 whether we can be specific or not, the greater the margin of error we can permit, the smaller the sample size we need.
Sample Diversity
In addition to selecting a random, large sample, you can also improve your chances of selecting a representative sample by sampling a wide variety of members of the population. That is, aim for diversity─so that diversity in the sample is just like the diversity in the population. If you are interested in how Ohio citizens will vote in the next election, will you trust a pollster who took a random sample and ended up talking only to white, female voters? No. Even though those 50 white women were picked at random, you know you want to throw them out and pick 50 more. You want to force the sample to be diverse. The greater the diversity of relevant characteristics in your sample, the better the inductive generalization, all other things being equal.
Because one purpose of getting a large, random sample is to get one that is sufficiently diverse, if you already know that the population is homogeneous—that is, not especially diverse—then you don’t need a big sample, or a particularly random one. For example, in 1906 the Chicago physicist R. A. Millikan measured the electric charge on electrons in his newly invented oil-drop device. His measurements clustered around a precise value for the electron’s charge. Referring to this experiment, science teachers tell students that all electrons have this same charge. Yet Millikan did not test all electrons; he tested only a few and then generalized from that sample. His sample was very small and was not selected randomly. Is this grounds for worry about whether untested electrons might have a different charge? Did he commit the fallacy of hasty generalization? No, because physical theory at the time said that all electrons should have the same charge. There was absolutely no reason to worry that Tuesday’s electrons would be different from Wednesday’s, or that English elections would be different from American ones. However, if this theoretical backup weren’t there, Millikan’s work with such a small, nonrandom sample would have committed the fallacy of hasty generalization. The moral: Relying on background knowledge about a population’s lack of diversity can reduce the sample size needed for the generalization, and it can reduce the need for a random sampling procedure.
When you are sampling electrons or protons, if you’ve seen one you’ve seen them all, so to speak. The diversity just isn’t there, unlike with, say, Republican voters, who vary greatly from each other. If you want to sample Republican voters’ opinions, you can’t talk to one and assume that 442 his or her opinions are those of all the other Republicans. Republicans are heterogeneous─the fancy term for being diverse. A group having considerable diversity in the relevant factors affecting the outcome of interest is said to be a heterogeneous group.
A group with a relatively insignificant amount of diversity is said to be a homogeneous group. For example, in predicting the outcome of measuring the average height of two groups, Americans and Japanese, the diversity of American ethnicity makes Americans a heterogeneous group compared to the more homogeneous Japanese group. It is easier to make predictions for homogeneous groups than for heterogeneous groups.
Being homogeneous is relative, however. The Japanese might be more homogeneous than Americans relative to measurements about height, but the Japanese might be more heterogeneous than Americans when it comes to attitudes about socialism and about how to care for infants.
Obstacles to Collecting Reliable Data
So far in our discussion of significant statistics, we have worried about how to make decisions using reliable information from a sample of our population. To obtain significant statistics, we try to obtain a representative sample by getting one that is diverse, random, and large. A major obstacle to obtaining a representative sample is that unreliable data too easily creep into our sample.
If you own a radio station and decide that over 80% of your listeners like that song by singer Katy Perry because over 80% of those who texted your station (about whether they like that song) said they liked it, then you’ve made a too risky assumption. Those who texted you weren’t selected at random from your pool of listeners; they selected themselves. Self-selection is a biased selection method that is often a source of unreliable data.
There is the notorious problem of lying to pollsters. The percentage of polled people who say they’ve voted in the election is usually higher than the percentage of people who actually did. More subtly, people may practice self-deception, honestly responding “yes” to questions such as “Are you sticking to your diet?” when they aren’t. Another problem facing us pollsters is that even though we want diversity in our sample, the data from some groups in the population may be easier to obtain than from other groups, and we may be tempted to favor ease over diversity. For example, when counting Christians worldwide, it is easier for us to get data from churches of people who speak some languages rather than others and who are in some countries rather than others and who are in modern cities rather than remote villages.
There are other obstacles to collecting reliable data. Busy and more private people won’t find the time to answer our questions. Also, pollsters occasionally fail to notice the difference between asking “Do you favor Jones or Smith?” and “Do you favor Smith or Jones?” The moral is that natural obstacles and sloppy methodology combine to produce unreliable data and so to reduce the significance of our statistics.
Argument from Authority
Suppose a high school science teacher says to you, The scientists I’ve read agree that Neptune is a cold planet compared to Mars, Earth, and Venus. So, Neptune is definitely a cold planet. This argument from authority does not jump to conclusions. The high school teacher offers expert testimony although it is secondhand. It might be called hearsay in a courtroom, but it is reasonable grounds for accepting the conclusion. So, the conclusion follows with probability. But with how much probability? Nobody knows, not even the scientists. Nobody can say authoritatively whether the conclusion is 85 percent probable or instead 90 percent probable. All they can properly say is that the appeal to authority makes the conclusion a safe bet because the proper authorities have been consulted, they have been quoted correctly, and it is well known that the experts do not significantly disagree with each other about this.
The conclusion of the following argument is not such a safe bet:
The scientists say astral travel is impossible. That is, our spiritual bodies can’t temporarily leave our physical bodies and travel to other places. So they say. However, my neighbor and several of her friends told me they separately traveled to Egypt while their physical bodies were asleep last night. They visited the pyramids. These people are sincere and reliable. Therefore, the scientists are wrong about astral travel.
Is this a successful inductive argument? The arguer asks us to accept stories from his neighbor and her friends. These anecdotes are pitted against the claims of the scientists. Which should you believe? Scientists have been wrong many times before; couldn’t they be wrong here, too? Yes, they could, but it wouldn’t be a good bet. If you had some evidence that could convincingly show the scientists to be wrong, then you, yourself, would likely soon become a famous scientist. You should be cautious about jumping to the conclusion that the scientists are wrong. The stories are so extraordinary that you really need extraordinarily good evidence to believe them. The only evidence in favor of the stories is the fact that the neighbors and friends, who are presumed to be reasonable, agree on their stories and the fact that several times in history other persons also have claimed to be astral travelers.
The neighbor might say that she does have evidence that could convincingly show the scientists to be wrong but that she wouldn’t get a fair hearing from the scientists because their minds are closed to these possibilities of expanding their consciousness. Yes, the scientists probably would give her the brush-off, but by and large the scientific community is open to new ideas. She wouldn’t get the scientists’ attention because they are as busy as the rest of us, and they don’t want to spend much time on unproductive projects. However, if the neighbor were to produce some knowledge about the Egyptian pyramids that she probably couldn’t have gotten until she did her astral traveling, then the scientists would look more closely at what she is saying. Until then, she will continue to be ignored by the establishment.
Most of what we know we have gotten from believing what the experts said, either first hand or, more likely, second hand. Not being experts ourselves, our problem is to be careful about sorting out the claims of experts from the other claims that bombard us, while being aware of the possibility that experts are misinterpreted, that on some topics they disagree, and that occasionally they themselves cannot be trusted to speak straightforwardly. Sensitive to the possibility of misinterpreting experts, we prefer first hand testimony to second hand, and second hand to third hand. Sensitive to disagreement among the experts, we prefer unanimity and believe that the greater the consensus, the stronger the argument from authority.
Also, we are sensitive to when the claim is made and to what else is known about the situation. For example, a man returning from a mountaintop might say to you, “Wow, from there the world looks basically flat.” Twenty anecdotes from twenty such people who independently climbed the same mountain do not make it twenty times more likely that the world is flat. You can’t trust the twenty stories because you know there is much better evidence to be had. However, in the days when the Egyptians were building their pyramids, the twenty anecdotes would actually have made it more reasonable to believe that the world is flat, although even then it wouldn’t have been twenty times more.
It’s important to resist the temptation to conclude that in ancient times people lived on a flat world but that now they live on a round one. This is just mumbo jumbo; the world stayed the same—it was people’s beliefs about the world that changed. Do not overemphasize the power of the mind to shape the world.
Arguments from Analogy
Analogies are used for all kinds of things, including jokes, descriptions of things or events, and in arguments. Here’s an example:
Suppose that for several months a scientist gives experimental drug D to a variety of dogs confined to cages. A group of similar What is reasonable to believe at any time depends on the evidence available at that time. 453 caged dogs do not receive the drug. The scientist then tests to see whether the dogs receiving drug D are more cardiovascularly fit than the ones not receiving the drug. The scientist checks blood pressure, stamina, and other physiological measures. The scientist’s initial conclusion is that dogs that get the drug are no more cardiovascularly fit than the other dogs. The scientist’s final conclusion is that, for humans, taking drug D will be no substitute for getting lots of exercise, as far as cardiovascular fitness is concerned.
This argument uses what analogy? Let’s figure it out. Here is the argument in standard form:
Dogs are like humans in many ways.
Dogs cannot use drug D as a substitute for exercise.
———————————————————————–
Humans cannot use drug D as a substitute for exercise.
The conclusion follows with probability.
Analyzing this argument depends on accepting the analogy between people and dogs. If the analogy is unacceptable, the argument breaks down. Scientists get into serious disputes about whether testing drugs on rats, dogs, and rabbits gives reliable information about how these drugs will affect human beings. These disputes are about analogy. To generalize, the simplest inductive arguments from analogy have the following form:
As are analogous to Bs in several respects.
As have characteristic C.
──────────────────
Bs have characteristic C
Characteristics are the same thing as properties or qualities. In the drug-testing example, A = dogs, B = humans, and C = the characteristic of not being able to use drug D as a substitute for As are analogous to Bs in several respects. As have characteristic C. Bs have characteristic C. If A’s have characteristic C but B’s do not, the analogy between A and B is a faulty analogy as far as C is concerned. The phrase “in several respects” is there to remind us that when we are assessing some piece of reasoning that uses an analogy, we always need to keep in mind which aspects of the analogy should be taken seriously and which should be ignored.
Advertising that uses testimonials often promotes an argument by analogy. Take the Hollywood beauty who testifies to the TV viewer: “I got a silicone breast implant from Dr. Wrigley, and I got the lead part in a commercial. His plastic surgery can help you, too.”336 You, the female viewer, are being asked implicitly to accept the analogy with your own situation and conclude that the surgery will get you what you want. But as a logical reasoner you will confront the analogy directly by thinking something like this: “That’s fine for her, but I’m not trying to get a part in a commercial, so realistically what does her testimony have to do with me in my situation?” By criticizing the analogy in the argument that the TV program encourages you to create, you are using the technique of pointing out the disanalogies. The disanalogies are the differences, the ways in which the two are not analogous. We point out disanalogies when we say, “Yes, they’re alike, but not in the important ways.” We are apt, also, to use this method in response to the analogy between people and shrimp by pointing out that we are not like shrimp in terms of sensitivity to pain, or intelligence, or moral worth. A second method of attacking an argument by analogy is to extend the analogy. We do this when we find other ways the two things are similar and then draw obviously unacceptable conclusions from this similarity. For example, we can attack the argument that uses the analogy between people and dogs by saying, “Dogs are like people in other ways, too. For example, we both like to eat meat. Since dogs enjoy their meat raw, you won’t mind eating your hamburger raw tonight, will you?” When the original advocate of the cardiovascular argument answers, “No, we aren’t that much like dogs,” you can respond with “I agree, so how can you be so sure we are like dogs when it comes to taking drug D?”
Let’s evaluate this argument by analogy from 1940:
Armies are like people. If you cut off the head, the body may thrash around a bit, but very soon it quits fighting. So, a good way to win this European war against the Nazis and Fascists would be to concentrate all our energies on killing Hitler and Mussolini.
Argument evaluation:
There is no doubt that if you cut off someone’s head, the person will soon stop fighting. The problem is whether there is a message here for how to win World War II against the German and Italian armies led by Hitler and Mussolini, respectively. To some extent armies are like people. They eat, they sleep, they move, they fight. On the other hand, to some extent armies are not like people. They are composed of more than one person, they can be in many places at once, and a new head can easily be appointed, and so forth. The most important disanalogy, however, is that the person without a head has to stop fighting, but an army without a supreme leader does not have to stop fighting. Maybe the two armies would stop fighting if their supreme leaders were killed, but the argument by analogy does not provide a strong reason for this conclusion. In short, a person without a head has no brains; an army without a head still has the brains of its officer corps and individual soldiers. A much better case could be made for killing the supreme leader if it could be shown that, throughout history, armies have stopped fighting when their supreme leaders have been killed.
Reasoning Based on Cause and Effect
Causal arguments are arguments in support of a causal explanation or causal claim. You know what a causal claim is. If I say it’s raining, then my claim is not a causal claim. If I say God made it rain, then I’m making the causal claim that God caused the rain. A causal claim is a claim that an effect has a cause. All of us are interested in finding causes; without finding them we’d never understand why anything happens. In the beginning of the chapter, we will explore reasoning about correlations and conclude with reasoning about causes and effects. We will investigate how to recognize, create, justify, and improve these arguments. Cause-effect reasoning often involves arguing by analogy from the past to the present, but it also can involve appealing to scientific theories and to other aspects of logical reasoning.
Correlations
A correlation is a connection or association between two kinds of things. For example, scientists are interested not only in statistics about who has lung cancer, but also in how smoking is related to lung cancer. This relationship is one of apparent connection, and it is described mathematically by saying that the values of the variable “number of smokers in a group” and the variable “number of lung cancer cases in that group” are correlated. The word correlated is a technical term. Finding a correlation in your data between two variables A and B is a clue that there may be some causal story for you to uncover, such as that A is causing B, or vice versa.
Suppose that a scientific article reports that smoking is positively correlated with lung cancer. What this means or implies is that groups of people with a high percentage of smokers usually also have a high percentage of lung cancer cases, and groups with a low percentage of smokers usually also have a low percentage of lung cancer cases. Here is another way to make the same point. The two percentages tend to rise and fall together across many groups. If A = percent of smokers in any group and B = percent of lung cancer cases in the same group, then the scientific article is reporting that values of the variable A tend to go up and down as values of the variable B also go up and down.
Significant Correlations
Given an observed correlation, how can you tell whether it is significant rather than accidental? Well, the problem of telling when an association is significant is akin to the problem of telling when any statistic is significant. The point is that the correlation is significant if you can trust it in making your future predictions. Conversely, an observed correlation is not significant if there is a good chance that it is appearing by accident and thus wouldn’t be a reliable sign of future events. However, you usually aren’t in a position to perform this kind of calculation of the significance of the correlation. If you are faced with a set of data that show a correlation, you might never be able to figure out whether the correlation is significant unless you collect more data. If the correlation is just an accident, it will disappear in the bigger pool of data when more data are collected. In short, significant correlations are those that continue into the future. They are the correlations that occur because some real causal activity is at work.
Causal Claims
Magic doesn’t cause food to appear on your table at dinnertime. Someone has to put some effort into getting it to the table. Effort causes the effect, we say. Similarly, houses don’t just “poof” into existence along the edge of the street of a new subdivision, except perhaps in Harry Potter books. It takes a great deal of physical labor to create these effects. Although effort causes some events, other events are caused with no effort at all. For example, the moon’s gravity is the cause of tides on Earth, yet the moon is not making an effort. It just happens naturally.
Cause-effect claims don’t always contain the word cause. You are stating a cause-effect relationship if you say that heating ice cubes produces liquid water, that eating chocolate cures skin rashes, that the sun’s gravity makes the Earth travel in an ellipse, or that the pollen in the air triggered that allergic reaction. The terms produces, cures, makes, and triggered are causal indicators; they indicate that the connection is more than a mere accidental correlation. Not all causal claims are true, of course. Which one of the previous ones isn’t?
If you insert a cup of sugar into the gas tank of your gasoline-driven car this afternoon, its engine will become gummed up. This is a specific causal claim. More generally, if you put sugar into any engine’s gas tank, the engine will get gummed up. This last causal claim is more general; it doesn’t apply only to this sugar in your car, nor to this date. Because it mentions kinds of objects rather than specific ones, it is a general causal claim—a causal generalization. So causal claims come in two flavors, general and specific. Scientists seeking knowledge of the world prefer general claims to specific ones. You can imagine why.
An event can have more than one cause. If John intentionally shoots Eduardo, then one cause of Eduardo’s bleeding is a bullet penetrating his body. Another cause is John’s intention to kill him. Still another is John’s action of pulling the trigger. All three are causes. We say they are contributing causes or contributing factors or partial causes.
Some contributing causes are more important to us than others, and very often we call the most important one the cause. What counts as the cause is affected by what we are interested in. If we want to cure Eduardo, we might say the bullet’s penetrating the skin is the cause. If we are interested in justice, we might say that John’s actions are the cause, and we would leave all the biology in the background.
Causal claims come in two other flavors in addition to specific and general: those that say causes always produce a certain effect, and those that say causes only tend to produce the effect. Heating ice cubes in a pan on your stove will always cause them to melt, but smoking cigarettes only tends to cause lung cancer. Scientists express this point by saying heating is a determinate cause of ice melting, but smoking is not a determinate cause of lung cancer. Rather, smoking is a probable cause of cancer. The heating is a determinate cause because under known proper conditions its effect will happen every time; it doesn’t just make the event happen occasionally or make its occurrence more likely, as is the case with smoking causing lung cancer. If our knowledge is merely of causes that tend to make the effect happen, we usually don’t know the deep story of what causes the effect. We understand the causal story more completely when we have found the determinate cause.
The verb causes can be ambiguous. Speakers often say, “Smoking causes cancer,” when they don’t mean determinate cause but only probable cause. We listeners must be alert so that we correctly interpret what is said.
Eating peanuts tends to cause cancer, too. But for purposes of good decision making about whether to stop eating peanuts, we would like to know how strong the tendency is. How probable is it that eating peanuts will be a problem for us? If there is one chance in a million, then we are apt to say that the pleasure of peanut eating outweighs the danger; we will risk it. For practical decision making we would also like to overcome the imprecision in the original claim. How much cancer? How many peanuts? How does the risk go up with the amount? If we would have to eat a thousand peanuts every day for ten years in order to be in significant danger, then pass the peanuts, please.
Inferring Causation from Correlation
Unfortunately, additional problems occur with the process of justifying causal claims. If we know that A causes B, we can confidently predict that A is correlated with B because causation logically implies correlation. But we cannot be so confident about the reverse—a correlation doesn’t usually provide strong evidence for causation. Consider the correlation between having a runny nose and having watery eyes. We know neither causes the other. Instead, having a cold causes both. So, in this case we say that the association between the runny nose and the watery eyes is spurious, because a third factor is at work causing the correlation.
In general, when events of kind A are associated (correlated) with events of kind B, three types of explanation might account for the association.
- The association is accidental, a coincidence.
- A is causing B, or B is causing A.
- Something else C is lurking in the background and is causing A and B to be significantly associated. That is, the association is spurious because of lurking factor C.
Given an observed correlation, how can you figure out how to explain it? How can you tell whether A is accidentally correlated with B, or A causes B, or B causes A, or some C is causing both? Here is where scientific sleuthing comes in. You have to think of all the reasonable explanations and then rule out everything until the truth remains. An explanation is ruled out when you collect data inconsistent with it. This entire process of searching out the right explanation is called the scientific method of justifying a causal claim. Let’s see it in action.
There is a strong positive correlation between being overweight and having high blood pressure. The favored explanation of this association is that being overweight puts stress on the heart and makes it pump at a higher pressure. Such an explanation is of type 2 (from the above list). One alternative explanation of the association is that a person’s inability to digest salt is to blame. This inability makes the person hungry, which in turn causes overeating. Meanwhile, the inability to digest salt also makes the heart pump faster and thereby helps distribute what little salt there is in the blood. This pumping requires a high blood pressure. This explanation, which is of type 3, is saying that the association is spurious and that a lurking factor, the inability to digest salt, is producing the association.
When someone suggests a possible explanation—that is, proposes a hypothesis—it should be tested if you want to know whether to accept the explanation as being correct. The test should look at some prediction that can be inferred from the explanation, some prediction that otherwise would not be expected. If the actual findings don’t agree with that prediction, then the explanation is refuted. On the other hand, if the prediction does come out as expected, we hold onto the hypothesis.
However, it is not always an easy matter to come up with a prediction that can be used to test the hypothesis. Good tests can be hard to find. Suppose, for example, my hypothesis is that the communist government of the U.S.S.R. (Russia, Ukraine, and so on) disintegrated in the early 1990s because it was fated to lose power then. How would you test that? You can’t.
The process of guessing a possible explanation and then trying to refute it by testing is the dynamic that makes science succeed. The path to scientific knowledge is the path of conjecturing followed by tough testing. There is no other path. (Philosophers of science say that the path to scientific knowledge is more complicated than this, and they are correct, but what we’ve said here is accurate enough for our purposes.)
Notice the two sources of creativity in this scientific process. First, it takes creativity to think of possible explanations that are worth testing. Second, it takes creativity to figure out a good way to test a suggested explanation.
Criteria for a Causal Relationship
I would commit the post hoc fallacy if I said that the sun regularly comes up after the rooster crows, so he’s the cause of the sun coming up. The fallacy lies in supposing that A caused B when the only evidence is that A has been followed by B. For another example, suppose you get a lot of headaches and you are trying to figure out why. You note that you are unusual because you are the sort of person who often leaves the TV set on all day and night. You also note that whenever you sleep near a TV set that is on, you usually develop a headache. You suspect that being close to the TV is the cause of your headaches. If you were to immediately conclude that the closeness to the TV does cause the headaches, you’d be committing the post hoc fallacy. If you are going to avoid the post hoc fallacy, then how should you proceed?
First you should ask someone with scientific expertise whether there’s any scientific evidence that sleeping close to a TV set that is on should cause headaches. If it should, then concluding that being close to the TV is the cause of your headaches does not commit the post hoc fallacy. Let’s assume that there is no convincing evidence one way or the other, although a few statistical studies have looked for a correlation between the two. If the data in those studies show no association between sleeping near the TV and getting headaches, you can conclude that your suspicions were wrong. But let’s suppose such an association has been found. If so, you are not yet justified in claiming a causal connection between the TV and the headaches. You─or the scientific community─need to do more. Before making your main assault on the causal claim, you need to check the temporal relation, the regularity, the strength, and the coherence of the association. What does that mean?
1. Temporal Relation: To be justified in saying that A causes B, A should occur before, not after, B. The future never causes anything in the present. This temporal relation is important because effects never precede their causes. Fear of sleeping near a TV in the future might cause a headache, but the future sleeping itself cannot cause it now. That is one of the major metaphysical presuppositions of all the sciences. Our claim, or hypothesis, that sleeping close to a TV causes headaches, does pass this first test.
2. Regularity: Suppose that three scientific studies have examined the relationship between sleeping near a TV and having headaches. In two of the studies an association has been found, but in one, none was found. Therefore, the association has not been so regular. Sometimes it appears; sometimes it doesn’t. The greater the regularity, the more likely that the association is significant.
3. Strength: Even when an association is regular across several scientific studies, the strength of the association makes a difference. The weaker the association between sleeping near a TV and getting headaches, the less justified you can be in saying that sleeping near the TV causes headaches. If, after sleeping near the TV, you get a headache 98 percent of the time, that’s a much stronger association than a 50 percent rate.
4. Coherence: The coherence of an association must also be taken into account when assessing whether a causal claim can be inferred from an association. Coherence is how well the causal claim fits with the standard scientific ideas of what is a possible cause of what. Suppose a researcher notices an association between color chosen by painters for the Chinese government’s grain silos and the frequency of headaches among Canadian schoolchildren. On years when the percentage of blue silos in China goes up, so do the Canadian headaches. In green years the headaches go down. Suppose the researcher then speculates that the Chinese colors are causing the Canadian headaches and provides these data about the correlation to make the case. Other scientists would write this off as a crackpot suggestion. They would use their background knowledge about what could possibly cause what in order to deny the causal claim about the colors and the headaches. This odd causal claim does not cohere with the rest of science. It is too bizarre. It is inconsistent with more strongly held beliefs.
Deductive arguments are arguments intended to be judged by the deductive standard of, "Do the premises force the conclusion to be true?"
Inductive arguments are arguments intended to be judged by the inductive standard of, "Do the premises make the conclusion probable?"
A sample is a subset of the population.
The population is the set of things you are interested in generalizing about.
Whenever a generalization is produced by generalizing on a sample, the reasoning process (or the general conclusion itself) is said to be an inductive generalization.
Whenever a generalization is produced by generalizing on a sample, the reasoning process (or the general conclusion itself) is said to be an inductive generalization.
Whenever a generalization is produced by generalizing on a sample, the reasoning process (or the general conclusion itself) is said to be an inductive generalization.
A sample that is perfectly analogous to the whole population in regard to the characteristics that are being investigated.
A method of sampling that is likely to produce a nonrepresentative sample.
When a generalization is made too quickly, on insufficient evidence.
Any sample obtained by using a random sampling method.
Taking a sample from a target population in such a way that any member of the population has an equal chance of being chosen.
A group with a relatively insignificant amount of diversity.
A biased selection method that is often a source of unreliable data.
When the conclusion is a safe bet because the proper authorities have been consulted, they have been quoted correctly, and it is well known that the experts do not significantly disagree with each other about this.
To be valid, As must be analogous to Bs in several respects. As have characteristic C. Bs have characteristic C. If A's have characteristic C but B's do not, the analogy between A and B is a faulty analogy as far as C is concerned.
Cause-effect reasoning involves arguing by analogy from the past to the present, but it also can involve appealing to scientific theories and to other aspects of logical reasoning.
A connection or association between two kinds of things.
A correlation is significant if you can trust it in making your future predictions.
A claim in which one action or thing causes another, often notated as a cause and effect.
To be justified in saying that A causes B, A should occur before, not after, B.
The greater the regularity, the more likely that the association is significant.
How well the causal claim fits with the standard scientific ideas of what is a possible cause of what.