Presented by the Pew Research Center

Pew Social and Demo­graphic Trends

By D’Vera Cohn — Pew Research Center

After a 72-year wait required by law, the National Archives has released indi­vidual records from the 1940 Census, opening a gold mine for people researching their family histories. But the 1940 Census also played a notable role in the history of census-taking: It helped usher in the modern era of sample surveys.

With the nation deep in the Great Depression, government offi­cials planning for the 1940 Census hoped to expand the topics they asked about in order to guide federal policy-making in an era of expanded government. They wanted to add ques­tions to the census about people’s incomes, migration histories and housing conditions.

But adding these new ques­tions would have made the census form too lengthy, and been too costly. After lobbying by some of its younger statis­ti­cians, the Census Bureau’s solution was to turn to the new science of survey sampling. To make room for new topics, the Census Bureau decided for the first time to ask some ques­tions of a sample of respon­dents, not of everyone. Those responses were extrap­o­lated to the total population.

New Ques­tions

The 1940 Census included 34 popu­lation ques­tions and 31 housing ques­tions for the general popu­lation, and an addi­tional 16 ques­tions (on six topics) that were asked of a sample of respon­dents. Some ques­tions asked of the sample had been asked in previous counts, and others were asked for the first time in 1940. The housing ques­tions were listed on a separate form, but in practice enumer­ators asked them along with the popu­lation questions.

Census Bureau offi­cials knew that some of the new ques­tions would be contro­versial, and that some people would find them intrusive. There has been debate since the first census, in 1790, about how much it is appro­priate for the government to ask beyond a basic count.

One question that stoked contro­versy was about income, asking respon­dents the “amount of money wages or salary received (including commis­sions)”; a follow-up question asked whether the person received $50 or more in income from other sources. In fact, a U.S. senator from New Hamp­shire had lobbied unsuc­cess­fully for the question to be dropped.

In 1940, not many Amer­icans had previ­ously been asked to give financial infor­mation to the federal government, according to historian Margo J. Anderson’s book, “The American Census: A Social History.” Only a minority of Amer­icans then filed income tax forms; only 15 million forms were filed in 1940, when the national popu­lation was counted at 132 million.

Census offi­cials knew that the strongest oppo­sition to the income question came from high-income Amer­icans, but the infor­mation they needed most was from low– and middle-income Amer­icans. So census-takers were told to ask for actual wage or salary amounts only up to $5,000 a year; at the time, three-quarters of American families made no more than that. Incomes of $5,000 or more would be reported as “$5,000+.” The instruc­tions to census-takers said: “Some persons who might otherwise be reluctant to report wages or salary would be quite willing to do so if they learn that the amount above $5,000 need not be specified.”

If people were reluctant to give their income infor­mation to the enumerator who knocked on their door, they could choose to fill out a card that would be sealed in an envelope and mailed to the Census Bureau. The question on income was toward the end of the ques­tion­naire, “because if the enumerator got kicked out of a household when that question was asked, the inter­viewer would have already obtained the answers to the previous ques­tions,” recalled Census Bureau official Edwin D. Gold­field, in an interview for a bureau oral history archive.

The housing ques­tions amounted to the first national inventory of housing, according to Anderson. They included ques­tions about the home’s water supply and toilet, if any. There also was a series of ques­tions about the home’s value and cost. The housing forms were destroyed, and are not part of the 1940 Census release of records.

Sample Survey

Since the first U.S. census, the same set of ques­tions had been asked of everyone. But in 1940, the desire for addi­tional data coin­cided with improve­ments in survey method­ology and theory that allowed the Census Bureau to add more ques­tions without burdening the entire popu­lation. Statis­ti­cians in the 1930s had made great advances in designing and imple­menting sample surveys, in which a randomly selected subset of respon­dents supply data that are extrap­o­lated to represent views of the entire popu­lation.

The impor­tance of drawing a random sample was made clear after Literary Digest, a well-known magazine, conducted a straw poll of respon­dents whose names were obtained mainly from auto­mobile regis­tration lists and tele­phone books. The poll, which had a gigantic response of two million post­cards (about 1,000 is considered an adequate sample today), indi­cated a land­slide victory for Repub­lican Alf Landon over Democrat Franklin D. Roosevelt in the 1936 pres­i­dential election. Although factors other than the flawed sample also played a role in that failed forecast, the Literary Digest debacle was a strong force in the rise of surveys based on scien­tific prob­a­bility samples, in which any American adult has a known chance of being asked for a response. Other polls taken in 1936 that were based on random samples were much more accurate than the Literary Digest poll.

Some long-time Census Bureau offi­cials had resisted incor­po­rating scien­tific sampling into the decennial census, believing it would “down­grade the validity of census infor­mation, because you had to say that this is based on a sample,” recalled Ross Eckler, a former Census Bureau director who was inter­viewed for a Census Bureau oral history project.

Armed with infor­mation from some smaller government sample surveys, a younger gener­ation of statis­ti­cians persuaded their skep­tical elders at the Census Bureau that such an approach should be part of the decennial census. The intro­duction of sampling did not cause any notable public contro­versy or chal­lenge from members of Congress, according to Eckler and other sources.

In the 1940 Census, the sampled popu­lation consisted of every 20th person inter­viewed by any given enumerator. In practice, enumer­ators filled out forms that had space for infor­mation from 40 people on each side, and were told to ask the supple­mental ques­tions of people whose names fell on certain desig­nated lines. According to the instruc­tions for enumer­ators, the ques­tions should be asked about anyone whose name was listed on a certain line, “whether this be the head, his wife, a son or daughter, an infant, a lodger, or any other member of the household.”

The sample ques­tions included place of birth of the person’s mother and father; “mother tongue” in the household during early childhood; three ques­tions about military service and veteran or veteran-family status; three ques­tions about Social Security receipt; and ques­tions about occu­pation, industry and class of worker. Women who were or had been married were asked whether they had been married more than once, how old they were when first married and the number of children they had ever had.

Sampling Since 1940

Every decennial census from 1950 to 2000 included widened use of sampling to ask addi­tional ques­tions of part of the popu­lation, which even­tually were asked on a separate form known as the “long form.” Since 2000, the American Community Survey, a sample survey, has asked the same ques­tions on a continuous basis that the census had asked once a decade on the long form. The Census Bureau also uses sample surveys to evaluate the quality of the census and to test new ques­tions and survey methods.

The tech­niques for drawing a sample, and extrap­o­lating results, have become increas­ingly complex. Over the years, the Census Bureau moved from a person-based sample to a household-based sample. Instead of being drawn from people inter­viewed by one enumerator, samples were drawn from lists of addresses.

The reli­a­bility of sample surveys is now taken for granted within the Census Bureau, and wider research community, where sample surveys are the basis for measuring public opinion, political atti­tudes, employment levels and other information.

Some uses still can generate contro­versy, however. Since 1950, the Census Bureau has taken a post-enumeration sample survey to check the quality of the full count. In 2000, the Census Bureau planned to use a sample survey as the basis not only for checking the accuracy of its original count, but also for amending that official count if it was found that the post-census survey data would improve the quality of the enumer­ation. The agency planned to conduct a post-enumeration survey, match the results with actual census records and apply a statis­tical tech­nique known as dual-systems esti­mation to correct flaws in the original enumeration.

However, the appli­cation of survey sampling to produce the official counts used to apportion congres­sional seats among the states met oppo­sition, mainly from Repub­licans who expressed concern that results could be manip­u­lated for partisan purposes. The use of survey sampling for appor­tionment purposes was success­fully chal­lenged in the U.S. Supreme Court before the 2000 Census was taken.

The Census Bureau did take a post-enumeration survey in 2000, and considered trying to use its results for other purposes, such as producing data for redis­tricting within states or allo­cating federal funds among states and local­ities. But problems were discovered with the survey results, so they were not used for other purposes.

The bureau did conduct a post-enumeration survey after the 2010 Census, called the Census Coverage Measurement program. It is intended to produce measure­ments of under­count, duplicate count and other error, but not to amend the count. Results are expected sometime this year.

Pew Research Center

The Pew Research Center for the People & the Press is an inde­pendent, non-partisan public opinion research orga­ni­zation that studies atti­tudes toward politics, the press and public policy issues. In this role it serves as a valuable infor­mation resource for political leaders, jour­nalists, scholars and citizens.

The Center conducts regular monthly polls on politics and major policy issues as well as the News Interest Index, a weekly survey aimed at gauging the public’s interest in and reaction to major news events. Shorter commen­taries are produced on a regular basis addressing the issues of the day from a public opinion perspective. In addition, the Center peri­od­i­cally fields major surveys on the news media, social issues and inter­na­tional affairs.

More PostsWebsiteFacebook