Descriptions of key issues in survey research and questionnaire design are highlighted in the following sections. Modes of data collection approaches are described together with their advantages and disadvantages. Descriptions of commonly used sampling designs are provided and the primary sources of survey error are identified. Terms relating to the topics discussed here are defined in the Research Glossary.
Modes of Survey Administration
Sources of Error
Survey research is a commonly-used method of collecting information about a population of interest. The population may be composed of a group of individuals (e.g., children under age five, kindergarteners, parents of young children) or organizations (e.g., early care and education programs, k-12 public and private schools).
There are many different types of surveys, several ways to administer them, and different methods for selecting the sample of individuals or organizations that will be invited to participate. Some surveys collect information on all members of a population and others collect data on a subset of a population. Examples of the former are the National Center for Education Statistics' Common Core of Data and the Administration for Children and Families' Survey of Early Head Start Programs (PDF).
A survey may be administered to a sample of individuals (or to the entire population) at a single point in time (cross-sectional survey), or the same survey may be administered to different samples from the population at different time points (repeat cross-sectional). Other surveys may be administered to the same sample of individuals at different time points (longitudinal survey). The Survey of Early Head Start Programs is an example of a cross-sectional survey and the National Household Education Survey Program is an example of a repeat cross-sectional survey. Examples of longitudinal surveys include the Head Start Family and Child Experiences Survey and the Early Childhood Longitudinal Study, Birth and Kindergarten Cohorts.
Regardless of the type of survey, there are two key features of survey research:
- Questionnaires—a predefined series of questions used to collect information from individuals.
- Sampling—a technique in which a subgroup of the population is selected to answer the survey questions. Depending on the sampling method, the information collected may or may not be generalized to the entire population of interest.
The American Association for Public Opinion Research (AAPOR) offers recommendations on how to produce the best survey possible: Best Practices for Survey Research.
AAPOR also provides guidelines on how to assess the quality of a survey: Evaluating Survey Quality in Today's Complex Environment.
Advantages and Disadvantages of Survey Research
- Surveys are a cost-effective and efficient means of gathering information about a population.
- Data can be collected from a large number of respondents. In general, the larger the number of respondents (i.e., the larger the sample size), the more accurate will be the information that is derived from the survey.
- Sampling using probability methods to select potential survey respondents makes it possible to estimate the characteristics (e.g., socio-demographics, attitudes, behaviors, opinions, skills, preferences and values) of a population without collecting data from all members of the population.
- Depending on the population and type of information sought, survey questionnaires can be administered in-person or remotely via telephone, mail, online and mobile devices.
- Questions asked in surveys tend to be broad in scope.
- Surveys often do not allow researchers to develop an in-depth understanding of individual circumstances or the local culture that may be the root cause of respondent behavior.
- Respondents may be reluctant to share sensitive information about themselves and others.
- Respondents may provide socially desirable responses to the questions asked. That is, they may give answers that they believe the researcher wants to hear or answers that shed the best light on them and others. For example, they may over-report positive behaviors and under-report negative behaviors.
- A growing problem in survey research is the widespread decline in response rates, or percentage of those selected to participate who chose to do so.
The two most common types of survey questions are closed-ended questions and open-ended questions.
- The respondents are given a list of predetermined responses from which to choose their answer.
- The list of responses should include every possible response and the meaning of the responses should not overlap.
- An example of a close-ended survey question would be, "Please rate how strongly you agree or disagree with the following statement: 'I feel good about my work on the job.' Do you strongly agree, somewhat agree, neither agree nor disagree, somewhat disagree, or strongly disagree?"
- A Likert scale, which is used in the example above, is a commonly used set of responses for closed-ended questions.
- Closed-ended questions are usually preferred in survey research because of the ease of counting the frequency of each response.
- Survey respondents are asked to answer each question in their own words. An example would be, "In the last 12 months, what was the total income of all members of your household from all sources before taxes and other deductions?" Another would be, "Please tell me why you chose that child care provider?"
- It is worth noting that a question can be either open-ended or close-ended depending on how it is asked. In the previous example, if the question on household income asked respondents to choose from a given set of income ranges instead, it would be considered close-ended.
- Responses are usually categorized into a smaller list of responses that can be counted for statistical analysis.
A well designed questionnaire is more than a collection of questions on one or more topics. When designing a questionnaire, researchers must consider a number of factors that can affect participation and the responses given by survey participants. Some of the things researchers must consider to help ensure high rates of participation and accurate survey responses include:
- It is important to consider the order in which questions are presented.
- Sensitive questions, such as questions about income, drug use, or sexual activity, should generally be placed near the end of the survey. This allows a level of trust or psychological comfort to be established with the respondent before asking questions that might be embarrassing or more personal.
- Researchers also recommend putting routine questions, such as age, gender, and marital status, at the end of the questionnaire.
- Questions that are more central to the research topic or question and that may serve to engage the respondent should be asked early. For example, a survey on children's early development that is administered to parents should ask questions that are specific to their children in the beginning or near the beginning of the survey.
- Double-barreled questions, which ask two questions in one, should never be used in a survey. An example of a double-barreled question is, "Please rate how strongly you agree or disagree with the following statement: 'I feel good about my work on the job, and I get along well with others at work.'" This question is problematic because survey respondents are asked to give one response for their feelings about two conditions of their job.
- Researchers should avoid or limit the use of professional jargon or highly specialized terms, especially in surveys of the general population.
- Question and response option text should use words that are at the appropriate reading level for research participants.
- The use of complex sentence structures should be avoided.
- Researchers should avoid using emotionally loaded or biased words and phrases.
- The length of a questionnaire is always a consideration. There is a tendency to try and ask too many questions and cover too many topics. The questionnaire should be kept to a reasonable length and only include questions that are central to the research question(s). The length should be appropriate to the mode of administration. For example, in general, online surveys are shorter than surveys administered in-person.
Questionnaires and the procedures that will be used to administer them should be pretested (or field tested) before they are used in a main study. The goal of the pretest is to identify any problems with how questions are asked, whether they are understood by individuals similar to those who will participate in the main study, and whether response options in close-ended questions are adequate. For example, a parent questionnaire that will be used in a large study of preschool-age children may be administered first to a small (often non-random) sample of parents in order to identify any problems with how questions are asked and understood and whether the response options that are offered to parents are adequate.
Based on the findings of the pretest, additions or modifications to questionnaire items and administration procedures are made prior to their use in the main study.
See the following for more information about questionnaire design:
Surveys can be administered in four ways: through the mail, by telephone, in-person or online. When deciding which of these approaches to use, researchers consider: the cost of contacting the study participant and of data collection, the literacy level of participants, response rate requirements, respondent burden and convenience, the complexity of the information that is being sought and the mix of open-ended and close-ended questions.
Some of the main advantages and disadvantages of the different modes of administration are summarized below.
- Advantages: Low cost; respondents may be more willing to share information and to answer sensitive questions; respondent convenience, can respond on their own schedule
- Disadvantages: Generally lower response rates; only reaches potential respondents who are associated with a known address; not appropriate for low literacy audiences; no interviewer, so responses cannot be probed for more detail or clarification; participants' specific concerns and questions about the survey and its purpose cannot be addressed
- Advantages: Higher response rates; responses can be gathered more quickly; responses can be probed; participants' concerns and questions can be addressed immediately
- Disadvantages: More expensive than mail surveys; depending on how telephone numbers are identified, some groups of potential respondents may not be reached; use of open-ended questions is limited given limits on survey length
- Advantages: Highest response rates; better suited to collecting complex information; more opportunities to use open-ended questions and to probe respondent answers; interviewer can immediately address any concerns participant has about the survey and answer their questions
- Disadvantages: Very expensive; time-consuming; respondents may be reluctant to share personal or sensitive information when face-to-face with an interviewer
- Advantages: Very low cost; responses can be gathered quickly; respondents may be more willing to share information and to answer sensitive questions; questionnaires are programmed, which allows for more complex surveys that follow skip patterns based on previous responses; respondent convenience, can respond on their own schedule
- Disadvantages: Potentially lower response rates; limited use of open-ended questions; not possible to probe respondents' answers or to address their concerns about participation
Increasingly, researchers are using a mix of these methods of administration. Mixed-mode or multi-mode surveys use two or more data collection modes in order to increase survey response. Participants are given the option of choosing the mode that they prefer, rather than this being dictated by the research team. For example, the Head Start Family and Child Experience Survey (2014-2015) offers teachers the option of completing the study's teacher survey online or using a paper questionnaire. Parents can complete the parent survey online or by phone.
See the following for additional information about survey administration:
- Four Survey Methodologies: A Comparison of Pros and Cons (PDF)
- Collecting Survey Data
- Improving Response to Web and Mixed-Mode Surveys
In child care and early education research as well as research in other areas, it is often not feasible to survey all members of the population of interest. Therefore, a sample of the members of the population would be selected to represent the total population.
A primary strength of sampling is that estimates of a population's characteristics can be obtained by surveying a small proportion of the population. For example, it would not be feasible to interview all parents of preschool-age children in the U.S. in order to obtain information about their choices of child care and the reasons why they chose certain types of care as opposed to others. Thus, a sample of preschoolers' parents would be selected and interviewed, and the data they provide would be used to estimate the types of child care parents as a whole choose and their reasons for choosing these programs. There are two broad types of sampling:
- Nonprobability sampling: The selection of participants from a population is not determined by chance. Each member of the population does not have a known or given chance of being selected into the sample. Findings from nonprobability (nonrandom) samples cannot be generalized to the population of interest. Consequently, it is problematic to make inferences about the population. Common nonprobability sampling techniques include convenience sampling, snowball sampling, quota sampling and purposive sampling.
Probability sampling: The selection of participants from the population is determined by chance and with each individual having a known, non-zero probability of selection. It provides accurate descriptions of the population and therefore good generalizability. In survey research, it is the preferred sampling method.
Three forms of probability sampling are described here:
Simple Random Sampling
This is the most basic form of sampling. Every member of the population has an equal chance of being selected. This sampling process is similar to a lottery: the entire population of interest could be selected for the survey, but only a few are chosen at random. For example, researchers may use random-digit dialing to perform simple random sampling for telephone surveys. In this procedure, telephone numbers are generated by a computer at random and called to identify individuals to participate in the survey.
Stratified sampling is used when researchers want to ensure representation across groups, or strata, in the population. The researchers will first divide the population into groups based on characteristics such as race/ethnicity, and then draw a random sample from each group. The groups must be mutually exclusive and cover the population. Stratified sampling provides greater precision than a simple random sample of the same size.
Cluster sampling is generally used to control costs and when it is geographically impossible to undertake a simple random sample. For example, in a household survey with face-to-face interviews, it is difficult and expensive to survey households across the nation using a simple random sample design. Instead, researchers will randomly select geographic areas (for example, counties), then randomly select households within these areas. This creates a cluster sample, in which respondents are clustered together geographically.
Survey research studies often use a combination of these probability methods to select their samples. Multistage sampling is a probability sampling technique where sampling is carried out in several stages. It is often used to select samples when a single frame is not available to select members for a study sample. For example, there is no single list of all children enrolled in public school kindergartens across the U.S. Therefore, researchers who need a sample of kindergarten children will first select a sample of schools with kindergarten programs from a school frame (e.g., National Center for Education Statistics' Common Core of Data) (Stage 1). Lists of all kindergarten classrooms in selected schools are developed and a sample of classrooms selected in each of the sampled schools (Stage 2). Finally, lists of children in the sampled classrooms are compiled and a sample of children is selected from each of the classroom lists (Stage 3). Many of the national surveys of child care and early education (e.g., the Head Start Family and Child Experiences Survey and the Early Childhood Longitudinal Survey-Kindergarten Cohort) use a multistage approach.
Multistage, cluster and stratified sampling require that certain adjustments be made during the statistical analysis. Sampling or analysis weights are often used to account for differences in the probability of selection into the sample as well as for other factors (e.g., sampling frame, undercoverage, and nonresponse). Standard errors are calculated using methodologies that are different from those used for a simple random sample. Information on these adjustments is provided by the National Center for Education Statistics through its Distance Learning Dataset Training System.
See the following for additional information about the different types of sampling approaches and their use:
- National Center for Education Statistics Distance Learning Dataset Training System: Analyzing NCES Complex Survey Data
- Sampling in Developmental Science: Situations, Shortcomings, Solutions, and Standards
- Nonprobability Sampling
- The Future of Survey Sampling
- Sampling Methods (StatPac)
Estimates of the characteristics of a population using survey data are subject to two basic sources of error: sampling error and nonsampling error. The extent to which estimates of the population mean, proportion and other population values differ from the true values of these is affected by these errors.
- Sampling error is the error that occurs because all members of the population are not sampled and measured. The value of a statistic (e.g., mean or percentage) that is calculated from different samples that are drawn from the same population will not always be the same. For example, if several different samples of 5,000 people are drawn at random from the U.S. population, the average income of the 5,000 people in those samples will differ. (In one sample, Bill Gates may have been selected at random from the population, which would lead to a very high mean income for that sample. Researchers use a statistic called the standard error to measure the extent to which estimated statistics (percentages, means, and coefficients) vary from what would be found in other samples. The smaller the standard error, the more precise are the estimates from the sample. Generally, standard errors and sample size are negatively related, that is, larger samples have smaller standard errors.
Nonsampling error includes all errors that can affect the accuracy of research findings other than errors associated with selecting the sample (sampling error). They can occur in any phase of a research study (planning and design, data collection, or data processing). They include errors that occur due to coverage error (when units in the target population are missing from the sampling frame), nonresponse to surveys (nonresponse error), measurement errors due to interviewer or respondent behavior, errors introduced by how survey questions were worded or by how data were collected (e.g., in-person interview, online survey), and processing error (e.g., errors made during data entry or when coding open-ended survey responses). While sampling error is limited to sample surveys, nonsampling error can occur in all surveys.
Measurement error is the difference between the value measured in a survey or on a test and the true value in the population. Some factors that contribute to measurement error include the environment in which a survey or test is administered (e.g., administering a math test in a noisy classroom could lead children to do poorly even though they understand the material), poor measurement tools (e.g., using a tape measure that is only marked in feet to measure children's height would lead to inaccurate measurement), rater or interviewer effects (e.g., survey staff who deviate from the research protocol).
Measurement error falls into two broad categories: systematic error and random error. Systematic error is the more serious of the two.
- Systematic error
Occurs when the survey responses are systematically different from the target population responses. It is caused by factors that systematically affect the measurement of a variable across the sample.
For example, if a researcher only surveyed individuals who answered their phone between 9 and 5, Monday through Friday, the survey results would be biased toward individuals who are available to answer the phone during those hours (e.g., individuals who are not in the labor force or who work outside of the traditional Monday through Friday, 9 am to 5 pm schedule).
- It can include both nonobservational and observational error.
- Nonobservational error -- Error introduced when individuals in the target population are systematically excluded from the sample, such as in the example above.
- Observational error -- Error introduced when respondents systematically answer survey question incorrectly. For example, surveys that ask respondents how much they weigh may underestimate the population's weight because some respondents are likely to report their weight as less than it actually is.
- Systematic errors tend to have an effect on responses and scores that is consistently in one direction (positive or negative). As a result, they contribute to bias in estimates.
- Random error
Random error is an expected part of survey research, and statistical techniques are designed to account for this sort of measurement error. It is caused by factors that randomly affect measurement of the variable across the sample.
Random error occurs because of natural and uncontrollable variations in the survey process, i.e., the mood of the respondent, lack of precision in measures used, and the particular measures/instruments (e.g., inaccuracy in scales used to measure children's weight).
For example, a researcher may administer a survey about marital happiness. However, some respondents may have had a fight with their spouse the evening prior to the survey, while other respondents' spouses may have cooked the respondent's favorite meal. The survey responses will be affected by the random day on which the respondents were chosen to participate in the study. With random error, the positive and negative influences on the survey measures are expected to balance out.
- Unlike systematic errors, random errors do not have a consistent positive or negative effect on measurement. Instead, across the sample the effects are both positive and negative. Such errors are often considered noise and add variability, though not bias, to the data.
See the following for additional information about the different types and sources of errors: