Administrative data are an important source of information for social science research. For example, school records have been used to track trends in student academic performance. Administrative data generally refers to data collected as part of the management and operations of a publicly funded program or service. Today, use of administrative data is becoming increasingly common in research about child care and early education. These data often are a relatively cost-effective way to learn more about the individuals and families using a particular service or participating in a particular program, but they do have some important limitations.
The advantages and disadvantages of using administrative data are described here. Issues pertaining to the access to such data are discussed. Terms relating to administrative data and its use in research studies are defined in the Research Glossary.
Advantages of Administrative Data
Obtaining and Learning About Administrative Data
- Administrative data make possible analyses at the state and local levels that are rarely possible using national survey data.
- Such data often contain detailed, accurate measures of participation in various social programs. They typically include large numbers of cases, making possible many different types of analyses.
- Data on the same individuals and/or same programs over a long period of time can be used for longitudinal and trend studies.
- Potential for linking data from several programs in order to get a more complete picture of individuals and the services received.
- At the state level, such data provide effective ways for assessing state-specific programs and can be useful for several forms of program evaluation.
- The large sample sizes allow small program effects to be more easily detected, and permit effects to be estimated for different groups.
- It is less expensive to obtain administrative data than to collect data directly on the same group.
Administrative data are collected to manage services and comply with government reporting regulations. Because the original purpose of the data is not research, this presents several challenges.
- The administrative data only describe the individuals or families using a service and provide no information about similar people who do not use the service.
- The potential observation period for any subject being studied (e.g., a person, a family, a child care program) is limited to the period of time that the subject is using the service for which the data are being collected.
- Generally, only those services that are publicly funded are included in the administrative data. For example, a researcher cannot rely on subsidy data to learn about all child care providers in the state or on non-subsidized forms of child care being used to augment child care that is subsidized.
- Many variables used in administrative data are not updated regularly, so it is important to learn how and when each variable is collected. For instance, an "earnings" variable in administrative data for subsidized child care generally is entered at the time that eligibility is determined and then updated when eligibility is redetermined. When this is the case, there is no way to know, using administrative data alone, what a family earns in the months between eligibility determination and redetermination.
- Important variables needed for a particular research study may not be collected in administrative data.
- Because the data are limited to data on program participants, information on those eligible for the program but who are not enrolled is often not available. Thus, administrative data may not be especially useful for estimating certain characteristics such as participation rates.
- Measurement error can pose a substantial challenge to analysts using administrative data. Factors affecting measurement error include:
- Data that were improperly entered at the agency
- Incomplete or inaccurate data items, particularly those items not required by the agency for management or reporting purposes
- Missing values on variables that have been overwritten by updated versions when cases are reviewed
- Procedures for accessing the data for research purposes can be time consuming and difficult. Protecting the privacy of program participants and the confidentiality of the data when they are used for research is a major concern to program officials.
Researchers interested in using administrative data for the purpose of research should expect to invest considerable time learning about the details of the administrative data system, the specific data elements being used, the data entry process and standards, and changes in the data system and data definitions over time. It also takes time to transform administrative data into research datasets that can be used in statistical analyses.
Important issues usually confront researchers who have decided to use administrative data records in their research. Among the most important of these issues are:
- Obtaining Data Access and Ensuring Confidentiality To obtain administrative data, a researcher and the agency responsible for the data must reach an agreement on how the data are to be used and processed, how confidentiality will be maintained and how the research results will be disseminated.
- Documentation of Source Data
- Once researchers have obtained the administrative records of interest, they must become familiar with the idiosyncrasies of the data.
- When combining data from more than one administrative database, researchers must be careful to assess the comparability of the data elements and the effectiveness of record matching procedures.
- Researchers must also learn about, understand, and document changes in the definition and meaning of data elements over time as well the procedures for updating data values.
- Researchers should carefully document variable definitions, value codes, any recodes that the agency implemented, changes in definitions and their effective dates and information on how the agency collected the data.
- Documentation of Program Parameters and Context. In addition to documenting the source data, investigators should also take great care to document the important parameters of the program that collected the data and to describe the policy context at the time the data were collected. For example, whether all those who were eligible and applied for the program were able to be served, when there were caps on available funding that may have limited service provision.
Sampling is not often done when administrative data are used for research purposes since information are available on the entire population of recipients. However, in order to ensure the protection of subject confidentiality a subsample from the full population may be selected. Studies that combine the use of survey research and administrative data records may also select only a sample of the population in order to minimize data collection costs.
The Joint Center for Poverty Research offers many recommendations on using administrative data (PDF). It recognizes the following centers as having successfully used administrative data in their research efforts:
- Child Welfare Research Center
- Ray Marshall Center for the Study of Human Resources at the University of Texas at Austin
- Chapin Hall Center for Children at the University of Chicago
- University of Maryland School of Social Work
More resources on administrative data integration, analyses, management, confidentiality, and security can be found here: Working with Administrative Data. Also, see Profiles in Success of Statistical Uses of Administrative Data (PDF) for more information on the use of administrative data.