Survey data, especially large federal statistical collections, have long been the foundation of social science research and a vital part of economic planning. Conducting these surveys, however, and deriving useful statistical products from the information they provide has become increasingly challenging due to declining response rates and rapidly rising costs.
These issues are further amplified by an escalating demand for timelier and more geographically detailed data. Federal agencies including the U.S. Census Bureau want to explore how external data sources can help to fill in some of the gaps left by traditional surveys, adding greater depth to their statistical and economic analyses.
This research initiative examines how local administrative data associated with housing values may provide new levels of analytic insight when compared with the American Community Survey (ACS) data traditionally used by federal agencies. Our initial trial used geographically detailed data from Arlington County to produce a particular statistical measure known as a hedonic index. Hedonic indexes are an especially robust measure of economic value, calculated using a regression analysis which relates the price or cost of a good to the specific combination of attributes of which it is comprised.
Our team tested a wide variety of administrative data sources designed to meet the particular analytical needs of Arlington County. These included Arlington County Real Estate Assessment data, commercially aggregated assessment data from CoreLogic and Black Knight Financial Services, multiple listing service (MLS) data from Metropolitan Regional Information Systems, and neighborhood indicators from Location, Inc.
Our findings demonstrate that researchers need not be limited to using federal survey data when investigating the value of housing stock. Information derived from administrative real estate assessment records and data about real estate transactions provide more precise hedonic estimates of housing value compared to typical survey sources. These findings present a new possibility: that data quality can be improved by the use of local information sources, reducing the burdens of survey administration currently faced by federal agencies.
Sources: American Community Survey (ACS) Data, 2013; Arlington County Real Estate Assessment Data, 2013; Black Knight Financial Services Assessment Records, 2013; CoreLogic Assessment Records, 2013; Metropolitan Regional Information System Real Estate Sales Data, 2013.
The overall pattern of results is presented in the above figure shows the incremental value of adding variables to the hedonic regressions beyond those available from the ACS. Just using ACS data in the hedonic regression produces a goodness-of-fit of 0.62 for Arlington, VA. The goodness-of-fit increases when using local property directly from the county or repackaged by commercial aggregators (CoreLogic) and increases even more when using the Metropolitan Regional Information System that produces Multiple Listing Service housing sales data.
The usefulness of local data is not limited to researchers. These findings are of potential use to local governments in validating their assessment practices. Comparing estimates of housing value between those based on real estate assessment records (updated annually) and those based on MLS data (updated when a sale occurs) can help capture sudden elevations in housing demand. This study also provides further evidence for the value of using local administrative data in federal statistical assessments.