Imputation Procedures and the Quality of Income Information in the ECHP

Publication type

Conference Paper

Series

Methodology and Statistics Conference

Authors

Publication date

September 16, 2002

Abstract:

The aim of the paper is to investigate the impact of the imputation procedures adopted in the European Community Household Panel (ECHP) on the quality of the information about income variables.

We evaluate the imputation methods adopted in the ECHP by looking for systematic differences in the distribution of income across different types of responding units. More precisely, we compare descriptive statistics of the imputed income variables (both in levels and growth rates) for the respondents and for different types of nonrespondents. While income levels do not seem to be affected much by imputation, income dynamics is. This occurs because the imputation procedure seems to alter the tails of the distribution of income growth. The effects of imputation are reduced if we consider statistics that are robust to outliers, such as the median.

A different approach to evaluate the impact of the missing data problem on indicators of poverty can be adopted following the work of Manski (1989) and Horowitz and Manski (1998). These papers show how to derive bounds for the cumulative distribution function of a variable of interest without imposing any assumption on the missing data mechanism.

Net household income in the ECHP is affected by nonresponse in about 22.4% of the cases. Such a high nonresponse rate implies that Manski's bounds tend to be wide.

In many cases, however, the information on income is not completely absent because income may be reported partially, i.e. we may know that total net household income is above a known threshold. This information may be sufficient to identify poor people. In fact, if household income is above the poverty line, then we can classify the members of a household as non-poor. Further, our ability to classify people as non-poor increases as the poverty line is reduced. This lowers the nonresponse rate by a big amount, narrowing Manski's bounds. Our aim is to evaluate if, for a suitable choice of the poverty line, it is better to combine the information from the fully respondents and the partially respondents and avoid using the imputed values.


Related Publications

#517903

News

Latest findings, new research

Publications search

Search all research by subject and author

Podcasts

Researchers discuss their findings and what they mean for society

Projects

Background and context, methods and data, aims and outputs

Events

Conferences, seminars and workshops

Survey methodology

Specialist research, practice and study

Taking the long view

ISER's annual report

Themes

Key research themes and areas of interest