British Household Panel SurveyBHPS
Stata Intro
28-29 November 2011
Introduction to BHPS using Stata
taught by Alita Nandi; Simonetta Longhi; Alexandra Skew; Gundi Knies
Registration
This course is now fully booked. To be added to the waiting list, please email .
Course objectives
This course is aimed at new users of the BHPS or those who have so far made use only of simpler aspects of the data. The underlying structure of the BHPS is complex, with various different data about individuals and the households in which they live across time. The BHPS team has tried to make this structure as transparent as possible through the way data is organised. However, even the number of different data sets can appear daunting. This course aims to guide the user through these apparent complexities, and ensure that they can effectively make use of as much of the data as they require for their own research projects.
The main focus is on the data reorganisation techniques required for different types of cross-sectional and longitudinal research, rather than the statistical techniques themselves, but it is informed by the ways in which data require to be organised for different statistical techniques.
Participants will learn about the way the BHPS is designed, which data are collected, how they are collected and how the data are structured and stored. By the end of the two day course, the participant will have a thorough knowledge of the BHPS, from survey design to data-set structure, and will have the tools to make the most of a rich, but complex, data set.
A basic working knowledge of Stata is assumed. See course prerequisites below.
Course content and format
The course will be a combination of lectures and computer lab sessions, covering the following topics:
About BHPS
Participants will learn about the BHPS samples, the data collection methods, content of the survey, structure of the different data files, how to prepare the files for analysis, how to find variables and access data.
About Stata
Participants will learn how to write reproducible code using do-files; how to automate repetitive tasks using loops, macros and stored results; how to extract and combine information from different files and how to aggregate information about different units of analysis.
Computer lab sessions
Finding variables, accessing the data, merging and matching different data files and reorganising data for use with different types of analysis, including wide-to-long transformations.
By the end of the two-day workshop participants will know how to:
- Use the documentation to find out what data are available;
- Define the appropriate units of analysis for a research project, and establish the basis for selecting cases and waves of data to use;
- Create merged longitudinal files with data from multiple waves;
- Make use of household level data, and data from other levels, including the annual histories;
- Match information from separate household members to each other, so that household effects can be analysed;
- Organise the data for special statistical techniques, such a panel regression, and discrete time event history models;
- Select the appropriate weights to use with an analysis.
The course will be based on a series of examples using Stata version 9.
Introduction to Understanding Society Seminar
On the second day of the course (29th November) there will be a seminar at lunch-time covering an Introduction to Understanding Society: The UK Household Longitudinal Study, covering content, data and documentation. This seminar is free of charge and open to everyone attending the course
Target audience
This course is aimed at new users of the BHPS or those who have so far made use only of simpler aspects of the data. A basic working knowledge of Stata is presumed (see Prerequisites).
Location and accommodation
The course will take place in the Social Science Research Centre at the University of Essex campus, Colchester. For further details visit how to get to the University. Accommodation is available on campus and in Colchester.
Course materials
Participants will be given handouts of the slides, exercises and Stata do-files during the course. Electronic versions will be made available to download afterwards.
Course Prerequisites
Participants are expected to have a basic working knowledge of Stata and should know how to inspect and describe data, how to recode and label variables, how to use the help functions, and how to work with do- and log-files.
Basic Stata commands with which participants should be familiar:
Getting started:
- starting Stata, Stata windows
- working directory, working memory: dir, cd, set memory
- using and saving data from disk: use, save, compress
- inspecting data: describe, list, inspect, summarize, count, tabulate
- subsetting data: in, if
- getting online help: help, search, hsearch
- exiting Stata
Audit trails:
- Working with do and log-files: doedit, log, cmdlog
- editing do-files: comments, line breaks
Data manipulation:
- creating and recoding variables: generate, replace, recode, egen, rename, drop, keep,
- labelling variables: label define
- bygroup operations: bysort
- observation subscripts: _n, _N
- functions: sum()
Participants not familiar with these Stata commands are strongly encouraged to use the resources recommended below to familiarise themselves with these commands before the course.
Resources for learning Stata:
- Books: Kohler, U. and Kreuter, F. (2009) Data Analysis Using Stata, 2nd ed. College Station, Texas: Stata Press.
- Stata NetCourses (especially nc101)
Further reading
The following provide information about the BHPS and about the types of research for which the data have been used. The course does not presume that participants have read this material.
Lambert, Paul S. (2006) The British Household Panel Survey: Introduction to a Longitudinal Data Resource. University of Stirling
Institute for Social and Economic Research (2008) In Praise of Panel Surveys University of Essex.