*********************************************** ** SC968 PANEL DATA METHODS for SOCIOLOGISTS ** DO-FILE FOR LECTURE 5 *********************************************** *********************************************** ** 5.0 OBJECTIVES *********************************************** *********************************************** ** 5.1 GETTING STARTED *********************************************** cd D:\Home\savram\SC968 cap log close log using "SC968 Lecture 9 Worksheet log.log",replace use survival_2014.dta,clear ** check the data describe summarize *********************************************** ** 5.2 SUMMARISING TIME TO EVENT DATA *********************************************** ** prepare data for survival analysis stset wave, failure(mastat==1/2) id(pid) exit(mastat==1/2 .) list pid wave mastat _st _d _t _t0 in 1/100,sepby(pid) noobs ** Question: In the previous setting, Stata considers the start of the risk period to be wave 1. How can you tell that? ** Answer: By looking at _t0 and _t. _t0=0 in wave 1 meaning that the start of the risk period is in wave 1. **Look at ages at the start of the study sum age if wave==1, detail stset age, id(pid) failure(mastat=1/2) origin(time 16) entry(wave=1) exit(mastat=1/2 .) list pid wave age mastat _st _d _t _t0 in 1/100,sepby(pid) noobs stdes ** Question: Are there any gaps? ** Answer: No stsum ** Question: What is the total person years of follow-up? ** Answer: 2539 ** Question: What is the cohabiting relationship rate per year of follow up? ** Answer: 6.8% stsum, by(sex) ** produce Kaplan-Meier graph sts graph, by (sex) graph save Graph "SC968 Lecture 9 Worksheet graph 1.gph",replace ** Question: What is the median time to cohabitation for men and women? ** Answer: Men 11 years, women 9 years ** log rank test of difference in survival sts test sex ** Question: Is there a significant difference in time to cohabitation between men and women? ** Answer: Yes, the log rank test is significant at 5% level *********************************************** ** 5.3 COX REGRESSION MODELS *********************************************** xi:stcox i.sex xi:stcox i.sex i.agegroup ** Table 1. Hazard ratios by gender for time to first cohabiting partnership ** Hazard ratio 95% C.I. P value **Unadjusted 1.66 1.22-2.24 0.001 **Adjusted for age group 1.64 1.21-2.23 0.001 ** Question: What happens to the hazard ratio for gender when you adjust for age group? ** Answer: It remains more or less the same. ** Question: What does the hazard ratio for gender represent when you adjust for age? ** Answer: It represents the hazard ratio of women relative to men when age is equal to the ommitted category, i.e. 15-24. ** Question: What’s the hazard ratio of women finding a partner relative to men when they are aged 35 and over? ** Answer: It's 1.64*2.01=3.30. This is the harzard ratio of women aged 35+ relative to the base category which is men, aged 15-24. ** report whether variables vary over time stvary nssec* hqual* income* xi:stcox i.nssec_w1 xi:stcox i.nssec_w1 i.sex i.agegroup xi:stcox i.income_w1 xi:stcox i.income_w1 i.sex i.agegroup xi:stcox i.hqual_w1 xi:stcox i.hqual_w1 i.sex i.agegroup xi:stcox i.sex i.agegroup i.nssec_w1 i.hqual_w1 i.income_w1 **Table 2 Hazard ratios for time to first cohabiting partnership ** Class Education Income **Unadjusted 1.36 1.43 1.48 **Adjusted for age/gender 1.25 1.35 1.33 **Adjusted for age/gender 1.04 1.36 1.25 ** and other SEP measures ** Question: Which measure of SEP has the highest hazard ratios? And which the lowest? ** Answer: Highest = education and income; lowest = social class ** Question: Are they all significant predictors of time to cohabitation? ** Answer: Only income is in univariate models. ** Question: How would you interpret the differences between the unadjusted hazard ratios and the hazard ratios adjusted for age and sex? ** Answer: The hazard ratios decrease a little bit suggesting the three variables are confounded by age and sex, i.e. there are gender and age differences between education/class/income groups ** that explain part of the differences in the time to first partnership ** Question: Does each SEP measure still predict time to cohabitation when you control for the other SEP measures? What do you conclude from this? ** Answer: None of the three predictors is significant at the conventional 5% level. However, education and income still have relatively high hazard ratios but they may be too imprecisely estimated. ** Class appears to have no independent effect. *********************************************** ** 5.4 THE PROPORTIONAL HAZARDS ASSUMPTION *********************************************** ** Kaplan Meier plot by gender stcoxkm, by(sex) graph save Graph "SC968 Lecture 9 Worksheet graph 2.gph",replace ** Question: What do you notice? Are the observed lines close to the predicted lines? ** Answer: Yes, they appear to be reasonably close. ** plot of cummulative hazard by gender sts graph, by(sex) cumhaz graph save Graph "SC968 Lecture 9 Worksheet graph 3.gph",replace ** Question: Do the cumulative survival curves cross? ** Answer: No ** log-log survival plot stphplot, strata(sex) adjust(agegroup nssec_w1 hqual_w1 income_w1) graph save Graph "SC968 Lecture 9 Worksheet graph 4.gph",replace ** Question: The lines should be approximately parallel. Are they? ** Answer: Yes, until the end of the analysis time ** interaction of gender with time xi: stcox i.sex i.agegroup i.nssec_w1 i.hqual_w1 i.income_w1 , tvc(i.sex)texp(log(_t)) ** Question: Does the effect of gender vary by time? ** Answer: No. Look at the second part of the table termed-tvc. It contains the interaction between log(analysis time) and sex. ** The p-value is 0.236--> the interaction term is not statisticaly significant. ** Schoenfeld residuals xi:stcox i.sex i.agegroup i.nssec_w1 i.hqual_w1 i.income_w1, schoenfeld(sch*) scaledsch(sca*) estat phtest, rank detail ** Question: Is the test statistic significant for sex? ** Answer: No, chisq= 2.44, p = 0.18 ** Question: Is there any evidence that hazards are non proportional for any of the other covariates? ** Answer: Yes, education. ** Kaplan Meier plot by education stcoxkm, by(hqual_w1) graph save Graph "SC968 Lecture 9 Worksheet graph 5.gph",replace ** plot of cummulative hazard by education sts graph, by(hqual_w1) cumhaz graph save Graph "SC968 Lecture 9 Worksheet graph 6.gph",replace ** log-log survival plot by education stphplot, strata(hqual_w1) adjust(sex nssec_w1 agegroup income_w1) graph save Graph "SC968 Lecture 9 Worksheet graph 7.gph",replace ** interaction of education with time xi: stcox i.sex i.agegroup i.nssec_w1 i.hqual_w1 i.income_w1 , tvc(i.hqual_w1)texp(log(_t)) ** Cox model stratified by education xi:stcox i.sex i.nssec_w1 i.agegroup i.income_w1, strata(hqual_w1) ** run model with time varying variables xi: stcox i.sex i.agegroup i.nssec i.hqual i.income ** Question: Why do you think the estimates are different from table 2? ** Answer: Income and social class show significant variation over time – see stvary output from the start of this session. Education does not change much. ******************************************************* ** 5.5 TRYING A NEW SURVIVAL ANALYSIS ON YOUR OWN! ******************************************************* ** analyse time to drop-out from survey stset wave, id(pid) failure(wdrawn==1) list pid wave wdrawn _st _d _t _t0 in 1/100,sepby(pid) noobs stdes stsum xi:stcox i.sex i.agegroup i.nssec_w1 i.hqual_w1 i.income_w1 ** Question: Is gender, age or SEP related to withdrawal from the survey? ** Answer: Gender and age only. But should check whether SEP measures have any univariate effect. log close