Institute for Mathematics and Its Applications
Talk abstract:
The two-phase stratified sampling design, with selection into the second phase (validation) sample dependent on both outcome and covariate factors observed for everyone, offers an efficient and cost effective alternative to designs in current use. In a motivating example from the National Wilms Tumor Study, two different strategies are used to sample "cases" (treatment failures) and "controls" for purposes of central pathology review: (i) standard case-control sampling, with controls drawn at random from the treatment successes; and (ii) balanced case-control sampling, with all patients of "unfavorable" histology according to the institutional pathologist included in the case-control sample. Three methods of analysis are considered: (i) Horvitz-Thompson (weighted likelihood) estimation; (ii) pseudo-likelihood; and (iii) nonparametric maximum likelihood. The performance of the six design-analysis strategies is investigated by comparing the estimated regression coeffecients with estimates from a standard logistic regression analysis of the complete data. This shows clearly the advantages of balanced over standard case-control sampling and of maximum likelihood over the other analysis methods.