Marie Davidian presented work published in the 2004 Statistics in Medicine paper in Spring of 2007 at NCSU - here are the presentation slides: Double Robustness in Estimation of Causal Treatment Effects.
The doubly robust estimator of treatment effect uses inverse probability weighting to adjust for selection into treatment and further regression adjustment in modeling treatment effect. Davidian shows that as long as either the propensity model or the regression model is correct, the doubly-robust estimator is a consistent estimator of the true treatment effect.
A SAS macro that performs doubly robust estimation is available from the Harry Guess research community web site at UNC.
Monday, December 29, 2008
AHRQ Symposium on Clinical and Comparative Effectiveness Research
From the invitation: "The Agency for Healthcare Research and Quality, through its Effective Health Care program, is sponsoring a second invitational symposium on clinical and comparative effectiveness research methods. This 2-day symposium will be held in the first week of June 2009 (dates to be confirmed) at the AHRQ Conference Center. The symposium is a direct followup to the 2006 AHRQ conference on Emerging Methods in Comparative Effectiveness and Safety; papers presented at that conference appeared in a 2007 Medical Care supplement ."
There is interesting work on propensity score methodology, inappropriate prescribing, and more reported in this supplement and it is helpful as an indicator of the areas of interest for this symposium. There does not, at first glance, seem to be much innovative methodology from a statistical viewpoint. I wonder...might there be an opportunity for the upcoming symposium to introduce boosted regression tree methods for either propensity score estimation or for using longitudinal data creatively for disease management programs?
There is interesting work on propensity score methodology, inappropriate prescribing, and more reported in this supplement and it is helpful as an indicator of the areas of interest for this symposium. There does not, at first glance, seem to be much innovative methodology from a statistical viewpoint. I wonder...might there be an opportunity for the upcoming symposium to introduce boosted regression tree methods for either propensity score estimation or for using longitudinal data creatively for disease management programs?
Wednesday, December 24, 2008
TE Love Propensity Score Resources
Thomas E. Love teaches a great short course on Propensity Score Methods and has made resources, including his bibliography, a spreadsheet for sensitivity analysis, and presentation slides, available here.
Monday, December 22, 2008
Sample of titles from Gary King's works in progress
I was just looking up one of Gary King's papers on his preprint page, since I don't have access to the journal in which it was published. I found so many interesting and useful preprints there that I had a hard time just getting to what I was looking for. A sampling of the titles is the quickest way to indicate the range and relevance of his works:
- Measuring Total Health Inequality: Adding Individual Variation to Group-Level Differences
- The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation
- The Future of Death in America
- How Not to Lie Without Statistics
- Matching for Causal Inference Without Balance Checking
- A `Politically Robust' Experimental Design for Public Policy Evaluation, with Application to the Mexican Universal Health Insurance Program
- What to do When Your Hessian is Not Invertible: Alternatives to Model Respecification in Nonlinear Estimation
- Armed Conflict as a Public Health Problem
This list also provides inspiration towards better titles in my future publications. Gary has a paper with advice on writing for publication, too - see his site.
- Measuring Total Health Inequality: Adding Individual Variation to Group-Level Differences
- The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation
- The Future of Death in America
- How Not to Lie Without Statistics
- Matching for Causal Inference Without Balance Checking
- A `Politically Robust' Experimental Design for Public Policy Evaluation, with Application to the Mexican Universal Health Insurance Program
- What to do When Your Hessian is Not Invertible: Alternatives to Model Respecification in Nonlinear Estimation
- Armed Conflict as a Public Health Problem
This list also provides inspiration towards better titles in my future publications. Gary has a paper with advice on writing for publication, too - see his site.
Tuesday, December 16, 2008
Modeling Clustered and Longitudinal Data - resources from Don Hedeker
Don Hedeker taught a tremendously useful, informative, and enjoyable course on multilevel modeling for clustered and longitudinal data at the recent Deming Conference. He managed to cover underlying theory (with a light touch), difficulties with approaches that ignore the structure of the data, and the multilevel approach to modeling. He illustrated every concept with practical applications using real data. He told jokes. He included funny New Yorker comics. It was a blast!
Presentation slides are posted on his web site (see under "Links"). He has also posted datasets and SAS code that he used to demonstrate his points. This is such a valuable collection of resources that I wanted to include specific pointers:
- Introduction to Multilevel Modeling. The slides posted here are the ones he used at the short course.
- Longitudinal data - continuous, binary, ordinal. These are the materials that he used in the course. His inclusion of missing data issues and strategies is important and quite useful.
I don't know if he will be teaching this course anywhere soon, but I highly recommend it!
Presentation slides are posted on his web site (see under "Links"). He has also posted datasets and SAS code that he used to demonstrate his points. This is such a valuable collection of resources that I wanted to include specific pointers:
- Introduction to Multilevel Modeling. The slides posted here are the ones he used at the short course.
- Longitudinal data - continuous, binary, ordinal. These are the materials that he used in the course. His inclusion of missing data issues and strategies is important and quite useful.
I don't know if he will be teaching this course anywhere soon, but I highly recommend it!
Monday, December 15, 2008
Case-control design for rare events modeling
In predictive modeling of rare events, such as hospitalization (or, international conflict escalating to war), the rareness of the event of interest means that very large samples are needed in order to provide enough input information to 'learn' what predictors are most informative.
One way around this is to implement a more efficient sampling design. An obvious choice is the 'case-control' design: using all of the observations corresponding to events, and a simple random sample of non-events. This provides a richer source of training data and it should improve predictive performance.
Predicted probabilities resulting from such a sample will be artificially high, and must be adjusted in order to correct for the sampling design. In Logistic Regression in Rare Events Data, Gary King and Langche Zeng develop corrections for finite sample and rare events bias, and standard error inconsistency that is useful when selecting based on the outcome variable as in a case-control study.
For the logit model, prior correction is shown to be consistent, fully efficient, and easy to apply. Explicit expressions are provided in Appendix B. Software that implements the methods in this paper using Stata is available from http://GKing.Harvard.Edu
One way around this is to implement a more efficient sampling design. An obvious choice is the 'case-control' design: using all of the observations corresponding to events, and a simple random sample of non-events. This provides a richer source of training data and it should improve predictive performance.
Predicted probabilities resulting from such a sample will be artificially high, and must be adjusted in order to correct for the sampling design. In Logistic Regression in Rare Events Data, Gary King and Langche Zeng develop corrections for finite sample and rare events bias, and standard error inconsistency that is useful when selecting based on the outcome variable as in a case-control study.
For the logit model, prior correction is shown to be consistent, fully efficient, and easy to apply. Explicit expressions are provided in Appendix B. Software that implements the methods in this paper using Stata is available from http://GKing.Harvard.Edu
Statistical View of Boosting -
Additive Logistic Regression - A Statistical View of Boosting, published in the Annals of Statistics 2000, by none other than Jerome Friedman, Trevor Hastie, and
Robert Tibshirani. (71 pages in length). I will post a summary soon. It looks to be a valuable link between classical statistical concepts and new machine learning methodology.
Robert Tibshirani. (71 pages in length). I will post a summary soon. It looks to be a valuable link between classical statistical concepts and new machine learning methodology.
Subscribe to:
Comments (Atom)