Wednesday, December 30, 2020

Panel Data and Experimental Design

 

How should researchers design panel data experiments? 

In "Panel Data and Experimental Design" (Fiona, 2020) the issue of up-front ex-ante consideration of realistic, non-constant serial correlations in the planning and design phase of panel data experiments is addressed.  After noting the standard simplifying assumptions for variance structures that that are widely used in the simplest panel settings, for example pre-test - post-test two-group comparisons, the authors  demonstrate how to relax the strict assumption of constant serial correlations and demonstrate their new, “serial-correlation-robust” power calculation approach, showing with simulations and an application to real data that it achieves correct power and thus provides  sample size estimates that support valid inference for this important class of experiments. 

In the authors own words, 

Highlights: 

Existing power calculations (McKenzie, 2012) fail with non-constant serial correlation.

We propose a new method for serial-correlation-robust power calculations.

Our method achieves correct power with arbitrary correlation in simulated & real data.

We introduce a new Stata package, pcpanel, which operationalizes this method.

See Fiona, Burlig, Preonas, and Woerman 2020, Panel Data and Experimental Design, J. Devel. Econ. Volume 144

The updated working paper is available here - a free download.


Wednesday, January 21, 2015

Resources for Statistical Practice: Creating an Analysis Plan

The ASA recently wrote briefly about preparing a Statistical Analysis plan.  It was not difficult to find a few other resources with more detailed information for SAPs in biomedical research:

Creation of a Statistical Analysis Plan for reporting clinical trial results: a template produced by Pfizer

The CDC Field Epidemiology Training Program is very comprehensive - see the complete set of trainings at
Training module for Non-Communicable Disease workers

In particular, these three modules are great for learning about statistical consulting:


 Finally, for grant submissions in mental and behavioral health: a recent National Institutes for Mental Health workshop addressed issues with integrating statistical and clinical research elements in grant submissions for intervention-focused research.

Friday, November 28, 2014

Multiple Primary Outcomes and Analysis Strategy

It is so common for researchers to want to include more than one outcome measure as a primary outcome in studies evaluating patient-centered interventions.  A good discussion of this situation, including both investigator/clinician and the statistical perspectives in summary, can be found here - along with a look at the literature practices in studies relating to depression as a subject area.

Quickly, the dilemma centers around the conservativeness of Bonferroni-type corrections, when outcomes are correlated, and the high false-positive rates and difficulty of interpreting results, when multiplicity is not correctly addressed.  CONSORT guidelines recommend selection of just one primary outcome, but this does not provide guidance when a single outcome is not deemed adequate.

Joint testing of multiple outcomes using, for example, linear mixed models with multiple continuous outcomes and random subject effects to account for within-patient correlation, is a method worth considering.  Yoon et. al. report here on simulation study to evaluate this approach in a several scenarios. 

This reminds me of a situation where hypotheses for multiple important outcomes were kept separate, and a multiple testing procedure for these was considered based on Rosenbaum's "testing hypotheses in order" .  Follow up on this to see what its performance (operating characteristics) looks like would be good.

Either of these methods could be used to strengthen a proposal where multiple outcomes seem to be needed.

Friday, October 19, 2012

Princeton-Trenton ASA Seminar on Missing Data in Clinical Trials

This seminar included presentations on analysis methods and a panel discussion on issues from design through sensitivity analyses for missing data mechanism assumptions in clinical trials. The largely pharmaceutical audience also included a few FDA experienced members, so very practical and patient centered issues were raised. The recent report on Missing data in Clinical Trials by the NRC panel on Missing data was cited frequently as a valuable contribution to the field.

Statisticians often point out the limitations (bias in effect and standard error estimates) that accompany per-protocol analyses where patients who drop out are omitted from the study, relative to analyses of the full ITT population. However, a good point to consider is the purpose of the trial and of the treatment under evaluation: it is usually to improve outcomes for those patients who can tolerate that treatment. In practice, no one treats patients as if once assigned a treatment they must comply with that treatment. If a treatment produces an adverse reaction or does not help, a patient will switch.

Recognizing this, newer designs were discussed. For example, in one design patients first participate in a "run-in" phase with a low dose of drug, and only those patients who tolerate the low dose are randomized. This eliminates only one source of dropout; the book includes other examples of designs.

There were some good suggestions for "sensitivity analyses" related to assumptions and methodological choices made in handling the missing data in the analysis of trial data. Slides should be posted soon. One suggestion was to relax the assumption of a single mechanism and to assume instead that there were two groups of patients, each having a different missing data mechanism. Then this analysis is conducted and compared to the primary analysis.

The simple idea of using as many potentially divergent estimators as possible in a sensitivity analysis was agreed to be sensible, but perhaps not feasible for a trial protocol where a primary and specific sensitivity analysis are required.

David Bristol gave his farewell lecture, complete with stories of ancient technologies (chalk & blackboard, for example). He described how different missing data mechanisms in treated and placebo groups can lead to bias: placebo group experiences placebo effect for several visits, but then loses patients for LOE (loss of efficacy). The completers in this group show definite improvement using LOCF analysis. The treatment group loses patients early if tolerability issues arise; they do not improve in such a short time. Dropout of this kind can bring the treated and control group means so close together that significance is lost, even when the drug "works" on those who can tolerate it. I had never thought about that. I now see why design of these studies needs to change. I need to think about the NI trial and perhaps come up with a NI trial design for non-pharmacologic interventions that would have this kind of run-in as part of the design. I also have a new respect for per-protocol analyses.



Wednesday, October 10, 2012

Statistician Greg Ridgeway: New Deputy Director of National Institute of Justice

I have admired Greg's work for a long time. His career path was shaped by a field where little statistical work was being done, and he managed to address important questions using data in innovative ways. He made some great contributions to the field of criminal justice, and will now bring his insights and his unique perspective to the National Institute of Justice.  I remember seeing funding proposals on the NIJ web site back in 2005 when I was a graduate student in statistics. My research advisor was reluctant to submit for funding there, because of our lack of expertise in criminal justice. Greg's story seems to weigh against that inhibition - how interesting!

Here's the article

Statistician Greg Ridgeway: New Deputy Director of National Institute of Justice

Monday, April 12, 2010

Statistical Methods for Sample Surveys

The Johns Hopkins School of Public Health has provided an OpenCourseWare site to make materials from their public health courses available to everyone. The Sample Surveys course is at http://ocw.jhsph.edu/courses/StatMethodsForSampleSurveys/index.cfm
and the other courses can be found from the "Home" link on that page.

Wednesday, January 6, 2010

Data Mining presentations at SUGI

This is a handy collection -the SAS Data Mining group presentations from SUGI meetings are posted at http://www.lexjansen.com/cgi-bin/xsl_transform.php?x=sdm&s=sugi_s&c=sugi

These include summary papers for applications that would be good to share with consulting clients ( for example, there was on one regression assumptions and problem with predictive modeling). There was also a paper on applying the Google page-rank to football teams! Definitely worth browsing. Best-paper awardees are clearly indicated as well.