Advanced Statistics for Health Research provides a rigorous geometric understanding of models used in the analysis of health data, including linear and non-linear regression models, and supervised machine learning models. Models drawn from the health literature include: ordinary least squares, two-stage least squares, probits, logits, Cox regressions, duration modeling, quantile regression and random forest regression. Causal inference techniques from the health literature are presented including randomization, matching and propensity score matching, differences-in-differences, instrumental variables, regression discontinuity, and fixed effects analysis. Codes for the respective statistical techniques presented are given for STATA, SAS and R.
Contents:
- The First Day
- The Spreadsheet View of Data
- Multiple Regression — Beta Coefficients, Correlations, and Standard Deviations
- The Data-Generating Process and Scientific Inference
- Causal Inference Using Multiple Regressions
- Randomization and Friends
- Matching and Propensity Score Matching — ‘As if Randomized’
- Instrumental Variables
- Regression Discontinuity — A Sort of Instrumental Variable Technique
- Statistical Merging — Difference-in-Difference and Split-Sample Instrumental Variables
- Panel Dataset Analysis with Randomization
- Panel Dataset Analysis with Fixed Effects and Lags
- Panel Dataset Analysis with Generalized Methods of Moments (GMM), Possible Endogeneity with the Predictor Variables
- Logits, Probits, and Multinomial Logits
- Discrete Outcomes Continued: The Area Under the Curve Metrics, Count Models
- Cox Regression Models a.k.a. Proportional Hazards Modeling
- Structural Duration Models
- Quantile Regression
- Linear Model with Restrictions/Variable Selections–Supervised Machine Learning Including Random Forest Regression
- Random Forest Regression Residuals and the Regression Gini Index
Readership: For health professionals, advanced undergraduate and graduate students who study regressions, causal inference, or supervised machine learning.
Key Features:
- Simple geometric figures make causal inference techniques more intuitive
- Easy to understand empirical examples are drawn from a broad collection of the health research literature
- SAS, Stata, and R codes make implementing regression models straightforward