Since it first appeared in 1996, the open-source programming language R has become increasingly popular as an environment for statistical analysis and graphical output. In addition to being freely available, R offers several advantages for biostatistics, including strong graphics capabilities, the ability to write customized functions, and its extensibility. This is the first textbook to present classical biostatistical analysis for epidemiology and related public health sciences to students using the R language. Based on the assumption that readers have minimal familiarity with statistical concepts, the author uses a step-bystep approach to building skills.
The text encompasses biostatistics from basic descriptive and quantitative statistics to survival analysis and missing data analysis in epidemiology. Illustrative examples, including real-life research problems and exercises drawn from such areas as nutrition, environmental health, and behavioral health, engage students and reinforce the understanding of R. These examples illustrate the replication of R for biostatistical calculations and graphical display of results. The text covers both essential and advanced techniques and applications in biostatistics that are relevant to epidemiology. This text is supplemented with teaching resources, including an online guide for students in solving exercises and an instructor’s manual.
KEY FEATURES:
- First overview biostatistics textbook for epidemiology and public health that uses the open-source R program
- Covers essential and advanced techniques and applications in biostatistics as relevant to epidemiology
- Features abundant examples and exercises to illustrate the application of R language for biostatistical calculations and graphical displays of results
- Includes online student solutions guide and instructor’s manual
Jadual kandungan
Contents
Preface
1. INTRODUCTION
1.1 Medicine, Preventive Medicine, Public Health, and Epidemiology
Medicine
Preventive Medicine and Public Health
Public Health and Epidemiology
Review Questions for Section 1.1
1.2 Personal Health and Public Health
Personal Health Versus Public Health
Review Questions for Section 1.2
1.3 Research and Measurements in EPDM and PH
EPDM: The Basic Science of PH
Main Epidemiologic Functions
The Cause of Diseases
Exposure Measurement in Epidemiology
Additional Issues
Review Questions for Section 1.3
1.4 BIOS and EPDM
Review Questions for Section 1.4
References
2. RESEARCH AND DESIGN IN EPIDEMIOLOGY AND PUBLIC HEALTH
Introduction
2.1 Causation and Association in Epidemiology and Public Health
The Bradford-Hill Criteria for Causation and Association in Epidemiology
Legal Interpretation Using Epidemiology
Disease Occurrence
Review Questions for Section 2.1
2.2 Causation and Inference in Epidemiology and Public Health
Rothman’s Diagrams for Sufficient Causation of Diseases
Causal Inferences
Using the Causal Criteria
Judging Scientific Evidence
Review Questions for Section 2.2
2.3 Biostatistical Basis of Inference
Modes of Inference
Levels of Measurement
Frequentist BIOS in EPDM
Confidence Intervals in Epidemiology and Public Health
Bayesian Credible Interval
Review Questions for Section 2.3
2.4 BIOS in EPDM and PH
Applications of BIOS
BIOS in EPDM and PH
Processing and Analyzing Basic Epidemiologic Data
Analyzing Epidemiologic Data
Using R
Evaluating a Single Measure of Occurrence
Poisson Count (Incidence) and Rate Data
Binomial Risk and Prevalence Data
Evaluating Two Measures of Occurrence—Comparison of Risk: Risk Ratio and Attributable Risk
Comparing Two Rate Estimates: Rate Ratio rr
Comparing Two Risk Estimates: Risk Ratio RR and Disease (Morbidity) Odds Ratio DOR
Comparing Two Odds Estimates From Case–Control: The Salk Polio Vaccine Epidemiologic Study
Review Questions for Section 2.4
Exercises for Chapter 2
Using Probability Theory
Disease Symptoms in Clinical Drug Trials
Risks and Odds in Epidemiology
Case–Control Epidemiologic Study
Mortality, Morbidity, and Fertility Rates
Incidence Rates in Case-Cohort Survival Analysis
Prevalence
Mortality Rates
Estimating Sample Sizes
References
Appendix
3. DATA ANALYSIS USING R PROGRAMMING
Introduction
3.1 Data and Data Processing
Data Coding
Data Capture
Data Editing
Imputations
Data Quality
Producing Results
Review Questions for Section 3.1
3.2 Beginning R
R and Biostatistics
A First Session Using R
The R Environment
Review Questions for Section 3.2
3.3 R as a Calculator
Mathematical Operations Using R
Assignment of Values in R and Computations Using Vectors and Matrices
Computations in Vectors and Simple Graphics
Use of Factors in R Programming
Simple Graphics
x as Vectors and Matrices in Biostatistics
Some Special Functions That Create Vectors
Arrays and Matrices
Use of the Dimension Function dim in R
Use of the Matrix Function matrix in R
Some Useful Functions Operating on Matrices in R
NA: “Not Available” for Missing Values in Datasets
Special Functions That Create Vectors
Review Questions for Section 3.3
Exercises for Section 3.3
3.4 Using R in Data Analysis in BIOS
Entering Data at the R Command Prompt
The Function list() and the Making of data.frame() in R
Review Questions for Section 3.4
Exercises for Section 3.4
3.5 Univariate, Bivariate, and Multivariate Data Analysis
Univariate Data Analysis
Bivariate and Multivariate Data Analysis
Multivariate Data Analysis
Analysis of Variance (ANOVA)
Review Questions for Section 3.5
Exercises for Section 3.5
References
Appendix: Documentation for the plot function
Generic X–Y Plotting
4. GRAPHICS USING R
Introduction
Choice of System
Packages
4.1 Base (or Traditional) Graphics
High-Level Functions
Low-Level Plotting Functions
Interacting with Graphics
Using Graphics Parameters
Parameters List for Graphics
Device Drivers
Review Questions for Section 4.1
Exercises for Section 4.1
4.2 Grid Graphics
The lattice Package: Trellis Graphics
The Grid Model for R Graphics
Grid Graphics Objects
Applications to Biostatistical and Epidemiologic Investigations
Review Questions for Section 4.2
Exercises for Section 4.2
References
5. PROBABILITY AND STATISTICS IN BIOSTATISTICS
Introduction
5.1 Theories of Probability
What Is Probability?
Basic Properties of Probability
Probability Computations Using R
Applications of Probability Theory to Health Sciences
Typical Summary Statistics in Biostatistics: Confidence Intervals, Significance Tests, and Goodness of Fit
Review Questions for Section 5.1
Exercises for Section 5.1
5.2 Typical Statistical Inference in Biostatistics: Bayesian Biostatistics
What Is Bayesian Biostatistics?
Bayes’s Theorem in Probability Theory
Bayesian Methodology and Survival Analysis (Time-to-Event) Models for Biostatistics in Epidemiology and Preventive Medicine
The Inverse Bayes Formula
Modeling in Biostatistics
Review Questions for Section 5.2
Exercises for Section 5.2
References
6. CASE–CONTROL STUDIES AND COHORT STUDIES IN EPIDEMIOLOGY
Introduction
6.1 Theory and Analysis of Case–Control Studies
Advantages and Limitations of Case–Control Studies
Analysis of Case–Control Studies
Review Questions for Section 6.1
Exercises for Section 6.1
6.2 Theory and Analysis of Cohort Studies
An Important Application of Cohort Studies
Clinical Trials
Randomized Controlled Trials
Cohort Studies for Diseases of Choice and Noncommunicable Diseases
Cohort Studies and the Lexis Diagram in the Biostatistics of Demography
Review Questions for Section 6.2
Exercises for Section 6.2
References
7. RANDOMIZED TRIALS, PHASE DEVELOPMENT, CONFOUNDING IN SURVIVAL ANALYSIS, AND LOGISTIC REGRESSIONS
7.1 Randomized Trials
Classifications of RTs by Study Design
Randomization
Biostatistical Analysis of Data from RTs
Biostatistics for RTs in the R Environment
Review Questions for Section 7.1
Exercises for Section 7.1
7.2 Phase Development
Phase 0 or Preclinical Phase
Phase I
Phase II
Phase III
Pharmacoepidemiology: A Branch of Epidemiology
Some Basic Tests in Epidemiologic Phase Development
Review Questions for Section 7.2
Exercises for Section 7.2
7.3 Confounding in Survival Analysis
Biostatistical Approaches for Controlling Confounding
Using Regression Modeling for Controlling Confounding
Confounding and Collinearity
Review Questions for Section 7.3
Exercises for Section 7.3
7.4 Logistic Regressions
Inappropriateness of the Simple Linear Regression When y Is a Categorical Dependent Variable
The Logistic Regression Model
The Logit
Logistic Regression Analysis
Generalized Linear Models in R
Review Questions for Section 7.4
Exercises for Section 7.4
References
Index
Mengenai Pengarang
Bertram K. C. (Bert) Chan, Ph D, is currently Consulting Biostatistician at the School of Medicine, Department of Preventive Medicine at Loma Linda University.