A ‘how to’ guide for applying statistical methods to biomarker data analysis
Presenting a solid foundation for the statistical methods that are used to analyze biomarker data, Analysis of Biomarker Data: A Practical Guide features preferred techniques for biomarker validation. The authors provide descriptions of select elementary statistical methods that are traditionally used to analyze biomarker data with a focus on the proper application of each method, including necessary assumptions, software recommendations, and proper interpretation of computer output. In addition, the book discusses frequently encountered challenges in analyzing biomarker data and how to deal with them, methods for the quality assessment of biomarkers, and biomarker study designs.
Covering a broad range of statistical methods that have been used to analyze biomarker data in published research studies, Analysis of Biomarker Data: A Practical Guide also features:
* A greater emphasis on the application of methods as opposed to the underlying statistical and mathematical theory
* The use of SAS, R, and other software throughout to illustrate the presented calculations for each example
* Numerous exercises based on real-world data as well as solutions to the problems to aid in reader comprehension
* The principles of good research study design and the methods for assessing the quality of a newly proposed biomarker
* A companion website that includes a software appendix with multiple types of software and complete data sets from the book’s examples
Analysis of Biomarker Data: A Practical Guide is an ideal upper-undergraduate and graduate-level textbook for courses in the biological or environmental sciences. An excellent reference for statisticians who routinely analyze and interpret biomarker data, the book is also useful for researchers who wish to perform their own analyses of biomarker data, such as toxicologists, pharmacologists, epidemiologists, environmental and clinical laboratory scientists, and other professionals in the health and environmental sciences.
विषयसूची
Preface xiii
Acknowledgements xvii
1 Introduction 1
1.1 What is a Biomarker? 1
1.2 Biomarkers Versus Surrogate Endpoints 2
1.3 Organization of This Book 3
2 Designing Biomarker Studies 5
2.1 Introduction 5
2.2 Designing the Study 6
2.2.1 The Exposure-Disease Association 6
2.2.2 Cross-sectional Studies 7
2.2.3 Case-Control Studies 7
2.2.4 Retrospective Cohort Studies 9
2.2.5 Prospective Cohort Studies 9
2.2.6 Observational Studies 10
2.2.7 Randomized Controlled Trials 11
2.3 Designing the Analysis 13
2.3.1 Choosing the Appropriate Measure of Association 15
2.3.1.1 Odds Ratio versus Risk Ratio 15
2.3.1.2 Consequences of Not Choosing the Appropriate Measure of Association 16
2.3.2 Choosing the Appropriate Statistical Analysis 16
2.3.3 Choosing the Appropriate Sample Size 17
2.4 Presenting Statistical Results 18
Problems 20
3 Elementary Statistical Methods for Analyzing Biomarker Data 21
3.1 Introduction 21
3.2 Graphical and Tabular Summaries 21
3.3 Descriptive Statistics 26
3.4 Describing the Shape of Distributions 31
3.5 Sampling Distributions 33
3.6 Introduction to Statistical Inference 34
3.6.1 Point Estimation and Confidence Interval Estimation 34
3.6.2 Hypothesis Testing 38
3.7 Comparing Means Across Groups 43
3.7.1 Two Group Comparisons 44
3.7.2 Multiple-Group Comparisons 45
3.8 Correlation Analysis 50
3.9 Regression Analysis 52
3.9.1 Simple Linear Regression 52
3.9.2 Multiple Regression 55
3.9.3 Analysis of Covariance 58
3.10 Analyzing Cross-Classified Data 61
3.10.1 Testing for Independence 61
3.10.2 Comparison of Proportions 65
Problems 69
4 Frequently Encountered Challenges in Analyzing Biomarker Data and How to Deal with Them 72
4.1 Introduction 72
4.2 Non-Normally Distributed Data 73
4.2.1 The Effects of Non-Normality 73
4.2.2 Testing Distributional Assumptions 74
4.2.2.1 Graphical Methods for Assessing Normality 74
4.2.2.2 Measures of Skewness and Kurtosis 81
4.2.2.3 Formal Hypothesis Tests of the Normality Assumption 83
4.2.3 Remedial Measures for Violation of a Distributional Assumption 86
4.2.3.1 Choosing a Transformation 86
4.2.3.2 Using a Robust Statistical Procedure 92
4.2.3.3 Distribution-Free Alternatives 93
4.3 Heterogeneity of Variance 113
4.3.1 The Effects of Heterogeneity 113
4.3.2 The Importance of Heterogeneity in the Comparison of Means 113
4.3.2.1 Comparisons of Two Groups 113
4.3.2.2 Comparisons of More Than Two Groups 116
4.3.2.3 Multiple Comparisons 118
4.4 Dependent Groups 122
4.4.1 The Consequences of Ignoring Dependence Among Groups 122
4.4.2 Comparing Two Dependent Means 124
4.4.2.1 Paired t-test 124
4.4.2.2 Wilcoxon Signed Ranks Test 127
4.4.2.3 Sign Test 128
4.4.3 Tests of Dependent Proportions 134
4.4.3.1 Mc Nemar’s Test 134
4.4.3.2 Cochran’s Q test 138
4.4.3.3 Sample Size and Power Considerations 142
4.5 Correlated Outcomes 144
4.5.1 Choosing the Appropriate Measure of Association 144
4.5.1.1 Spearman’s rho 144
4.5.1.2 Kendall’s tau-b 146
4.5.2 Recommended Methods of Statistical Analysis for Correlation Coefficients 148
4.5.3 Recommended Methods for Interpreting Correlation Coefficient Results 156
4.5.4 Sample Size Issues in Correlation Analysis 157
4.5.5 Comparison of Correlation Coefficients 171
4.5.5.1 Comparison of Independent Correlation Coefficients 172
4.5.5.2 Comparison of Dependent Correlation Coefficients 174
4.5.6 Sample Size Issues When Comparing Two Correlation Coefficients 181
4.5.6.1 Sample Size Issues When Comparing Independent Correlation Coefficients 181
4.5.6.2 Sample Size Issues When Comparing Dependent Correlation Coefficients 183
4.6 Clustered Data 184
4.7 Outliers 199
4.7.1 The Effects of Outliers 199
4.7.2 Detection of Outliers 199
4.7.3 Methods for Accommodating Outliers 207
4.8 Limits of Detection and Non-Detected Observations 208
4.8.1 Statistical Inference When NDs Are Present 210
4.8.2 Maximum Likelihood Estimation of a Correlation Coefficient When Both X and Y Are Subject to Non-Detects 210
4.8.3 Comparison of Confidence Interval Methods for Correlation Coefficients When Both Variables Are Subject to Limits of Detection 212
4.9 The Analysis of Cross-Classified Categorical Data 221
4.9.1 Choosing the Appropriate Measure of Association 221
4.9.1.1 The Odds Ratio 221
4.9.1.2 Risk Ratio 223
4.9.1.3 Risk Difference 224
4.9.1.4 Odds Ratio for Paired Data 225
4.9.2 Choosing the Appropriate Statistical Analysis 225
4.9.3 Choosing the Appropriate Sample Size 226
4.9.4 Choosing a Statistical Method When Both the Predictor and the Outcome Are Dichotomous 226
4.9.4.1 Comparing Two Independent Groups in Terms of a Binomial Proportion 226
4.9.4.2 Exact Test for Independence of Rows and Columns in a 2 × 2 Table 230
4.9.4.3 Exact Inference for Odds Ratios 232
4.9.4.4 Inference for the Odds Ratio for Paired Data 234
4.9.5 Choice of a Statistical Method When the Predictor Is Ordinal and the Outcome is Dichotomous 237
4.9.5.1 Tests for a Significant Trend in Proportions 237
4.9.6 Choice of a Statistical Method When Both the Predictor and the Outcome are Ordinal 240
4.9.6.1 Test for Linear-by-Linear Association 240
4.9.7 Choice of a Statistical Method When Both the Predictor and the Outcome are Nominal 243
4.9.7.1 Fisher-Freeman-Halton Test 243
Problems 246
5 Validation of Biomarkers 255
5.1 Overview of Methods for Assessing Characteristics of Biomarkers 255
5.2 General Description of Measures of Agreement 257
5.2.1 Discrete Variables 257
5.2.1.1 Cohen’s Kappa 257
5.2.1.2 Extensions of Coefficient Kappa 265
5.2.1.3 Weighted Kappa 273
5.2.2 Continuous Variables 275
5.2.2.1 Pearson’s Correlation Coefficient 275
5.2.2.2 Alternatives to Pearson’s Correlation Coefficient 277
5.3 Assessing Reliability of a Biomarker 287
5.3.1 General Considerations 287
5.3.2 Assessing Reliability of a Dichotomous Biomarker 287
5.3.2.1 Dichotomous Biomarker More Than Two Raters 289
5.3.3 Assessing Reliability of a Continuous Biomarker 291
5.3.4 Assessing Inter-Subject Intra-Subject and Analytical Measurement Variability 292
5.4 Assessing Validity 294
5.4.1 General Considerations 294
5.4.2 Assessing Validity When a Gold Standard Is Available 295
5.4.2.1 Dichotomous Biomarkers 295
5.4.2.2 Comparing Several Dichotomous Biomarkers 302
5.4.2.3 Continuous Biomarkers 304
5.4.3 Assessing Validity When a Gold Standard Is Not Available 314
5.4.3.1 Dichotomous Biomarkers 315
5.4.3.2 Continuous Biomarkers 319
5.4.4 Assessing Criterion Validity in Method Comparison Studies 328
5.4.5 Assessing Construct Validity in Method Comparison Studies 329
Problems 329
References 332
Solutions to Problems 348
Index 391
लेखक के बारे में
STEPHEN W. LOONEY, PHD, is Professor in the Department of Biostatistics and Epidemiology at Georgia Regents University, USA. He is a Fellow of the American Statistical Association and the Royal Statistical Society, an elected member of the International Statistical Institute, and a member of the International Biometric Society.
JOSEPH L. HAGAN, SCD, is Research Statistician at Texas Children’s Hospital and Assistant Professor at the Baylor College of Medicine, USA. He is a member of the American Statistical Association.