Presents statistical methodologies for analyzing common types of data from method comparison experiments and illustrates their applications through detailed case studies
Measuring Agreement: Models, Methods, and Applications features statistical evaluation of agreement between two or more methods of measurement of a variable with a primary focus on continuous data. The authors view the analysis of method comparison data as a two-step procedure where an adequate model for the data is found, and then inferential techniques are applied for appropriate functions of parameters of the model. The presentation is accessible to a wide audience and provides the necessary technical details and references. In addition, the authors present chapter-length explorations of data from paired measurements designs, repeated measurements designs, and multiple methods; data with covariates; and heteroscedastic, longitudinal, and categorical data. The book also:
• Strikes a balance between theory and applications
• Presents parametric as well as nonparametric methodologies
• Provides a concise introduction to Cohen’s kappa coefficient and other measures of agreement for binary and categorical data
• Discusses sample size determination for trials on measuring agreement
• Contains real-world case studies and exercises throughout
• Provides a supplemental website containing the related datasets and R code
Measuring Agreement: Models, Methods, and Applications is a resource for statisticians and biostatisticians engaged in data analysis, consultancy, and methodological research. It is a reference for clinical chemists, ecologists, and biomedical and other scientists who deal with development and validation of measurement methods. This book can also serve as a graduate-level text for students in statistics and biostatistics.
Содержание
Preface xv
1 Introduction 1
1.1 Preview 1
1.2 Notational Conventions 1
1.3 Basic Characteristics of a Measurement Method 2
1.3.1 A Statistical Model for Measurements 3
1.3.2 Quality Characteristics 3
1.4 Method Comparison Studies 5
1.5 Meaning of Agreement 6
1.6 A Measurement Error Model 8
1.6.1 Identifiability Issues 9
1.6.2 Model Based Moments 10
1.6.3 Conditions for Perfect Agreement 10
1.6.4 Link to Test Theory 11
1.7 Similarity versus Agreement 11
1.7.1 Evaluation of Similarity 11
1.7.2 Evaluation of Agreement 12
1.8 A Toy Example 13
1.9 Controversies and Our View 14
1.10 Concepts Related to Agreement 15
1.11 Role of Confidence Intervals and Hypotheses Testing 16
1.11.1 Formulating the Agreement Hypotheses 16
1.11.2 Testing Hypotheses Using Confidence Bounds 17
1.11.3 Evaluation of Agreement Using Confidence Bounds 17
1.11.4 Evaluation of Similarity Using Confidence Intervals 18
1.12 Common Models for Paired Measurements Data 18
1.12.1 A Measurement Error Model 19
1.12.2 A Mixed Effects Model 20
1.12.3 A Bivariate Normal Model 21
1.12.4 Limitations of the Paired Measurements Design 22
1.13 The Bland Altman Plot 23
1.13.1 The Ideal Plot 23
1.13.2 A Linear Trend in the Bland Altman Plot 25
1.13.3 Heteroscedasticity in the Bland Altman Plot 26
1.13.4 Variations of the Bland Altman Plot 27
1.14 Common Regression Approaches 29
1.14.1 Ordinary Linear Regression 29
1.14.2 Deming Regression 31
1.15 Inappropriate Use of Common Tests in Method Comparison Studies 34
1.15.1 Test of Zero Correlation 34
1.15.2 Paired ttest 36
1.15.3 Pitman Morganand Bradley Blackwood Tests 36
1.15.4 Test of Zero Intercept and Unit Slope 38
1.16 Key Steps in the Analysis of Method Comparison Data 39
1.17 Chapter Summary 40
1.18 Bibliographic Note 41
Exercises 47
2 Common Approaches for Measuring Agreement 53
2.1 Preview 53
2.2 Introduction 53
2.3 Mean Squared Deviation 54
2.4 Concordance Correlation Coefficient 54
2.5 A Digression: Tolerance and Prediction intervals 57
2.5.1 Definitions 57
2.5.2 Normally Distributed Data 58
2.6 Lin’s Probability Criterion and Bland Altman Criterion 59
2.7 Limits of Agreement 60
2.7.1 The Approach 60
2.7.2 Why Ignore the Variability? 61
2.7.3 Limits of Agreement versus Prediction and Tolerance Intervals 62
2.8 Total Deviation Index and Coverage Probability 62
2.8.1 The Approaches 62
2.8.2 Normally Distributed Differences 63
2.9 Inference on Agreement Measures 64
2.10 Chapter Summary 64
2.11 Bibliographic Note 65
Exercises 66
3 A General Approach for Modeling and Inference 71
3.1 Preview 71
3.2 Mixed Effects Models 71
3.2.1 The Model 72
3.2.2 Prediction 73
3.2.3 Model Fitting 74
3.2.4 Model Diagnostics 75
3.3 A Large Sample Approach to Inference 76
3.3.1 Approximate Distributions 77
3.3.2 Confidence Intervals 78
3.3.3 Parameter Transformation 80
3.3.4 Bootstrap Confidence Intervals 81
3.3.5 Confidence Bands 83
3.3.6 Test of Homogeneity 83
3.3.7 Model Comparison 84
3.4 Modeling and Analysis of Method Comparison Data 85
3.5 Chapter Summary 88
3.6 Bibliographic Note 89
Exercises 89
4 Paired Measurements Data 95
4.1 Preview 95
4.2 Modeling of Data 95
4.2.1 Mixed Effects Model 95
4.2.2 Bivariate Normal Model 97
4.3 Evaluation of Similarity and Agreement 98
4.4 Case Studies 99
4.4.1 Oxygen Saturation Data 99
4.4.2 Plasma Volume Data 101
4.4.3 Vitamin D Data 103
4.5 Chapter Summary 106
4.6 Technical Details 106
4.6.1 Mixed Effects Model 106
4.6.2 Bivariate Normal Model 107
4.7 Bibliographic Note 108
Exercises 108
5 Repeated Measurements Data 111
5.1 Preview 111
5.2 Introduction 111
5.2.1 Types of Data 112
5.2.2 Individual versus Average Measurement 113
5.2.3 Example Datasets 113
5.3 Displaying Data 114
5.3.1 Basic Plots 114
5.3.2 Interaction Plots 116
5.4 Modeling of Data 117
5.4.1 Unlinked Data 118
5.4.2 Linked Data 121
5.4.3 Model Fitting and Evaluation 123
5.5 Evaluation of Similarity and Agreement 123
5.6 Evaluation of Repeatability 124
5.6.1 Unlinked Data 125
5.6.2 Linked Data 125
5.7 Case Studies 126
5.7.1 Kiwi Data 126
5.7.2 Oximetry Data 129
5.8 Chapter Summary 133
5.9 Technical Details 134
5.9.1 Unlinked Data 134
5.9.2 Linked Data 134
5.10 Bibliographic Note 135
Exercises 137
6 Heteroscedastic Data 141
6.1 Preview 141
6.2 Introduction 141
6.2.1 Diagnosing Heteroscedasticity 142
6.2.2 Example Datasets 143
6.3 Variance Function Models 144
6.4 Repeated Measurements Data 146
6.4.1 A Heteroscedastic Mixed Effects Model 147
6.4.2 Specifying the Variance Function 149
6.4.3 Model Fitting and Evaluation 150
6.4.4 Testing for Homoscedasticity 151
6.4.5 Evaluation of Similarity, Agreement, and Repeatability 151
6.4.6 Case Study: Cholesterol Data 152
6.5 Paired Measurements Data 162
6.5.1 A Heteroscedastic Bivariate Normal Model 162
6.5.2 Specifying the Variance Function 163
6.5.3 Model Fitting and Evaluation 164
6.5.4 Testing for Homoscedasticity 164
6.5.5 Evaluation of Similarity and Agreement 164
6.5.6 Case Study: Cyclosporin Data 165
6.6 Chapter Summary 171
6.7 Technical Details 171
6.7.1 Repeated Measurements Data 171
6.7.2 Paired Measurements Data 173
6.8 Bibliographic Note 174
Exercises 174
7 Data from Multiple Methods 177
7.1 Preview 177
7.2 Introduction 177
7.3 Displaying Data 179
7.4 Example Datasets 179
7.4.1 Systolic Blood Pressure Data 180
7.4.2 Tumor Size Data 180
7.5 Modeling Unreplicated Data 184
7.6 Modeling Repeated Measurements Data 186
7.6.1 Unlinked Data 186
7.6.2 Linked Data 187
7.7 Model Fitting and Evaluation 189
7.8 Evaluation of Similarity and Agreement 190
7.9 Evaluation of Repeatability 191
7.10 Case Studies 192
7.10.1 Systolic Blood Pressure Data 192
7.10.2 Tumor Size Data 195
7.11 Chapter Summary 198
7.12 Technical Details 198
7.13 Bibliographic Note 200
Exercises 200
8 Data with Covariates 205
8.1 Preview 205
8.2 Introduction 205
8.3 Modeling of Data 206
8.3.1 Modeling Means of Methods 206
8.3.2 Modeling Variances of Methods 207
8.3.3 Data Models 208
8.3.4 Model Fitting and Evaluation 211
8.4 Evaluation of Similarity, Agreement, and Repeatability 211
8.4.1 Measures of Agreement for Two methods 212
8.4.2 Measures of Agreement for More Than Two Methods 213
8.4.3 Measures of Repeatability 213
8.4.4 Inference on Measures 214
8.5 Case Study 214
8.6 Chapter Summary 224
8.7 Technical Details 225
8.8 Bibliographic Note 226
Exercises 226
9 Longitudinal Data 229
9.1 Preview 229
9.2 Introduction 229
9.2.1 Displaying Data 231
9.2.2 Percentage Body Fat Data 231
9.3 Modeling of Data 234
9.3.1 The Longitudinal Data Model 236
9.3.2 Specifying the Mean Functions 237
9.3.3 Specifying the Correlation Function 237
9.3.4 Model Fitting and Evaluation 240
9.4 Evaluation of Similarity and Agreement 241
9.5 Case Study 242
9.6 Chapter Summary 247
9.7 Technical Details 247
9.8 Bibliographic Note 249
Exercises 250
10 A Nonparametric Approach 253
10.1 Preview 253
10.2 Introduction 253
10.3 The Statistical Functional Approach 255
10.3.1 A Weighted Empirical CDF 256
10.3.2 Distributions Induced by Empirical CDF 256
10.4 Evaluation of Similarity and Agreement 258
10.5 Case Studies 259
10.5.1 Unreplicated Blood Pressure Data 259
10.5.2 Replicated Blood Pressure Data 263
10.6 Chapter Summary 267
10.7 Technical Details 267
10.7.1 The Matrix 268
10.7.2 Estimation of 269
10.7.3 Influence Functions for the Measures 270
10.7.4 TDI Confidence Bounds 270
10.7.5 Summary of Steps 271
10.8 Bibliographic Note 271
Exercises 272
11 Sample Size Determination 279
11.1 Preview 279
11.2 Introduction 279
11.3 The Sample Size Methodology 281
11.3.1 Paired Measurements Design 281
11.3.2 Repeated Measurements Design 281
11.4 Case Study 282
11.5 Chapter Summary 286
11.6 Bibliographic Note 286
Exercises 287
12 Categorical Data 289
12.1 Preview 289
12.2 Introduction 289
12.3 Experimental Setups and Examples 290
12.3.1 Types of Data 290
12.3.2 Illustrative Examples 290
12.3.3 A Graphical Approach 292
12.4 Cohen’s Kappa Coefficient for Dichotomous Data 293
12.4.1 Definition and Basic Properties: Two Raters 293
12.4.2 Sample Kappa Coefficient 297
12.4.3 Agreement with a Gold Standard 298
12.4.4 Unbiased Raters: Intraclass Kappa 299
12.4.5 Multiple Raters 300
12.4.6 Combining and Comparing Kappa Coefficients 301
12.4.7 Sample Size Calculations 302
12.5 Kappa Type Measures for More Than Two Categories 303
12.5.1 Two Fixed Raters with Nominal Categories 303
12.5.2 Two Raters with Ordinal Categories: Weighted Kappa 303
12.5.3 Multiple Raters 304
12.6 Case Studies 305
12.6.1 Two Raters with Two Categories 305
12.6.2 Weighted Kappa: Multiple Categories 306
12.7 Models for Exploring Agreement 306
12.7.1 Conditional Logistic Regression Models 306
12.7.2 Log Linear Models 307
12.7.3 A Generalized Linear Mixed Effects Model 308
12.8 Discussion 309
12.9 Chapter Summary 310
12.10 Bibliographic Note 311
Exercises 312
References 319
Dataset List 331
Index 333
Об авторе
P. K. CHOUDHARY, Ph D, is Professor in the Department of Mathematical Sciences at the University of Texas at Dallas. Currently, he is also the Associate Head of the department. His research interests include development of statistical methodology for biostatistical applications, and he has published extensively in the field of method comparison studies.
H. N. NAGARAJA, Ph D, is Professor Emeritus at The Ohio State University where he has served in the Departments of Statistics and Internal Medicine and the Division of Biostatistics. He is a fellow of the American Statistical Association and the American Association for the Advancement of Science, and an elected member of the International Statistical Institute. His published works include Order Statistics, Third Edition (with H. A. David) and Records (with B. C. Arnold and N. Balakrishnan), both published by Wiley.