An Introduction to Categorical Data Analysis, Second Edition presents an introduction to the most important methods for analyzing categorical data. It summarizes methods that have long played a prominent role such as chi-squared tests and measures of association. It provides special emphasis, however, to logistic regression and loglinear modeling techniques for univariate and correlated multivariate categorical responses. This Second Edition presents new methods for clustered data, which are increasingly common in longitudinal studies, for example. Two new chapters discuss these methods along with improvements in major software. Chapter 10 deals with marginal models, including the generalized estimating equations (GEE) approach. Chapter 11 deals with random effects models through generalized linear models. Earlier chapters and appendices are updated.
Table of Contents:
Preface to the Second Edition xv
1. Introduction 1
1.1 Categorical Response Data 1
1.2 Probability Distributions for Categorical Data 3
1.3 Statistical Inference for a Proportion 6
1.4 More on Statistical Inference for Discrete Data 11
Problems 16
2. Contingency Tables 21
2.1 Probability Structure for Contingency Tables 21
2.2 Comparing Proportions in Two-by-Two Tables 25
2.3 The Odds Ratio 28
2.4 Chi-Squared Tests of Independence 34
2.5 Testing Independence for Ordinal Data 41
2.6 Exact Inference for Small Samples 45
2.7 Association in Three-Way Tables 49
Problems 55
3. Generalized Linear Models 65
3.1 Components of a Generalized Linear Model 66
3.2 Generalized Linear Models for Binary Data 68
3.3 Generalized Linear Models for Count Data 74
3.4 Statistical Inference and Model Checking 84
3.5 Fitting Generalized Linear Models 88
Problems 90
4. Logistic Regression 99
4.1 Interpreting the Logistic Regression Model 99
4.2 Inference for Logistic Regression 106
4.3 Logistic Regression with Categorical Predictors 110
4.4 Multiple Logistic Regression 115
4.5 Summarizing Effects in Logistic Regression 120
Problems 121
5. Building and Applying Logistic Regression Models 137
5.1 Strategies in Model Selection 137
5.2 Model Checking 144
5.3 Effects of Sparse Data 152
5.4 Conditional Logistic Regression and Exact Inference 157
5.5 Sample Size and Power for Logistic Regression 160
Problems 163
6. Multicategory Logit Models 173
6.1 Logit Models for Nominal Responses 173
6.2 Cumulative Logit Models for Ordinal Responses 180
6.3 Paired-Category Ordinal Logits 189
6.4 Tests of Conditional Independence 193
Problems 196
7. Loglinear Models for Contingency Tables 204
7.1 Loglinear Models for Two-Way and Three-Way Tables 204
7.2 Inference for Loglinear Models 212
7.3 The Loglinear–Logistic Connection 219
7.4 Independence Graphs and Collapsibility 223
7.5 Modeling Ordinal Associations 228
Problems 232
8. Models for Matched Pairs 244
8.1 Comparing Dependent Proportions 245
8.2 Logistic Regression for Matched Pairs 247
8.3 Comparing Margins of Square Contingency Tables 252
8.4 Symmetry and Quasi-Symmetry Models for Square Tables 256
8.5 Analyzing Rater Agreement 260
8.6 Bradley–Terry Model for Paired Preferences 264
Problems 266
9. Modeling Correlated Clustered Responses 276
9.1 Marginal Models Versus Conditional Models 277
9.2 Marginal Modeling: The GEE Approach 279
9.3 Extending GEE: Multinomial Responses 285
9.4 Transitional Modeling Given the Past 288
Problems 290
10. Random Effects: Generalized Linear Mixed Models 297
10.1 Random Effects Modeling of Clustered Categorical Data 297
10.2 Examples of Random Effects Models for Binary Data 302
10.3 Extensions to Multinomial Responses or Multiple Random Effect Terms 310
10.4 Multilevel (Hierarchical) Models 313
10.5 Model Fitting and Inference for GLMMS 316
Problems 318
11. A Historical Tour of Categorical Data Analysis 325
11.1 The Pearson–Yule Association Controversy 325
11.2 R. A. Fisher’s Contributions 326
11.3 Logistic Regression 328
11.4 Multiway Contingency Tables and Loglinear Models 329
11.5 Final Comments 331
Appendix A: Software for Categorical Data Analysis 332
Appendix B: Chi-Squared Distribution Values 343
Bibliography 344
Index of Examples 346
Subject Index 350
Brief Solutions to Some Odd-Numbered Problems 357
About the Author :
ALAN AGRESTI, PhD, is Distinguished Professor Emeritus in the Department of Statistics at the University of Florida. He has presented short courses on categorical data methods in thirty countries. Dr. Agresti was named "Statistician of the Year" by the Chicago chapter of the American Statistical Association in 2003. He is the author of two advanced texts, including the bestselling Categorical Data Analysis (Wiley) and is also the coauthor of Statistics: The Art and Science of Learning from Data and Statistical Methods for the Social Sciences.
Review :
"Yes, I fully recommend the text as a basis for introductory course, for students, as well as non-specialists in statistics. The wealth of examples provided in the text is, from my point of view, a rich source of motivating ones own studies and work." (Biometrical Journal, December 2008) "This text does a good job of achieving its state goal, and we enthusiastically recommend it." (Journal of the American Statistical Association, September 2008)
"This book is very well-written and it is obvious that the author knows the subject inside out." (Journal of Applied Statistics, April 2008)
"Provides an applied introduction to the most important methods for analyzing categorical data, such as chi-squared tests and logical regression." (Statistica 2008)
"This is an introductory book and as such it is marvelous...essential for a novice..." (MAA Reviews, June 26, 2007)