This book covers the recent advancement of statistical methods for the analysis of longitudinal data. Real datasets from four large NIH-supported longitudinal clinical trials and epidemiological studies illustrate the practical applications of the statistical methods. This book focuses on the nonparametric approaches, which have gained tremendous popularity in biomedical studies. These approaches have the flexibility to answer many scientific questions that cannot be properly addressed by the existing parametric approaches, such as the linear and nonlinear mixed effects models.
- Introduction
-
Scientific Objectives of Longitudinal Studies
Data Structures and Examples
Structures of Longitudinal Data
Examples of Longitudinal Studies
Objectives of Longitudinal Analysis
Conditional-Mean Based Regression Models
Parametric Models
Semiparametric Models
Unstructured Nonparametric Models
Structured Nonparametric Models
Conditional-Distribution Based Models
Conditional Distribution Functions and Functionals
Parametric Distribution Models
Semiparametric Distribution Models
Unstructured Nonparametric Distribution Models
Structured Nonparametric Distribution Models
Review of Smoothing Methods
Local Smoothing Methods
Global Smoothing Methods
Introduction to R
Organization of the Book
- Parametric and Semiparametric Methods
-
Linear Marginal and Mixed-Effects Models
Marginal Linear Models
The Linear Mixed-Effects Models
Conditional Maximum Likelihood Estimation
Maximum Likelihood Estimation
Restricted Maximum Likelihood Estimation
Likelihood based Inferences
Nonlinear Marginal and Mixed-Effects Models
Model Formulation and Interpretation
Likelihood-based Estimation and Inferences
Estimation of Subject-Specific Parameters
Semiparametric Partially Linear Models
Marginal Partially Linear Models
Mixed-Effects Partially Linear Models
Iterative Estimation Procedure
Profile Kernel Estimators
Semiparametric Estimation by Splines
R Implementation
The BMACS CD Data
The ENRICHD BDI Data
Remarks and Literature Notes
Unstructured Nonparametric Models
- Kernel and Local Polynomial Methods
-
Least-Squares Kernel Estimators
Least-Squares Local Polynomial Estimators
Cross-Validation Bandwidths
The Leave-One-Subject-Out Cross-Validation
A Computation Procedure for Kernel Estimators
Heuristic Justification of Cross-Validation
Bootstrap Pointwise Confidence Intervals
Resampling-Subject Bootstrap Samples
Two Bootstrap Confidence Intervals
Simultaneous Confidence Bands
R Implementation
The HSCT Data
The BMACS CD Data
Asymptotic Properties of Kernel Estimators
Mean Squared Errors
Assumptions for Asymptotic Derivations
Asymptotic Risk Representations
Useful Special Cases
Remarks and Literature Notes
- Basis Approximation Smoothing Methods
-
Estimation Method
Basis Approximations and Least Squares
Selecting Smoothing Parameters
Bootstrap Inference Procedures
Pointwise Confidence Intervals
Simultaneous Confidence Bands
Hypothesis Testing
R Implementation
The HSCT Data
The BMACS CD Data
Asymptotic Properties
Conditional Biases and Variances
Consistency of Basis Approximation Estimators
Consistency of B-Spline Estimators
Convergence Rates
Consistency of Goodness-of-Fit Test
Remarks and Literature Notes
- Penalized Smoothing Spline Methods
-
Estimation Procedures
Penalized Least Squares Criteria
Penalized Smoothing Spline Estimator
Cross-Validation Smoothing Parameters
Bootstrap Pointwise Confidence Intervals
R Implementation
The HSCT Data
The NGHS BMI Data
Asymptotic Properties
Assumptions and Equivalent Kernel Function
Asymptotic Distributions, Risk and Inferences
Green’s Function for Uniform Density
Theoretical Derivations
Remarks and Literature Notes
Time-Varying Coefficient Models
- Smoothing with Time-Invariant Covariates
-
Data Structure and Model Formulation
Data Structure
The Time-Varying Coefficient Model
A Useful Componentwise Representation
Componentwise Kernel Estimators
Construction of Estimators through Least Squares
Cross-Validation Bandwidth Choices
Componentwise Penalized Smoothing Splines
Estimators by Componentwise Roughness Penalty
Estimators by Combined Roughness Penalty
Cross-Validation Smoothing Parameters
Bootstrap Confidence Intervals
R Implementation
The BMACS CD Data
A Simulation Study
Asymptotic Properties for Kernel Estimators
Mean Squared Errors
Asymptotic Assumptions
Asymptotic Risk Representations
Remarks and Implications
Useful Special Cases
Theoretical Derivations
Asymptotic Properties for Smoothing Splines
Assumptions and Equivalent Kernel Functions
Asymptotic Distributions and Mean Squared Errors
Theoretical Derivations
Remarks and Literature Notes
- The One-Step Local Smoothing Methods
-
Data Structure and Model Interpretations
Data Structure
Model Formulation
Model Interpretations
Remarks on Estimation Methods
Smoothing Based on Local Least Squares Criteria
General Formulation
Least Squares Kernel Estimators
Least Squares Local Linear Estimators
Smoothing with Centered Covariates
Cross-Validation Bandwidth Choice
Pointwise and Simultaneous Confidence Bands
Pointwise Confidence Intervals by Bootstrap
Simultaneous Confidence Bands
R Implementation
The NGHS BP Data
The BMACS CD Data
Asymptotic Properties for Kernel Estimators
Asymptotic Assumptions
Mean Squared Errors
Asymptotic Risk Representations
Asymptotic Distributions
Asymptotic Pointwise Confidence Intervals
Remarks and Literature Notes
- The Two-Step Local Smoothing Methods
-
Overview and Justifications
Raw Estimators
General Expression and Properties
Component Expressions and Properties
Variance and Covariance Estimators
Refining the Raw Estimates by Smoothing
Rationales for Refining by Smoothing
The Smoothing Estimation Step
Bandwidth Choices
Pointwise and Simultaneous Confidence Bands
Pointwise Confidence Intervals by Bootstrap
Simultaneous Confidence Bands
R Implementation
The NGHS BP Data
Remark on the Asymptotic Properties
Remarks and Literature Notes
- Global Smoothing Methods
-
Basis Approximation Model and Interpretations
Data Structure and Model Formulation
Basis Approximation
Remarks on Estimation Methods
Estimation Method
Approximate Least Squares
Remarks on Basis and Weight Choices
Least Squares B-Spline Estimators
Cross-Validation Smoothing Parameters
Conditional Biases and Variances
Estimation of Variance and Covariance Structures
Resampling-Subject Bootstrap Inferences
Pointwise Confidence Intervals
Simultaneous Confidence Bands
Hypothesis Testing for Constant Coefficients
R Implementation with the NGHS BP Data
Estimation by B-Splines
Testing Constant Coefficients
Asymptotic Properties
Integrated Squared Errors
Asymptotic Assumptions
Convergence Rates for Integrated Squared Errors
Theoretical Derivations
Consistent Hypothesis Tests
Remarks and Literature Notes
Shared-Parameter and Mixed-Effects Models
- Models for Concomitant Interventions
-
Concomitant Interventions
Motivation for Outcome-Adaptive Covariate
Two Modeling Approaches
Data Structure with a Single Intervention
Naive Mixed-Effects Change-Point Models
Justifications for Chang-Point Models
Model Formulation and Interpretation
Biases of Naive Mixed-Effects Models
General Structure for Shared-Parameters
The Varying-Coefficient Mixed-Effects Models
Model Formulation and Interpretation
Special Cases of Conditional Mean Effects
Likelihood-Based Estimation
Least Squares Estimation
Estimation of the Covariances
The Shared-Parameter Change-Point Models
Model Formulation and Justifications
The Linear Shared-Parameter Change-Point Model
The Additive Shared-Parameter Change-Point Model
Likelihood-Based Estimation
Gaussian Shared-Parameter Change-Point Models
A Two-Stage Estimation Procedure
Confidence Intervals for Parameter Estimators
Asymptotic Confidence Intervals
Bootstrap Confidence Intervals
R Implementation to the ENRICHD Data
Varying-Coefficient Mixed-Effects Models
Shared-Parameter Change-Point Models
Asymptotic Consistency
The Varying-Coefficient Mixed-Effects Models
Maximum Likelihood Estimators
The Additive Shared-Parameter Models
Remarks and Literature Notes
- Nonparametric Mixed-Effects Models
-
Objectives of Nonparametric Mixed-Effects Models
Data Structure and Model Formulation
Data Structure
Mixed-Effects Models without Covariates
Mixed-Effects Models with a Single Covariate
Extensions to Multiple Covariates
Estimation and Prediction without Covariates
Estimation with Known Covariance Matrix
Estimation with Unknown Covariance Matrix
Individual Trajectories
Cross-Validation Smoothing Parameters
Functional Principal Components Analysis
The Reduced Rank Model
Estimation of Eigenfunctions and Eigenvalues
Model Selection of Reduced Ranks
Estimation and Prediction with Covariates
Models without Covariate Measurement Error
Models with Covariate Measurement Error
R Implementation
The BMACS CD Data
The NGHS BP Data
Remarks and Literature Notes
Nonparametric Models for Distributions
- Unstructured Models for Distributions
-
Objectives and General Setup
Objectives
Applications
Estimation of Conditional Distributions
Rank-Tracking Probability
Data Structure and Conditional Distributions
Data Structure
Conditional Distribution Functions
Conditional Quantiles
Rank-Tracking Probabilities
Rank-Tracking Probability Ratios
Continuous and Time-Varying Covariates
Estimation Methods
Conditional Distribution Functions
Conditional Cumulative Distribution Functions
Conditional Quantiles and Functionals
Rank-Tracking Probabilities
Cross-Validation Bandwidth Choices
Bootstrap Pointwise Confidence Intervals
R Implementation
The NGHS BMI Data
Asymptotic Properties
Asymptotic Assumptions
Asymptotic Mean Squared Errors
Theoretical Derivations
Remarks and Literature Notes
- Time-Varying Transformation Models - I
-
Overview and Motivation
Data Structure and Model Formulation
Data Structure
The Time-Varying Transformation Models
Two-Step Estimation Method
Raw Estimates of Coefficients
Bias, Variance and Covariance of Raw Estimates
Smoothing Estimators
Bandwidth Choices
Bootstrap Confidence Intervals
Implementation and Numerical Results
The NGHS Data
Asymptotic Properties
Conditional Mean Squared Errors
Asymptotic Assumptions
Asymptotic Risk Expressions
Theoretical Derivations
Remarks and Literature Notes
- Time-Varying Transformation Models -
-
Overview and Motivation
Data Structure and Distribution Functionals
Data Structure
Conditional Distribution Functions
Conditional Quantiles
Rank-Tracking Probabilities
Rank-Tracking Probability Ratios
The Time-Varying Transformation Models
Two-Step Estimation and Prediction Methods
Raw Estimators of Distribution Functions
Smoothing Estimators for Conditional CDFs
Smoothing Estimators for Quantiles
Estimation of Rank-Tracking Probabilities
Estimation of Rank-Tracking Probability Ratios
Bandwidth Choices
R Implementation
Conditional CDF for the NGHS SBP Data
RTP and RTPR for the NGHS SBP Data
Asymptotic Properties
Asymptotic Assumptions
Raw Baseline and Distribution Function Estimators
Local Polynomial Smoothing Estimators
Theoretical Derivations
Remarks and Literature Notes
- Tracking with Mixed-Effects Models
Both authors are mathematical statisticians at the National Institutes of Health (NIH) and have published extensively in statistical and biomedical journals.
"This book will be a good reference book in the area of longitudinal data. It provides a self-contained treatment of structured nonparametric models for longitudinal data. Three commonly used method-kernel, polynomial splines, penalization/smoothing splines are all treated with enough depth. The book's coverage of time-varying coefficient models is comprehensive. Shared-parameter and mixed-effects models are very useful in longitudinal data analysis. Section V 'Nonparametric Models for Distribution' summarizes recent development on a new class of models. This section alone will make the book unique among all published books in longitudinal data analysis. Another unique feature of the book is the use of four actual longitudinal studies presented in Section 1.2. The book used data from these four studies when introducing every model/method. This approach motivates each method very well and also shows the usefulness of the method...The book will be an excellent addition to the literature."
Jianhua Huang, Texas A&M University
". . ., this book provides a comprehensive review of the structure of longitudinal data, as well as several applicable nonparametric models. It should prove useful for those working with biomedical data, including real world evidence. The focus is on theorems and proofs; references demonstrate the depth of the research. Real life examples are provided in the form of data descriptions, as well as R code and output. These examples will allow the reader to employ the models and gain expertise in the interpretation of the R output."
Journal of Biopharmaceutical Statistics, Darcy Hille, Merck Research Laboratories