Date of Award


Document Type


Degree Name

PhD in Business


Department of Mathematical Sciences: Business Analytics

First Advisor

Dominique Haughton

Second Advisor

Mingfei Li

Third Advisor

Jennifer Priestley


As an irreversible, progressive brain disorder, Alzheimer’s disease (AD) imposes a severe burden upon patients and their caregivers, as well as the healthcare system. Of the ten leading causes of death in the United States, Alzheimer’s disease is the only one without a pharmacological intervention that has been proven to cure or delay the onset of the disease. Aging is the primary risk factor contributing to Alzheimer’s disease in the elderly. With an aging population that continues to grow, the challenges for the healthcare system surrounding AD become more and more serious. My dissertation aims to contribute to a better understanding of this rising problem from big data analytics point of view. A large-scale national clinical and administrative data warehouse in the Veteran Affairs healthcare system will be used for the following investigation. Chapter 2 investigated whether the combined therapy of four classes of low-cost FDA-approved medications targeting modifiable risk factors for AD can extend patients’ lifespans with machine learning techniques. As multiple comorbidities are often present simultaneously in elderly patients, and their treatment typically involves the use of multiple medications, it is essential to investigate the comparative effects of commonly prescribed medications and their combinations on the development of Alzheimer’s disease. Chapter 2 specifically focuses on the concurrent utilization of medications that belong to up to three distinct categories that have not been previously explored. This knowledge can inform the development of treatment plans that optimize the balance between managing multiple chronic conditions, minimizing the risk of developing Alzheimer’s disease, and offering a potentially significant opportunity to reduce the economic burden of the disease. Chapter 3 studied the AD disease progression and the effect of therapeutic interventions for the modifiable risk factors at each stage of AD. The disease progression of AD is a dynamic process and precise prediction of the time course is difficult. Starting from the early phase of mild cognitive impairment (MCI) to AD, then to death, the transition rates and probabilities between different disease phases were presented using the Markov multi-state modeling. Acquiring this knowledge is a critical step toward preventing and diagnosing AD at an early stage and may present opportunities for improved clinical management. Risk factors that facilitate the progression from MCI to AD were identified using Cox regression with propensity score weights. The understanding of the disease progression of AD will contribute to the estimate of the costs related to AD, and understand the cost-effectiveness of AD-related treatments and services. Chapter 4 aims to assess the robustness of multi-state survival models when confronted with a noisy dependent variable, particularly in the context of complex diagnostic tasks. In healthcare systems, diagnostic challenges are commonplace, often leading to missed, delayed, or erroneous diagnoses. Diseases such as Alzheimer’s disease pose additional complexities, resulting in elevated diagnostic errors due to the disease’s intricacies and the limited use of advanced measurements in primary care practices. Regrettably, the existing literature on algorithm performance in highly noisy data environments remains limited. In this study, I compared the classic Markov multi-state model and a deep-learning model, multi-state ODEs, utilizing simulated data with a highly noisy dependent variable. The primary goal is to discern the sensitivity of prediction outcomes from multi-state survival models for disease progression analysis. Through this investigation, our findings contribute to a better understanding of the performance of these models under challenging conditions, shedding light on their reliability and applicability in noisy data scenarios.