Advanced Federated Learning Algorithms Leveraging Selective Forgetting for Data-Constrained Environments
Author(s)
Alotaibi, Abdulrahman
DownloadThesis PDF (7.370Mb)
Advisor
Kagal, Lalana
Terms of use
Metadata
Show full item recordAbstract
Federated Learning (FL) enables collaborative training of machine learning models without centralizing raw data, offering a practical framework for real-world AI. Yet real-world deployments face challenges in data-constrained environments, where client datasets are both limited and heterogeneous, as well as from broader adversarial risks and complex regulatory requirements. This dissertation addresses these challenges by integrating Knowledge Evolution (KE) and Later-Layer Forgetting (LLF) into the FL paradigm, and by analyzing their combined impact on security and compliance. The proposed FL-KE and FL-LLF frameworks introduce selective forgetting mechanisms that prune less salient representations, reallocate model capacity, and enable iterative refinement over generations. Experimental evaluations on diverse image classification datasets, including Flower-102, CUB-200, MIT-67, and Stanford Dogs, demonstrate accelerated convergence, improved generalization, and robustness under data scarcity compared to baseline FL. Beyond performance, this work examines the security implications of FL-KE and FL-LLF through a comprehensive threat model covering poisoning, backdoor, inference, free-rider, Sybil, and Byzantine attacks. Analysis reveals that selective forgetting can reduce the persistence of malicious updates, mitigating certain attack vectors while coexisting with robust aggregation and secure aggregation protocols. Finally, this dissertation explores the intersection of FL and data privacy regulations through an empirical survey of stakeholders across the Gulf Cooperation Council (GCC) region. The findings reveal a gap between regulatory awareness and operational compliance, as well as opportunities for FL especially in its KE and LLF enhanced forms to align with legal principles of data minimization, purpose limitation, and user rights. By combining methodological advances, defenses against adversarial threats, and attention to regulatory requirements, this work offers a framework for building the next generation of federated learning systems that are effective, secure, and compliant in varied settings. These contributions also support the broader goal of trusted and safe machine learning, where the demand for robust, data-sharing, and regulation-aware systems is central to preventing harmful outcomes, promoting fairness, and protecting the integrity of AI in sensitive fields such as healthcare, finance, and government. The findings presented here carry direct implications for deploying federated AI in high-stakes environments and highlight promising directions for future research at the intersection of machine learning, security, and policy.
Date issued
2025-09Department
Program in Media Arts and Sciences (Massachusetts Institute of Technology)Publisher
Massachusetts Institute of Technology