Privacy-Aware and Heterogeneity-Tolerant Learning for Visual Recognition under Distributed and Federated Settings

BMU Banner
Enquire Now Apply Now
close

    Admissions Open 2026







    Published date June 8, 2026 | By BMU
    Aditi Palit

    Aditi Palit

    Faculty Associate

    aditi.palit@bmu.edu.in


    Overview

    Modern vision systems (healthcare, surveillance, remote sensing) operate in distributed environments where data is siloed across institutions. Centralized learning is often infeasible due to privacy, ownership, and regulatory constraints. This research develops a unified federated learning framework that enables collaborative visual recognition while addressing key real-world challenges:

    • Non-IID data
    • Model and label heterogeneity
    • Privacy risks in visual data
    • Lack of semantic understanding

    Core Idea

    Instead of sharing raw images, clients train locally and share only compact knowledge (representations, statistics, or predictions). A central server intelligently aggregates this information to build a robust global model.

    Key Challenges Addressed

    • High-dimensional data → redundancy, unstable learning
    • Non-IID distributions → slow and divergent convergence
    • Heterogeneous clients → incompatible models and labels
    • Privacy leakage → sensitive visual content exposure
    • Semantic gap → purely visual features lack meaning

    Key Contributions

    1. Representation-Efficient Learning (AgriNet) Reduces redundancy in high-dimensional visual data (e.g., hyperspectral images) for stable learning.
    2. Distribution-Aware Federated Optimization (FedGMMinit) Uses statistical modeling (GMMs) to initialize global models, improving convergence under non-IID data.
    3. Server-Side Learning with Unlabeled Data (LFBC) Enables learning from heterogeneous clients using prediction fusion instead of parameter sharing.
    4. Task-Aware Visual Privacy Preservation Masks only sensitive regions while preserving task-relevant information, balancing privacy and performance.
    5. Multimodal Federated Learning (FLAMeST) Combines vision features with language embeddings to improve semantic understanding and generalization.

    Why This is Important

    This work moves beyond traditional federated learning by addressing all core challenges jointly, rather than in isolation.

    It shows that:

    • Efficient representations → better convergence
    • Semantic alignment → better generalization
    • Task-aware privacy → usable real-world systems

    Applications

    • Clinical decision support from distributed medical data
    • Privacy-preserving surveillance systems
    • Agricultural monitoring using hyperspectral imagery
    • Cross-institution AI systems without data sharing

    This research provides a complete, scalable, and privacy-aware framework for real-world
    distributed vision systems—bridging the gap between theory and deployable AI.