Prof. Ranjan Sarkar
Jadavpur University
Collaborative supervision on multimodal clinical AI and challenge design.
Corresponding paper: Cross-Modal Alignment for Clinical Report Generation (CBMS 2026)
Software Developer – ML · CNH Industrial, Gurgaon · Independent Researcher
I work at the intersections of multimodal learning, clinical AI, and language reasoning. My research focuses on systems that fuse vision, text, and structured knowledge — with an emphasis on reliability, hallucination mitigation, and deployment under limited supervision. I collaborate with researchers at Jadavpur University, University of Liverpool, IIIT Bangalore, ISI Kolkata, and the University of South Carolina.
Prospective PhD Student · Looking for research opportunities in multimodal AI, clinical AI, and language reasoning.
Open for Collaboration · Happy to collaborate on high-impact projects, papers, and challenge benchmarks.
My research addresses a central challenge in AI deployment: building systems that are reliable, grounded, and interpretable when operating across heterogeneous data modalities. I am particularly interested in how vision, language, and structured knowledge can be jointly leveraged to solve high-stakes problems in clinical AI and document intelligence.
A recurring theme in my work is low-supervision learning — designing models that remain effective when labeled data is scarce or expensive. I explore this through semi-supervised encoders, retrieval-augmented generation, and knowledge-graph reasoning pipelines that reduce dependence on large annotated datasets.
Going forward, I aim to deepen my focus on hallucination mitigation in multi-agent LLM systems and extend my clinical AI work to longitudinal patient data, bridging the gap between AI research and real-world medical deployment.
* equal contribution · scroll to see all recent publications
Professors I worked under, with corresponding paper references.
Jadavpur University
Collaborative supervision on multimodal clinical AI and challenge design.
Corresponding paper: Cross-Modal Alignment for Clinical Report Generation (CBMS 2026)
IIIT Bangalore
Guidance on low-resource visual learning and efficient model adaptation.
Corresponding paper: Lightweight Visual Encoders for Low-Resource Medical Image Analysis (CVPRW 2026)
Mentored students and what they are doing now, with co-authored paper references.
Now: Research engineer focused on clinical report generation pipelines.
Co-authored reference: Cross-Modal Alignment for Clinical Report Generation (CBMS 2026)
Now: Working on hallucination robustness and multi-agent LLM evaluation.
Co-authored reference: RAG-Guided Knowledge Graph Reasoning for Document QA (ICPR 2026)
Now: Continuing research in multilingual document understanding benchmarks.
Co-authored reference: VISTAC 2: Vision and Language for Document Understanding (ICPR 2026)
Reviewer for journals and transactions in AI, medical imaging, and document intelligence.
Program committee and reviewer support for leading AI/ML conferences and workshops.
Co-organizing benchmark-driven competitions in clinical imaging and multimodal reasoning.
Mentoring UG/PG researchers on publication-ready projects and research communication.
Recognized for high-quality and timely reviewing contributions in AI venues.
Awarded for impactful contributions to multimodal and clinical AI research.
Honored for translating research ideas into deployable AI systems.
Recognized for early research contributions and collaborative publications.
Building consistency and endurance through regular runs and interval sessions.
Writing research notes, implementation breakdowns, and explainers for complex AI ideas.
Exploring books on scientists, innovators, and founders to learn decision patterns.
Playing for focus and reflex training, usually in doubles format on weekends.