PORTFOLIO

Chirag_image

Chirag Jain

Hi! đź‘‹ I am Chirag

I build software to solve problems I have personally encountered. My focus is on leveraging AI and SaaS technologies to develop applications that are accessible, insightful, and designed to make a tangible, positive impact on people's lives.

👨‍💻Technical & Scientific Interests: Research, Deep Learning, Machine Learning, Python Software Development, Data Science, Database Management, Astronomy


Work Experience

Apollo Tyres LTD Digital Innovation Hub

Data Science Intern

Jan. 2026 – Present

Location: Hyderabad - Hybrid

Starting January 5, 2025 - Updates coming soon

Research Experience

Indian National Science Academy (INSA) Summer Research Fellow

May 2025 – Jul. 2025

Technologies: PyTorch, Computer Vision

Supervisor: Dr. Ramesh Venkadamchalam (Department of Mathematics, Central University of Tamil Nadu)

  • Conducted research on AI-based disaster damage assessment using the xView2 dataset comprising 20,000 pre/post-disaster image pairs, focusing on building-level classification to support Humanitarian Assistance and Disaster Relief (HADR) efforts.
  • Designed a scalable preprocessing pipeline that extracted building patches, applied augmentation, and implemented majority-class undersampling, resulting in a 94.4% dataset expansion (from 304,370 to 591,583 image patches) and improved class balance for model training.
  • Developed and optimized a lightweight CNN with residual connections (3.7M parameters, 28.71 MB) achieving 83.3% accuracy on 128x128 inputs, closely matching ResNet18's 83.5% (7M parameters, 63.33 MB) with over 54.66% reduction in model size, enabling faster inference for field deployment.

Comparative Analysis of DL Models for Brain MRI Tumour Detection

Publication

Technologies: PyTorch, Computer Vision

Supervisor: Dr. Deepasikha Mishra (School of Computer Science and Engineering, VIT-AP University)

  • Conducted a comparative analysis of Deep Learning models (VGG16, VGG19, Xception, Simple CNN, EfficientNet-Attn) by standardizing training conditions with the same optimizer, scheduler, and epochs through performance evaluation.
  • Processed ~12,000 brain MRI scans from the MRI ND-5 dataset (sourced from IEEE Dataport) through transformation pipelines to train deep learning models and generate comparative performance visualizations.
  • EfficientNet achieved the highest performance with 99.82% accuracy on the external dataset and 97.45% on the internal dataset, validated using Nemenyi and Cohen's d statistical significance tests
  • Research selected for presentation at an IEEE international conference, with subsequent publication slated for Scopus-indexed IEEE Digital Library.

Education

Vellore Institute of Technology - Amaravati, Andhra Pradesh

2022-2026

B.Tech of Computer Science and Engineering Core

Current CGPA: 9.12

Chennai Public School - Chennai, Tamil Nadu

2020-2022

Central Board of Secondary Education (CBSE)

Senior Secondary

Percentage: 92.6%

Chennai Public School - Chennai, Tamil Nadu

2018-2020

Central Board of Secondary Education (CBSE)

Secondary

Percentage: 95.6%


Certifications


Projects

KOSH: Open Government Data MCP Server

University Capstone Project - Group

  • Developed Kosh, a conversational interface for India’s Open Government Data that allows users to query complex public datasets and generate instant, dynamic visualizations using natural language, reducing time-to-insight from minutes to seconds .
  • Engineered a specialized Model Context Protocol (MCP) server using Python and FastMCP to wrap 25+ government APIs, solving the N X M integration bottleneck and creating a standardized interoperability layer for AI agents.
  • Built a robust full-stack solution leveraging Google Gemini 2.5 Pro for advanced reasoning, Node.js/Express for the agent backend, and React.js for the frontend, featuring real-time response streaming and automated chart rendering.
  • Contributed towards the development of Kosh React UI with a custom in-chat visualisation feature and engineered specialised backend tools to integrate 8+ government APIs, enabling the system to fetch, filter, and render complex public datasets dynamically.

GenAI Legal Assistant (Recently Revamped)

  • Designed and developed a responsive, full-stack SaaS application to analyze and summarize complex legal documents, serving as a "GenAI Legal Assistant."
  • Built an intelligent document processing pipeline that ingests multiple file formats (.pdf, .docx, .txt), extracts text, and performs automated section identification using legal keyword recognition.
  • Engineered a resilient, dual-mode architecture for analysis, featuring a primary mode powered by the Gemini API and a fallback "Lite" mode using a fine-tuned Legal Pegasus model and KeyBERT for continued operation.
  • Automated the end-to-end software delivery lifecycle by establishing a CI/CD pipeline, deploying a scalable and monitored solution to a production environment on the Railway platform.
  • Delivered a rich user experience with features including drag-and-drop file uploads, custom summary length controls, and the ability to export the structured analysis as a professionally formatted PDF.

GitDone: A GitHub-Integrated Deadline tracker

  • Deployed an open-source tool that uses GitHub OAuth2 for secure user sign-in, allowing developers to create and manage deadline countdowns for their repositories.
  • Constructed a 4-endpoint REST API to provide users with a unique embed link for a real-time countdown widget, enabling seamless integration into applications like Notion by configuring appropriate CORS header.
  • Automated the software delivery lifecycle by engineering a CI/CD pipeline with AWS CodePipeline for deployments to AWS Elastic Beanstalk. Fortified security and performance by implementing Amazon CloudFront as a CDN to handle custom domain routing, SSL certificate termination, and cached asset delivery with a 99.5% uptime

Interactive Portfolio Analytics & Risk Assessment Dashboard

  • Architected a quantitative finance platform to analyse 10 blue-chip equities, implementing a suite of risk metrics including Sharpe Ratio optimization, 95% Value-at-Risk, correlation matrices, max drawdown, and beta coefficients.
  • Deployed a mobile-responsive analytics dashboard using a containerized architecture (Render.com) and Plotly/Dash, featuring real-time data visualization and an automated processing pipeline for 500 trading days of market data via yfinance API.
  • Implemented an advanced statistical modelling system featuring dynamic risk-free rate integration (10Y Treasury), Monte Carlo simulation for risk analysis, and performance attribution across technology, financial, and healthcare sectors.

Stacked Ensemble Learning Model to Classify Potentially Hazardous Near-Earth Asteroids

  • Developed a novel stacked ensemble model to classify Near-Earth asteroids as Potentially Hazardous Asteroids (PHAs) using physical and orbital attributes, achieving a recall of 99.29% and accuracy of 99.53%, critical for asteroid impact analysis.
  • The dataset is acquired from NASA’s Jet Propulsion Laboratory Solar System Dynamics’ open datasets, consisting approximately 1.3 million records undergoing data pre-processing before model building.
  • Built a stacked ensemble with Random Forest and XGBoost as base models and Logistic Regression as the meta-model, optimized using GridSearchCV, RFECV, and 15-fold cross-validation.
  • Demonstrated the stacked model’s superior recall performance compared to individual base and meta models, underscoring its robustness in asteroid classification; results are currently under review for journal publication.

Early Prediction of Chronic Kidney Disease using Machine Learning

  • Designed and implemented a predictive learning machine learning model that analysed medical records to identify chronic kidney disease, achieving an accuracy of 93.33% and a recall score of 94.44%.
  • Four models were considered - Random Forest, Decision Tree, Logistic Regression, and XGBoost. Out of the 4, the overall performance of XGBoost was relatively better than the other 3.
  • Trained on a CKD dataset acquired from UC Irvine Machine Learning Repository comprising 400 records and synthetic data generated using Copulas library comprising 200 records.
  • Deployed the model locally via Flask with a user-friendly web interface scalable for public deployment using PythonAnywhere, enhancing accessibility and potential for broader user engagement.

Chymes: A Spotify Playlist Curator

  • Developed and designed a playlist curator using Python that creates a Spotify playlist using real-time weather status.
  • The model utilised Openweathermap API, and Spotify API to gather information and generate a playlist of 30 songs.
  • Utilised Flask to deploy the webpage, and soon enough a mobile application available on Play Store. Currently, the web application is under Beta testing phase with 5+ users.

X(formerly Twitter) Sentiment Analysis: COVID-19 Tweets

  • Built a sentiment analysis model using Python that categorised 2021 COVID-19 pandemic tweets into positive, negative, and neutral sentiments.
  • The datasets were acquired from Kaggle and merged to form a single dataset with 2,00,000+ records.Transfer learning was implemented on a pre-trained model, Vader built on Python by C.J. Hutto.
  • The tuned model achieved an accuracy of 88% and the results were visualised on a window created using Tkinter library on a daily and monthly basis.

Contact & Socials

Gmail:

chiragajay.jain@gmail.com

LinkedIn:

linkedin.com/in/chiragajain

GitHub:

github.com/ChiragAJain

Kaggle:

kaggle.com/chiragajain

© Copyright 2026 | Made by Chirag Jain