PORTFOLIO

Chirag Jain
Hi! đź‘‹ I am Chirag
I build software to solve problems I have personally encountered. My focus is on leveraging AI and SaaS technologies to develop applications that are accessible, insightful, and designed to make a tangible, positive impact on people's lives.
👨‍💻Technical & Scientific Interests: Research, Deep Learning, Machine Learning, Python Software Development, Data Science, Database Management, Astronomy
Experience
Indian National Science Academy (INSA) Summer Research Fellow
May 2025 – Jul. 2025
Technologies: PyTorch, Computer Vision
Supervisor: Dr. Ramesh Venkadamchalam (Department of Mathematics, Central University of Tamil Nadu)
- Conducted research on AI-based disaster damage assessment using the xView2 dataset comprising 20,000 pre/post-disaster image pairs, focusing on building-level classification to support Humanitarian Assistance and Disaster Relief (HADR) efforts.
- Designed a scalable preprocessing pipeline that extracted building patches, applied augmentation, and implemented majority-class undersampling, resulting in a 94.4% dataset expansion (from 304,370 to 591,583 image patches) and improved class balance for model training.
- Developed and optimized a lightweight CNN with residual connections (3.7M parameters, 28.71 MB) achieving 83.3% accuracy on 128x128 inputs, closely matching ResNet18's 83.5% (7M parameters, 63.33 MB) with over 54.66% reduction in model size, enabling faster inference for field deployment.
Education
Vellore Institute of Technology - Amaravati, Andhra Pradesh
2022-2026
B.Tech of Computer Science and Engineering Core
Current CGPA: 9.12
Chennai Public School - Chennai, Tamil Nadu
2020-2022
Central Board of Secondary Education (CBSE)
Senior Secondary
Percentage: 92.6%
Chennai Public School - Chennai, Tamil Nadu
2018-2020
Central Board of Secondary Education (CBSE)
Secondary
Percentage: 95.6%
Certifications
- AWS
Solution Architect Associate (SAA-C03)
- IBM Professional Data Science
Certification
- R
Programming Certification
- The Complete
Python Developer Course - ZerotoMastery Academy
- Associate
Data Engineer
in Snowflake
- Data
Analyst with R Certification
- What is Data Science
- Tools for Data Science
- Data Science Methodology
- Python for Data
Science,AI and development
- Python Project
for Data Science
- Database and SQL
for Data Science with Python
- Data Analysis
with Python
- Data
Visualisation with Python
- Machine Learning
with Python
- Generative AI: Elevate Your
Data Science Career
- Data Scientist Career Guide
and Interview Preparation
- ISRO
IIRS DLP - Exploring Earth's Moon through Chandrayaan
- ISRO
IIRS DLP - Aditya L1: India's first space based observatory
- ISRO
IIRS - Space Science and Technology Awareness Training (START)
Professional Certifications
Fundamental Certifications
Data Science and Analysis Certifications
Astronomy Certifications
Projects
GenAI Legal Assistant (Recently Revamped)
- Designed and developed a responsive, full-stack SaaS application to analyze and summarize complex legal documents, serving as a "GenAI Legal Assistant."
- Built an intelligent document processing pipeline that ingests multiple file formats (.pdf, .docx, .txt), extracts text, and performs automated section identification using legal keyword recognition.
- Engineered a resilient, dual-mode architecture for analysis, featuring a primary mode powered by the Gemini API and a fallback "Lite" mode using a fine-tuned Legal Pegasus model and KeyBERT for continued operation.
- Automated the end-to-end software delivery lifecycle by establishing a CI/CD pipeline, deploying a scalable and monitored solution to a production environment on the Railway platform.
- Delivered a rich user experience with features including drag-and-drop file uploads, custom summary length controls, and the ability to export the structured analysis as a professionally formatted PDF.
GitDone: A GitHub-Integrated Deadline tracker
- Deployed an open-source tool that uses GitHub OAuth2 for secure user sign-in, allowing developers to create and manage deadline countdowns for their repositories.
- Constructed a 4-endpoint REST API to provide users with a unique embed link for a real-time countdown widget, enabling seamless integration into applications like Notion by configuring appropriate CORS header.
- Automated the software delivery lifecycle by engineering a CI/CD pipeline with AWS CodePipeline for deployments to AWS Elastic Beanstalk. Fortified security and performance by implementing Amazon CloudFront as a CDN to handle custom domain routing, SSL certificate termination, and cached asset delivery with a 99.5% uptime
Interactive Portfolio Analytics & Risk Assessment Dashboard
- Architected a quantitative finance platform to analyse 10 blue-chip equities, implementing a suite of risk metrics including Sharpe Ratio optimization, 95% Value-at-Risk, correlation matrices, max drawdown, and beta coefficients.
- Deployed a mobile-responsive analytics dashboard using a containerized architecture (Render.com) and Plotly/Dash, featuring real-time data visualization and an automated processing pipeline for 500 trading days of market data via yfinance API.
- Implemented an advanced statistical modelling system featuring dynamic risk-free rate integration (10Y Treasury), Monte Carlo simulation for risk analysis, and performance attribution across technology, financial, and healthcare sectors.
Comparative Analysis of DL Models for Brain MRI Tumour Detection
Technologies: PyTorch, Computer Vision
Supervisor: Dr. Deepasikha Mishra (School of Computer Science and Engineering, VIT-AP University)
- Conducted a comparative analysis of Deep Learning models (VGG16, VGG19, Xception, Simple CNN, EfficientNet-Attn) by standardizing training conditions with the same optimizer, scheduler, and epochs through performance evaluation.
- Processed ~12,000 brain MRI scans from the MRI ND-5 dataset (sourced from IEEE Dataport) through transformation pipelines to train deep learning models and generate comparative performance visualizations.
- EfficientNet achieved the highest performance with 99.82% accuracy on the external dataset and 97.45% on the internal dataset, validated using Nemenyi and Cohen’s d statistical significance tests
- Research selected for presentation at an IEEE international conference, with subsequent publication slated for Scopus-indexed IEEE Digital Library.
Stacked Ensemble Learning Model to Classify Potentially Hazardous Near-Earth Asteroids
- Developed a novel stacked ensemble model to classify Near-Earth asteroids as Potentially Hazardous Asteroids (PHAs) using physical and orbital attributes, achieving a recall of 99.29% and accuracy of 99.53%, critical for asteroid impact analysis.
- The dataset is acquired from NASA’s Jet Propulsion Laboratory Solar System Dynamics’ open datasets, consisting approximately 1.3 million records undergoing data pre-processing before model building.
- Built a stacked ensemble with Random Forest and XGBoost as base models and Logistic Regression as the meta-model, optimized using GridSearchCV, RFECV, and 15-fold cross-validation.
- Demonstrated the stacked model’s superior recall performance compared to individual base and meta models, underscoring its robustness in asteroid classification; results are currently under review for journal publication.
Early Prediction of Chronic Kidney Disease using Machine Learning
- Designed and implemented a predictive learning machine learning model that analysed medical records to identify chronic kidney disease, achieving an accuracy of 93.33% and a recall score of 94.44%.
- Four models were considered - Random Forest, Decision Tree, Logistic Regression, and XGBoost. Out of the 4, the overall performance of XGBoost was relatively better than the other 3.
- Trained on a CKD dataset acquired from UC Irvine Machine Learning Repository comprising 400 records and synthetic data generated using Copulas library comprising 200 records.
- Deployed the model locally via Flask with a user-friendly web interface scalable for public deployment using PythonAnywhere, enhancing accessibility and potential for broader user engagement.
Chymes: A Spotify Playlist Curator
- Developed and designed a playlist curator using Python that creates a Spotify playlist using real-time weather status.
- The model utilised Openweathermap API, and Spotify API to gather information and generate a playlist of 30 songs.
- Utilised Flask to deploy the webpage, and soon enough a mobile application available on Play Store. Currently, the web application is under Beta testing phase with 5+ users.
X(formerly Twitter) Sentiment Analysis: COVID-19 Tweets
- Built a sentiment analysis model using Python that categorised 2021 COVID-19 pandemic tweets into positive, negative, and neutral sentiments.
- The datasets were acquired from Kaggle and merged to form a single dataset with 2,00,000+ records.Transfer learning was implemented on a pre-trained model, Vader built on Python by C.J. Hutto.
- The tuned model achieved an accuracy of 88% and the results were visualised on a window created using Tkinter library on a daily and monthly basis.
Contact & Socials
Gmail:
chiragajay.jain@gmail.comLinkedIn:
linkedin.com/in/chiragajainGitHub:
github.com/Chirag-65-JainKaggle:
kaggle.com/chiragajain© Copyright 2025 | Made by Chirag Jain