Projects


AI Coding Agents: Evaluation & Benchmarking Framework

Designed and delivered an evaluation framework to rigorously benchmark advanced AI coding agents (e.g., GPT-5.x, Claude Sonnet) on complex real-world engineering tasks. Built containerized Python workflows that stress-tested model reasoning, systematically identified failure modes, and established oracle solutions and automated verification tests to measure performance accuracy, multi-step reasoning, and instruction adherence. This project blends production-grade engineering with advanced AI evaluation, surfacing actionable insights into reasoning capabilities and robustness of modern AI agents.


Lead Scoring: End-to-End Machine Learning Pipeline

Developed an end-to-end machine learning pipeline in Python to build a customer propensity model, predicting the likelihood to convert by enriching demographic and behavioral data to prioritize high-value leads. Engineered a modular, production-grade system encompassing data ingestion, feature engineering, automated model development, and API deployment. Utilized Scikit-learn, Pandas, and various AutoML frameworks to train and evaluate a suite of models—including Logistic Regression, Random Forest, and Gradient Boosting—with systematic hyperparameter tuning and cross-validation. Exposed the final, optimized model via a REST API using Flask, providing a production-ready solution for real-time lead scoring that enables data-driven marketing strategy and resource allocation.


Customer Segmentation using K-Means Clustering

Conducted a comprehensive customer segmentation analysis using K-Means clustering and Principal Component Analysis (PCA). Engineered predictive models, including Logistic Regression, Decision Forest, and Neural Networks, to forecast customer responses. Achieved up to a 99% F1 Score and a 0.99 AUC in predicting customer behavior. Summarized actionable business insights to inform personalized marketing strategies and cross-channel marketing initiatives. Utilized Python and Pandas for data preprocessing, exploratory data analysis, and model implementation.


Market Research: Electric Vehicles and the barriers to EV adoption

Conducted comprehensive market research to identify barriers to Electric Vehicle (EV) adoption, using tools such as IBM SPSS for data analysis. Constructed a consumer research survey in Qualtrics to collect quantitative and qualitative data from over 100 participants, focusing on EV charging times, efficiency, and accessibility. Performed statistical analyses including correlation, regression, and ANOVA using SPSS to measure the impact of charging times on EV purchase decisions among car buyers.


Data Analytics & Engineering: Food Delivery Trends – Post Pandemic

Conducted an exploratory data analysis on food delivery trends and U.S. food price inflation data from 1968-2024 using pandas and Python. Implemented OLS regression models to analyze the impact of post-pandemic inflation rates on the frequency of food delivery orders. Utilized one-hot encoding for categorical variables to prepare the data for regression analysis. Created multiple OLS regression models to examine trends and factors influencing food delivery services post-pandemic.


Artificial Intelligence Case Study: J.P. Morgan Chase

Conducted an in-depth analysis of OmniAI, J.P. Morgan’s internally built intelligent information system, examining its development, strategic adoption, and impact on operational efficiency and customer engagement. Provided insights into the role of AWS SageMaker in building and deploying machine learning models at J.P. Morgan. Analyzed the impact of AI on reducing operational costs and increasing the speed and accuracy of data analysis.