About Me
I'm a Data Scientist and Machine Learning Engineer pursuing my Master's in Data Science at the University of Texas at Arlington.
With expertise in AI/ML, Deep Learning, and Cloud Technologies, I specialize in building scalable solutions that transform raw data into actionable insights.
In my journey, I also work as an AI and ML Research Assistant at The University of Texas at Arlington under Dr. Eric Jones Jr., collaborating on applied Machine learning and AI research.
My work spans Gen AI, healthcare, finanical analytics, NLP, computer vision, and MLOps, with a passion for deploying production-ready machine learning systems that make a real-world impact.
Featured Projects
QuantTrade AI – GenAI Equity Research Copilot
Hybrid GenAI fintech platform that fuses predictive models, financial document retrieval, sentiment analysis, and LLM-based reasoning into a unified decision-support system for equity research and portfolio intelligence.
NegotiationArena – OpenEnv Salary Negotiation RL
Reinforcement learning environment for salary, equity, and start-date negotiation built on OpenEnv, with a rule-based Challenger for training, a Qwen2.5-1.5B (4-bit) GRPO agent via Unsloth, and a Gradio-powered HF Spaces demo for interactive negotiations.
NASA Turbofan Engine Degradation – RUL Prediction
Interactive Streamlit application for predicting Remaining Useful Life (RUL) of turbofan engines using NASA C-MAPSS FD001 dataset. Features comprehensive EDA, feature engineering, regression and classification models (XGBoost, Random Forest, SVR, SVC), model comparisons, and real-time predictions with pre-trained models.
Sepsis Prediction System
Advanced ML model using XGBoost and ensemble methods to predict sepsis onset early, trained on a comprehensive synthetic dataset. Achieves high accuracy in early detection critical for ICU patient care and treatment planning.
Edukrishnaa - Career Guidance System
AI-powered career guidance system for students (10th, 12th, UG) using psychometric and aptitude tests with multiclass classification to recommend career paths, job roles, roadmaps, and higher study options. Published research paper in Springer Nature.
CI/CD Pipeline Dashboard
Real-time analytics dashboard for monitoring CI/CD pipelines with Docker containerization, AWS cloud infrastructure integration, automated testing workflows, deployment metrics, and performance analytics for DevOps teams.
Application Logs ETL Pipeline
End-to-end ETL pipeline processing structured and semi-structured application log data with timestamp normalization, service-level metric computation, and ML-based latency degradation prediction. Delivers transformed datasets, metric reports, and predictive models with performance analytics.
KSAT Quest - Regression Runoff
Predicts soil's saturated hydraulic conductivity using UKSAT data via preprocessing, feature selection, and modeling, evaluated with RMSLE/R² to aid sustainable hydrological modeling.
NYC Taxi Trip Duration Prediction
Predictive modeling project forecasting taxi trip durations using historical NYC taxi data. Includes comprehensive data preprocessing, exploratory data analysis (EDA), feature engineering, and XGBoost-based predictive model to explore urban taxi trip dynamics.
Disaster Impact Prediction Model
Comprehensive ML model predicting reconstruction costs, injury risks, regional impact, first responder requirements, resource availability, evacuation plans, and shelter needs during disasters.
E-Voting dApp - Blockchain Voting
Completely decentralized e-voting system built on Ethereum blockchain, ensuring transparency, security, and immutability of voting records using smart contracts.
Diabetes Retinopathy Diagnosis
ML-based medical diagnosis system for detecting diabetic retinopathy from retinal images, aiding early detection and treatment of this vision-threatening condition.
Breast Cancer Cell Classification
Machine learning model for classifying breast cancer cells, helping in early diagnosis and treatment planning through automated cell analysis.
An Eye for Sightless
Computer vision system for visually impaired individuals using Kinect sensor to identify known persons, provide audio navigation, and assist in corridor navigation through familiar or unfamiliar spaces.
Research & Publications
EduKrishnaa: A Career Guidance Web Application Based on Multi-intelligence Using Multiclass Classification Algorithm
2023Yash Joshi, Shreyas Ajgaonkar, Pravin Tale, Pranav Jore, Mrunmayee Jakate, Snehal Lavangare, Deepali Kadam
Multi-disciplinary Trends in Artificial Intelligence (MIWAI 2023), Lecture Notes in Computer Science, Springer Nature
Web-based career guidance system that analyzes user profiles with personality and technical tests using multiclass classification algorithms. Random Forest achieved 95.45% accuracy. Provides diverse career options, job opportunities, skill-building courses, projects, internships, and startup suggestions for students.
Read PaperEarly Sepsis Prediction Using Machine Learning in ICU Patients Using XGBoost and LSTM
2024Yash Joshi, et al.
IEEE Conference on Healthcare Informatics 2024
Novel approach combining XGBoost and LSTM for early sepsis detection with 98.7% accuracy...
Read PaperMy Journey
AI & ML Research Assistant
The University of Texas at Arlington (with Dr. Eric Jones Jr.)
Supporting applied AI and machine learning research, contributing to model development, experimentation, and data-driven insights across academic and real-world problem domains.
Master of Science in Data Science
The University of Texas at Arlington
CGPA: 4.0/4.0. Coursework: Big Data Management, Data Science, Data Visualization, Machine Learning, R Programming, Statistics.
Hackathon UTA - 1st Place
Team Lead
Led team to first place in University of Texas at Arlington hackathon competition.
Data Engineer
Larsen and Toubro Private LTD.
Configured data ingestion pipelines connecting SAP ERP, SQL Server, and flat-file feeds. Engineered ETL workflows in Python (Pandas, SQLAlchemy) and Airflow, reducing manual reporting time by 35%. Optimized pipeline performance, cutting data refresh latency from 2 hours to 25 minutes.
Bachelor's in Computer Science
Mumbai University
GPA: 8.5/9.0. Comprehensive foundation in computer science principles including algorithms, data structures, software engineering, database systems, operating systems, and computer networks. Developed strong problem-solving skills and technical expertise.
Smart India Hackathon - Finalist (2nd Place)
Team Leader
Led team as finalist achieving 2nd place in national-level Smart India Hackathon competition.
AI & Machine Learning Engineer
Universal Recycle Solutions
Designed and deployed Power BI dashboards connected to SQL Server. Developed ML pipeline (Random Forest, Gradient Boosting) to forecast waste collection volume 7-14 days ahead. Cut overtime costs by an estimated 12% per quarter through predictive analytics.
Hack Overflow - 1st Place
Team Lead
Led team to first place in Hack Overflow hackathon competition.
State Level Paper Presentation - 1st Place
Presenter
Achieved first place in state-level paper presentation competition.
Technical Diploma in Computer Science
Diploma Program
Percentile: 92.00%. Intensive technical training in computer science fundamentals including programming languages, software development, web technologies, and system administration. Built strong technical foundation through hands-on projects and practical applications.