Bengaluru, Karnataka, India
I'm a Data Science postgraduate at Great Lakes Institute of Management, with a background in Electronics & Communication Engineering. I specialize in turning raw, complex data into clear insights that drive real business decisions.
What makes my profile different: I've spent 2 years as an Associate Consultant at Great Learning, working directly with CRM data, tracking 500+ client records, and applying data-driven segmentation to improve outreach efficiency by 8–10% and enrollment conversion by 6–8%. That hands-on, client-facing background gives me the ability to not just build models — but communicate findings clearly to non-technical stakeholders.
Recent highlights:
- 📦 PostBot — Processed 1.6L+ India Post records, engineered BO/PO ratio features, and deployed a CatBoost model with R² = 0.92 as a Streamlit decision-support system for district delivery efficiency
- 📊 SaaS Analytics Pipeline — Generated 677K+ rows of synthetic SaaS data, built a full MySQL pipeline, and uncovered key insights: trial churn at 40.5%, Enterprise CLV 17.6× higher than Starter, and an anomaly flagged at Z-Score 4.51
- 🏏 IPL Win Probability — Ball-by-ball XGBoost predictor trained on 1,169 matches, deployed as a live Tableau dashboard with real-time simulation
- 🤖 AI-Powered Apps — Built multiple Streamlit apps using Gemini AI and Groq for resume analysis, defect detection, and a full RAG pipeline for LLM-powered document Q&A
- 🎭 Face Mask Detection — Built during Data Science internship at Zephyr Technologies using Scikit-learn on 2,000+ image samples, reducing manual reporting by 30%
🎓 Currently: Data Science & Generative AI — Great Lakes Institute of Management (ML · SQL · Business Analytics · LLMs · GenAI)
💼 Open to: Data Analyst · Business Analyst · Data Scientist — Bengaluru or Remote
Languages & Databases
ML & AI
Visualization & BI
Tools & Platforms
Python · CatBoost · XGBoost · Random Forest · Streamlit · EDA · Feature Engineering
- Diagnosed inefficiencies in India Post's postal network where districts with high office counts still showed poor delivery performance
- Processed 1.6L+ records, performed deep EDA, and engineered domain features like BO/PO ratios to capture structural imbalances
- Trained and compared Random Forest, XGBoost, and CatBoost models — CatBoost achieved R² = 0.92, the best-performing model
- Deployed PostBot, a Streamlit-based decision support system that generates district-level recommendations to improve delivery efficiency
Python · MySQL · SQL · Tableau · EDA · Excel · RFM Segmentation · Cohort Analysis
- Designed an end-to-end analytics solution to monitor customer churn, revenue trends, and user behavior for a SaaS business
- Generated 677,000+ rows of realistic synthetic SaaS data across 5 MySQL tables covering billing, usage, and customer data
- Wrote advanced SQL queries for KPI analysis, cohort retention, RFM segmentation, and anomaly detection
- Key findings: trial plan churn at 40.5%, Enterprise CLV 17.6× higher than Starter (₹85,066 vs ₹4,830), and a payment anomaly flagged at Z-Score 4.51 in August 2023
Python · XGBoost · Scikit-learn · Tableau · EDA · Pandas
- Ball-by-ball win probability predictor trained on 1,169 IPL matches using XGBoost
- Built a full ETL → feature engineering → model tuning → deployment pipeline
- Deployed as a live Tableau dashboard for real-time win-probability simulation during a match
Python · Streamlit · Gemini AI · GenAI · Computer Vision · Multimodal LLM
- Built a Streamlit web app where users upload images of building structures to detect and analyze structural defects
- Powered by Gemini's multimodal vision model for real-time defect identification and safety recommendations
Python · Streamlit · Gemini AI · GenAI · NLP · LLM · Prompt Engineering
- AI-powered resume scoring tool that evaluates a candidate's CV against a given job description and designation
- Provides detailed, actionable feedback and improvement suggestions using Gemini AI
Python · RAG · LLM · GenAI · Gemini · Groq · Vector DB · Prompt Engineering
- Implemented a full Retrieval Augmented Generation (RAG) pipeline that enhances LLM responses with real-time document retrieval
- Combines Groq (LLaMA) and Gemini with vector search to ground answers in uploaded documents rather than pre-trained knowledge alone
| Role | Company | Period |
|---|---|---|
| Associate Consultant | Great Learning | Jul 2023 – Jun 2025 |
| Data Science Intern | Zephyr Technologies & Solutions | Jul 2022 – Aug 2022 |
| Degree | Institution | Year |
|---|---|---|
| PG Program — Data Science & Generative AI | Great Lakes Institute of Management, Bangalore | 2025 – Present |
| B.E. — Electronics & Communication Engineering | NMAM Institute of Technology, Nitte | 2019 – 2023 |
"Data is not just numbers — it's the story behind every business decision."
📂 Portfolio: murali-manohara.github.io | 📧 muralimanohara661@gmail.com