B.Tech in Computer Science & Engineering | IIITDM Kancheepuram Aspiring AI Research Engineer | Focus: Computer Vision, Multimodal ML, Natural Language Processing
I am a 5th-semester Computer Science student at IIITDM Kancheepuram (CGPA: 8.67) dedicated to advancing the state of the art in Artificial Intelligence. I am actively building a research portfolio aimed at top-tier MS programs and global R&D roles.
- 🔭 Current Focus: Multimodal Visual Question Answering for sports video understanding — fine-tuning LLaVA-1.5-7B with LoRA on a custom-annotated Cricket VQA dataset.
Building a domain-specific multimodal VQA system for T20 broadcast footage. Involves constructing a custom annotated dataset of 500+ frames across five question types (counting, spatial, formation, context, tactical delivery prediction), and fine-tuning LLaVA-1.5-7B with LoRA adapters.
End-to-end LightGBM win-prediction system trained on 703 Cricsheet matches. 63 engineered features across ELO ratings, H2H history, venue familiarity, phase-by-phase batting/bowling ratings, and toss dynamics. Achieved 93.5% accuracy on held-out 2022 & 2024 World Cup data. Deployed as a GitHub Pages site.
Fine-tuning IndicBART for low-resource Telugu→English NMT using RAMP data augmentation on the Samanantar corpus. Exploring back-translation and transfer learning to improve BLEU scores.
Schema-aware Text-to-SQL system fine-tuned on the Spider dataset. Focused on domain adaptation and improving generalization of T5-small to complex multi-table queries.
- Languages: Python, C++, C, Verilog, SQL
- AI/ML Frameworks: PyTorch, Hugging Face Transformers, Scikit-learn, LightGBM
- Libraries: OpenCV, NumPy, Pandas, Matplotlib, Seaborn
- Research Interests: Multimodal Machine Learning, Computer Vision, Natural Language Processing, Reinforcement Learning, Time Series Forecasting
- LinkedIn: linkedin.com/in/tarak-ram-alahari-a49226333
- Email: alaharitarak@gmail.com
"Precision in code, innovation in research."