Skip to content
View Al1mkaYandere's full-sized avatar
🤑
🤑

Block or report Al1mkaYandere

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Al1mkaYandere/README.md

Alim Igilik — Mindcore

Probabilistic modeling · Stochastic systems · Computational mathematics

I work at the intersection of applied probability, statistical learning, and computational methods — with a focus on building mathematically grounded models of complex, uncertainty-driven systems.


Research Interests

Area Focus
Stochastic Systems Probability theory, mathematical statistics, discrete/continuous-time stochastic processes
Probabilistic Forecasting Count process modeling, calibrated uncertainty quantification, extreme-event prediction
Statistical Learning Inference under model misspecification, hybrid neural-statistical architectures
Computational Mathematics Numerically stable, reproducible implementations of probabilistic models

Long-term direction: quantitative modeling at the intersection of stochastic processes, probabilistic inference, and high-dimensional statistical learning — with applications in quantitative finance and risk-driven systems.


Projects

Neural Negative Binomial Regression for Weekly Seismicity Forecasting

arXiv GitHub

End-to-end probabilistic pipeline for spatial-temporal earthquake occurrence modeling.

Core contributions:

  • Demonstrated via likelihood-ratio test with boundary correction ($p < 10^{-179}$) that the Poisson assumption is systematically violated in Central Asia seismic data (2010–2024)
  • Designed EarthquakeNet: per-cell overdispersion estimation via spatial embeddings + MLP, replacing the standard global-α negative binomial assumption
  • Walk-forward evaluation (2018–2023): 8.6% reduction in mean pinball deviation vs. NB-GLM baseline; 12.5% lower CRPS in the tail regime (Y ≥ 5)
  • Full reproducible pipeline: data download → feature engineering → training → reporting, one-command rerun

Python PyTorch NumPy SciPy Stochastic Processes Count Models Spatial-Temporal


IELTS Automated Essay Scoring — Hybrid DeBERTa Regressor

GitHub

Research pipeline for predicting IELTS-style essay band scores (0–9).

Core contributions:

  • Engineered 21 linguistic features (lexical diversity, syntactic complexity, coherence proxies) as a structured tabular representation of essay quality
  • Designed a hybrid DeBERTa regressor jointly processing transformer embeddings and tabular features
  • Topic-grouped cross-validation with out-of-fold evaluation to prevent topic-level data leakage — a common failure mode in AES benchmarks
  • Benchmarked classical ML baselines (XGBoost, LightGBM, Ridge) against the neural model; full reproducible pipeline via single-command scripts

Python PyTorch HuggingFace DeBERTa XGBoost NLP Feature Engineering


LLM Long-Form Video Generation System

GitHub

Modular asynchronous pipeline for automated long-form content generation and multimodal orchestration.

  • Async data ingestion, LLM-driven narrative structuring, TTS synthesis, image generation, and video rendering
  • Cost-controlled LLM orchestration (Claude / Gemini APIs) with config-driven reproducibility

Python Asyncio Claude API Gemini API Systems Design Multimodal AI


Technical Foundations

Mathematics:
Measure Theory · Probability Theory · Stochastic Processes · Statistics · Linear Algebra · Functional Analysis · Optimization · Numerical Methods

Engineering:

Languages:
Python · C++ · SQL · Bash · Java

ML / DL:
PyTorch · NumPy · SciPy · Pandas · HuggingFace Transformers · Scikit-learn · XGBoost · LightGBM · CatBoost

Tools:
Git · Docker · Linux · LaTeX


Philosophy

My work is grounded in a probability-first view of complex systems — where uncertainty is not treated as noise to be eliminated, but as a fundamental object of study.

Rather than black-box or purely deterministic approaches, I aim to understand and model the underlying generative mechanisms of data through rigorous probabilistic frameworks.

This connects:

  • stochastic processes and time-evolving systems
  • calibrated probabilistic inference and statistical learning
  • numerically stable computational implementations

The goal is a consistent bridge between rigorous probability theory, computational mathematics, and modern data-driven modeling — applied to systems where getting the uncertainty right is as important as getting the point estimate right.


Email LinkedIn arXiv

Pinned Loading

  1. seismic-probabilistic-modeling seismic-probabilistic-modeling Public

    Code for EarthquakeNet - negative-binomial deep learning for overdispersed seismic count data. End-to-end USGS pipeline (Central Asia, 2010–2024): spatiotemporal grid, NB GLM, Hybrid DL NB, Neural …

    Jupyter Notebook

  2. llm-longform-video-research llm-longform-video-research Public

    AI engineering toolkit for reproducible long-form narrative generation and multimodal assembly (LLM, calibration, TTS, browser automation, video).

    Python

  3. nlp-essay-scoring nlp-essay-scoring Public

    Research-oriented Automated Essay Scoring (AES) framework leveraging transformer architectures and semantic feature extraction for intelligent essay evaluation.

    Jupyter Notebook