Skip to content
View KonNik88's full-sized avatar

Block or report KonNik88

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
KonNik88/README.md

Hi there

I’m Konstantin Nikiforov — MD, Molecular Geneticist (Russia) transitioning into Data Science & Machine Learning.
I like end-to-end work: data → features → models → APIs → simple UIs → orchestration.

Focus: applied ML for business (analytics & decision support), reproducible pipelines, model calibration & evaluation.
Interests: neural networks & AI, biomedical ML, causal inference, interpretable models, CV, NLP/linguistics, music, Time Series.


Stack

Core: Python · SQL · Pandas · NumPy · scikit-learn · CatBoost · XGBoost · LightGBM · Optuna
DL/NLP: PyTorch · Hugging Face · Sentence-BERT · (SimCLR/BYOL, Diffusion — exploring)
Recsys: ALS (implicit) · Hybrid SBERT+ALS+CatBoost · Qdrant (vector DB)
Time Series: Prophet · TBATS · ETNA/AutoTS · backtesting (rolling/holdout)
MLOps/App: FastAPI · Streamlit · Airflow · Docker/Compose · MLflow
Data & Scale: Spark (PySpark) · PostgreSQL · MySQL
XAI/Visualization: SHAP · LIME · Plotly/Dash · Matplotlib · ydata-profiling
Domains: Tabular ML · Recommenders · NLP · Time Series · BioML · CV

Exploring: Ray/Dask · GNNs · SSL · ESM · AudioML


Selected Projects

  • Hybrid Book Recommender System — CatBoost + ALS + SBERT · FastAPI + Streamlit · Docker · Qdrant
    repo
  • BlendCAL — Conversion Prediction — CatBoost/XGBoost/LightGBM ensemble · FastAPI · Streamlit · Airflow DAGs · Docker Compose
    repo
  • Model Drift Monitoring — Evidently + SHAP + PSI/JS checks · alert policy demo
    repo
  • Panel Time-Series Forecasting — ARIMA, TBATS, Prophet, Darts · Optuna-tuned baselines
    repo
  • Omics Survival Analysis — RNA-seq PCA + embeddings for bioinformatics
    repo

Languages

English (B2) · French (B2)

Contact

Email: konnik1000@gmail.com · Telegram: @Konnik1988 · GitHub: https://github.com/KonNik88


TL;DR

MD molecular geneticist building practical DS/ML pipelines (Python, FastAPI, Streamlit, Airflow, Docker).

Popular repositories Loading

  1. omics-survival-embeddings omics-survival-embeddings Public

    Benchmarking embedding methods (UMAP, VAE, PCA, FA, ICA, etc.) for survival prediction on omics data with TabNet, CatBoost and ridge models.

    Jupyter Notebook 3

  2. blendcal-conversion-prediction blendcal-conversion-prediction Public

    End-to-end ML pipeline for predicting conversion in web sessions: feature engineering, CatBoost+XGBoost+LightGBM ensemble with calibration, FastAPI service, Streamlit UI, Airflow DAG orchestration,…

    Jupyter Notebook 1

  3. heart-disease-ml-practice heart-disease-ml-practice Public

    Practice notebook on heart-disease risk with a small/noisy dataset: EDA → preprocessing → classic ML baselines (scikit-learn). Not for clinical use

    Jupyter Notebook 1

  4. pca-genotypes-africa pca-genotypes-africa Public

    Genome-wide PCA analysis and clustering of African populations (Namib/Angola project, Oliveira et al. 2023)

    Jupyter Notebook 1

  5. pca-rnaseq-analysis pca-rnaseq-analysis Public

    PCA and UMAP analysis of bulk RNA-seq (HL-60, GSE184891) and scRNA-seq (AML, GSE116256). Includes QC, visualization, clustering, and biological interpretation.

    Jupyter Notebook 1

  6. pancreatic-disease-prediction-ml pancreatic-disease-prediction-ml Public

    Pancreatic disease prediction from biomarker tabular data (Debernardi et al., 2020) — EDA, classical ML (CatBoost/LightGBM/XGBoost), PyTorch MLP, LightAutoML, Optuna HPO, and rigorous evaluation

    Jupyter Notebook 1