Available for AI/ML Opportunities

AYUSH CHHOKER

>> |

I design and build intelligent AI systems that deliver real impact — from RAG architectures to LLM pipelines and scalable data solutions.

0+ Years Experience
0+ Projects Built
0%+ Model Accuracy
RAG Specialization
Scroll

Hi there! I'm Ayush.

I'm a Data Science and Analytics graduate student at SUNY Polytechnic Institute, specializing in building intelligent AI systems that bridge research and production. My core focus areas are Retrieval-Augmented Generation (RAG), Graph RAG architectures, LLM fine-tuning, and scalable ML pipelines.

With hands-on expertise in Python, TensorFlow, PyTorch, LangChain, and modern AI frameworks, I build production-grade systems that go beyond prototypes — from real-time data pipelines to multi-hop reasoning engines powered by knowledge graphs.

My approach combines rigorous technical depth with practical engineering — whether architecting a cardiovascular risk prediction model with explainability or deploying a Graph RAG system for multi-document reasoning.

RAG Systems LLM Fine-tuning Graph RAG ML Pipelines Data Engineering Vector Search NLP Deep Learning
Ayush Chhoker

AI Research & RAG

Building cutting-edge RAG systems, Graph RAG with Neo4j, LLM fine-tuning, and intelligent information retrieval architectures.

Data Science & ML

End-to-end ML pipelines, predictive modeling, statistical analysis, and production-grade deployments with 90%+ accuracy.

Software Engineering

Scalable backend systems with FastAPI, real-time streaming pipelines, vector databases, and cloud-native deployments.

Tech Arsenal

Core Proficiency

Python 95%
RAG Systems 93%
LLM Engineering 90%
LangChain / LlamaIndex 88%
NLP & Transformers 86%
TensorFlow / PyTorch 85%
LLM Fine-tuning 84%
ML Deployment & APIs 83%

Tech Stack

AI / ML
PyTorch TensorFlow Scikit-learn Hugging Face OpenAI API Anthropic API LangChain LlamaIndex
Vector / Graph DB
FAISS Pinecone ChromaDB Neo4j MongoDB PostgreSQL
Data & Analytics
Pandas Spark Power BI Tableau IBM SPSS MLflow
Cloud / DevOps
FastAPI Docker AWS Streamlit Firebase Git

Professional Journey

Graduate Research Assistant

SUNY Polytechnic Institute
Oct 2024 – Present Utica, NY Active

Developing and implementing RAG systems and ML infrastructure to support research and applied AI projects. Focus on Graph RAG architectures, LLM fine-tuning, and synthetic data generation pipelines.

  • Building advanced RAG systems for AI-driven information retrieval
  • Developing Graph RAG architectures using Neo4j for multi-hop reasoning
  • Designing synthetic dataset generation pipelines for model training
  • Creating robust data pipelines integrating diverse sources
  • Research on NLP, embeddings, and semantic search
Python LangChain Neo4j Transformers Vector DBs PyTorch

Data Analyst

Stop-not Service
Aug 2022 – Aug 2024 Remote Full-time

Led data analytics initiatives including statistical analysis, dashboard generation, and data-driven solutions to support operational and business decisions.

  • Statistical analysis and dashboard generation via IBM SPSS and Python
  • Data cleaning, processing and validation using Python scripting
  • Built interactive dashboards in Tableau and Power BI
  • Identified market trends and opportunities for business growth
  • Automated recurring reporting workflows
Python IBM SPSS Tableau Power BI SQL Pandas

M.S. Data Science & Analytics

SUNY Polytechnic Institute
Aug 2024 – May 2026 Graduate

Research & coursework centered on RAG systems, embeddings, LLM architectures, and scalable ML pipelines.

Deep Learning NLP Statistical Modeling Big Data

Master of Computer Applications (MCA)

Galgotias University
Aug 2020 – Sep 2022 Postgraduate

Software engineering, algorithms, and applied computing. Strong foundation in programming and analytical thinking.

Software Engineering Algorithms Data Structures

Featured Work

Production-grade AI systems, ML platforms & data solutions

CardioFusion

ML platform for cardiovascular disease risk prediction using hybrid ensemble models, deep learning, and SHAP-powered explainability.

PythonTensorFlowSHAPScikit-learnStreamlit

LRAG — LightRAG

Graph RAG implementation using LightRAG for multi-hop reasoning. Neo4j knowledge graphs for improved contextual understanding across documents.

PythonNeo4jLangChainFAISSGraph RAG

DataFlow Intelligence

Unified Streamlit dashboard integrating 7 analytical projects — COVID-19 tracking to restaurant analytics — using live APIs and rich datasets.

PythonStreamlitPandasREST APIsPlotly

Logistics Streaming

Real-time logistics data streaming and analytics pipeline with live monitoring dashboard. Event-driven architecture with real-time insights.

PythonKafkaSparkStreamlitReal-time

Data Analysis Suite

Multi-domain analytical dashboard covering world university rankings, COVID-19 tracking, restaurant analytics, and more with interactive visualizations.

PythonPandasMatplotlibStreamlitAPIs

Real Estate Intelligence

ML-powered real estate pricing prediction dashboard with geospatial clustering, market trend analysis, and interactive property insights.

PythonScikit-learnPlotlyJupyterML

Credentials & Certifications

TensorFlow Developer Certificate

Google
2023

Machine Learning Specialization

Stanford University (Coursera)
2022

Data Science Professional Certificate

IBM
2022

Power BI Data Analyst

Microsoft
2021

Let's Connect

Have a project in mind, a research idea, or want to collaborate? I'd love to hear from you.