Structured attribution framework for training small LLMs to generate step-by-step reasoning with explicit citations. Uses a two-phase SFT + GRPO pipeline with composite rewards for faithfulness and interpretability.
Technologies: LLMs, Reinforcement Learning (GRPO), NLP, Attribution
Full-stack institutional research repository inspired by CERN’s InvenioRDM. Supports record management, full-text search, versioning, and DOI-style IDs with scalable backend and search infrastructure.
Technologies: Flask, React, PostgreSQL, OpenSearch, Docker
Benchmark for evaluating attribution stability under prediction-invariant perturbations. Introduces a three-axis stability framework for robust comparison of XAI methods.
Technologies: Explainable AI, Computer Vision, Evaluation Metrics
Reinforcement learning-based retrieval framework that learns when and how much to retrieve by modeling RAG as a Markov Decision Process. Improves accuracy while reducing retrieval cost.
Technologies: LLMs, RAG, Deep Q-Learning, RL, NLP
Unified detection, validation, and remediation across Java, Python, and C++ using uAST normalization and hybrid GraphSAGE + LLM fusion. Achieves 89.84–92.02% accuracy and 69.74% end-to-end resolution.
Tech: Python, PyTorch, Tree-sitter, GraphSAGE, Qwen2.5-Coder, Docker
HyperComplEx is a novel hybrid embedding framework that adaptively combines hyperbolic, complex, and Euclidean spaces via learned attention mechanisms for knowledge graph completion. Achieves up to 18% relative gain in MRR while maintaining near-linear scalability.
Technologies: Python, PyTorch, Geometric Deep Learning, Knowledge Graph Embeddings
A large-scale, unified dataset of parsed source code across 10 major programming languages under a universal AST schema. Over 7 million parsed code files with 99.9999% conversion rate. Published on Hugging Face.
Technologies: Python, Tree-sitter, Apache Parquet, Hugging Face Datasets
Integrates graph neural networks (GNNs) with Large Language Models for vulnerability detection in Java code. Uses PROGEX for AST/CFG extraction. 93.57% accuracy, 17.81% above LLM baselines.
Technologies: Python, Java, PROGEX, GNNs, PyTorch Geometric, LLMs
A web-based multi-agent assistant built with Phi framework and Groq-powered LLaMA3 for real-time financial queries. Integrates live stock data via yfinance with web search citations.
Technologies: Python, Flask, Phi Data Agents, Groq, yfinance
Hybrid agent augmenting Bandit static analysis with LLMs to detect and repair Python vulnerabilities. Reduces false positives by 10.8%, improves fix accuracy by 13.51%. Developer rating: 4.5/5.
Technologies: Python, LLMs, HuggingFace, Apple MLX, Bandit
A modular, low-latency order matching engine in C++ for financial trading simulation. Supports limit, market, and cancel orders. Exposes Python bindings via pybind11.
Technologies: C++17, pybind11, CMake, Google Test, Python
Dual-stage pipeline leveraging LLMs to detect, exploit, and remediate software vulnerabilities across 14 programming languages. Integrates static analysis and exploit simulation. Usefulness: 8.06/10.
Technologies: Python, LLMs, HuggingFace, Apple MLX, Static Code Analysis
Transformer-based sentiment analysis on CMU-MOSEI dataset with early fusion of text, audio, and visual modalities. Achieves 97.87% accuracy and 0.9682 F1-score.
Technologies: Python, PyTorch, HuggingFace, Multimodal Deep Learning
PDF-aware scientific chatbot combining LangChain, Pinecone vector search, and Mistral7B for interactive question-answering over research papers.
Technologies: Python, LangChain, Pinecone, LLMs, NLP
AI-powered smart shopping cart using computer vision for automatic product identification and billing. Includes a patent published in India. Uses object detection and edge computing.
Technologies: Computer Vision, Python, OpenCV, PyTorch, RoboFlow
Secure Android messaging app integrating AES-256 encryption with steganography for covert communication. Hidden data embedded within media files.
Technologies: Android, Java, AES-256, Steganography, Firebase