About
Experience

ProPeers
Founding Engineer & AI Architect
July 2025 – Present · Delhi, India · Remote
- Architected the full AI ecosystem powering RoadmapAI, CodeLLM, AskAI, Global AI Search and the AI Code Editor building Agentic AI pipelines, RAG systems, MCP server architecture and LLM orchestration that now drives 80%+ of total platform traffic.
- Engineered RoadmapAI end-to-end with a self-learning RAG pipeline (text-embedding-ada-002, ChromaDB, semantic filtering, adaptive difficulty) and MCP-layered prompts, achieving sub-second inference and large-scale personalization.
- Delivered ~99% personalized roadmap accuracy using Agentic flows, structured prompt masks, multi-model routing, and RAG optimization directly improving RoadmapAI user ratings from the early 12% baseline.
- Built CodeLLM, an AI judge with multi-language detection, dual-layer JSON parsing, context-aware error classification (COMPILATION/RUNTIME/VALIDATION), semantic retrieval and deterministic verdict synthesis.
- Developed AskAI, an agentic programming assistant using MCP-based prompt pipelines, resource-aware context analysis, dynamic O3Mini/O1 routing, token metering and automated formatting boosting engagement 3× and answer resolution speed 2×.
- Shipped the AI Code Editor with real-time AI review (<40ms), inline reasoning, multi-language execution and deep RoadmapAI/CodeLLM integration raising editor retention by 40%.
- Scaled Roadmap features to 120K+ organic users and improved MAU by 46% through rapid iteration, tight user-feedback loops and stable AI feature launches.
- Delivered Individual Roadmap Communities enabling peer-matching, shared progress tracking and roadmap-level micro-communities.
- Optimized CI/CD and deployment systems, cutting deployment time by 34%, automating multi-service rollouts, and enabling safer high-frequency releases.
- Reduced platform downtime by 90% (4 hrs to 45 mins/month) via infra hardening, progressive fallbacks, cache-first routing, real-time health checks and load-aware autoscaling.
- Implemented complete analytics & aggregation pipelines for 100K+ users with Redis caching, chunked batch aggregation, API acceleration and advanced rate-limit enforcement.
- Developed full search-validation engines (Roadmaps + RoadmapAI), ensuring context-safe retrieval, hallucination-resistance and consistent multi-node semantic validation.
- Performed Azure cost & infra optimization VM right-sizing, eliminated Bastion, stabilized Redis/Entra costs, contained Cognitive Service spikes and resolved large bandwidth egress surges.
SDE - 1
July 2024 – July 2025 · Delhi, India · Remote
- Built and scaled the flagship "Roadmaps" feature, delivering 100+ curated learning paths across DSA, Development, and System Design used by 100K+ users. Improved personalization and relevance, while reducing API response time from 2.1s to < 300ms, resulting in a 7x faster experience and 40% higher user engagement.
- Worked on complex APIs to reduce processing time and improved tab switching experience for smoother navigation
- Developed and integrated the "AskAI + Discussion Forum", an intelligent peer-programming assistant where users can interact with AI to solve DSA/Dev doubts and collaborate with others enabling on-demand doubt resolution and community learning.
- Engineered a Session Recording Bot using Python, Selenium, and headless Azure VMs with deep link automation automating session joining and recording, cutting down 100% of manual effort and improving reliability.
- Optimized 150+ APIs by implementing advanced caching layers, async processing, and API pipelines, reducing backend latency by up to 70% and improving system throughput.
- Reduced core web vitals TBT, LCP, and FCP from 4.4s to 990ms through advanced frontend optimizations (SSR, dynamic imports, lazy-loading APIs), significantly boosting UX for 15K+ monthly active users.
- Led the end-to-end performance overhaul of the platform, focusing on smoother tab-switching experiences, minimal downtime, and blazing-fast navigation across the app.
- Migrated MongoDB from Atlas to self-hosted replica sets, wrote automated backup & recovery scripts, set up VMs, and integrated cron-based backups to Azure Blob, ensuring data durability and cost-efficiency.
- Set up real-time monitoring and alerting with Prometheus and Grafana, ensuring system health, proactive issue resolution, and enhanced DevOps visibility.
- Deployed scalable CI/CD pipelines using Azure, GitLab, and Vercel, ensuring zero-downtime deployments and faster iteration cycles across teams.
- Handled end-to-end production deployment and scaling for a system serving 15K+ users, maintaining high availability, fault tolerance, and robust performance at scale.
Cloud Conduction
Junior Software Engineer
Jan 2024 – June 2024 · USA, · Remote
- Built an AI-powered chat application from the ground up using React and .NET, improving frontend efficiency by 60% and backend performance by 30%, delivering a highly responsive user experience.
- Integrated and optimized AI model responses, reducing latency from 1.86s to 1.2s (35% faster) through strategic API design, caching, and performance tuning.
- Designed scalable cloud architecture on Microsoft Azure for AI workloads, improving system throughput by 10% while significantly reducing infrastructure costs via autoscaling and resource optimization.
- Developed modern, responsive UI components in React that improved user engagement metrics by 25%, including better retention and interaction rates.
- Implemented secure, scalable API gateways in .NET Core, capable of handling 500+ concurrent requests with 99.9% uptime, supporting production-level reliability.
- Led the implementation of new features using the MERN stack, cutting down development time by 40%, and accelerating product iteration cycles.
- Established CI/CD pipelines (Azure DevOps & GitHub Actions), reducing deployment failures by 75% and enabling faster, automated releases.
- Conducted in-depth code reviews and optimization, reducing technical debt by 30%, standardizing best practices across teams, and improving maintainability.
- Owned and managed the complete project lifecycle, from initial system design and dev planning to production deployment, server setup, and post-launch support.
Impactful Work As a ( INDIVIDUAL CONTRIBUTOR )
INDIVIDUAL CONTRIBUTOR
- Architected an end-to-end RAG-powered AI learning platform serving 100K+ users with sub-second inference latency, leveraging Azure OpenAI embeddings (text-embedding-ada-002), ChromaDB vector indexing, and semantic retrieval with dynamic topic-aware filtering achieving 0.25 similarity-threshold precision
- Engineered a self-evolving knowledge graph where every AI-generated artifact (roadmaps, articles, practice questions) is automatically embedded, vectorized, and reintegrated into ChromaDB creating a continuously learning retrieval layer that improves semantic accuracy with each user interaction
- Built an intelligent RAG pipeline with multi-stage context optimization combining semantic vector similarity search, domain-specific keyword enforcement, exclusion-based noise filtering, and quality-threshold gating (0.25 cutoff) to deliver hallucination-resistant contextual augmentation
- Designed a production-grade MCP-compliant prompt orchestration system with structured message arrays (system/user roles), dynamic context injection based on user proficiency levels (1-5 scale), adaptive difficulty mapping (Beginner/Intermediate/Advanced), and goal-oriented content generation across 3 formats
- Implemented a real-time intent classification engine with confidence-weighted pattern matching across 4 transformation operators (NEW_SUBROADMAP, ADD_TOPICS, PROJECT_CREATION, REGENERATE_PIPELINE) using 20+ keyword signatures per intent and hierarchical fallback resolution for ambiguous requests
- Developed a conflict-safe progress-preserving merge algorithm that maintains atomic user state (isDone flags, bookmarks, annotations, code links) during AI-driven content expansions through differential patching, duplicate detection, and rollback-capable database transactions
- Created a multi-layer security validation framework with lexical abuse detection (violent/illegal/inappropriate patterns), technical relevance scoring across 15+ engineering domains, injection-attack guards, and AI-powered verification with a 0.6 confidence threshold for edge cases
- Architected a scalable token-governance system with tiered allocation models (8 free tokens + purchased pools), operation-based cost accounting (Creation: 2 tokens, Customization: 4 tokens), atomic transaction handling via MongoDB optimistic locking, and graceful quota degradation
- Optimized database performance through strategic indexing with compound indices on (userId, sessionId, isDeleted), aggregation pipeline optimization for history queries, session-based data isolation, soft-delete mechanisms, and pagination limiting to 50 records per fetch
- Implemented a multi-model AI orchestration layer supporting dynamic routing between o3-mini (8K context window) for complex generation and gpt-3.5-turbo (4K context) for standard operations, with consistent MCP interface abstraction and model-specific parameter tuning
- Built a resilient fallback architecture ensuring 100% availability with RAG-miss graceful degradation, sparse-query fallback prompts, cache-bypass recovery paths, multi-tier error handling, structured security-event logging, and health-check monitoring across all AI subsystems
System Architecture & Details
Select a view to see the architecture flow
- Architected an end-to-end AI-powered code evaluation system replacing traditional compilers with RAG-enhanced logical judgment, leveraging semantic retrieval, model-context engineering, and multi-model orchestration to achieve 99% evaluation accuracy across Python, Java, C++, and JavaScript.
- Built a multi-stage language detection engine using regex patterns, anti-pattern suppression, syntax heuristics, and confidence-based classification to prevent cross-language submissions and ensure evaluation integrity for every code block.
- Implemented a production-grade MCP-compliant prompt pipeline generating strictly structured system/user message arrays, including judge instructions, evaluation rules, test-case schemas, complexity requirements, and JSON-first verdict formatting.
- Designed a dual-layer response parsing system with JSON block extraction, Markdown fallback resolution, regex-based error isolation, and verdict normalization to guarantee consistent outputs even with noisy AI responses.
- Engineered a multi-model AI orchestration layer dynamically routing requests between o3-mini (accuracy), o1 (reasoning), and gpt-35-turbo (performance) with token-window optimization and context-aware selection.
- Integrated a RAG pipeline with ChromaDB using text-embedding-ada-002 to retrieve reference solutions, constraints, edge cases, and complexity hints, enabling AI to perform context-enriched evaluation rather than plain code matching.
- Created a modular progress-tracking engine mapping submissions to TodoItems, Topics, and Subroadmaps, automatically updating isDone status and learning milestones through real-time backend sync and user completion logic.
- Developed a robust validation and error-classification layer with strict checks for payload integrity, language mismatches, test-case correctness, sanitized code inspection, and COMPILATION_ERROR / RUNTIME_ERROR / VALIDATION_ERROR generation.
- Implemented a structured verdict generator delivering human-like educational feedback including passed/failed test-case breakdowns, root-cause explanations, error localization, corrected code suggestions, and time/space complexity analysis.
- Optimized backend infrastructure using MongoDB submission architecture with collections for Submission, TodoItem, Topic, UserTodoItemMapping, ensuring analytics-ready storage, high-throughput writes, and environment-aware routing for dev/prod deployments.
- Achieved scalable, real-time evaluation flows combining JWT-secured endpoints, load-balanced AI calls, semantic retrieval augmentation, multi-model fail-safes, and a high-availability fallback pipeline for uninterrupted code judging.
System Architecture & Details
Select a view to see the architecture flow
- Architected and developed a production-grade AI programming assistant handling 100+ RPS with 99.9% uptime across learning platform resources.
- Engineered sophisticated multi-model AI orchestration routing questions between O3Mini, O1, GPT-3.5 Turbo, and Llama 3.3 based on question complexity and resource type.
- Built comprehensive token management system with dual-token architecture (9 free + purchased), atomic MongoDB operations, and fair usage enforcement preventing system abuse.
- Implemented MCP (Model Context Protocol) prompt engineering with three specialized generators eliminating RAG infrastructure while maintaining response quality.
- Designed intelligent model selection algorithm routing Practice Questions to O1, complex DSA to O1, articles to GPT-3.5, and general questions to O3Mini for optimal performance.
- Developed advanced response processing pipeline with autoWrapCode (10+ language detection), formatAIResponse (markdown fixing), and removeConversationalEndings (AI fluff removal).
- Created scalable session management with three MongoDB schemas (generic, roadmap-specific, content creation), soft deletion, voting system, and optimized query patterns.
- Built complete API security layer with JWT authentication, rate limiting, input sanitization, HTTPS enforcement, and comprehensive error handling across 6+ endpoints.
- Implemented production monitoring system with response time tracking, token usage analytics, structured logging, and health checks for continuous optimization.
- Achieved 3x user engagement and 2x resolution speed through intelligent model selection, clean response formatting, and context-aware interactions.
- Engineered no-RAG architecture using sophisticated prompt engineering instead of vector databases, reducing infrastructure costs by 60%.
- Added content caching optimization with RoadmapAskAIContentCreation schema and duplicate request prevention for article improvements.
- Implemented question classification system using GPT-3.5 Turbo to categorize questions into 7 types (DSA, System Design, Development, etc.) for better routing.
- Designed circuit breaker pattern and fallback chains (O1 → O3Mini → GPT-3.5 → Llama) for API failure resilience and graceful degradation.
System Architecture & Details
Select a view to see the architecture flow
- Architected and deployed a production-grade AI-powered global search engine across ProPeers platform serving Roadmaps, Digital Products, Mentors, Webinars, and Bootcamps from a single query.
- Engineered dual-model AI orchestration using Kimi-K2.5 for intent extraction and GPT-5.2 for response generation via Microsoft Azure Foundry, enabling sub-second intelligent query understanding.
- Built a 3-collection ChromaDB vector search system with separate collections for Roadmaps (
roadmapsearchdata-v0), Digital Products (dpsearchdata-v0), and Mentors (mentorsearchdata-v0) using Azure OpenAI text-embedding-ada-002. - Implemented hybrid search pipeline combining semantic vector search (ChromaDB) for Roadmaps/DPs with tag-overlap scoring for Mentors, achieving 0.62 score threshold for relevance filtering.
- Designed intent extraction engine using Kimi-K2.5 that parses user queries including Hinglish, short-form, and vague inputs into structured JSON with intent, goal, topic_tags, and audience classification.
- Built full data ingestion pipeline with MongoDB fetch scripts, Azure OpenAI embedding generation, batch processing (5 docs/batch with rate limiting), and ChromaDB storage across 3 collections covering 100+ roadmaps, 18+ DPs, and 20 top mentors.
- Implemented deduplication logic for Digital Products and parallel search across all 3 ChromaDB collections using Promise.all, reducing search latency significantly.
- Integrated live MongoDB fallback layer for Webinars and Bootcamps — fetching isActive records directly from DB in parallel with ChromaDB search, enriching LLM context without RAG overhead.
- Developed score-based relevance filtering with threshold 0.62 — Cooking/gibberish queries correctly trigger fallback while specific queries like 'Goldman Sachs ECHP' achieve 0.78+ similarity scores.
- Created 19 smart suggestion chips with intent-optimized queries covering DSA, System Design, Full Stack, AI/ML, DevOps, Cloud, and Career Switch — sorted by user search frequency for maximum discoverability.
- Built complete Next.js frontend with Tailwind CSS featuring search bar, smart chips, result cards for all 5 content types (roadmaps, DPs, mentors, webinars, bootcamps), AI response display, and fallback handling.
- Validated system across 6 edge case categories — Easy (React query: 0.69 avg), Medium (Hinglish career switch: 0.74), Hard (out-of-scope cooking: fallback triggered), Gibberish (fallback triggered), Single-word (DSA: 0.74), and Niche (Goldman Sachs: 0.786).
- Designed 3-step LLM response generation: Kimi extracts structured intent → ChromaDB+MongoDB fetch results → GPT-5.2 generates 60-word motivational summary with structured result presentation.
- Implemented Microsoft Azure Foundry model integration for both GPT-5.2 (max_completion_tokens) and Kimi-K2.5 (system role normalization, think-tag stripping, 3000 token allocation for reasoning).
System Architecture & Details
Select a view to see the architecture flow
- Engineered an AI-integrated code editor using Monaco, seamlessly tied into CodeLLM and AskAI pipelines.
- Supported live verdicts, multi-language (C++, Java, Python) switching, and dynamic prompts based on user activity.
- Embedded AI-based feedback inline within the editor via backend event sync and code stream capture.
- Delivered interactive IDE-like experience with <40ms event lag, boosting engagement and retention by 40%.
- Tight integration with RoadmapAI and CodeLLM for contextual assistance
- Real-time code validation and suggestions during typing
System Architecture & Details
Select a view to see the architecture flow
- Refactored and optimized over 150 core APIs (Editor, Roadmap, AskAI, Profile) for high-throughput performance.
- Reduced average response latency from 2.2s → 300ms through async queues, parallel batches, and Redis caching.
- Introduced pagination layers, ElasticSearch indexing, and horizontal load balancing to maintain SLA under scale.
- Achieved 70% backend performance boost and improved Core Web Vitals (TTFB, LCP, FCP) across all pages.
- Load tested to 10K RPM 99.95% uptime sustained with zero cold-starts using warmed cloud functions.
- Implemented advanced caching strategies and async processing
- Enhanced frontend performance through SSR, dynamic imports, and lazy-loading
System Architecture & Details
Select a view to see the architecture flow
Problem Solving & DSA
Key Highlights
- 5000+ Problems Solved Across 10+ Platforms
- 1500+ Day Unbreakable Coding Streak
- Knight Badge @LeetCode (Top 5% Worldwide)
- InterviewBit Global Rank 13 (6⭐ Problem Solving)
- Institute Rank 1 & Global Rank 98 @GeeksForGeeks
LeetCode
1879+ (Top 5% Worldwide)
1400+ solved
4⭐ Problem Solving
GeeksForGeeks
Institute Rank 1 & Global Rank 98
1300+ Solved solved
6⭐ Problem Solving
InterviewBit
1854+ (Master)
560+ Solved solved
Rank: Global Rank 13
CodeStudio
1854+ (Specialist)
2000+ solved
Rank: Global Rank 130
6⭐ Problem Solving
HackerRank
6⭐ Problem Solving
300+ solved
Rank: Rank 52
HackerEarth
1260+ Top 10%
200+ solved
Rank: Rank 101
5⭐ Python/Java
Technical Skills
AI / ML
Frontend Development
Backend Development
Cloud & DevOps
Databases
Programming Languages
Tools
Education
Sage University Indore
B.Tech in Computer Science
2020 – 2024 · MP, India
CGPA: 8.5/10

