Python backend engineer obsessed with AI-driven backend systems, data infrastructure, async systems, and building things that actually work at scale. 2025 AIML graduate from Hyderabad, India.
I'm a 2025 B.Tech graduate in AI/ML with a focus on Python backend engineering and data infrastructure. I like understanding how things actually work, not just at the surface, but the tradeoffs, the failure modes, and why certain decisions were made.
The intersection of AI and backend systems is where I want to be. Multi-agent pipelines, async infrastructure, systems where intelligence meets scale. Looking for roles where there's always something deeper to figure out.
Production-grade async BFS web crawler with dual fetch strategy — aiohttp for static sites, automatic Playwright fallback for JS-heavy sites. robots.txt compliance with async per-domain caching. 3-table PostgreSQL schema with background job tracking and duplicate detection. Containerized and published on Docker Hub.
Multi-agent LangGraph pipeline processing insurance PDFs end-to-end — classifies pages into 9 document types via Gemini vision, routes to specialized extraction agents. Exponential backoff retry logic, graceful JSON fallback, keyword-based mock layer for testing without API quota consumption.
Role-based finance dashboard backend with 3-tier RBAC (Admin, Analyst, Viewer) across 15+ REST endpoints. JWT authentication, bcrypt hashing, soft delete, dynamic filtering, and pagination. 33 passing pytest tests with dependency injection overrides and isolated test database.
RAG pipeline over 119K+ hotel booking records — chunked and embedded domain documents with sentence-transformers (MiniLM-L6-v2), indexed via FAISS cosine similarity, Mistral-7B for natural language Q&A. Hybrid retrieval combining vector search with deterministic analytics for revenue, cancellation rate, and lead-time queries. 4 FastAPI endpoints with background embedding on startup; analytical reports returned as base64 charts.
Horizontally scalable file sharing platform with NGINX load balancing across 3 FastAPI instances. Supports 2GB chunked uploads. Automated self-cleanup via asyncio background tasks. One-command deployment with Docker Compose.
Multi-criteria task prioritization system using Weighted Sum Model from MCDM theory with 4 prioritization strategies, dependency management with forward/backward handling and penalty multipliers. RESTful API with 100% pytest coverage.
First exploration of multimodal API integration — Cohere for NLP, Imagga for image recognition. Built async middleware and a responsive JS frontend. May be broken on free-tier hosting.
First attempt at RAG — LangChain, vector database for semantic search, LLM APIs with response streaming. Where I learned what retrieval-augmented generation actually means in practice.