Agentic AI Engineer

I build the infrastructure
AI products run on.

Multi-agent systems, LLM inference pipelines, and the unglamorous production work that separates demos from products.

Abhinandan

Work

Things I've built

Multi-Agent System

Parallel Research Orchestrator

Designed and shipped a multi-agent pipeline using Claude's API that decomposes complex research queries into parallel sub-agents, reconciles conflicting outputs, and synthesizes a grounded response. Reduced end-to-end latency by [X]% vs. sequential chains.

TypeScriptClaude APINode.js
Live DemoGitHub

LLM Observability

Multi-LLM Monitoring Dashboard

Built production observability for a system routing prompts across GPT-4, Claude, and Gemini. Tracks latency p50/p95, cost-per-token, error rates, and semantic drift — powers real-time routing decisions.

PythonFlaskGCPPostgreSQL
Live DemoGitHub

Inference Engineering

Low-Latency Inference Layer

Architected a caching and batching layer in front of foundation model endpoints. Achieved [X]ms median cold-start and [X]% cache hit rate in production, cutting inference costs by ~[X]%.

PythonDockerRedisAWS
Live DemoGitHub

Full-Stack AI Product

NewTools

Built a free, privacy-focused browser utility platform from scratch. Features AI-powered PDF-to-CSV extraction (Claude API with SSE streaming), a shared daily credit system, Supabase auth, and a suite of client-side tools — all with zero data sent to the server.

Next.jsTypeScriptFastAPIClaude APISupabase

Experience

Where I've worked

BrowzerCurrent· Remote
Oct 2025 – Present

Founding Software Engineer

  • Built a CDP-based browser automation engine using Electron.js and Chrome DevTools Protocol, enabling reliable semantic action capture and replay across complex web workflows.
  • Designed and shipped a WXT Chrome extension (MV3) with a FastAPI backend, orchestrating real-time browser ↔ server communication via SSE with Heroku-tuned heartbeat infrastructure.
  • Integrated Claude as the LLM backbone to plan and execute automation tasks from recorded browser actions, powering an agentic execution pipeline end-to-end.
  • Architected multi-tenant auth using Supabase with JWT custom claims, role-based access control, and 30-day session persistence across web and extension clients.
  • Implemented Stripe subscription billing with webhook handling for edge cases including cancellation flows and payment recovery.
  • Set up GitHub Actions CI/CD with automated Claude-powered code review and changelog generation, reducing manual release overhead significantly.
  • Built a process graph system using Neo4j + Supabase to model and persist workflow relationships extracted from recorded browser sessions.
Cynos Nexus· Noida, India
Jan 2025 – Jul 2025

Software Engineer (Part-time)

  • Developed features that helped acquire 20 clients in 2 months, contributing to ₹100K in new MRR.
  • Built the AI knowledge base service using FastAPI & LangChain.
  • Automated deployment infrastructure using Docker, GitHub Actions for CI/CD, and Nginx for reverse proxy/load balancing.
  • Engineered secure AWS S3 multi-part file uploads handling files up to 5GB, ensuring data integrity.
  • Integrated Razorpay for automated payment processing, achieving a >95% transaction success rate.
Caresept· Remote, Türkiye
Sept 2024 – Dec 2024

Contract Software Engineer

  • Engineered a high-performance PDF generator using WeasyPrint, converting dynamic JSON reports to high-quality PDFs.
  • Optimised bulk CSV data processing using Celery worker, reducing processing time by 40%.
  • Built a LangChain and pgvector knowledge base, improving ML model query accuracy by 15%.
  • Established a CI/CD pipeline with GitHub Actions and Docker on AWS EC2.
NextUI· Y Combinator S24
Jun 2024 – Aug 2024

Open-source Contributor

  • Received a personal offer from the CEO after making impactful open-source contributions.
  • Resolved 10+ bugs in core components including Calendar, Table, and Pagination.
  • Delivered 7+ feature enhancements improving component flexibility and extensibility.
SkilledUp· Noida, India
Feb 2024 – May 2024

SWE Intern

  • Built the backend for an enterprise Learning Management System serving 400+ users.
  • Implemented JWT and OAuth 2.0 authentication securing 400+ user accounts with near-zero breaches.
  • Developed a MySQL + Express.js service handling ~4,000 daily queries.

Stack

What I work with

AI / ML Systems

Claude APIMulti-Agent OrchestrationLLM MonitoringInference EngineeringRAG PipelinesMulti-LLM RoutingPrompt Engineering

Languages & Frameworks

TypeScriptPythonReactNext.jsFlaskNode.js

Infrastructure

DockerGCPAWSREST APIsPostgreSQLRedis

About

~2 years building at the edge
of what's possible with AI.

I joined [Company/Project Placeholder] early — before the team had process, before the architecture was decided, before anyone was sure it would work. That meant writing production code and making calls that stuck.

Most of my work lives at the intersection of language models and real software: figuring out where a model's reasoning breaks down, designing systems that degrade gracefully when it does, and shipping things that work on a Tuesday at 3am.

I care about the unsexy parts — latency budgets, error surfaces, cost models, observability. The parts that don't make it into the demo but determine whether the product survives contact with users.

Currently open to select projects · [CITY_PLACEHOLDER]

View Resume →

2 yrs

founding engineer experience

3+

AI products shipped to production

[N]

agents in a single pipeline

[N]M+

LLM calls monitored in production

Depth

Under the hood

Designed for fault tolerance, not just happy paths

Every agent pipeline I've shipped has explicit fallback routing — if a model returns a malformed response or exceeds the latency budget, the system recovers without user impact. Built this after watching a competitor's demo break live.

Costs are a first-class concern

I track token spend per feature, not just per deployment. On [Project X], optimizing prompt structure alone cut monthly inference cost by ~[X]%. Cost decisions happen at design time, not after the bill arrives.

Observability before features

I instrument before I ship. Every production LLM call I've written emits latency, token counts, model version, and a trace ID. Debugging without this is guesswork — I've seen teams spend days on issues I can isolate in minutes.

[Architecture decision placeholder]

[One specific non-obvious architectural call you made and why — e.g., chose not to use a framework because X, or chose model Y over Z because of Z's behavior at high concurrency.]

The demo is not the product

I've rebuilt things that worked in demos and failed at [N]x load. The diff between a prototype and a production system is usually invisible until it isn't — I've learned to design for that gap from the start.

Writing

Latest posts

All posts →