Go LLM Gateway
Multi-provider routing, cost attribution, OpenTelemetry tracing, and a streaming proxy — engineered for predictable tail latency under high concurrency. Benchmarked head-to-head against LiteLLM.
Read the build →AI agent engineering
Reliable under load, fully observable, and security-first — built for production, not demos.
The Lab
Open, benchmarked, production-minded agent systems. Each one ships with a writeup and the numbers behind it.
Multi-provider routing, cost attribution, OpenTelemetry tracing, and a streaming proxy — engineered for predictable tail latency under high concurrency. Benchmarked head-to-head against LiteLLM.
Read the build →A security-first autonomous Kubernetes remediation agent. Human-in-the-loop approval gates, a strict action allowlist, and prompt-injection defense baked in. A Go operator paired with a planning agent over gRPC.
Read the build →Tail latency, backpressure, and graceful degradation treated as first-class requirements — not afterthoughts.
Every request traced and attributed. If you can't see it, you can't run it in production.
Allowlists, approval gates, and injection defenses so an agent can act without acting recklessly.
About
AgenticCore Labs is led by Harpreet Singh — a software engineer with a decade spent building and operating distributed systems at scale.
Track record
Currently leading a team building scalable, secure Go and Java microservices on GCP for banking-scale payment infrastructure — distributed systems that process 150M+ transactions a day at ~1,700 TPS, with p95 latency held under 250ms.
Built distributed Go microservices, a Kubernetes monitoring operator, and an mTLS security framework for enterprise products.
Built a distributed, event-driven notification system — email, SMS, and push — on Apache Kafka for UMANG, India's government super-app. It powers citizen alerts across a platform that today serves 80M+ users and 2,000+ government services.
The platforms agents actually run on — proven, not just claimed.
The lab
Most AI agents are demos. The hard part is making them production-grade: distributed, scalable, and secure enough to run on real systems — predictable under load, fully observable, and safe to let act. That's the same engineering discipline behind everything I've built, now applied to AI agents, and what I bring to client work.
Writing
Deep dives on agent engineering — benchmarks, architecture, and the trade-offs behind each decision.
Articles
What actually happens to p95/p99 under concurrent load — and the engineering choices that keep it flat.
Coming soon →Allowlists, human-in-the-loop checkpoints, and defending an autonomous agent against prompt injection.
Coming soon →Same gateway, two languages. A head-to-head on throughput and tail latency once both projects ship.
Coming soon →Work with me
Building an AI agent, hiring for a role, or have any other question about the work? Send a message — I read every one.