Engineering Manager · AI Architect · Writer

    Building enterprise AI
    & teaching how.

    Engineering Manager and AI Architect by day. Here I share articles, courses, and hands-on guides on shipping cloud-native AI — distilled from 12+ years building at enterprise scale.

    Open to select roles & collaborations — work with me
    Avaneesh Yadav — Engineering Manager & AI Architect
    Currently
    Engineering Manager @ HashedIn
    0+
    Years Enterprise
    0+
    Engineers Mentored
    0+
    Systems Shipped
    Build & Architect

    I ship enterprise AI in production

    Agentic systems, LLM platforms, and cloud-native architecture — 12+ years turning hard problems into systems that scale.

    Write & Teach

    I share what I learn building it

    Articles, free courses, and hands-on guides on LLMs, RAG, MCP, and shipping AI — so you can build it too.

    Featured Course

    LLMOps for Engineering Managers

    Master the operational side of LLM systems — from inference optimization to production observability, cost control, and quality evaluation.

    Intermediate 5 lessons ~5 hr
    Start the course
    What you'll master
    • RAG systems & retrieval architecture
    • LLM inference & cost optimization
    • Production observability & evals
    Experience

    12+ years shipping at enterprise scale.

    Jun 2022 — Present

    Engineering Manager

    HashedIn by Deloitte

    Leading multi-team engineering org building AI-powered platforms and cloud-native products for global enterprise clients including Herbalife, Kroger, HCA Healthcare, and Al Rajhi Takaful. Driving architecture, hiring, and delivery for 30+ engineers.

    Nov 2021 — May 2022

    Lead Application Engineer

    HexaView Technologies

    Managed a 10-member team delivering an investment portfolio accounting and rebalancing application. Introduced TDD, increasing delivery speed by ~40%.

    Jul 2019 — Oct 2021

    Senior Software Engineer

    Xoriant (Client: Finastra – Global PAYplus)

    Built components for a high-performance global payments hub used by large global and domestic banks. Conducted code reviews, performance tuning, and defect resolution.

    Feb 2018 — Jul 2019

    Senior Software Engineer

    Larsen & Toubro Infotech (LTI)

    Built domain and integration logic for an Annuity Operational Data Store consolidating data from multiple enterprise sources. Integrated Spring with IBM MQ via Spring Integration.

    Apr 2016 — Feb 2018

    Associate Consultant

    Syntel (Clients: FedEx, Mercedes-Benz)

    Developed backend systems and REST services for FedEx shipment processing and a Mercedes-Benz parts reporting platform. Implemented microservices architecture and standardised exception handling.

    Aug 2013 — Feb 2016

    Software Engineer

    NextGenVision Technology Pvt. Ltd.

    Designed and developed RESTful APIs using Spring Boot and JAX-RS, integrating JSON and XML data models for enterprise banking systems. Built automated testing with JUnit and Mockito.

    Technical Skills
    56 skills across 8 areas
    AI & MLOpenAI / GPTLangChainRAGVector DBsLLMOpsPrompt EngineeringAgentic AI
    Languages & FrameworksJava (8/11)PythonTypeScriptSpring BootSpring BatchSpring CloudHibernateMyBatis
    Enterprise IntegrationApigee-XIBM MQApache KafkaREST (JAX-RS)SOAP (JAX-WS)API GatewaySpring Integration
    Cloud & DevOpsAWSGCPAzureKubernetesDockerTerraformCI/CDJenkins
    DatabasesPostgreSQLMySQLOracleDB2MongoDBRedisBigQuery
    Observability & ToolsELK / Elastic StackGrafanaDynatraceOpenTelemetrySonarQubeJIRAConfluence
    FrontendReactAngularNext.jsNode.jsTailwind CSSJavaScript
    LeadershipEngineering ManagementAgile / ScrumHiringMentoringHLD / LLDStakeholder Mgmt
    Zero-to-One Case Study

    Enterprise AI Knowledge Platform

    Taking RAG from prototype to 10,000+ daily users

    Read Full Case Study

    The Challenge

    A Fortune 500 client needed semantic search across 2M+ enterprise documents with sub-500ms response time, strict data residency requirements, and zero tolerance for hallucinated outputs.

    The Architecture

    Multi-tenant RAG pipeline with hybrid search (vector + BM25 fusion), domain-fine-tuned embeddings, and pgvector on PostgreSQL — eliminating the need for a dedicated vector database and its operational overhead.

    10,000+
    Daily Active Users
    94%
    Search Relevance Score
    <400ms
    p99 Latency
    $0
    Additional Infra Cost
    Tech stack:OpenAILangChainpgvectorKubernetesSpring AIRedis
    Live · Powered by Llama 3.1

    Try the AI demo

    Ask an enterprise-AI question and see a live, RAG-style answer — a small demo of the patterns I write about.

    Click an example question or type your own above.

    Production-grade answers in seconds.

    Powered by Groq · Llama 3.1-8B-Instant · A live demo, not professional advice

    Free Weekly Newsletter

    Scaled AI Weekly

    Join engineering leaders receiving my weekly insights on AI architecture — blueprints, post-mortems, and lessons from 12+ years in production.

    • Real architecture decisions — not tutorials
    • LLMOps patterns that survive production
    • Team leadership for engineering managers

    No spam. Unsubscribe any time.
    Join engineers from enterprise teams.

    Get in Touch

    Let's build something remarkable.

    Open to engineering leadership roles, advisory engagements, and conversations about AI, cloud architecture, and team building.

    Let's build something together

    Whether you're scaling an engineering org, launching an AI product, or modernising your cloud platform — I'd love to hear from you.

    Your message goes directly to my inbox.

    Abuildingai.in

    Enterprise AI architecture, engineering leadership, and production patterns — from 12+ years of shipping at scale.

    Scaled AI Weekly

    Enterprise AI insights, every Monday. Free.

    © 2026 Avaneesh Kumar Yadav · Engineered with intent.