Skip to content
View archit15singh's full-sized avatar
🚀
🚀

Block or report archit15singh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
archit15singh/README.md

Archit Singh

Senior Backend Engineer | Infra & System Design @ Scale

Kafka · Redis · Python · Postgres · GenAI · Distributed Systems · Observability


I design backend systems that stay reliable at scale, adapt fast to product needs, and fail predictably.
8+ years across infra-heavy teams building telemetry pipelines, orchestrators, and LLM-backed systems under concurrency, latency, and audit constraints.


🔩 What I Build

  • Distributed Cloud Applications → Microservices with predictable scale & recoverability
  • Stream Processing Pipelines → Kafka + Postgres + Redis under 10M+ event loads
  • Telemetry + Observability Systems → Tracing, metrics, SLA diagnostics (Prometheus, OTel)
  • LLM Agent Infrastructure → Memory-backed, tool-using multi-agent execution engines
  • Control Plane & Coordination → Consensus-safe orchestration, retries, failover resilience

🧠 Core Expertise

  • Distributed Systems: queues, state machines, eventual consistency
  • Infra Design: ingestion, orchestration, API contracts, failure budgets
  • Stream Processing: Kafka, Redis, Celery, Prefect
  • Observability: OpenTelemetry, Prometheus, Grafana, Sentry
  • GenAI Integration: agent memory, structured planning, tool use
  • Cloud & Ops: Docker, Kubernetes, AWS (ECS, CloudWatch), Terraform (basic)

🚀 Key Outcomes

  • Built streaming ingestion pipelines handling 10M+ events/month
  • Cut P95 latency by 45% and ETL time by 30% in clinical telemetry
  • Reduced cross-region failures by 35% through retry-safe orchestration
  • Logged full agent memory + tool usage telemetry for enterprise GenAI workflows
  • Redis-based observability platform acquired by Redis Inc (folded into RedisInsight)

🛠️ Featured Projects

  • 🧠 memoria: Long-term memory infra for agents — Redis + Neo4j + vector search with temporal + semantic context
  • 🧪 infrasim: Chaos simulation + fault injection platform with trace replay, SLO dashboards, and distributed failure visualizer
  • 🛰️ synapse: Modular agent framework with controller-worker pattern, task routing, tool policies, and memory-integrated planning
  • 🧾 cognify: Rule+LLM hybrid engine with YAML DSL, audit trails, retry-safe pipelines, and deterministic + generative reasoning fusion
  • 🧩 spectra: Observability-as-code for microservices and agents — OpenTelemetry auto-instrumentation with latency maps and SLA views

🌱 Side Projects & Explorations

  • 📦 Designing Chrome DevTools-style UI for real-time Kafka + Redis pipeline debugging
  • 🔁 Building a trace-aware feedback loop for agent retries and subgoal recovery
  • 📊 Benchmarking multi-agent planning across QA, RAG, and vision-grounded reasoning
  • 🧬 Early research on “Project Episteme”: decentralized agents discovering novel scientific hypotheses
  • 🔍 Prototyping GPT+tool+memory chain visualizer for auditing AI reasoning in real time

🌍 Connect with Me

Currently exploring Senior/Staff roles in distributed systems, observability, or cloud-native infra teams (e.g. telemetry, ingestion, real-time processing).
DMs open — let’s build resilient systems.

Pinned Loading

  1. AgentLite AgentLite Public

    Forked from SalesforceAIResearch/AgentLite

    Jupyter Notebook

  2. k3s k3s Public

    Forked from k3s-io/k3s

    Lightweight Kubernetes

    Go

  3. robusta robusta Public

    Forked from robusta-dev/robusta

    Better Prometheus alerts for Kubernetes - smart grouping, AI enrichment, and automatic remediation

    Python

  4. signoz signoz Public

    Forked from SigNoz/signoz

    SigNoz is an open-source observability platform native to OpenTelemetry with logs, traces and metrics in a single application. An open-source alternative to DataDog, NewRelic, etc. 🔥 🖥. 👉 Open sour…

    TypeScript

  5. prettymaps prettymaps Public

    Forked from marceloprates/prettymaps

    Draw pretty maps from OpenStreetMap data! Built with osmnx +matplotlib + shapely

    Jupyter Notebook

  6. mem0 mem0 Public

    Forked from mem0ai/mem0

    Memory for AI Agents; SOTA in AI Agent Memory; Announcing OpenMemory MCP - local and secure memory management.

    Python