About
A pragmatic programmer with a passion for solving problems using technology. I am an accomplished software and data engineer, I have over a decade of experience architecting and building robust large-scale distributed systems, data pipelines, and web applications using technologies like Airflow, Spark, Kafka, AWS, Python Flask, Go, Node JS, and React to name a few. My solutions have powered products and platforms to handle massive traffic and data volumes.
Work Experience
Skills
Check out my latest work
I've worked on a variety of projects, from simple websites to complex web applications. Here are a few of my favorites.
Latest Articles
I write about software engineering, databases, distributed systems, and other technical topics. Here are my latest articles.
- 📝
Everyone Should Know This About WAL: The Foundation of Database Durability
15 min readA deep dive into Write Ahead Logs (WAL) - the fundamental technique that ensures data durability in distributed systems, databases, and streaming platforms.
- 📝
Epoll, Kqueue, and Event Loops: Scaling C Network Servers
19 min readHow readiness-based I/O (epoll/kqueue) lets C servers scale: level vs edge triggering, drain-until-EAGAIN, fair scheduling, timers, and backpressure—without 3 a.m. incidents.
- 📝
Page Cache, mmap, and When to Bypass It
22 min readHow the Linux page cache actually works, what mmap buys you over read/write, where readahead and writeback help (or hurt), and when O_DIRECT is the right tool—not the default.
- 📝
Building Lock-Free Structures in C: Hazard Pointers vs. Epoch GC
22 min readImplement lock-free stacks/queues in C and reclaim memory safely: ABA, tagged pointers, hazard pointers, and epoch-based reclamation—what they are, when they win, and how to use them without footguns.
- 📝
High-performance DataFrames: Polars, pandas 2, and Arrow Interop
13 min readPick the right engine; exploit Apache Arrow and expression pipelines to get faster, more memory‑efficient DataFrames in Python—without painting yourself into a corner.
- 📝
Memory: Refcounting, Generational GC, and Finding Leaks Without Guessing
17 min readA production-first tour of CPython’s memory model: what refcounts really guarantee, how the cyclic GC works, why RSS doesn’t always go down, and how to reason about growth without guesswork. Part 1 lays the mental model and refcounting truths.
- 📝
Profiling That Matters: py-spy, eBPF, perf, and Interpreter-level Counters
14 min readA practical, low-overhead toolbox for Python performance: what to use (and when), how to keep overhead in single digits, and how to read profiles you can trust.
- 📝
Types that Pay for Themselves: Pydantic v2, mypy/pyright, and Runtime Contracts
14 min readMake Python types carry their weight: combine static checking (mypy/pyright) with fast runtime validation (Pydantic v2) to turn annotations into contracts that prevent bugs, speed up onboarding, and keep hot paths fast.
- 📝
Building Fast Native Extensions: Cython, cffi, HPy, and a Tiny C-extension by Hand
17 min readThe shortest safe path from Python to C-speed: a practical roadmap for choosing Cython, cffi, HPy, or the raw C API, with a minimal wheelable extension, packaging/ABI mental models, and performance guardrails you can apply today.
- 📝
AsyncIO at Scale: Backpressure, Structured Concurrency, and Cancellation Semantics
14 min readBuild async services that stay responsive under load: apply backpressure with bounded queues, adopt structured concurrency, and make cancellation a contract with deadlines.
Get in Touch
Want to chat? Just shoot me a dm with a direct question on twitter and I'll respond whenever I can. I will ignore all soliciting.