How I thought through the log package design before writing any code

Ashwini Gaddagi

May 11, 2026 — 5 min read

📖

Distributed Services with Go by Travis Jeffery; Chapter 3.
Here's what clicked.

Understanding the structure of a log is one aspect of learning curve. And sitting down to build it in Go is a whole different trajectory! Here's how the progression looked.

Step 0: Co-relations; Why logs matter

Logs are not optional in production. Obviously.

If something breaks at scale especially in a distributed system logs are often the only source of truth. No logs means no debugging, no replay, no visibility and the most nerve-gutting on-calls of your life. How do you even begin to understand a system you can't observe?

That naturally lead to a fundamental question:

What is a log under the hood?

At first glance, a log looks like “just writing to a file”. That's what I thought when I was building log rotations at earlier gigs and technically it's not wrong. You do write to a file. But how do these things need to work?

Step 1: Understand what the system actually needs to do

How do these things need to work at the business logic level first then move onto the code level.

A log needs to be append-only. Why? Because you can never edit past(if only it were possible. :P). Immutability is what makes replication safe, recovery simple, and debugging possible. The moment you allow mutation, you lose the guarantee ex: offset 42 always means the same thing.

Why log needs offset? a non-negative position index. Why non-negative? Because offset represents a position in a sequence that only grows forward. You can never have a record at position -1. The constraint reflects the reality of how the data moves.

These are some business requirements expressed as code constraints/design precisions. Getting clear on the why before the how means the code structure almost suggests itself. Read on uint64, fun fact

Curious: Why not just keep writing to single file? I use tail -f and grep on log files all the time while debugging. Why do I do that? Because the content is there sequential, readable. But writing everything into a single file is disaster:

File grows indefinitely and is hard to manage
Reads become slower and is scanning from the start every time
Compaction and deletion become painful
Recovery becomes expensive

Makes so much sense! Go through the image on structure of Log or tweet.:

From the image it's quite clear that a log is a series of segments. And the index is what makes reads fast. Without it, finding offset X (say 1500) means scanning from byte 0. With it, you jump directly to the right byte position, this is the same principle as a database index.

Step 2: Understand the flow before picking packages

Once I knew what the system needed to do, I started thinking about the flow.

Write, append, read, open, close these are the actionable operations on a file in relative to what is needed for this system to be built. And every one of them involves I/O. Which means every one of them crosses from my Go program into the kernel.

That's when the os package started to look like what it actually is: a thin wrapper around syscalls. When you see os.OpenFile, os.Write, os.Read these are thin wrappers around syscalls. Every one of them crosses the boundary from your Go program into the kernel i.e user space to kernel space.

This is the moment the flow became concrete for me:

my Go code → os package → syscall → kernel → disk

Each hop has a cost. And that cost compounds at scale. So the question shifted from "how do I write to a file" to "how do I design around the cost of writing to a file."

Syscalls are not free, they involve a context switch from user space to kernel space. What is the price we pay?

Step 3: Ask the core design question

Who is touching this file, when, and how many of them at once?

These questions unlocks the concurrency design.

Single writer, no readers: trivial. Open, append, close. No coordination needed.

Multiple readers, no writer: safe. Reads don't mutate state. Everyone can read simultaneously without stepping on each other.

One writer + multiple readers: now there's a problem. If a goroutine is mid-write while another reads, it might see partial data. Torn reads. Corruption in perception even if not on disk.

Multiple writers: chaos. Two goroutines appending simultaneously means interleaved bytes. The file becomes garbage.

further read when to use mutex (for concurrency)

Step 4: Design layers with single responsibilities

Once I understood the concurrency question, the layering defined in the book became obvious.

The log package has to handle: durable writes, fast reads, concurrent access, and eventually compaction. That's too much for one type. You split it.

Each layer gets one job:

store : owns raw byte I/O. Knows nothing about offsets.
index : owns offset → position mapping. Knows nothing about bytes.
segment : composes store + index. Knows about one chunk of the log.
log : owns all segments. Handles concurrency at the top.

Single responsibility makes each piece independently testable, independently replaceable, reasonably simple to reason about and make it reliable. When something breaks in production you need to know which layer broke... in any system.

understand log package layers with example

Step 5: Push concurrency to the top layer

This one is subtle to connect dots.

Concurrency is managed at the log level. The log owns the Mutex. Every layer below it is single-threaded by contract.

Why? Because if concurrency is scattered across layers, you get lock ordering bugs, deadlocks, and races that are nearly impossible to reason about. When it lives in one place, you reason about it in one place.

The log guarantees: only one writer or multiple readers reach the layers below at any given time. The layers below trust that guarantee and don't think about it.

Step 6: Minimize what will hurt at scale

Two specific things to minimize:

understand the The cost of a syscall

Syscalls: batch writes with bufio.Writer. Eliminate index read syscalls entirely with mmap. One extra syscall per record is invisible at 100 records. At 10 million records per second it's the difference between a system that holds and one that falls over.

Lock contention: RWMutex instead of plain Mutex precisely because reads vastly outnumber writes in most log workloads. Multiple readers proceed in parallel. Only writes block.

Step 7: Test persistence explicitly

Writing tests cases is an art itself which am learning to do so.

The mental map

Before writing any system that touches files under concurrency:

→ understand the business constraints first (why append-only? why non-negative offsets?)
→ trace the full flow from your code to disk
→ ask who touches the data and when
→ let the access pattern determine the synchronization primitive
→ design layers with single responsibilities
→ push concurrency ownership to the top layer
→ minimize syscalls through buffering and memory mapping
→ test persistence explicitly

This isn't specific to a log package. This pattern is familiar to every system that touches shared mutable state.

How I thought through the log package design before writing any code

Ashwini Gaddagi

Step 0: Co-relations; Why logs matter

Step 1: Understand what the system actually needs to do

Step 2: Understand the flow before picking packages

Step 3: Ask the core design question

Step 4: Design layers with single responsibilities

Step 5: Push concurrency to the top layer

Step 6: Minimize what will hurt at scale

Step 7: Test persistence explicitly

The mental map

Read more

Why bufio.Writer and not direct os.File writes?

Understand log package layers

The cost of a syscall

Compiler?