What makes Protobuf faster?

πŸ“–
Chapter 2 of Distributed Services with Go swaps JSON for Protocol Buffers. Before writing a single .proto file, I wanted to understand why and what actually makes Protobuf faster?

Understand serialization

Example:

curl -X GET localhost:8080/consume -d '{"offset":0}'
{"record":{"value":"TGV0J3MgR28K"}}

Right now Record travels over HTTP as JSON:

{"record": {"value": "TGV0J3MgR28K"}}

That's a string of characters. Every character is a byte. Count them that's ~40 bytes just for the wrapper.

Serialization means take your Go struct in memory –> convert it to bytes you can send over the wire or store on disk.

JSON does this. But it's wasteful:

JSON:     {"id": 1234, "text": "buy milk", "done": false}
           47 bytes, human readable, lots of punctuation overhead

Protobuf: [binary bytes]
           ~15 bytes, not human readable, just the data

picture this:

YOUR GO STRUCT (in memory)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ID:   1234          β”‚
β”‚ Text: "buy milk"    β”‚         SERIALIZE
β”‚ Done: false         β”‚  ──────────────────►  bytes on the wire
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜


bytes on the wire  ──────────────────►  GO STRUCT (in memory)
                        DESERIALIZE

JSON serializes to text bytes (human readable, large). Protobuf serializes to binary bytes (not readable, compact).

Binary means instead of writing the characters "1234" (4 bytes), you write the actual number 1234 in binary (2 bytes). No quotes, no field names, no curly braces. Just the data!

ℹ️
Protocol Buffers (Protobuf) is Google's way of serializing structured data an alternative to JSON for services talking to each other.

What makes Protobuf faster?

You will often read protobuf enables faster serialisations, but what makes it fast? what is it benefitting us? or what are we saving from this?

JSON has to do extra work that Protobuf skips entirely.

1. write opening {
2. write the field NAME "text" as characters  <- extra work
3. write :
4. write the VALUE "buy milk" as characters
5. write closing }

= {"text":"buy milk"} = 20 bytes

Protobuf encoding "buy milk" at field 2:

1. write field NUMBER 2 as a single byte   <- just a number, tiny
2. write the length
3. write the VALUE as raw bytes

= [binary] β‰ˆ 10 bytes

The field name is never sent. Only the field number. The receiver already knows from the .proto schema that field 2 is text. So you never pay the cost of sending "text" every single time.

Imagine a log passing 1 million records per second. Each record:

JSON:     {"value":"TGV0J3MgR28K","offset":42}  =  42 bytes
Protobuf: [binary]                               =  14 bytes

1. Payload: less data on the wire

1 million records/sec Γ— 42 bytes = 42 MB/sec   (JSON)
1 million records/sec Γ— 14 bytes = 14 MB/sec   (Protobuf)

saving: 28 MB/sec of network bandwidth

In cloud infrastructure bandwidth costs money. Less payload = lower bill.

2. Time: faster encode/decode

JSON:     parse character by character
          find field names, match strings
          handle whitespace, quotes, colons

Protobuf: read field number (1 byte)
          read length (1 byte)
          copy bytes directly

No string matching, no character scanning. Direct memory operations.

JSON encode/decode:     ~500ns per record
Protobuf encode/decode: ~100ns per record

at 1M records/sec -> saves 400ms of CPU per second

3. Memory: smaller structs in flight

JSON in memory:     string with all those characters allocated
Protobuf in memory: tight byte slice, no overhead

Less memory allocated = less garbage collection = more consistent latency.

Where each saving matters

PAYLOAD    ->  matters at Layer 3 (transport)
               services talking to each other
               mobile clients on slow networks
               cross-region traffic (very expensive)

TIME       ->  matters at high throughput
               Kafka-like systems, millions of msgs/sec
               real-time pipelines

MEMORY     ->  matters at scale
               many concurrent connections
               services with tight memory limits (k8s pods)
Net-Net Protobuf is fast because it never sends field names, never scans characters, and never allocates string overhead it works directly with numbers and bytes.

JSON was designed for humans to read. Protobuf was designed for machines to process. That single design difference ripples into savings across payload, time, and memory simultaneously.