What makes Protobuf faster?
Understand serialization
Example:
curl -X GET localhost:8080/consume -d '{"offset":0}'
{"record":{"value":"TGV0J3MgR28K"}}Right now Record travels over HTTP as JSON:
{"record": {"value": "TGV0J3MgR28K"}}That's a string of characters. Every character is a byte. Count them that's ~40 bytes just for the wrapper.
Serialization means take your Go struct in memory β> convert it to bytes you can send over the wire or store on disk.
JSON does this. But it's wasteful:
JSON: {"id": 1234, "text": "buy milk", "done": false}
47 bytes, human readable, lots of punctuation overhead
Protobuf: [binary bytes]
~15 bytes, not human readable, just the datapicture this:
YOUR GO STRUCT (in memory)
βββββββββββββββββββββββ
β ID: 1234 β
β Text: "buy milk" β SERIALIZE
β Done: false β βββββββββββββββββββΊ bytes on the wire
βββββββββββββββββββββββ
bytes on the wire βββββββββββββββββββΊ GO STRUCT (in memory)
DESERIALIZEJSON serializes to text bytes (human readable, large). Protobuf serializes to binary bytes (not readable, compact).
Binary means instead of writing the characters "1234" (4 bytes), you write the actual number 1234 in binary (2 bytes). No quotes, no field names, no curly braces. Just the data!
What makes Protobuf faster?
You will often read protobuf enables faster serialisations, but what makes it fast? what is it benefitting us? or what are we saving from this?
JSON has to do extra work that Protobuf skips entirely.
1. write opening {
2. write the field NAME "text" as characters <- extra work
3. write :
4. write the VALUE "buy milk" as characters
5. write closing }
= {"text":"buy milk"} = 20 bytesProtobuf encoding "buy milk" at field 2:
1. write field NUMBER 2 as a single byte <- just a number, tiny
2. write the length
3. write the VALUE as raw bytes
= [binary] β 10 bytesThe field name is never sent. Only the field number. The receiver already knows from the .proto schema that field 2 is text. So you never pay the cost of sending "text" every single time.
Imagine a log passing 1 million records per second. Each record:
JSON: {"value":"TGV0J3MgR28K","offset":42} = 42 bytes
Protobuf: [binary] = 14 bytes1. Payload: less data on the wire
1 million records/sec Γ 42 bytes = 42 MB/sec (JSON)
1 million records/sec Γ 14 bytes = 14 MB/sec (Protobuf)
saving: 28 MB/sec of network bandwidthIn cloud infrastructure bandwidth costs money. Less payload = lower bill.
2. Time: faster encode/decode
JSON: parse character by character
find field names, match strings
handle whitespace, quotes, colons
Protobuf: read field number (1 byte)
read length (1 byte)
copy bytes directlyNo string matching, no character scanning. Direct memory operations.
JSON encode/decode: ~500ns per record
Protobuf encode/decode: ~100ns per record
at 1M records/sec -> saves 400ms of CPU per second3. Memory: smaller structs in flight
JSON in memory: string with all those characters allocated
Protobuf in memory: tight byte slice, no overheadLess memory allocated = less garbage collection = more consistent latency.
Where each saving matters
PAYLOAD -> matters at Layer 3 (transport)
services talking to each other
mobile clients on slow networks
cross-region traffic (very expensive)
TIME -> matters at high throughput
Kafka-like systems, millions of msgs/sec
real-time pipelines
MEMORY -> matters at scale
many concurrent connections
services with tight memory limits (k8s pods)Net-Net Protobuf is fast because it never sends field names, never scans characters, and never allocates string overhead it works directly with numbers and bytes.
JSON was designed for humans to read. Protobuf was designed for machines to process. That single design difference ripples into savings across payload, time, and memory simultaneously.