The Journey of Exploring Scripting Language for gokakashi

The Journey of Exploring Scripting Language for gokakashi

We’ve been building gokakashi in Go for a while now, and we had this wild thought: “What if users could define their own evaluation conditions on top of the scan reports?” Something like this in the config file:

check:
  condition: |
    sev.critical > 0 || sev.high > 0 && fixedversion.exist
  notify:
    - team-linear

Sounds cool, right? But here’s the thing—I had absolutely no clue what kind of "scripting language" I’d need for this. I mean, sure, I know bash, but how do I even begin to enable something as simple and dynamic as this? Why would we need this in the first place? And what do we even call this functionality?

The Vision: Why Do We Need This?

(After a lot of scribbling in my notebook and eating peanuts...)
I wanted users to dynamically define conditions like sev.critical > 0 || sev.high > 0 && fixedversion.exist in a way that’s simple, scalable, and doesn’t involve me writing code for every possible permutation and combinations of conditions. The whole idea was to:

  1. Make it easier for users: Let them define conditions in a structured format they already know.
  2. Avoid hardcoding and redundancy: Instead of hardcoding every condition in Go, we needed a way to evaluate user-defined rules dynamically.
  3. Scalability: The tool should grow with user needs, without bloating the codebase with edge cases or repetitive logic.
  4. Call me lazy: I didn’t want to whack my keyboard all weekend. I had a jazz evening planned—Anthony Lazaro was in town, and Italians + jazz are a combo I can’t resist!

The First Attempt: Go Hardcoded Logic

Like every self-taught explorer, I dove in headfirst. I directly implemented the desired conditions in Go:

sev.critical > 0 || sev.high > 0 && fixedversion.exist

This meant defining keywords like sev, critical, and fixedversion. It sounded straightforward, so I went ahead. And wanting to know what all stupid things the task holds.

And guess what? It worked!

But as I sat there feeling smug, reality hit me like a freight train:

  • What if users wanted different conditions?
  • Say, sev.medium > 5 && sev.low > 3. Do I write custom code for every combination? The thought was exhausting.
  • What keywords should I define? How do I make them intuitive?
  • And most importantly: how could I finish this before my jazz night?

So, Problems I faced:

  • Rewriting the code again and again for every new/custom condition.
  • Handling the endless combinations of user requirements.
  • Code getting cluttered, repetitive, and just plain ugly. These were speculations.
  • So many unknown factors. The more I thought about it, the less scalable it felt.
    It was clear: hardcoding logic wasn’t the solution. I needed a way for users to define any condition they wanted and evaluate it without adding complexity to my code. Or this would mean introducing our own language. And then I stumbled on DSL

Enter Domain-Specific Languages (DSLs)

While Googling, I stumbled on DSLs—specialized "mini-languages" for specific tasks. One article by Rudderstack on creating a DSL for JSON transformations helped me understand the concept here.

But their solutions wasn’t what I needed. I didn’t want to reinvent the wheel. I wanted something simple, lightweight, and widely adopted which is faster for goKakashi users.

Our goal was to keep gokakashi fast and intuitive for users.

Finding CEL (Common Expression Language)

Then it clicked: Kubernetes—our favorite monster—uses what for rule evaluation?— CEL-go for rule evaluation.
CEL is lightweight, expressive, and designed for evaluating conditions on structured data like JSON and Protocol Buffers. Exactly what I needed.

Why CEL?

  1. Dynamic Logic: Users can write conditions like sev.critical > 0 && fixedversion.exist without hardcoding anything.
  2. Built-in Features: CEL comes with operators (&&, ||), functions (size(), exists()), and type safety baked in.
  3. Structured Data: Works seamlessly with JSON or Protocol Buffers, which is what we use for reports.
    I found the official docs link and dived in. It made sense—CEL lets you evaluate conditions dynamically at runtime without bloating your code. But what about these keywords sev, critical etc etc? So, now the wheel was to understand how this CEL worked? And I just followed the doc and understanding each concepts mentioned in it.

What Are Protocol Buffers (Protobuf)?

Before diving into how CEL works, touched on Protocol Buffers (Protobuf) because they seem to go hand-in-hand with CEL.

Protobuf is a way to serialize structured data into a compact binary format. It’s like JSON or XML visual level but much faster and more efficient. Here’s why:

  1. Language and Platform Neutral: Works across languages (Go, Python, Java) and platforms. Think about it: if you have three microservices, each written in a different programming language and running on different platforms, how do you exchange data between them? Would you write separate code for each one to send the data in that format by each services? Or would you use a neutral format like Protobuf, where the serialized data can be interpreted by all the microservices by themselves and nones has to home work for others?
  2. Binary Format: Smaller size, faster serialization/deserialization, and minimal bandwidth usage.Making it ideal for scenarios like IoT, mobile apps, or any system with limited network resources.
  3. Schema-Based: Protobuf separates data from its schema, making it more efficient than JSON’s contextual data model.
    Example schema:
message User {
  string name = 1;
  int32 age = 2;
  string role = 3;
}

This is serialized into a super-compact binary format. The results would look something like

ast:&{0x1400027f7c0 0x140003d84e0}

CEL: Evaluating Conditions Dynamically

CEL works on structured data like JSON or Protobuf. It’s designed to evaluate conditions like this:

age > 18 && role == "admin"

// Remember this is the structure - age, role you would give it to CEL

So,
Why is this powerful? You don’t have to write custom logic for every condition in your application. CEL handles:

  • Logical Operators: &&, |, etc.
  • Collection Functions: exists(), all(), filter().
  • Type Safety: Ensures the condition matches the data type (e.g., age is an integer).

So, now it finally clicked since our structure is JSON I honestly don't have define these keywords also sev, critical it must be able to understand JSON struct itself? Maybe now user can write by their own with their existing knowledge - something like below for condtions they want to evaluate for i.e to sum in points

  • Users define conditions directly on JSON data.
  • No hardcoding required.
  • It’s fast, efficient, and scalable and extensible.
report.Results[0].Vulnerabilities.exists(v, v.Severity == 'CRITICAL')

So, I thought:

  • Load and send the raw JSON data.
  • Define CEL expressions to operate on the fields in the structured data.

Following example condition expression is written on top of JSON report generated from Trivy scanner.

Parse the JSON directly into a []map[string]interface{} and Pass the parsed data to CEL.
Parse the JSON directly into a []map[string]interface{} and Pass the parsed data to CEL. - directJSONParser.go

So, did it solve what I need?

Yes, it did but having this understanding our questions had to be re-phrased— is this what we want in gokakashi?

  • We are going to support multiple scanner, does this mean we parse all the raw scanners to it and let the user adapt to respective scanner generated report?
  • Should we standardize all scanner reports into a common format?
  • Would that fasten it? Would that add to blaoting or would that degrade user experience?
  • Or let users adapt their conditions to each scanner’s JSON structure?
  • Will standardization simplify things, or will it bloat performance?
  • How do we balance flexibility and ease of use?
  • How to do execute actions after math of conditions evalution?

Stay tuned for the next post where we dive into this decision-making process! The best part so far is discussing such decisions and how we arrive at it. But but but... if you are curious... Yes, I’m glad I didn’t miss Anthony Lazaro’s jazz night. I enjoyed my evening. Hehehe... He even spoke to me amongst the crowd. ☺️

Lessons Learned

  • Start Simple: I first tried hardcoding the logic to understand the gaps in my knowledge which I am not even aware of. It helped me figure out what I needed.
  • Iterate and Explore: I got stuck, Googled, and asked myself “what’s missing?” until I landed on the right tools. Which required me to rephrase my question again and again.
  • Think: Re-itirate the need and justify. We wanted gokakshi to be faster. What did it needed to be faster in this case? How can it be faster? Where user in masses are already familiar and which makes it easy to adapt and use quickly. Something they already knew and didn't have to learn domain specific. That's one form of scalable thinking I would say.

Reources