Buf + Pants = :heart:

What is Protocol Buffers?

I’m glad you asked! Protocol Buffers, or protobufs for short, allow you to structure and serialize data in a forward-compatible and backward-compatible way. Think of it like JSON, but smaller and faster. Your protobufs (stored in .proto files) can then be compiled and code for your language of choice generated, allowing you to interact with your data with the same ease as any other object in your codebase - regardless if it’s Python, Go or something else.

Imagine we want to create a protobuf to represent a user in our system. Here’s what a simple project/user.proto would look like, ignoring the rather horrible formatting for now:

syntax = "proto3";



message User
{
  int32 id = 1;
  string username = 2;
    string email = 3;
  bool active = 4;

  enum Rank {
    ADMIN = 0;
      MOD = 1;
    REGULAR = 2;
  }

  Rank rank = 5;
}

Normally, at this stage, we’d have to invoke protoc and turn this protobuf into code for our language(s). Luckily for us, Pants handles that automatically, so we can jump straight into coding! Here are two Python examples, one for writing and one for reading:

from project.user_pb2 import User

user = User(
    id=1,
    username="foobar",
    email="[email protected]",
    active=True,
    rank=User.Rank.REGULAR,
)

print(user.SerializeToString())  # b'\x08\x01\x12\x06foobar\x1a\[email protected] \x01(\x02'

import sys
from project.user_pb2 import User

user = User()

with open(sys.argv[1], "rb") as f:
    user.ParseFromString(f.read())

print(user.username)  # foobar

Pretty neat, huh? Since Protobufs are forward-compatible and backward-compatible, we could extend our protobuf with a string password_hash = 6; field and the code samples above would continue to work, even if we end up deploying an updated version of only one of the code samples. If only the writer is updated, then the reader will simply ignore the unknown password_hash field, and if we only update the reader it will simply use the default value for password_hash’s type (an empty string in this case).

All good so far? Sweet, but let’s buf it up a notch!

Introducing Buf #

As of 2.11, Pants now has support for buf, a CLI tool for working with Protocol Buffers. More specifically Pants supports buf’s formatting and linting capabilities. To enable them, simply add "pants.backend.codegen.protobuf.lint.buf" to backend_packages in your pants.toml.

Remember the not-so-pretty protobuf we defined above? After running ./pants fmt project/user.proto, it looks like this:

syntax = "proto3";

message User {
  int32 id = 1;
  string username = 2;
  string email = 3;
  bool active = 4;

  enum Rank {
    ADMIN = 0;
    MOD = 1;
    REGULAR = 2;
  }

  Rank rank = 5;
}

Much better! With automated formatting we don’t have to waste time and energy to decide how .proto files should be formatted. Did I mention there’s support for linting as well? Let’s run ./pants lint project/user.proto!

project/user.proto:1:1:Files must have a package defined.
project/user.proto:10:5:Enum value name "ADMIN" should be prefixed with "RANK_".
project/user.proto:10:5:Enum zero value name "ADMIN" should be suffixed with "_UNSPECIFIED".
project/user.proto:11:5:Enum value name "MOD" should be prefixed with "RANK_".
project/user.proto:12:5:Enum value name "REGULAR" should be prefixed with "RANK_".

Yikes! Buf’s linting capabilities enforces consistency and keeps our protobufs in line with best practices. One of the things it wants us to do in the example above is to prefix our enum value names with the name of the enum (e.g. RANK_ADMIN instead of just ADMIN). This is because protobufs use C++ scoping rules, making it impossible for two enums to have the same enum value names. There are also a couple of other errors in there as well, if you’re interested in the explaination for them (and others) you can check out Buf’s well written and explanatory documentation on the subject.

As mentioned before, protobufs are forward-compatible and backward-compatible - but it’s up to the developer to avoid making breaking changes, and using Buf’s linter will make sure that your protobufs are structured and written in a consistent and thought through way, allowing you to extend and work with with them much easier now as well as in the future.

My Pants experience #

During my almost ten years at Snowfall, one of my proudest achievements has been to take the company’s CI/CD from barely existing to what others have called state of the art. About a year ago we improved that setup further by throwing out a bunch of custom tools and scripts that required both energy and time to maintain and replaced them with Pants.

There were some things we had and did that Pants wasn’t capable off yet though, and one of them was buf’s linting. Now, if Pants was a proprietary, closed software we’d send in a feature request and hope for the best, but luckily it’s not! Fine. Okay, you got me. I did open a feature request at first anyway, but you have to start somwhere.

I eventually took the time to look into it further though. Writing a new linter for Pants, how hard can it be? Turns out, not hard at all! Since I had already been dabbling with Pants for a while, I knew the core concepts, and reading through the plugin documentation as well as looking at existing linters gave me enough knowledge to get a proof of concept working. I had some issues with transitive dependencies as well as source roots, but the Pants maintainers and the rest of the community was very welcoming and willing to help out (extra shout out to Eric Arellano!), so it didn’t take long before those issues were sorted.

All in all, support for buf lint took a few days to complete, while support for buf format took an evening. It was a very pleasant experience and whetted my appetite for contributing more code to Pants, so who knows what the future will bring!