yumh

writing about things, sometimes.

The state of sha256 in got

Written by Omar Polo on 18 July 2024 while listening to Il Vuoto” by Franco Battiato.

TL;DR during the c2k24 OpenBSD hackathon I worked on Game of Trees and in particular in getting SHA256 working. It is working fine, except for the network protocol (and gotd) where more work is still needed.

Git, since several years now, has been working making the hashing function customizable per-repository. It's not a trivial task since git revolves around SHA1 for almost everything, and so this required to think about bare repository extensions, network protocol changes, bundle format changes and so on.

Recap: git is basically a key-value content-addressed database where every object inside it (blob, tree, commit or tag) is stored based on its own hash.

My plans for c2k24 were to clean up and complete what I started about a year ago: read-only support for SHA256 repositories in got. Hacking went on so smoothly that it blew my own expectations by getting also the worktree, object and packfile creations, and even dumping and loading of git bundles working before leaving Prague. (I wrote the bundle v3 support in the airport waiting for my flight so I was technically still there!)

Most of the work has been in building some internal interfaces to abstract over the hashing algorithm, finding decent ways to allow the same code to parse both SHA1 and SHA256 objects and packfiles, and fix every regression introduced by these changes.

I’d like to stress the point that it’s not “just” SHA256, rather these changes are to be able to make the hashing algorithm customizable. If tomorrow SHA256 will be considered broken adding support for a new hashing scheme will be a matter of a couple of minutes, or at least that’s the goal.

Once got was able to read SHA256 repositories, enabling it to write object and packfiles was actually fairly easy. Support for git-bundles v3, which introduce a way to tell which hashing algorithm was used, was also very straightforward.

The next things to do are merging my pending changes, enabling more tests to run in “SHA256” mode, writing a few more, and adjust some documentation.

The big thing still missing is the network support. As of now got can't clone, fetch or send from/to SHA256 repos, only use ‘gotadmin dump’ and ‘gotadmin load’ to transfer bundles. This is less straightforward as it will require to implement the git-protocol v2 in both gotd and the got cli.

(A nice side-effect of implementing the protocol v2 in gotd is that then it becomes feasible to write a gotsh-like program to allow clones over HTTP. While I still prefer anonssh by a large margin, lots of people are more used to HTTP so this is a nice feature to have.)

I think I’ll want to work on some other things before attempting to implement the v2 protocol. gotwebd needs some more love and there are some personal stuff I’d like to get done too!

Summing up, I’m very pleased by the quick progress in this. c2k24 has been the first OpenBSD hackathon I attended to and has been amazing. Prague is also such a beautiful city. Thanks to stsp@ and tobhe@ for reviewing my diffs and helping me along the way, and to sashan@ and the OpenBSD Foundation for organizing and sponsoring the event.