Re: make dist using git archive - Mailing list pgsql-hackers
From | Eli Schwartz |
---|---|
Subject | Re: make dist using git archive |
Date | |
Msg-id | 39a8ff30-a046-4dd2-847a-5c33e9d24031@gmail.com Whole thread Raw |
In response to | Re: make dist using git archive (Peter Eisentraut <peter@eisentraut.org>) |
Responses |
Re: make dist using git archive
|
List | pgsql-hackers |
On 1/31/24 3:03 AM, Peter Eisentraut wrote: >> What do you use this for? IMO a more robust way to track the commit used >> is to use gitattributes export-subst to write a `.git_archival.txt` file >> containing the commit sha1 and other info -- this can be read even after >> the file is extracted, which means it can also be used to bake the ID >> into the built binaries e.g. as part of --version output. > > It's a marginal use case, for sure. But it is something that git > provides tooling for that is universally available. Any alternative > would be an ad-hoc solution that is specific to our project and would be > different for the next project. mercurial has the "archivemeta" config setting that exports similar information, but forces the filename ".hg_archival.txt". The setuptools-scm project follows this pattern by requiring the git file to be called ".git_archival.txt" with a set pattern mimicking the hg one: https://setuptools-scm.readthedocs.io/en/latest/usage/#git-archives So, I guess you could use this and then it would not be specific to your project. :) >> Overall I feel like much of this is about requiring dist tarballs to be >> byte-identical to other dist tarballs, although reproducible builds is >> mainly about artifacts, not sources, and for sources it doesn't >> generally matter unless the sources are ephemeral and generated >> on-demand (in which case it is indeed very important to produce the same >> tarball each time). > > The source tarball is, in a way, also an artifact. > > I think it's useful that others can easily independently verify that the > produced tarball matches what they have locally. It's not an absolute > requirement, but given that it is possible, it seems useful to take > advantage of it. > > In a way, this also avoids the need for signing the tarball, which we > don't do. So maybe that contributes to a different perspective. Since you mention signing and not as a simple "aside"... That's a fascinating perspective. I wonder how people independently verify that what they have locally (I assume from git clones) matches what the postgres committers have authorized. I'm a bit skeptical that you can avoid the need to perform code-signing at some stage, somewhere, somehow, by suggesting that people can simply git clone, run some commands and compare the tarball. The point of signing is to verify that no one has acquired an untraceable API token they should not have and gotten write access to the authoritative server then uploaded malicious code under various forged identities, possibly overwriting previous versions, either in git or out of git. Ideally git commits should be signed, but that requires large numbers of people to have security-minded git commit habits. From a quick check of the postgres commit logs, only one person seems to be regularly signing commits, which does provide a certain measure of protection -- an attacker cannot attack via `git push --force` across that boundary, and those commits serve as verifiable states that multiple people have seen. The tags aren't signed either, which is a big issue for verifiably identifying the release artifacts published by the release manager. Even if not every commit is signed, having signed tags provides a known coordination point of code that has been broadly tested and code-signed for mass use. ... In summary, my opinion is that using git-get-tar-commit-id provides zero security guarantees, and if that's not something you are worried about then that's one thing, but if you were expecting it to *replace* signing the tarball, then that's.... very much another thing entirely, and not one I can agree at all with. -- Eli Schwartz
Attachment
pgsql-hackers by date: