Reproducibility, in Debian, is:
With free software, anyone can inspect the source code for malicious flaws. But Debian provide binary packages to its users. The idea of “deterministic” or “reproducible” builds is to empower anyone to verify that no flaws have been introduced during the build process by reproducing byte-for-byte identical binary packages from a given source.
Then, in order to provide reproducible binaries to Vala projects we need:
- Make sure you distribute generated C source code
- If you are a library, make sure to distribute VAPI and GIR files
This will help build process to avoid call valac in order to generate C source code, VAPI and GIR files from your Vala sources.
Because C source is distributed with a release’s tarball, any Vala project could be binary reproducible from sources.
In order to produce development packages, you should distribute VAPI and GIR files, along with .h ones. They should be included in your tarball, to avoid valac produce them.
GXml distribute all their C sources, but not GIR and VAPI files. This should be fixed next release.
GNOME Clocks distributes just Vala sources; this is why bug #772717 against Clocks, has been filed.
libgee distributes Vala sources also, but no Debian bug exists against it. May be its Vala source annotations helps, but may is a good idea to distribute C, VAPI and GIR files in future versions.
My patches to GNOME Builder, produce Makefiles to generate C sources form Vala ones. They require to be updated in order to distribute VAPI and GIR files with your Vala project.
Distributing generated sources instead of requiring valac isn’t always (or even usually) considered a best practice. The most obvious problem is the preprocessor; different conditions at compile time result in different C code being generated, so for a lot of software distributing the generated C isn’t possible. Libfolks hit this issue pretty hard a few years ago, IIRC.
AFAIK autotools is really the only build system which supports distributing the generated sources, and even there it tends to create a lot more problems than it solves (especially revolving around srcdir != builddir builds). Autotools basically copied what Vala itself does, but the main reason valac distributes generated sources in release tarballs is to avoid depending on itself.
It’s also worth noting that GIRs are not intended to be portable. Subtly different GIRs can be (and are) generated on different architectures and platforms. For example, think about the gio-unix-2.0.vapi, which is really just part of the Gio-2.0.gir which isn’t available on Windows. VAPIs have this problem, too, though it’s much less prevalent than with GIR since we don’t include values for things like constants and enum values.
If you need reproducible builds you can accomplish that by using the same version of valac (with the same options) to generate the sources just like you have to use the same version of the same C compiler (with the same options).
Then may we need to add “Reproducibility Tests” in Vala. This tests should detect if on different conditions valac produces different source code, if so should be considered a bug.
Can’t speak to Vala specifically, but in general, non-reproducibility issues usually come down to one of two things:
1. different versions of the tool produce different output.
2. some date/time information is included in the output, so that output is never the same.
The second is easy to fix, simply by removing any timestamped output. The first, well, that’s hard to do much about – if the user has a slightly different tool version from upstream, you’re stuck. About all you can do is try to ensure that distro binaries are built with the same versions as users will have installed.
Yes, I think the real issue is making valac deterministic. AFAICT it’s had bugs filed for nondeterministic order of generated declarations, as well as other mess like producing duplicate declarations, etc.
Disclaimer: I’m not a Vala user, proponent, or opponent – just someone who reads blogs and Bugzilla. 😀
Well, most code is deterministic. As long as Vala isn’t embedding something from the environment in the build or ordering things non-deterministically, you’re probably good. There are some examples here:
https://wiki.debian.org/ReproducibleBuilds/Howto#Introduction
Distributing generated files is a huge pain for distributions (I work on Ubuntu desktop and have done a lot of Vala development). In addition to what Evan has highlighted generated files are either impractically hard to patch or require the distributor to regenerate the files anyway. In modern autotools Debian packages even the the autotools generated files are regenerated evey build (using dh_autoreconf).
If you want 100% reproducibility you need to distribute binaries which is what systems like Snap and Flatpak are solving. If you are distributing source, then just distribute just the files that are written by the developer and the instructions for building them. It’s not hard for distributors to provide the dependencies if they are well defined (this idea that this is hard seems to be from ancient times when processors were slow and bandwidth was expensive). Anything in between doesn’t actually result in guaranteed reproducible builds and just makes life difficult for everyone…
Best practice (even policy according to some interpretations) in Debian is for packages to be built from actual source (the .vala files), not generated source (the not-really-human-readable .c files).
The Debian GNOME team builds its vala-using packages by first regenerating the .c sources so the distributed .c sources aren’t really even used. The distributed .c sources are annoying when trying to review diffs to see what changed in a new stable release since they make the diff much more cluttered.
I’ve also noticed several packages have artifacts from the build environment in the generated C code like this:
#line 38 “/home/mcatanzaro/src/jhbuild/checkout/gnome-sudoku/src/number-picker.vala”
https://sources.debian.net/src/gnome-sudoku/1:3.22.0-1/src/number-picker.c/#L104
Maybe Michael’s valac has some debug option enabled? Since it doesn’t show up for packages built by other maintainers.
Distributing generated c-files (from vala) won’t help as Debian requires that all packages are built from sources. Sources being defined like in the GPL as the favorite way for the programmer to edit it which the c-sources are not. So Debian will first purge those if you distribute them.
Would generated C source code even be considered as source code in the context of reproducible builds? AFAIK it’s not considered as source code by licenses like the GPL (because it’s not the “preferred form of the work for making changes in it”).
Wasn’t there a Debian policy of compiling everything from the “original source code”?
Adding to the “disting generated files is bad, don’t do it” crowd: you should not distribute GIR files from introspection either.
GIR files are machine dependent, and they will still need to be generated on the target machine that builds the code. Distributing the GIR files in a tarball is perfectly pointless, as they will be overwritten anyway.
Good to know. Thanks.