title:Generating static and portable executables with OCaml
authors:Louis Gesbert
date:2021-09-02
category:tooling
tags:static,portable,binaries
Generating static and portable executables with OCaml
Distributing OCaml software on opam is great (if I dare say so myself), but sometimes you need to provide your tools to an audience outside of the OCaml community, or just without recompilations or in a simpler way.
However, just distributing the locally generated binaries requires that the users have all the needed shared libraries installed, and a compatible libc. It’s not something you can assume in general, and even if you don’t need any C shared library or are confident enough it will be installed everywhere, the libc issue will arise for anyone using a distribution based on a different kind, or a little older than the one you used to build.
There is no built-in support for generating static executables in the OCaml compiler, and it may seem a bit tricky, but it’s not in fact too complex to do by hand, something you may be ready to do for a release that will be published. So here are a few tricks, recipes and advice that should enable you to generate truly portable executables with no external dependency whatsoever. Both Linux and macOS will be treated, but the examples will be based on Linux unless otherwise specified.
Example
I will take as an example a trivial HTTP file server based on Dream.
_build/default/fserv.exe: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=1991bb9f1d67807411c93f6fb6ec46b4a0ee8ed5, for GNU/Linux 3.2.0, with debug_info, not stripped
(on macOS, replace ldd with otool -L; dune output is obtained with (display short) in ~/.config/dune/config)
So let’s see how to change this result. Basically, here, libev, libssl and libcrypto are required shared libraries that may not be installed on every system, while all the others are part of the core system:
linux-vdso, libdl and ld-linux are concerned with the dynamic loading of shared objects ;
libm and libpthread are extensions of the core libc that are tightly bound to it, and always installed.
Statically linking the libraries
In simple cases, static linking can be turned on as easily as passing the -static flag to the C compiler: through OCaml you will need to pass -cclib -static. We can add that to our dune file:
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/10/../../../x86_64-linux-gnu/libcrypto.a(dso_dlfcn.o): in function `dlfcn_globallookup':
(.text+0x13): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: ~/.opam/4.11.0/lib/ocaml/libunix.a(initgroups.o): in function `unix_initgroups':
initgroups.c:(.text.unix_initgroups+0x1f): warning: Using 'initgroups' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
[...]
$ file _build/default/fserv.exe
_build/default/fserv.exe: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=9ee3ae1c24fbc291d1f580bc7aaecba2777ee6c2, for GNU/Linux 3.2.0, with debug_info, not stripped
$ ldd _build/default/fserv.exe
not a dynamic executable
The executable was generated… and the result seems OK, but we shouldn’t skip all these ld warnings. Basically, what ld is telling us is that you shouldn’t statically link glibc (it internally uses dynlinking, to libraries that also need glibc functions, and will therefore still need to dynlink a second version from the system 🤯).
Indeed here, we have been statically linking a dynamic linking engine, among other things. Don’t do it.
Linux solution: linking with musl instead of glibc
The easiest workaround at this point, on Linux, is to compile with musl, which is basically a glibc replacement that can be statically linked. There are some OCaml and gcc variants to automatically use musl (comments welcome if you have been successful with them!), but I have found the simplest option is to use a tiny Alpine Linux image through a Docker container. Here we’ll use OCamlPro’s minimal Docker images but anything based on musl should do.
_build/default/fserv.exe: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped
~/fserv $ ldd _build/default/fserv.exe
/lib/ld-musl-x86_64.so.1 (0x7ff41353f000)
Almost there! We see that we had to install extra packages with apk add: the static libraries might not be already installed and in this case are in a separate package (you would get bin/ld: cannot find -lssl). The last remaining dynamic loader in the output of ldd is because static PIE executable were not supported until recently. To get rid of it, we need to either replace -static with -static-pie if the version of gcc is high enough, or add -cclib -no-pie:
(executable
(public_name fserv)
(flags (:standard -cclib -static-pie))
(libraries dream))
And we are good!
~/fserv $ file _build/default/fserv.exe
_build/default/fserv.exe: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped
~/fserv $ ldd _build/default/fserv.exe
/lib/ld-musl-x86_64.so.1: _build/default/fserv.exe: Not a valid dynamic program
Trick: short script to compile through a Docker container
Passing the context to a Docker container and getting the artefacts back can be bothersome and often causes file ownership issues, so I use the following snippet to pipe them to/from it using tar:
} >&2 && tar c -hC _build/install/default/bin .' | \
tar vx
The other cases: turning to manual linking
Sometimes you can’t use the above: the automatic linking options may need to be tweaked for static libraries, your app may still need dynlinking support at some point, or you may not have the musl option. On macOS, for example, the libc doesn’t have a static version at all (and the -static option of ld is explicitely “only used building the kernel”). Let’s get our hands dirty and see how to use a mixed static/dynamic linking scheme. First, we examine how OCaml does the linking usually:
The linking options are passed automatically by OCaml, using information that is embedded in the cm(x)a files, for example:
$ ocamlobjinfo $(opam var lwt:lib)/unix/lwt_unix.cma |head
File ~/.opam/4.11.0/lib/lwt/unix/lwt_unix.cma
Force custom: no
Extra C object files: -llwt_unix_stubs -lev -lpthread
Extra C options:
Extra dynamically-loaded libraries: -llwt_unix_stubs
Now the linking flags, here -llwt_unix_stubs -lev -lpthread let the C compiler choose the best way to link; in the case of stubs, they will be static (using the .a files — unless you make special effort to use dynamic ones), but -lev will let the system linker select the shared library, because it is generally preferred. Gathering these flags by hand would be tedious: my preferred trick is to just add the -verbose flag to OCaml (for the lazy, you can just set — temporarily — OCAMLPARAM=_,verbose=1):
Note that -lpthread and -lm are tightly bound to the libc and can’t be static in this case, so we moved -lpthread to the end, outside of the static section. The part between the -Bstatic and the -Bdynamic is what will be statically linked, leaving the defaults and the libc dynamic. Result:
_build/default/fserv.exe: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=31c93085284da5d74002218b1d6b61c0efbdefe4, for GNU/Linux 3.2.0, with debug_info, not stripped
The remaining are the base of the dynamic linking / shared object systems, but we got away with libssl, libcrypto and libev, which were the ones possibly absent from target systems. The resulting executable should work on any glibc-based Linux distribution that is recent enough; on older ones you will likely get missing GLIBC symbols.
If you need to distribute that way, it’s a good idea to compile on an old release (like Debian ‘oldstable’ or ‘oldoldstable’) for maximum portability.
Manually linking on macOS
Unfortunately, the linker on macOS doesn’t seem to have options to select the static versions of the libraries; the only solution is to get our hands even dirtier, and link directly to the .a files, instead of using -l arguments.
Most of the flags just link with stubs, we can keep them as is: -lssl_stubs -lcamlstr -loverlap_stubs_stubs -ldigestif_c_stubs -lmtime_clock_stubs -lmirage_crypto_rng_unix_stubs -lmirage_crypto_stubs -lcstruct_stubs -llwt_unix_stubs -lthreadsnat -lunix -lbigstringaf_stubs
That leaves us with: -lssl -lcrypto -lev -lpthread
lpthread is built-in, we can ignore it
for the others, we need to lookup the .a file: I use e.g.
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1292.60.1)
This is as good as it will get!
Cleaning up the build system
We have until now been adding the linking flags manually in the dune file; you probably don’t want to do that and be restricted to static builds only! Not counting the non-portable link options we have been using…
The quick&dirty way
Don’t use this in your build system! But for quick testing you can conveniently pass flags to the OCaml compilers using the OCAMLPARAM variable. Combined with the tar/docker snippet above, we get a very simple static-binary generating command:
The linking flags will depend on the chosen linking mode and on the OS. For the OS, it’s easiest to generate them through a script ; for the linking mode, I use an environment variable to optionally turn static linking on.
This will use the following gen-linking-flags.sh script to generate the file, passing it the value of $LINKING_MODE and defaulting to dynamic. Doing it this way also ensures that dune will properly recompile when the value of the environment variable changes.
echo "No known static compilation flags for '$OS'" >&2
exit 1
esac;;
*)
echo "Invalid linking mode '$LINKING_MODE'" >&2
exit 2
esac
echo '('
for f in $FLAGS; do echo " $f"; done
for f in $CCLIB; do echo " -cclib $f"; done
echo ')'
Then you’ll only have to run LINKING_MODE=static dune build fserv.exe to generate the static executable (wrapped in the Docker script above, in the case of Alpine), and can include that in your CI as well.
For real-world examples, you can check learn-ocaml or opam.