Welcome to the NGI Zero podcast where we talk to the people who are building the Next Generation
Internet.
I am Ronny Lam.
And I am Tessel Renzenbrink.
We're both from an NLnet, a foundation which supports people who are working on a free
and open internet.
Our guest today is Janneke.
He is a physicist, the co-founder of GNU LilyPond, co-founder of Doe040, the democratic
school in Eindhoven, a Guix developer and founder of GNU Mes.
The GNU Mes project received several NGI Zero grants and that's what we'll be talking
about today.
Hi Janneke, nice to have you here.
Yeah, great to be here.
Thanks for having me.
Okay, we have devised three short questions that would let our listeners know everything
about you if you have answered them.
So everything, everything.
So here we go.
Emacs or Vim?
Emacs.
Star Trek or Star Wars?
Wow, that's a difficult one.
In my youth, Star Trek, now definitely Star Wars.
And the final one.
There is life outside our solar system.
We are alone.
There's life outside our solar system.
Agreed.
So you work on operating systems.
What key issues do you see in this field?
Yeah, that's a good question.
When you say I work on operating systems, I was going, am I doing that?
So in my mind, I'm working on problems that I see in the world, especially when I think
there's an elegant solution for the thing.
So I got involved in GNU Guix, yeah, sort of by chance, because I love Scheme.
I think the world would be a better place if people started using functional programming
more, because that's a real help.
So I got into Guix because I started to use Guile more in my programming.
And I love the idea of, yeah, say the idea behind Guix.
Guix is of course implementing the Nix thesis, which for me boils down to the fact of the
observation that a traditional package management is actually a broken system.
If you're lucky, it works.
And if you put in a lot of hard work, you can get it to almost always work.
But in essence, what you want for a package manager is to describe all the dependencies
that a program has.
That is what Nix does and that was Guix does.
So I love an elegant solution for a problem that people are ignoring.
OK, so that's you describing what you think is important.
And then how does GNU Mes add to that?
How does it contribute?
Yeah, so when I started working on Geeks, I started reading a manual, especially Ludovic
Courtès did a great job of writing a beautiful manual.
But somewhere in the manual, he explains how the packages in Guix are an acyclic directed
graph, their dependencies and that in Guix, and the same goes for Nix, by the way, if
you install or build a package, what you do is philosophically, you build it from source
together with all its dependencies.
And that goes for every package, except for the bootstrap binaries.
And in the manual, I read the text of Ludovic, where he said, well, that last thing, the bootstrap
binaries, we can rebuild them, of course, but they are not built from source.
And that is a problem because they cannot be, they are opaque and they're pretty large
and they cannot be inspected.
So at the same time, when I joined Geeks around 2016 was that there was another project by
Jeremiah Orians, the Stage0 project, where he built a self hosting hex assembler program
in a couple of hundred bytes, I think, three hundred and fifty seven at the time.
And I combined those two those two things.
So the problem we have in Geeks that we trust binaries that we don't build from source and the
inspiration of a 200 and well, 200 or 300 byte program that can build itself.
And my idea was Mes can bridge the gap between an assembly, a simple assembly language and bring
it up to a to Scheme to a high level programming language.
And from there, we can probably bootstrap the whole system.
So that's how I started with that.
So what Mes tries to do is to make software bootstrapable or the Guix system bootstrapable.
So you said you had a binary blob of two hundred fifty three hundred megabytes or something, you said.
And you put it down to?
Ah, no. So the program by Jeremiah Orians was three hundred and fifty seven bytes.
So really small, less than a K, one K.
And that program can build itself.
So that's really on the small kind of range of the spectrum.
And on the other range of the spectrum was the bootstrap binaries that we have in Guix.
Those were two hundred and fifty megabytes.
The traditional distributions use a much larger binary seed, often twice as large.
So half a gigabyte of opaque binaries.
And using Guix, yes, we brought that down from two hundred and fifty megabytes to.
Well, depends on how you're counting the upper limit, is twenty megabytes.
So a factor of ten, an order smaller, because we still depend on stuff, of course.
But we build everything from source, starting from this couple of hundred bytes currently in,
if you install Guix.
Wow, that's that's that's amazing.
I think so, too. I never thought we would achieve actually achieve this in so little time.
And this this three hundred fifty seven bytes, they are assembly?
Yeah, there are there are they it's.
It's a Jeremiah calls it an hex assembler, so it's a custom assembler.
And actually, that is written in machine code.
So it's architecture dependent.
So for every architecture, there's there's another binary.
And it's a real, real simple language.
Simply you enter the bytes, but then an ASCII.
So if you want to.
So you're actually writing machine code.
So you write the instructions, the opcodes and operands simply in numbers.
But there's also a comment mark, a comment character, which is a pound sign.
And you, after that, you manually write which instruction is going to be,
what the instruction actually is.
So you write machine code.
But in that machine code, that program, the hex 0 program can interpret
its ASCII variant and produce a running executable from itself.
So it's self hosting.
Can you take us along a bit and you look at the very beginning, you look at 250 megabytes.
What what kind of steps did you do to to reduce it like this?
Yes.
So the 250 megabytes is...
I first start looking what's what's in there, of course.
So it's the usual suspects is it's gcc.
It's G Libc.
It's binutils stuff like awk and sed and coreutils and tar and gzip, stuff like that.
So I figured what we really need to remove there is gcc, G Libc and binutils.
So I started focusing on that.
So after I got Mes a bit started and it was able to compile a trivial C program,
I asked around on the the Guile user mailing list.
Can somebody help me with this?
I have this idea.
I'd like to build a C compiler in Scheme so that... The idea was that we could build gcc,
G Libc and binutils without using gcc, G Libc and binutils.
So that that was the initial idea.
So we replaced gcc by a smaller C compiler binary.
That was step one.
That was step one.
So Mes is a Scheme interpreter and I just replaced the gcc binary in our bootstrap
binaries by the Mes binary.
But it's a lot smaller than it's two orders of magnitude smaller than gcc.
That's the first thing that I did and I wrote a C compiler in Scheme.
I was wondering if I could compile gcc maybe with that C compiler,
but it's terribly slow and the ccc source code,
even if you look at gcc 1.0, uses pretty funky C.
So we found another smaller C compiler project, which is TinyCC.
TinyCC has an explicit target or goal for what they do is to be able to compile gcc.
And TinyCC is four times smaller than gcc 1.0, so only 20 to 25,000 lines of C code.
And it doesn't use many funky C construct.
It's relatively simple C.
So I focused on that.
And yeah, so the first real milestone was when I was able to compile a working TinyCC using Mes.
And does GNU Mes only work for GNU Guix?
At the moment, sadly, that is the case.
So there's no inherent limitation.
As I write in my blob or blurb of Mes, I'd like Mes to help bootstrapping all
free Linuxes or free Unix distributions even.
But so far, only Guix has adopted the Mes bootstrap path.
Could NixOS, for example, be the next one?
Yeah, I'm hoping.
So about a year ago, I think there's been an effort by Emily, and I forgot her last name,
Emily Trau, who actually got, I think, one or two PRs into Nix.
To build Mes, so to bootstrap Mes using stage0.
So it's already a much more mature path than what I just described.
But as far as I know, the effort of working on a full bootstrap into Nix is currently stalled.
So we talked about this also at FOSDEM this year to see if we can somehow
nudge some Nix developers to help Emily get an NLnet grant and bring this to Nix,
because this really should be the next step, I believe.
Yeah, great.
And maybe for people that maybe may not understand the importance of this,
what is the importance of being able to bootstrap the whole operating system?
Yeah, I think it's very easy to underestimate the importance of it.
I would like to ask the question, we all love free software, and it's,
if you ask the question, why is it important to use free software?
Well, of course, I want to have control over my machine, privacy, being autonomous,
that kind of thing, being able to create a community and build things together.
But can you really speak of a program being free software if you cannot bootstrap it?
That would be my question.
I would argue, but I know that I have plenty of discussions and people disagree with me,
but there are also people who agree.
I would say that if you have a free software program that cannot be bootstrapped,
that it's not free software.
So for me, it would be essential to free software.
So if you cannot inspect the code from the ground up, you cannot trust the operating system?
Yeah, so it boils down to Ken Thompson's Trusting Trust paper, where he shows that
if somehow something in your stack, for example, your C compiler is tainted or compromised,
your whole system, you cannot trust anything of your system.
You have to trust initial binaries.
So you always have to trust something.
There's the hardware and there's other stuff and you cannot inspect all source code by yourself.
But I think it makes sense to reduce the amount of trust that you need as much as possible.
Yeah, yeah, indeed.
Yes.
And this is a great step towards that.
So I would, yeah, thank you.
So I've been playing with the idea,
I haven't announced it yet.
And maybe it's just to stir things up.
But I was thinking about the idea that we have the four software freedoms as published by the FSF,
that we might need a fifth freedom, Freedom Four.
Freedom four, the freedom to bootstrap the program and recreate it bit for bit.
So if a binary is available and you can't build it totally from source,
how would you ever exercise freedom zero to run it as you wish?
Yeah, and I think adding to that, another important thing is if you want to have
true freedom, you want to run this on open hardware.
Yeah.
And so that the hardware can also be inspected at least during build time.
That's why I'm so happy with the effort of Ekaitz Zárraga and Andrius Štikonas,
who've been working for a couple of years now, I think very hard and thankfully sponsored by
an NLnet too on bringing this bootstrap to RISC-V, which could be a big boost for the bootstrap too.
Yeah, yeah, indeed.
Yeah.
So what are the next steps for the project?
Yeah, I've been thinking about that.
And in a way, I've been doing this for eight years now.
And it's always been pretty obvious what the next logical step would be.
And since one and a half years, we now have this bootstrap in Guix.
My idea was and my hope was that when we would show that it would be possible to do because
I've talked to a big sigh, I've talked to a lot of naysayers the past decade.
This cannot be done and if you can do it, why bother, that kind of thing.
I think it's fun and it's essential and it can be done.
So I figured when more people learn about this, they would join
and most work would be taken off of my hands, so to speak.
And of course, we have a technical roadmap.
So currently, the bootstrap uses an ancient version of gcc, gcc 295,
which isn't a big problem because it works, but it only works for 32-bit x86.
So that's where the RISC-V effort comes in.
So we need to get rid of gcc or ancient software, but especially gcc 2.
And there are other things in the bootstrap that we want to clean up.
Currently, we still use some implementation of coreutils in the bootstrap.
We have a Scheme implementation, which is called Gash with its Gash utils,
which we run on the Guile driver of Guix.
That path cannot be used by Nix, of course, because they have no Guile binary.
So we have to get rid of that thing.
We are pretty far there, but that has to be done.
Those are the biggest two technical challenges.
But far more important is to grow the adoption, like we talked about before.
It would be amazing if they got into Nix because that community is,
I think it's 10 times as large, if not bigger than Guix.
It would be great to have this in Debian.
To summarize, the next step for me is not so much technical, but
much more inspirational, getting other communities involved in the importance of bootstrapping.
For example, the people who built G Libc and gcc, or people who build a kernel and decide to add
Rust to it. I would love for them to learn about bootstrapping, to learn about bootstrapping,
to realize how important it is, and to take responsibility for their software to
remain or become fully bootstrappable.
To be honest, we showed that it can be done for a current system that depends on C,
but the world goes very fast and everything is Rustified.
And the bootstrapping community is just too small and will probably
always be too small to carry this burden of making
softwares that have become non-bootstrappable bootstrappable again.
It would be great if it would become a no-brainer for free software.
So if anyone has ideas how to best do that, I'm all ears.
Not off the top of my head, if I'm honest.
Can you tell us something about the community behind GNU Mes and the bootstrappable people?
Yeah, so it's a very friendly community.
I started this effort in 2016, just mailing to the Guile mailing list and the Guile user
mailing list, and Ludovic Courtès has been one of my first supporters, which maybe isn't
so surprising because he wrote in his manual about Guix how important bootstrapping is.
But it's been very important to have, and from other people in the Guile community also,
they didn't really help with the technical work, but they cheered me on in the beginning
and said, yay, go for this, this is great. So that was really great.
And Ricardo Wurmus, also a Guix developer, took it upon him to create a bootstrapping
IRC channel. And initially we were, I think the first year we were with five or ten people on
there and it's grown to, yeah, 100 plus, 120 maybe, even people who are working on bootstrap
relating things or cheering us on. So that's the online community, how it started. So Jeremiah
Orians was in the beginning a big help. We helped each other and coached each other
through our problems. And what was also really great was the reproducible builds community.
I went to the reproducible builds summits in 2017 and 18 and 19, the first two were in Berlin.
And yeah, we got a podium there within the reproducible builds community to start a
bootstrappable subgroup and that a lot of great ideas came from there. So the first things I
actually used was Matt Wette's NYACC or NYACC C parser. So he wrote
a parser for C in for Guile scheme.
And he helped a bit to make it fit for Mes. So Mes is using Matt Wette's NYACC to parse
C programs such as TinyCC. There wasn't a lot of cooperation in the sense that we,
I wrote a few patches and he wrote some things I think for Mes, but mostly used,
worked on us on our own projects. Same as Jeremiah Orians, but we made it fit together. And
yeah, that was really great. And of course, after some time, Jeremiah proposed to write
M2-Planet, which is a bootstrappable, yeah, sub C compiler. So it is a subset of C.
Some people say I cannot call it C because it breaks some C rules, but it is, when Jeremiah did
that, it allowed me to dumb down the Mes C code base even further so that we could align
or yeah, align our two projects to work together. So I've worked a lot with Jeremia,
which was a big help. And yeah, the past years, more people joined. There's the live bootstrap
project, which implements a bootstrap outside of any distribution, just as a reference implementation.
But yeah, the community has been really great, especially the last year with
Ekaitz and Andrius, where we talk on IRC about all our problems and people just listen
and rubber duck you and help you with the problems, sometimes even debug it. So I guess it helps that
it was a small community and that we achieved what we set out to do. So yeah, it's very friendly.
And without that, I surely would have given up. Yeah.
And just curious, like what do you like most about the project? I mean,
there's got to be something about you to look at a big blob and go like, I'm going to spend years
to just pll it apart. I think it's a very easy project to love from my perspective. So
it's a glaring problem that we have known about since the beginning of the 80s, '83 or '84, I think,
that practically everybody in computing knows about and that everybody has been ignoring.
And that most people agree who look into it, that it really is a big problem and it hasn't been
solved. So, and then especially when we started, there were a lot of people who said it can't be
done. So if you have a big problem and people say it can't be done and someone famous wrote an
article about it 40 years ago, yeah, and it involves writing your software without
many dependencies like we did in the old days. So when I started Mes,
it depended on nothing except for the Linux kernel, so to say. So no frameworks, no full
stack developer nonsense, no JavaScript stuff, just small programs that you can understand,
write yourself and don't depend on much. Yeah, it's an amazing mix, I would say.
Going back to Trusting the Trust problem like Thompson described, I mean,
shouldn't everything be bootstrapped in your vision?
Yes, I think everything should be bootstrapped. Yeah, but why do you ask? What do you
what are you thinking about? Do you doubt that for certain things?
Well, it's not doubt. We are building a trustworthy internet. Yeah. In my perception,
we can only trust things if we can trust what is at the base of those things. And I think, well,
answering my question myself a little bit maybe, if we can bootstrap certain things,
then we can build on those, from those, the rest of course. Yes. So if we can build a
safe C compiler, then we can build trustworthy programs on that safe C compiler and the same
goes for rest of course. Yes, but I would say if you have an application, let's say,
Emacs or Vim, if you built that in Guix with the trusted C compiler that you bootstrapped,
I would say those programs you built, Emacs or Vim, are also bootstrapped. So
in some cases, if you have a program that
doesn't, that's written in C and it only depends on libraries that are written in C that can be
bootstrapped themselves, then the software is also bootstrapped. So for me, one of the problems
arises is when developers start adding dependency cycles to their software,
then it becomes very hard to boot -, if not impossible at that time, to bootstrap it.
And yeah, what do you do? Then you have to go inject a binary. So my idea is that
there's no technical or sound reason to inject untrusted binaries in your software stack.
We've shown that you can bootstrap a system. Let's get rid of all the binary blobs
and all the generated code that is still used here and there. Yeah, couldn't agree more.
And of course, it may have downsides if you want to build everything from source,
it can take a lot of time. So I could imagine that someone would create a trusted base system
where everything is bootstrapped and that system is in some way,
marked as trusted and hashed and everyone can build on that and you don't build it yourself.
But yeah, that's just using a binary substitute for something that bootstrapped. So
I think doing something like that just makes sense. That's essentially what Guix and Nix do
when you install a package anyway. Yeah. You mentioned NGI Zero a couple of times already
in our discussion. How does NGI Zero funding help your project?
Well, that's very easy. I've had funding for at least three or four periods and if I hadn't had
that, I wouldn't have had the time to work on this. So it would really be a hobby project and
it's hard to tell, but it's been in my eyes essential to get this to work. Because as you
said before, where do you get the inspiration or ask before, why do you work on this? Why is
this important or fun? It is
a big project
and if it drags out too long and yeah, it's easy to lose interest maybe. So
at the perfect time when I thought it was really early stages, I was really surprised
that when I asked for funding and explained why this was important, that NLnet, NGI Zero
immediately understood why it is important and wanted to fund. Also that support, not only
financially, which was very important, but also the support and the understanding and the
acknowledgement that this is important, that went with it, that did really wonders for my project.
So I'm really grateful for this. You're doing amazing and important work.
Well, and I want to return that because you are doing the amazing work and with a
ground laying project, I think. Yeah, thanks. So it's one of the most lovely examples of
how we help each other and bring the world forward. Yeah.
If you had to give advice to people who are now considering to apply for NGI Zero funding,
what would you say to them? I would say do it. I heard from NGI Zero, from Pjotr Prins,
at FOSDEM, I told him about my project or maybe he already knew about it and he said you have to
apply for NGI funding and I said, oh no, I'm never going to do that again. For GNU LilyPond,
Han-Wen and I went through a UI funding application twice, I think.
It wasted months of our lives and it didn't produce anything and yeah, it was all spare time
at that time. So your project stands still for a couple of months, which is terrible and it's not
inspiring work. So I said, I'm never going to do that again and then he said, Pjotr said,
it's really easy but we have to do it now because the deadline is today at 12.
And I can help you write the application. So yeah, we wrote the application and I got funded. So
my advice, we just put in between the two of us, maybe two or three hours of work to make a sound
proposal because it really helps if you know what you're doing and you can explain why your
project is important. But if you do, chances are pretty good that you get funded. So it's a really
lightweight process and it can help your project a lot. So if you're doing something important
for the internet, for freedom or security, go check it out.
I really like the fact that you applied between deciding to apply and applying,
it was two hours or something. That's great.
Great. We're reaching a bit the end of the questions, but I was wondering
if the people who are listening now and they want to contribute to GNU Mes or they maybe want to
help get things more bootstrappable, how can they join? How can they help? What can they do?
Yeah, so we have what they can do. We have an IRC channel, which is hash bootstrappable on LibreNet.
That would be the best thing to join. There's also the bootstrappable mailing list.
I would just say check out the GNU Mes website for that information. And there's also a couple of
pretty nice blog posts on the Guix website about bootstrapping and reproducible builds.
So you get some background in how that all works. So yeah, if you're interested, go check it out and
say hi. Come say hi, send an email or join IRC. I think there's even a Matrix bridge for it.
So yeah, don't hesitate and see if there's anything that you would like to do and can pick up.
There's a lot of work that we still have to do. That's a nice invitation for anybody who's
interested in to have a peek. So that's from our side the question. Is there anything you would
still like to say? Yeah, maybe what I would like to say is
I've been thinking, how did I, you asked this, how did you get involved in this? And I told you I
found Guix. But of course in a past life, I worked on GNU LilyPond. And at some time,
Han-Wen and I decided to work less on that and step back a bit to give other people a chance
and the community to take up. And that's been lovely. But then for a couple of years,
three or four years, I didn't really have a project. And that was really great because
I had a lot of free time. But yeah, there was also a kind of nagging or guilty kind of feeling,
shouldn't I be programming something? But a Music Typesetter is really great. People enjoy that
software. But I was really going, well, I was looking for something important to do, something
that I would enjoy, but also something, I call it a hole in the world. When you look at the world
from some distance and you say, hmm, why there's a big white space here? Why doesn't someone fill it
in? So in that period, I started together with others, I took the initiative for this democratic
school, which was really, I think that's really important. Yeah, if you were into education at
all, or have children look up democratic education and sociocracy. But yeah, I didn't have the time
to really go work on that full time. So that was a fun project, but others took over. And I was still
going, what shall I do? And then I just happened on Guix and found this remark from Ludovic
Courtès in the manual saying, well, I told that before, this is important to reduce this size.
And then there was at the same time, Jeremiah with his stage0 project. And I was going, well,
that's a gap and that's a big white space or gap or hole in the world. So yeah, I guess what I'm
saying is, it really helped me to stop doing what I've been doing for a couple of years, get out of
the treadmill, look back and reflect on what could give you energy and what you think is
important. And some things may just work out. But yeah, you can let that go again. So I did that
with the school, but it's still a very nice thing. And yeah, sometimes it, in my case, it took three
or four years to find something to work on. And I'm very happy that I didn't jump on another project,
but had so much time on me to be bored enough to start this. So yeah.
And then you were completely swallowed by it. So it was good.
Yeah. So I was going, I'm never going to work so hard again as I did on LilyPond. And then,
oh well, here I am doing it again. So, but yeah, it's been great fun. Yeah.
That's great, Janneke. Thank you so much for building all of GNU Mes and also for talking
to us and explaining why and how and all the things it took and who helped along the way.
So thank you very much for talking to us. Yeah, I'm sure I forgot some important names.
The Reproducible Builds community was essential to get bootstrapping going. And yeah,
I want to thank that whole community too, people, everyone who helped.