1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
|
[1]< Table of contents
Linux and Tiny C Compiler in the browser, part one
2022-05-22
Introduction
Current C compilers running in the browser are experimental, though
[2]Clang In Browser is pretty impressive. Instead of porting a compiler
to WASM, I'm going to take a different approach and use my favourite
method for a lot of things: virtual machines. It's slower, especially
since I'm using a JavaScript cpu emulator, but decent performance is
possible with a fast compiler like Tiny C Compiler and a custom Linux.
Demo
Try cat /opt/test.c and tcc -run /opt/test.c
Motivation
I could sit for hours back in the days and tweak the Linux kernel on my
Pentium-something, in an attempt to make the system boot faster. Most
of the time I just broke things and had to recompile Gentoo. But
there's rarely a need today to compile Linux; if you need something
barebone, you probably use Docker with Alpine Linux. Compiling Linux is
still useful in the embedded space, and with a c compiler in the mix
you get to learn the basics of how programs work.
In the mean time, unikernels such as MirageOS and Unikraft have
surfaced as a supplement or even alternative to Docker. One of the ways
they differ is that your code is compiled into an operating system
instead of running on top of Linux. Imagine you could compile Linux
into your code, having dead code elimination on every feature you don't
use! The sales pitch is this: reduced attack surface, fast boot times
and better performance. Building a custom Linux then becomes even more
exciting because unikernels borrow many concepts from Linux, eg.
Unikraft is configured in the same tui as the Linux kernel (and
Buildroot), make and gcc are used to a great extent, and you can choose
between multiple libc implementations, but what exactly is that?
So...
What to expect
This tutorial teaches you how to compile a small Linux image for
running in the browser via v86; a 32-bit x86 cpu emulator in
javascript. You get insights into cross compilation with a modern
implementation of the c standard library, and c internals when we add a
fast compiler to the image. Remote debugging via gdb is described in
the end using gdbserver, virtual serial ports and qemu.
Prerequisites
Linux that is not wsl, at least an hour to spare for compilation and
the following packages needed by Buildroot:
sudo apt install make gcc g++ libncurses-dev libssl-dev_____
I've built this on Ubuntu 20.04 and 22.04 using bash, but most modern
distro's should be fine.
Before you start, create a directory for the project, perhaps
~/my-v86-linux, then cd into that and run all commands from there. All
commands will assume you are in that directory. Name it whatever you
like, and it doesn't have to be in ~/.
The v86 CPU emulator
v86 runs in the browser and emulates an x86-compatible cpu and hardware
where machine code is translated to WebAssembly modules at runtime. The
list of emulated hardware is impressive:
* x86 instruction set similar to Pentium III
* Keyboard and mouse support
* VGA card
* IDE disk controller
* Network card
* virtio filesystem
* Sound card
View the full list of emulated hardware in [3]v86's readme.
You're not limited to Linux on this emulator. It runs Windows (1.01,
3.1, 95, 98, 2000), ReactOS, FreeBSD, OpenBSD and various hobby
operating systems.
v86 is a hobby project written by an anonymous developer under the
pseudonym "copy". Previous work according to [4]copy's webpage includes
an impossible game, Game of Life and a brainfuck interpreter written in
javascript.
Buildroot
Buildroot is a tool to generate embedded Linux systems through cross
compilation. It's a huge work effort of cross compilation scripts and
configuration files put together in a nice terminal ui, and you can
tweak just about anything. It also acts as customizable toolchain, that
provides us with all the necessary tools to cross compile applications
that doesn't come in Buildroot packages. Read more on
[5]https://buildroot.org.
Let's get started.
cd into your project directory, then download and extract Buildroot:
Hint: Tab through commands and copy instead of using your mouse.
mkdir buildroot_____________________________________________
wget https://github.com/buildroot/buildroot/archive/refs/tag
--output-document - \__________________________________
| tar -xz --strip-components 1 --directory buildroot________
Instead of building Linux from the default Buildroot configuration, we
use a template that sets the right cpu and architecture among other
things:
wget https://github.com/humphd/browser-vm/archive/refs/tags/
--output-document - \__________________________________
| tar -xz --strip-components 1 browser-vm-1.0.2/buildroot-v8
Remove commands that compress licenses. We'll get to that later.
echo "" > buildroot-v86/board/v86/post-image.sh_____________
Tell Buildroot to create a new .config file with preloaded settings
from the template
make --directory buildroot BR2_EXTERNAL=../buildroot-v86 v86
You're almost ready to build the initial image. Execute:
make --directory buildroot menuconfig_______________________
Go to Toolchain -> C library and pick musl, exit and save. Then build
everything
make --directory buildroot__________________________________
This is going to take a while, but the good thing is that caching is
enabled, so next time will be substantially faster.
About musl... It's an implementation of the c standard library, like
uclibc and glibc. Your distro is probably using glibc, the GNU C
Library, which is big in size and not well suited for embedded Linux
where size matters. uclibc is better suited here, and so is musl which
seems to be the clear winner in [6]this (biased) comparison. I prefer
musl's MIT license over (L)GPL, which makes it interesting for
proprietary applications running in unikernels. It's developed and
maintained by Rich Felker, with a long list of contributors, and the
source code is said to be a reference code to look into for systems
programming in [7]this podcast (at 01:01:17).
Preparing the website
While waiting for Buildroot to compile, let's create the website that
will host the emulator and run Buildroot Linux:
mkdir web___________________________________________________
wget https://github.com/copy/v86/releases/download/latest/li
--directory-prefix web__________________________________
wget https://github.com/copy/v86/releases/download/latest/v8
--directory-prefix web__________________________________
wget https://github.com/copy/v86/releases/download/latest/v8
--directory-prefix web__________________________________
wget https://github.com/copy/v86/archive/refs/tags/latest.ta
--output-document - \__________________________________
| tar -xz --strip-components 2 --directory web \___________
v86-latest/bios/seabios.bin \__________________________
v86-latest/bios/vgabios.bin_____________________________
____________________________________________________________
cat >web/index.html <<EOF___________________________________
____________________________________________________________
<meta charset="utf8">_______________________________________
<title>Emulator</title>_____________________________________
<body bgcolor="#101010">____________________________________
____________________________________________________________
<div id="screen_container">_________________________________
<div style="white-space: pre; font: 14px monospace; line
<canvas hidden></canvas>________________________________
</div>______________________________________________________
____________________________________________________________
<script src="/libv86.js"></script>__________________________
<script>____________________________________________________
var emulator = new V86Starter({_____________________________
wasm_path : "/v86.wasm",_________________________
memory_size : 64 * 1024 * 1024, // 64 MB memory ou
vga_memory_size : 2 * 1024 * 1024,_____________________
screen_container : screen_container,____________________
bios : {url: "/seabios.bin"},_______________
vga_bios : {url: "/vgabios.bin"},_______________
cdrom : {url: "/linux.iso"},_________________
filesystem : {},__________________________________
autostart : true_________________________________
})__________________________________________________________
</script>___________________________________________________
EOF_________________________________________________________
When Buildroot is done compiling, run
cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____
Then open a new terminal and start a simple webserver pointing to the
web directory, eg.
python3 -m http.server 8000 --directory web_________________
and open [8]http://localhost:8000 to see v86 in action. Log in as root,
no password needed.
Customize your image
Buildroot is all about customization. Try the following commands:
make --directory buildroot menuconfig_______________________
make --directory buildroot busybox-menuconfig_______________
make --directory buildroot linux-menuconfig_________________
There's a lot to explore.
menuconfig
menuconfig is where you configure Buildroot with things such as Linux
kernel version, what bootloader to use (grub2, syslinux etc.), the libc
library you want to use like when you chose musl and which architecture
to compile for. There are multiple packages to choose from, ranging
from small libraries and utilities to X11 and Qt.
busybox-menuconfig
Busybox combines hundreds of Linux utilities into one binary and is
also highly configurable with busybox-menuconfig. It provides you with
ls, grep, diff and many other utilities you're used to on Linux, and
I'd encourage you to remove all the tools you don't use to create a
smaller image. Ideally Busybox would come with the bare minimum instead
of having to manually remove unnecessary things. This is where
unikernels shine, because they take the opposite approach, where you
start with almost nothing and add what you need.
linux-menuconfig
linux-menuconfig is where you configure the Linux kernel. There's a
million things to configure, and you can easily break something unless
you know what you're doing. In one of the following tutorials for this
series, I'll show you how to tweak the kernel by trial and error, since
that's how I do it: Remove one feature, test the system, rinse and
repeat.
Resist the temptation to make changes for now.
rootfs_overlay
Located in buildroot-v86/board/v86/rootfs_overlay, this is where you
place files that you want to add to the image. Our template includes
two files: etc/fstab and etc/inittab.
Disable kernel messages after login
Some things are not critical for booting the system, but is run as part
of the boot process anyway. They can be slow to start and clutter the
terminal after login, potentially adding log messages in the middle of
writing a command. To disable kernel messages after login, create the
following file
mkdir buildroot-v86/board/v86/rootfs_overlay/etc/profile.d__
echo "echo 0 >/proc/sys/kernel/printk" \___________________
>buildroot-v86/board/v86/rootfs_overlay/etc/profile.d/no
All .sh files in etc/profile.d are run on login.
Auto login
etc/inittab prepares the file system and mounts etc/fstab, runs init
scripts and "spawn" applications after boot. One of the commands for
spawning ends with the comment "# GENERIC_SERIAL" and that line needs
to be changed to not prompt for login and just start /bin/sh.
(F=buildroot-v86/board/v86/rootfs_overlay/etc/inittab && cp
&& sed --in-place "28d" $F \______________________________
&& sed --in-place "s/.*# GENERIC_SERIAL/console::respawn:-\
&& diff /tmp/oldf $F)______________________________________
Notice that the command starts with console::respawn. Respawn means
that if sh crashes, Busybox will keep restarting it until it succeeds.
getty is replaced here because it's the application that prompts for
login. It also prevents us from sending messages between tty's, which
only makes sense in a multi user system: If user A is logged into tty1
and user B is in tty2, then A shouldn't be able to bother B with `echo
"Hi B!" >/dev/tty2`. Instead we spawn -/bin/sh, where the hyphen
instructs Busybox to treat the shell as a login shell. Without it,
/etc/profile and scripts in /etc/profile.d are ignored.
To add the new files to your image, you simply compile again
make --directory buildroot__________________________________
cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____
Add Tiny C Compiler
Tiny C Compiler, or tcc, is:
* ANSI C compliant, with most [9]C99 extensions.
* Small, roughly ~300 KB.
* Fast according to [10]the homepage, specifically 9 times faster
than gcc.
I've used tcc to compile win32 applications with opengl and gdi+, and a
pdf library that we'll use later to benchmark performance. There are
limitations to what can be compiled, I haven't managed to compile
libpng for instance, but you can use gcc to provide a shared library
that tcc can link with.
The compiler is written by Fabrice Bellard, author of qemu, ffmpeg,
quickjs, jslinux and the list goes on. You've likely used his software
in one way or another. I will use the last version he released before
abondoning tcc, but it's alive and well in [11]this fork.
To get tcc working we have to compile it twice: The first time is to
compile libtcc1.a. The way this happens according to the Makefile is
that gcc is used to compile tcc, and then tcc builds and outputs
libtcc1.a. If we start by compiling with musl, it's not going to run on
the host, and thus libtcc1.a cannot be built. So first step is to
configure the build with --enable-cross, which builds a cross compiler
that compiles the right libtcc1.a. After that, we can compile for a
single architecture and libc: x86 musl.
mkdir tcc___________________________________________________
wget http://download.savannah.gnu.org/releases/tinycc/tcc-0.
--output-document - \__________________________________
| tar -xj --strip-components 1 --directory tcc \___________
--exclude tests --exclude examples______________________
____________________________________________________________
mkdir libtcc________________________________________________
cp --recursive tcc/* libtcc_________________________________
Configure tcc cross compilers for current cpu architecture to get
i386-version of libtcc1.a
(cd libtcc && ./configure --prefix=./output --enable-cross)_
Malloc hooks have been removed in glibc 2.34 and Ubuntu 22.04 ships
with glibc 2.35. The next two commands are unnecessary on Ubuntu 20.04,
but harmless.
(F=libtcc/lib/bcheck.c && cp $F /tmp/oldf \________________
&& sed --in-place "s/#define CONFIG_TCC_MALLOC_HOOKS//" $F
&& sed --in-place "s/#define HAVE_MEMALIGN//" $F \________
&& diff /tmp/oldf $F)______________________________________
Then build libtcc on the host and copy to the file system overlay.
make --directory libtcc_____________________________________
make --directory libtcc install_____________________________
mkdir -p buildroot-v86/board/v86/rootfs_overlay/lib/tcc_____
cp libtcc/output/lib/tcc/i386-libtcc1.a \__________________
buildroot-v86/board/v86/rootfs_overlay/lib/tcc/libtcc1.a
Next step is to configure and build the compiler for x86 musl.
(cd tcc && ./configure \___________________________________
--cpu=x86 \____________________________________________
--config-musl \________________________________________
--cross-prefix=${PWD}/../buildroot/output/host/bin/i686-
--elfinterp=/lib/ld-musl-i386.so.1 \___________________
--crtprefix=/lib \_____________________________________
--libdir=/lib \________________________________________
--tccdir=/lib/tcc \____________________________________
--bindir=/bin \________________________________________
--includedir=/include \________________________________
--sysincludepaths=/lib/tcc/include:/include \__________
--sharedir=-unused \___________________________________
`# We need debug symbols for later, but uncomment this i
`# The difference is ~70% file size reduction.` \______
`# --strip-binaries`)___________________________________
make --directory tcc \_____________________________________
--assume-old libtcc1.a \_______________________________
--assume-old tcc-doc.html \____________________________
--assume-old tcc-doc.info_______________________________
DESTDIR=$PWD/tcc/output make --directory tcc install________
cp --recursive tcc/output/* buildroot-v86/board/v86/rootfs_o
--assume-old makes make skip libtcc1.a. Also skip steps that require
makeinfo since documentation will end up in the directory
"output-unused" as specified a bit hacky with --sharedir=-unused.
DESTDIR is set when installing because configuring with
--prefix=./output compiles tcc with search paths beginning with that
prefix.
--elfinterp points to the dynamic linker in the image, responsible for
locating shared libraries needed by an application, prepare it to run
and then execute it. Because we use musl, this file is called
ld-musl-i386.so.1, but on your glibc-based distro it's (likely)
ld-linux-x86-64.so.2. Without it, the system won't know how to start
applications and you'll get `/bin/sh: {your command}: not found`
For tcc to create executables, it needs startup routines that are
linked into the executable. Those files start with crt, short for c
runtime, and we have configured tcc to search for them in /lib. Since
tcc supports running c without creating an executable via `tcc -run
file.c`, you only need these files if you want to build executables
(and if you plan on continuing this tutorial). Here's a quick summary
of crt files from [12]https://dev.gentoo.org/~vapier/crt.txt:
crt1.o
Contains the _start symbol which sets up the env with
argc/argv/libc _init/libc _fini before jumping to the libc main.
crti.o
Defines the function prolog; _init in the .init section and
_fini in the .fini section.
crtn.o
Defines the function epilog.
cp buildroot/output/host/i686-buildroot-linux-musl/sysroot/l
buildroot-v86/board/v86/rootfs_overlay/lib______________
That is what's needed for running tcc in v86, but it doesn't do much
without musl's standard c headers. We pick only the bare minimum,
because all headers are ~5 mb uncompressed.
printf "buildroot/output/host/i686-buildroot-linux-musl/sysr
bits alloca.h assert.h complex.h ctype.h errno.h fenv.h
inttypes.h iso646.h limits.h locale.h math.h memory.h ma
signal.h stdalign.h stdarg.h stdbool.h stddef.h stdint.h
stdnoreturn.h string.h strings.h tgmath.h threads.h time
wchar.h wctype.h \_____________________________________
| xargs -0 cp --recursive --target buildroot-v86/board/v86/r
Hello world
With tcc compiled and installed into our image, it's time to prepare
some code to test if the compiler works.
mkdir buildroot-v86/board/v86/rootfs_overlay/opt____________
cat >buildroot-v86/board/v86/rootfs_overlay/opt/test.c <<EOF
#include <stdio.h>__________________________________________
#include <string.h>_________________________________________
____________________________________________________________
int main(int argc, char **argv)_____________________________
{___________________________________________________________
char *name = "stranger";________________________________
if (argc > 1 && strlen(argv[1]) > 0)____________________
name = argv[1];_____________________________________
printf("Hello, %s\n", name);____________________________
return 0;_______________________________________________
}___________________________________________________________
EOF_________________________________________________________
Rebuild image with the new files:
make --directory buildroot__________________________________
cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____
If you've closed your server, open a new terminal and run
python3 -m http.server 8000 --directory web_________________
Go to [13]http://localhost:8000 and try this in the emulator:
# Compile and run without producing a binary________________
tcc -run /opt/test.c________________________________________
____________________________________________________________
# Create binary_____________________________________________
tcc /opt/test.c -o hello____________________________________
./hello world_______________________________________________
Benchmarking
Time for a quick benchmark to see what performance we can expect. We'll
use the excellent pdf writer library, libharu.
mkdir libharu_______________________________________________
wget https://github.com/libharu/libharu/archive/refs/tags/RE
--output-document - \__________________________________
| tar -xz --strip-components 1 --wildcards --directory libha
"libharu-RELEASE_2_3_0/include/*.h" \________________
"libharu-RELEASE_2_3_0/src/*.c" \____________________
libharu-RELEASE_2_3_0/src/t4.h \_____________________
libharu-RELEASE_2_3_0/demo/line_demo.c________________
____________________________________________________________
cat >libharu/include/hpdf_config.h <<EOF____________________
#define LIBHPDF_HAVE_NOPNGLIB_______________________________
#define HPDF_NOPNGLIB_______________________________________
#define LIBHPDF_HAVE_NOZLIB_________________________________
EOF_________________________________________________________
Doing `sudo apt install sloccount` and then `sloccount libharu` tells
us that the library consists of 128394 physical source lines of code.
That's because of surprisingly big files with arrays containing
encoding data, but let's see how long it'll take to compile that by
creating a quick and dirty benchmark that works for both gcc and tcc.
cat >libharu/benchmark <<EOF________________________________
LIBHARUDIR=\$(dirname \$(readlink -f "\$0"))________________
CC=\$1______________________________________________________
[[ \$CC = gcc ]] && LIBMATH=-lm_____________________________
time \$CC -I\$LIBHARUDIR/include "\$LIBHARUDIR/src/*.c" \\_
\$LIBHARUDIR/demo/line_demo.c \$LIBMATH -o /dev/null____
EOF_________________________________________________________
chmod +x libharu/benchmark__________________________________
____________________________________________________________
# Build a shared library for another benchmark______________
buildroot/output/host/bin/i686-buildroot-linux-musl-gcc -sha
-Ilibharu/include libharu/src/*.c -lm \________________
-o buildroot-v86/board/v86/rootfs_overlay/lib/libharu.so
____________________________________________________________
# Make it easy to run a benchmark where tcc links with libha
# compiling from scratch.___________________________________
cat >libharu/benchmark-link <<EOF___________________________
time tcc -Ilibharu/include -lharu libharu/demo/line_demo.c -
EOF_________________________________________________________
chmod +x libharu/benchmark-link_____________________________
____________________________________________________________
cp --recursive libharu buildroot-v86/board/v86/rootfs_overla
____________________________________________________________
make --directory buildroot__________________________________
cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____
Run the benchmarks
Run this locally
libharu/benchmark gcc_______________________________________
Run this in the emulator
libharu/benchmark tcc # Patience required__________________
libharu/benchmark-link______________________________________
As the benchmark unsurprisingly shows us, linking to a precompiled
shared library is faster than compiling from scratch. On my machine,
benchmark-link is 60 ms in v86. Not bad! Take a look at
libharu/demo/line_demo.c, it's not the tinyest c file out there.
I didn't show you how to compile a shared library with tcc on purpose
(only how to link with one). There's a bug somewhere, and we'll
investigate that in the next section.
Debugging
If you've followed the steps so far, you can open your emulator and
execute
tcc -shared -fPIC -Ilibharu/include libharu/src/*.c_________
This command tells tcc to compile a shared library instead of an
executable and will take approximately 30 seconds, then it'll exit with
a segmentation fault.
I won't tell you how to fix this problem, because I have no need to
compile shared libraries with tcc on a custom x86 system, nor do I have
the intellect to fix the bug. But I didn't know (the latter) at the
time, so I wanted to figure out what was wrong, which required...
Remote debugging
The gnu debugger, gdb, supports remote debugging via gdbserver, which
is a small application you run on the target and connect to from gdb.
Running gdbserver inside v86, inside a browser, and connecting to that
from gdb would be cool, but since gdb doesn't work in v86 (you'll find
out why later), gdbserver is not going to either. So to debug
something, we need to reproduce the bug in qemu, and use socat to
create a virtual serial port for gdb/gdbserver communication. And to
compile gdb we need musl-cross-make via git.
sudo apt install qemu-system-i386 socat git_________________
With qemu installed, it's easy to boot your image
qemu-system-i386 -serial stdio -cdrom web/linux.iso -cpu Wes
And you even get a nice serial console for copy pasting! That was the
good news, now for the bad...
Buildroot, gdb and musl doesn't go well together and results in
configure errors if you select the gdb package. So we have to compile
gdb on our own, using a different toolchain. This could have been
avoided with uclibc instead of musl, but in the name of MIT licenses,
here we are. Hopefully you won't mind another huge compilation step.
The following will clone musl-cross-make, configure and compile it.
git clone https://github.com/richfelker/musl-cross-make.git_
____________________________________________________________
cat >musl-cross-make/config.mak <<EOF_______________________
TARGET=i686-linux-musl______________________________________
MUSL_VER=git-v1.2.2_________________________________________
GCC_VER=10.3.0______________________________________________
# Not needed libs___________________________________________
COMMON_CONFIG += --disable-nls______________________________
EOF_________________________________________________________
____________________________________________________________
make --directory musl-cross-make -j$(nproc)_________________
make --directory musl-cross-make install____________________
Now is the time to grab a coffee.
Welcome back, we're now ready to build gdb/gdbserver with the toolchain
installed into musl-cross-make/output/bin. Compiling gdb 10.2 is ideal
here because it doesn't require gmp (GNU Multiple Precision Arithmetic
Library), which later versions does.
mkdir gdb___________________________________________________
wget https://ftp.gnu.org/gnu/gdb/gdb-10.2.tar.gz --output-do
| tar -xz --strip-components 1 --directory gdb______________
(cd gdb && \_______________________________________________
PATH=$PATH:$PWD/../musl-cross-make/output/bin \__________
./configure \_____________________________________________
--prefix=$PWD/output \________________________________
--host=i686-linux-musl \______________________________
--disable-nls \_______________________________________
--with-curses)_________________________________________
PATH=$PATH:$PWD/musl-cross-make/output/bin make --directory
PATH=$PATH:$PWD/musl-cross-make/output/bin make --directory
The new toolchain in `musl-cross-make/output/bin` follows a naming
convention for cross compilers, so every program starts with
i686-linux-musl as specified in musl-cross-make/config.mak by TARGET.
gdb follows the same convention, and by specifying i686-linux-musl in
`--host` and adding the toolchain to PATH, gdb is able to locate the
right tools without having to install them on your system. We also
--disable-nls (localization) and compile --with-curses instead of a
default ancient alternative that we'd have to compile separately.
Clean gdbserver by strip'ing it of debug symbols and non-essential
data, and copy to the target. This reduces gdbserver file size from 8
mb to 500 kb. For gdbserver to run, the c++ standard library is
required as well.
musl-cross-make/output/bin/i686-linux-musl-strip gdb/output/
cp gdb/output/bin/gdbserver buildroot-v86/board/v86/rootfs_o
musl-cross-make/output/bin/i686-linux-musl-strip \_________
musl-cross-make/output/i686-linux-musl/lib/libstdc++.so.
cp musl-cross-make/output/i686-linux-musl/lib/libstdc++.so.6
buildroot-v86/board/v86/rootfs_overlay/lib______________
These files are ~2500 kb in total, so you want to remove them again
after debugging.
gdb must then be compiled for the host with i686 target support, which
is easy in Buildroot:
make --directory buildroot menuconfig_______________________
then select Toolchain -> Build cross gdb for the host and compile
make --directory buildroot__________________________________
cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____
Qemu and virtual serial ports
While compiling, we create a pseudo terminal (pty) acting as a virtual
serial port. Since socat uses random id's for the terminals like
/dev/pty/2 and /dev/pty/18, we tell socat to create symbolic links for
the random id's with id's we know in advance.
Open a new terminal and run the following:
socat pty,rawer,link=/tmp/vserial-host pty,rawer,link=/tmp/v
When compilation is done, start qemu in a new terminal and connect with
the virtual serial port on the host
qemu-system-i386 -serial stdio -cdrom web/linux.iso -cpu Wes
-chardev serial,id=gdbserial,path=/tmp/vserial-host \__
-device isa-serial,chardev=gdbserial____________________
if you write `dmesg | grep tty` in the serial console you'll see two
connected ports: ttyS0 which is connected to your terminal via `-serial
stdio` and ttyS1 is connected to the virtual socat serial port.
Start gdbserver in your qemu serial console for tcc debugging
gdbserver /dev/ttyS1 tcc -shared -fPIC -Ilibharu/include lib
then start gdb on the host, pointing to the cross compiled version of
tcc
buildroot/output/host/bin/i686-buildroot-linux-musl-gdb \__
-ix buildroot/output/staging/usr/share/buildroot/gdbinit
tcc/output/bin/tcc______________________________________
-ix means: Before the "inferior", which is gdb's name for a process
(simply put), execute the file buildroot/.../gdbinit. `gdbinit` is
provided by Buildroot and contains the following:
add-auto-load-safe-path {...}/buildroot/output/host/i686-buildroot-linux-musl/sy
sroot
set sysroot {...}/buildroot/output/host/i686-buildroot-linux-musl/sysroot
which specify the directory that contains copies of libraries on the
target, in corresponding subdirectories.
Let's connect to qemu and run tcc:
(gdb)
target remote /tmp/vserial-target___________________________
(gdb)
continue____________________________________________________
You'll get a few warnings that I believe is due to shared libraries
being stripped of debugging symbols by Buildroot. Then the following
error appears:
0x004f9c1f in fill_local_got_entries (s1=0xb7e99020) at tccelf.c:1362
1362 for_each_elem(s1->got->reloc, 0, rel, ElfW_Rel) {
Looking into tcc's source, we see that this code is only run when
compiling shared libraries. Perhaps recompiling for uclibc makes a
difference, or upgrading to the tcc fork (which requires additional
work in regards to compilation). Let me know if you fix the error and
I'll add it to the tutorial.
We could have added gdb to rootfs_overlay and run that in qemu instead,
but then we lose code snippets of the error due to missing source
files. Feel free to use gdb on the target if you're okay with just line
numbers.
Debugging in v86
I've not been able to get gdb working in v86. Everything segfaults
whenever I attempt to debug. Changing toolchain to uclibc will make
Buildroot compile gdb, but it doesn't fix the issue, and downgrading
gdb from 11.2 to 10.2 or 8 makes no difference. gdb works when running
in qemu, so it must have something to do with v86. It would have been
great to have gdb tell what crashed at runtime, but a c compiler will
have to do for now.
Licenses
To get all licenses from Buildroot, you write
make --directory buildroot legal-info_______________________
They're then found in buildroot/output/legal-info. Getting a complete
list of licenses for everything used here is left as an exercise for
the reader.
What's next
In the next tutorial(s) I'll show you how to:
* Interact with v86 from JavaScript via serial and 9P.
* Create a simple interface for dmesg diffing to better optimize the
image.
* Compile and run c applications in the browser with a small ui.
* Build a streaming parser for Linux kernel calls to create a basic
but highly stylable console with unicode support; to display stdout
(printf/puts/putchar/...) and ask for input on stdin
(scanf/gets/getchar/...).
If you got this far, perhaps you want to subscribe to new tutorials?
Then [14]subscribetoj@nsommer.dk and I'll add you to the list. The mail
can be empty, but if not I promise I'll read it. You can always
[15]unsubscribetoj@nsommer.dk.
Tipping: I'm writing tutorials for as long as there's money in the
bank. Help me write more by tipping via bank transfer (IBAN) to DK81
2000 6277 7121 54. Any amount is highly appreciated!
References
1. https://ja.nsommer.dk/
2. https://tbfleming.github.io/cib
3. https://github.com/copy/v86/blob/master/Readme.md#readme
4. https://copy.sh/
5. https://buildroot.org/
6. http://www.etalabs.net/compare_libcs.html
7. https://www.se-radio.net/2020/06/episode-414-jens-gustedt-on-modern-c
8. http://localhost:8000/
9. https://bellard.org/tcc/tcc-doc.html#ISOC99-extensions
10. https://bellard.org/tcc/
11. https://repo.or.cz/tinycc.git
12. https://dev.gentoo.org/~vapier/crt.txt
13. http://localhost:8000/
14. mailto:subscribetoj@nsommer.dk
15. mailto:unsubscribetoj@nsommer.dk
|