summaryrefslogtreecommitdiff
path: root/docs/ja.nsommer.dk_articles_linux-and-tiny-c-compiler-in-the-browser-part-one.txt
blob: d7eb9e78b8f5df5efd8ec7b18710bf886531ad45 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
   [1]< Table of contents

               Linux and Tiny C Compiler in the browser, part one

   2022-05-22

Introduction

   Current C compilers running in the browser are experimental, though
   [2]Clang In Browser is pretty impressive. Instead of porting a compiler
   to WASM, I'm going to take a different approach and use my favourite
   method for a lot of things: virtual machines. It's slower, especially
   since I'm using a JavaScript cpu emulator, but decent performance is
   possible with a fast compiler like Tiny C Compiler and a custom Linux.
   Demo

   Try cat /opt/test.c and tcc -run /opt/test.c

Motivation

   I could sit for hours back in the days and tweak the Linux kernel on my
   Pentium-something, in an attempt to make the system boot faster. Most
   of the time I just broke things and had to recompile Gentoo. But
   there's rarely a need today to compile Linux; if you need something
   barebone, you probably use Docker with Alpine Linux. Compiling Linux is
   still useful in the embedded space, and with a c compiler in the mix
   you get to learn the basics of how programs work.

   In the mean time, unikernels such as MirageOS and Unikraft have
   surfaced as a supplement or even alternative to Docker. One of the ways
   they differ is that your code is compiled into an operating system
   instead of running on top of Linux. Imagine you could compile Linux
   into your code, having dead code elimination on every feature you don't
   use! The sales pitch is this: reduced attack surface, fast boot times
   and better performance. Building a custom Linux then becomes even more
   exciting because unikernels borrow many concepts from Linux, eg.
   Unikraft is configured in the same tui as the Linux kernel (and
   Buildroot), make and gcc are used to a great extent, and you can choose
   between multiple libc implementations, but what exactly is that?

   So...

What to expect

   This tutorial teaches you how to compile a small Linux image for
   running in the browser via v86; a 32-bit x86 cpu emulator in
   javascript. You get insights into cross compilation with a modern
   implementation of the c standard library, and c internals when we add a
   fast compiler to the image. Remote debugging via gdb is described in
   the end using gdbserver, virtual serial ports and qemu.

Prerequisites

   Linux that is not wsl, at least an hour to spare for compilation and
   the following packages needed by Buildroot:

   sudo apt install make gcc g++ libncurses-dev libssl-dev_____

   I've built this on Ubuntu 20.04 and 22.04 using bash, but most modern
   distro's should be fine.

   Before you start, create a directory for the project, perhaps
   ~/my-v86-linux, then cd into that and run all commands from there. All
   commands will assume you are in that directory. Name it whatever you
   like, and it doesn't have to be in ~/.

The v86 CPU emulator

   v86 runs in the browser and emulates an x86-compatible cpu and hardware
   where machine code is translated to WebAssembly modules at runtime. The
   list of emulated hardware is impressive:
     * x86 instruction set similar to Pentium III
     * Keyboard and mouse support
     * VGA card
     * IDE disk controller
     * Network card
     * virtio filesystem
     * Sound card

   View the full list of emulated hardware in [3]v86's readme.

   You're not limited to Linux on this emulator. It runs Windows (1.01,
   3.1, 95, 98, 2000), ReactOS, FreeBSD, OpenBSD and various hobby
   operating systems.

   v86 is a hobby project written by an anonymous developer under the
   pseudonym "copy". Previous work according to [4]copy's webpage includes
   an impossible game, Game of Life and a brainfuck interpreter written in
   javascript.

Buildroot

   Buildroot is a tool to generate embedded Linux systems through cross
   compilation. It's a huge work effort of cross compilation scripts and
   configuration files put together in a nice terminal ui, and you can
   tweak just about anything. It also acts as customizable toolchain, that
   provides us with all the necessary tools to cross compile applications
   that doesn't come in Buildroot packages. Read more on
   [5]https://buildroot.org.

   Let's get started.

   cd into your project directory, then download and extract Buildroot:

   Hint: Tab through commands and copy instead of using your mouse.

   mkdir buildroot_____________________________________________
   wget https://github.com/buildroot/buildroot/archive/refs/tag
       --output-document -  \__________________________________
   | tar -xz --strip-components 1 --directory buildroot________

   Instead of building Linux from the default Buildroot configuration, we
   use a template that sets the right cpu and architecture among other
   things:

   wget https://github.com/humphd/browser-vm/archive/refs/tags/
       --output-document -  \__________________________________
   | tar -xz --strip-components 1 browser-vm-1.0.2/buildroot-v8

   Remove commands that compress licenses. We'll get to that later.

   echo "" > buildroot-v86/board/v86/post-image.sh_____________

   Tell Buildroot to create a new .config file with preloaded settings
   from the template

   make --directory buildroot BR2_EXTERNAL=../buildroot-v86 v86

   You're almost ready to build the initial image. Execute:

   make --directory buildroot menuconfig_______________________

   Go to Toolchain -> C library and pick musl, exit and save. Then build
   everything

   make --directory buildroot__________________________________

   This is going to take a while, but the good thing is that caching is
   enabled, so next time will be substantially faster.

   About musl... It's an implementation of the c standard library, like
   uclibc and glibc. Your distro is probably using glibc, the GNU C
   Library, which is big in size and not well suited for embedded Linux
   where size matters. uclibc is better suited here, and so is musl which
   seems to be the clear winner in [6]this (biased) comparison. I prefer
   musl's MIT license over (L)GPL, which makes it interesting for
   proprietary applications running in unikernels. It's developed and
   maintained by Rich Felker, with a long list of contributors, and the
   source code is said to be a reference code to look into for systems
   programming in [7]this podcast (at 01:01:17).

Preparing the website

   While waiting for Buildroot to compile, let's create the website that
   will host the emulator and run Buildroot Linux:

   mkdir web___________________________________________________
   wget https://github.com/copy/v86/releases/download/latest/li
       --directory-prefix web__________________________________
   wget https://github.com/copy/v86/releases/download/latest/v8
       --directory-prefix web__________________________________
   wget https://github.com/copy/v86/releases/download/latest/v8
       --directory-prefix web__________________________________
   wget https://github.com/copy/v86/archive/refs/tags/latest.ta
       --output-document -  \__________________________________
   | tar -xz --strip-components 2 --directory web  \___________
       v86-latest/bios/seabios.bin  \__________________________
       v86-latest/bios/vgabios.bin_____________________________
   ____________________________________________________________
   cat >web/index.html <<EOF___________________________________
   ____________________________________________________________
   <meta charset="utf8">_______________________________________
   <title>Emulator</title>_____________________________________
   <body bgcolor="#101010">____________________________________
   ____________________________________________________________
   <div id="screen_container">_________________________________
       <div style="white-space: pre; font: 14px monospace; line
       <canvas hidden></canvas>________________________________
   </div>______________________________________________________
   ____________________________________________________________
   <script src="/libv86.js"></script>__________________________
   <script>____________________________________________________
   var emulator = new V86Starter({_____________________________
       wasm_path        : "/v86.wasm",_________________________
       memory_size      : 64 * 1024 * 1024,  // 64 MB memory ou
       vga_memory_size  : 2 * 1024 * 1024,_____________________
       screen_container : screen_container,____________________
       bios             : {url: "/seabios.bin"},_______________
       vga_bios         : {url: "/vgabios.bin"},_______________
       cdrom            : {url: "/linux.iso"},_________________
       filesystem       : {},__________________________________
       autostart        : true_________________________________
   })__________________________________________________________
   </script>___________________________________________________
   EOF_________________________________________________________

   When Buildroot is done compiling, run

   cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____

   Then open a new terminal and start a simple webserver pointing to the
   web directory, eg.

   python3 -m http.server 8000 --directory web_________________

   and open [8]http://localhost:8000 to see v86 in action. Log in as root,
   no password needed.

Customize your image

   Buildroot is all about customization. Try the following commands:

   make --directory buildroot menuconfig_______________________

   make --directory buildroot busybox-menuconfig_______________

   make --directory buildroot linux-menuconfig_________________

   There's a lot to explore.

  menuconfig

   menuconfig is where you configure Buildroot with things such as Linux
   kernel version, what bootloader to use (grub2, syslinux etc.), the libc
   library you want to use like when you chose musl and which architecture
   to compile for. There are multiple packages to choose from, ranging
   from small libraries and utilities to X11 and Qt.

  busybox-menuconfig

   Busybox combines hundreds of Linux utilities into one binary and is
   also highly configurable with busybox-menuconfig. It provides you with
   ls, grep, diff and many other utilities you're used to on Linux, and
   I'd encourage you to remove all the tools you don't use to create a
   smaller image. Ideally Busybox would come with the bare minimum instead
   of having to manually remove unnecessary things. This is where
   unikernels shine, because they take the opposite approach, where you
   start with almost nothing and add what you need.

  linux-menuconfig

   linux-menuconfig is where you configure the Linux kernel. There's a
   million things to configure, and you can easily break something unless
   you know what you're doing. In one of the following tutorials for this
   series, I'll show you how to tweak the kernel by trial and error, since
   that's how I do it: Remove one feature, test the system, rinse and
   repeat.

   Resist the temptation to make changes for now.

  rootfs_overlay

   Located in buildroot-v86/board/v86/rootfs_overlay, this is where you
   place files that you want to add to the image. Our template includes
   two files: etc/fstab and etc/inittab.

  Disable kernel messages after login

   Some things are not critical for booting the system, but is run as part
   of the boot process anyway. They can be slow to start and clutter the
   terminal after login, potentially adding log messages in the middle of
   writing a command. To disable kernel messages after login, create the
   following file

   mkdir buildroot-v86/board/v86/rootfs_overlay/etc/profile.d__
   echo "echo 0 >/proc/sys/kernel/printk"  \___________________
       >buildroot-v86/board/v86/rootfs_overlay/etc/profile.d/no

   All .sh files in etc/profile.d are run on login.

  Auto login

   etc/inittab prepares the file system and mounts etc/fstab, runs init
   scripts and "spawn" applications after boot. One of the commands for
   spawning ends with the comment "# GENERIC_SERIAL" and that line needs
   to be changed to not prompt for login and just start /bin/sh.

   (F=buildroot-v86/board/v86/rootfs_overlay/etc/inittab && cp 
    && sed --in-place "28d" $F  \______________________________
    && sed --in-place "s/.*# GENERIC_SERIAL/console::respawn:-\
    && diff /tmp/oldf $F)______________________________________

   Notice that the command starts with console::respawn. Respawn means
   that if sh crashes, Busybox will keep restarting it until it succeeds.

   getty is replaced here because it's the application that prompts for
   login. It also prevents us from sending messages between tty's, which
   only makes sense in a multi user system: If user A is logged into tty1
   and user B is in tty2, then A shouldn't be able to bother B with `echo
   "Hi B!" >/dev/tty2`. Instead we spawn -/bin/sh, where the hyphen
   instructs Busybox to treat the shell as a login shell. Without it,
   /etc/profile and scripts in /etc/profile.d are ignored.

   To add the new files to your image, you simply compile again

   make --directory buildroot__________________________________
   cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____

Add Tiny C Compiler

   Tiny C Compiler, or tcc, is:
     * ANSI C compliant, with most [9]C99 extensions.
     * Small, roughly ~300 KB.
     * Fast according to [10]the homepage, specifically 9 times faster
       than gcc.

   I've used tcc to compile win32 applications with opengl and gdi+, and a
   pdf library that we'll use later to benchmark performance. There are
   limitations to what can be compiled, I haven't managed to compile
   libpng for instance, but you can use gcc to provide a shared library
   that tcc can link with.

   The compiler is written by Fabrice Bellard, author of qemu, ffmpeg,
   quickjs, jslinux and the list goes on. You've likely used his software
   in one way or another. I will use the last version he released before
   abondoning tcc, but it's alive and well in [11]this fork.

   To get tcc working we have to compile it twice: The first time is to
   compile libtcc1.a. The way this happens according to the Makefile is
   that gcc is used to compile tcc, and then tcc builds and outputs
   libtcc1.a. If we start by compiling with musl, it's not going to run on
   the host, and thus libtcc1.a cannot be built. So first step is to
   configure the build with --enable-cross, which builds a cross compiler
   that compiles the right libtcc1.a. After that, we can compile for a
   single architecture and libc: x86 musl.

   mkdir tcc___________________________________________________
   wget http://download.savannah.gnu.org/releases/tinycc/tcc-0.
       --output-document -  \__________________________________
   | tar -xj --strip-components 1 --directory tcc  \___________
       --exclude tests --exclude examples______________________
   ____________________________________________________________
   mkdir libtcc________________________________________________
   cp --recursive tcc/* libtcc_________________________________

   Configure tcc cross compilers for current cpu architecture to get
   i386-version of libtcc1.a

   (cd libtcc && ./configure --prefix=./output --enable-cross)_

   Malloc hooks have been removed in glibc 2.34 and Ubuntu 22.04 ships
   with glibc 2.35. The next two commands are unnecessary on Ubuntu 20.04,
   but harmless.

   (F=libtcc/lib/bcheck.c && cp $F /tmp/oldf  \________________
    && sed --in-place "s/#define CONFIG_TCC_MALLOC_HOOKS//" $F 
    && sed --in-place "s/#define HAVE_MEMALIGN//" $F  \________
    && diff /tmp/oldf $F)______________________________________

   Then build libtcc on the host and copy to the file system overlay.

   make --directory libtcc_____________________________________
   make --directory libtcc install_____________________________
   mkdir -p buildroot-v86/board/v86/rootfs_overlay/lib/tcc_____
   cp libtcc/output/lib/tcc/i386-libtcc1.a  \__________________
       buildroot-v86/board/v86/rootfs_overlay/lib/tcc/libtcc1.a

   Next step is to configure and build the compiler for x86 musl.

   (cd tcc && ./configure  \___________________________________
       --cpu=x86  \____________________________________________
       --config-musl  \________________________________________
       --cross-prefix=${PWD}/../buildroot/output/host/bin/i686-
       --elfinterp=/lib/ld-musl-i386.so.1  \___________________
       --crtprefix=/lib  \_____________________________________
       --libdir=/lib  \________________________________________
       --tccdir=/lib/tcc  \____________________________________
       --bindir=/bin  \________________________________________
       --includedir=/include  \________________________________
       --sysincludepaths=/lib/tcc/include:/include  \__________
       --sharedir=-unused  \___________________________________
       `# We need debug symbols for later, but uncomment this i
       `# The difference is ~70% file size reduction.`  \______
       `# --strip-binaries`)___________________________________
   make --directory tcc  \_____________________________________
       --assume-old libtcc1.a  \_______________________________
       --assume-old tcc-doc.html  \____________________________
       --assume-old tcc-doc.info_______________________________
   DESTDIR=$PWD/tcc/output make --directory tcc install________
   cp --recursive tcc/output/* buildroot-v86/board/v86/rootfs_o

   --assume-old makes make skip libtcc1.a. Also skip steps that require
   makeinfo since documentation will end up in the directory
   "output-unused" as specified a bit hacky with --sharedir=-unused.
   DESTDIR is set when installing because configuring with
   --prefix=./output compiles tcc with search paths beginning with that
   prefix.

   --elfinterp points to the dynamic linker in the image, responsible for
   locating shared libraries needed by an application, prepare it to run
   and then execute it. Because we use musl, this file is called
   ld-musl-i386.so.1, but on your glibc-based distro it's (likely)
   ld-linux-x86-64.so.2. Without it, the system won't know how to start
   applications and you'll get `/bin/sh: {your command}: not found`

   For tcc to create executables, it needs startup routines that are
   linked into the executable. Those files start with crt, short for c
   runtime, and we have configured tcc to search for them in /lib. Since
   tcc supports running c without creating an executable via `tcc -run
   file.c`, you only need these files if you want to build executables
   (and if you plan on continuing this tutorial). Here's a quick summary
   of crt files from [12]https://dev.gentoo.org/~vapier/crt.txt:

   crt1.o
          Contains the _start symbol which sets up the env with
          argc/argv/libc _init/libc _fini before jumping to the libc main.

   crti.o
          Defines the function prolog; _init in the .init section and
          _fini in the .fini section.

   crtn.o
          Defines the function epilog.


   cp buildroot/output/host/i686-buildroot-linux-musl/sysroot/l
       buildroot-v86/board/v86/rootfs_overlay/lib______________

   That is what's needed for running tcc in v86, but it doesn't do much
   without musl's standard c headers. We pick only the bare minimum,
   because all headers are ~5 mb uncompressed.

   printf "buildroot/output/host/i686-buildroot-linux-musl/sysr
       bits alloca.h assert.h complex.h ctype.h errno.h fenv.h 
       inttypes.h iso646.h limits.h locale.h math.h memory.h ma
       signal.h stdalign.h stdarg.h stdbool.h stddef.h stdint.h
       stdnoreturn.h string.h strings.h tgmath.h threads.h time
       wchar.h wctype.h  \_____________________________________
   | xargs -0 cp --recursive --target buildroot-v86/board/v86/r

  Hello world

   With tcc compiled and installed into our image, it's time to prepare
   some code to test if the compiler works.

   mkdir buildroot-v86/board/v86/rootfs_overlay/opt____________
   cat >buildroot-v86/board/v86/rootfs_overlay/opt/test.c <<EOF
   #include <stdio.h>__________________________________________
   #include <string.h>_________________________________________
   ____________________________________________________________
   int main(int argc, char **argv)_____________________________
   {___________________________________________________________
       char *name = "stranger";________________________________
       if (argc > 1 && strlen(argv[1]) > 0)____________________
           name = argv[1];_____________________________________
       printf("Hello, %s\n", name);____________________________
       return 0;_______________________________________________
   }___________________________________________________________
   EOF_________________________________________________________

   Rebuild image with the new files:

   make --directory buildroot__________________________________
   cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____

   If you've closed your server, open a new terminal and run

   python3 -m http.server 8000 --directory web_________________

   Go to [13]http://localhost:8000 and try this in the emulator:

   # Compile and run without producing a binary________________
   tcc -run /opt/test.c________________________________________
   ____________________________________________________________
   # Create binary_____________________________________________
   tcc /opt/test.c -o hello____________________________________
   ./hello world_______________________________________________

Benchmarking

   Time for a quick benchmark to see what performance we can expect. We'll
   use the excellent pdf writer library, libharu.

   mkdir libharu_______________________________________________
   wget https://github.com/libharu/libharu/archive/refs/tags/RE
       --output-document -  \__________________________________
   | tar -xz --strip-components 1 --wildcards --directory libha
         "libharu-RELEASE_2_3_0/include/*.h"  \________________
         "libharu-RELEASE_2_3_0/src/*.c"  \____________________
         libharu-RELEASE_2_3_0/src/t4.h  \_____________________
         libharu-RELEASE_2_3_0/demo/line_demo.c________________
   ____________________________________________________________
   cat >libharu/include/hpdf_config.h <<EOF____________________
   #define LIBHPDF_HAVE_NOPNGLIB_______________________________
   #define HPDF_NOPNGLIB_______________________________________
   #define LIBHPDF_HAVE_NOZLIB_________________________________
   EOF_________________________________________________________

   Doing `sudo apt install sloccount` and then `sloccount libharu` tells
   us that the library consists of 128394 physical source lines of code.
   That's because of surprisingly big files with arrays containing
   encoding data, but let's see how long it'll take to compile that by
   creating a quick and dirty benchmark that works for both gcc and tcc.

   cat >libharu/benchmark <<EOF________________________________
   LIBHARUDIR=\$(dirname \$(readlink -f "\$0"))________________
   CC=\$1______________________________________________________
   [[ \$CC = gcc ]] && LIBMATH=-lm_____________________________
   time \$CC -I\$LIBHARUDIR/include "\$LIBHARUDIR/src/*.c"  \\_
       \$LIBHARUDIR/demo/line_demo.c \$LIBMATH -o /dev/null____
   EOF_________________________________________________________
   chmod +x libharu/benchmark__________________________________
   ____________________________________________________________
   # Build a shared library for another benchmark______________
   buildroot/output/host/bin/i686-buildroot-linux-musl-gcc -sha
       -Ilibharu/include libharu/src/*.c -lm  \________________
       -o buildroot-v86/board/v86/rootfs_overlay/lib/libharu.so
   ____________________________________________________________
   # Make it easy to run a benchmark where tcc links with libha
   # compiling from scratch.___________________________________
   cat >libharu/benchmark-link <<EOF___________________________
   time tcc -Ilibharu/include -lharu libharu/demo/line_demo.c -
   EOF_________________________________________________________
   chmod +x libharu/benchmark-link_____________________________
   ____________________________________________________________
   cp --recursive libharu buildroot-v86/board/v86/rootfs_overla
   ____________________________________________________________
   make --directory buildroot__________________________________
   cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____

  Run the benchmarks

   Run this locally

   libharu/benchmark gcc_______________________________________

   Run this in the emulator

   libharu/benchmark tcc  # Patience required__________________

   libharu/benchmark-link______________________________________

   As the benchmark unsurprisingly shows us, linking to a precompiled
   shared library is faster than compiling from scratch. On my machine,
   benchmark-link is 60 ms in v86. Not bad! Take a look at
   libharu/demo/line_demo.c, it's not the tinyest c file out there.

   I didn't show you how to compile a shared library with tcc on purpose
   (only how to link with one). There's a bug somewhere, and we'll
   investigate that in the next section.

Debugging

   If you've followed the steps so far, you can open your emulator and
   execute

   tcc -shared -fPIC -Ilibharu/include libharu/src/*.c_________

   This command tells tcc to compile a shared library instead of an
   executable and will take approximately 30 seconds, then it'll exit with
   a segmentation fault.

   I won't tell you how to fix this problem, because I have no need to
   compile shared libraries with tcc on a custom x86 system, nor do I have
   the intellect to fix the bug. But I didn't know (the latter) at the
   time, so I wanted to figure out what was wrong, which required...

  Remote debugging

   The gnu debugger, gdb, supports remote debugging via gdbserver, which
   is a small application you run on the target and connect to from gdb.
   Running gdbserver inside v86, inside a browser, and connecting to that
   from gdb would be cool, but since gdb doesn't work in v86 (you'll find
   out why later), gdbserver is not going to either. So to debug
   something, we need to reproduce the bug in qemu, and use socat to
   create a virtual serial port for gdb/gdbserver communication. And to
   compile gdb we need musl-cross-make via git.

   sudo apt install qemu-system-i386 socat git_________________

   With qemu installed, it's easy to boot your image

   qemu-system-i386 -serial stdio -cdrom web/linux.iso -cpu Wes

   And you even get a nice serial console for copy pasting! That was the
   good news, now for the bad...

   Buildroot, gdb and musl doesn't go well together and results in
   configure errors if you select the gdb package. So we have to compile
   gdb on our own, using a different toolchain. This could have been
   avoided with uclibc instead of musl, but in the name of MIT licenses,
   here we are. Hopefully you won't mind another huge compilation step.

   The following will clone musl-cross-make, configure and compile it.

   git clone https://github.com/richfelker/musl-cross-make.git_
   ____________________________________________________________
   cat >musl-cross-make/config.mak <<EOF_______________________
   TARGET=i686-linux-musl______________________________________
   MUSL_VER=git-v1.2.2_________________________________________
   GCC_VER=10.3.0______________________________________________
   # Not needed libs___________________________________________
   COMMON_CONFIG += --disable-nls______________________________
   EOF_________________________________________________________
   ____________________________________________________________
   make --directory musl-cross-make -j$(nproc)_________________
   make --directory musl-cross-make install____________________

   Now is the time to grab a coffee.

   Welcome back, we're now ready to build gdb/gdbserver with the toolchain
   installed into musl-cross-make/output/bin. Compiling gdb 10.2 is ideal
   here because it doesn't require gmp (GNU Multiple Precision Arithmetic
   Library), which later versions does.

   mkdir gdb___________________________________________________
   wget https://ftp.gnu.org/gnu/gdb/gdb-10.2.tar.gz --output-do
   | tar -xz --strip-components 1 --directory gdb______________
   (cd gdb &&  \_______________________________________________
    PATH=$PATH:$PWD/../musl-cross-make/output/bin   \__________
    ./configure  \_____________________________________________
        --prefix=$PWD/output  \________________________________
        --host=i686-linux-musl  \______________________________
        --disable-nls  \_______________________________________
        --with-curses)_________________________________________
   PATH=$PATH:$PWD/musl-cross-make/output/bin make --directory 
   PATH=$PATH:$PWD/musl-cross-make/output/bin make --directory 

   The new toolchain in `musl-cross-make/output/bin` follows a naming
   convention for cross compilers, so every program starts with
   i686-linux-musl as specified in musl-cross-make/config.mak by TARGET.
   gdb follows the same convention, and by specifying i686-linux-musl in
   `--host` and adding the toolchain to PATH, gdb is able to locate the
   right tools without having to install them on your system. We also
   --disable-nls (localization) and compile --with-curses instead of a
   default ancient alternative that we'd have to compile separately.

   Clean gdbserver by strip'ing it of debug symbols and non-essential
   data, and copy to the target. This reduces gdbserver file size from 8
   mb to 500 kb. For gdbserver to run, the c++ standard library is
   required as well.

   musl-cross-make/output/bin/i686-linux-musl-strip gdb/output/
   cp gdb/output/bin/gdbserver buildroot-v86/board/v86/rootfs_o
   musl-cross-make/output/bin/i686-linux-musl-strip  \_________
       musl-cross-make/output/i686-linux-musl/lib/libstdc++.so.
   cp musl-cross-make/output/i686-linux-musl/lib/libstdc++.so.6
       buildroot-v86/board/v86/rootfs_overlay/lib______________

   These files are ~2500 kb in total, so you want to remove them again
   after debugging.

   gdb must then be compiled for the host with i686 target support, which
   is easy in Buildroot:

   make --directory buildroot menuconfig_______________________

   then select Toolchain -> Build cross gdb for the host and compile

   make --directory buildroot__________________________________
   cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____

  Qemu and virtual serial ports

   While compiling, we create a pseudo terminal (pty) acting as a virtual
   serial port. Since socat uses random id's for the terminals like
   /dev/pty/2 and /dev/pty/18, we tell socat to create symbolic links for
   the random id's with id's we know in advance.

   Open a new terminal and run the following:

   socat pty,rawer,link=/tmp/vserial-host pty,rawer,link=/tmp/v

   When compilation is done, start qemu in a new terminal and connect with
   the virtual serial port on the host

   qemu-system-i386 -serial stdio -cdrom web/linux.iso -cpu Wes
       -chardev serial,id=gdbserial,path=/tmp/vserial-host  \__
       -device isa-serial,chardev=gdbserial____________________

   if you write `dmesg | grep tty` in the serial console you'll see two
   connected ports: ttyS0 which is connected to your terminal via `-serial
   stdio` and ttyS1 is connected to the virtual socat serial port.

   Start gdbserver in your qemu serial console for tcc debugging

   gdbserver /dev/ttyS1 tcc -shared -fPIC -Ilibharu/include lib

   then start gdb on the host, pointing to the cross compiled version of
   tcc

   buildroot/output/host/bin/i686-buildroot-linux-musl-gdb  \__
       -ix buildroot/output/staging/usr/share/buildroot/gdbinit
       tcc/output/bin/tcc______________________________________

   -ix means: Before the "inferior", which is gdb's name for a process
   (simply put), execute the file buildroot/.../gdbinit. `gdbinit` is
   provided by Buildroot and contains the following:
add-auto-load-safe-path {...}/buildroot/output/host/i686-buildroot-linux-musl/sy
sroot
set sysroot {...}/buildroot/output/host/i686-buildroot-linux-musl/sysroot

   which specify the directory that contains copies of libraries on the
   target, in corresponding subdirectories.

   Let's connect to qemu and run tcc:

   (gdb)
   target remote /tmp/vserial-target___________________________
   (gdb)
   continue____________________________________________________

   You'll get a few warnings that I believe is due to shared libraries
   being stripped of debugging symbols by Buildroot. Then the following
   error appears:
0x004f9c1f in fill_local_got_entries (s1=0xb7e99020) at tccelf.c:1362
1362        for_each_elem(s1->got->reloc, 0, rel, ElfW_Rel) {

   Looking into tcc's source, we see that this code is only run when
   compiling shared libraries. Perhaps recompiling for uclibc makes a
   difference, or upgrading to the tcc fork (which requires additional
   work in regards to compilation). Let me know if you fix the error and
   I'll add it to the tutorial.

   We could have added gdb to rootfs_overlay and run that in qemu instead,
   but then we lose code snippets of the error due to missing source
   files. Feel free to use gdb on the target if you're okay with just line
   numbers.

  Debugging in v86

   I've not been able to get gdb working in v86. Everything segfaults
   whenever I attempt to debug. Changing toolchain to uclibc will make
   Buildroot compile gdb, but it doesn't fix the issue, and downgrading
   gdb from 11.2 to 10.2 or 8 makes no difference. gdb works when running
   in qemu, so it must have something to do with v86. It would have been
   great to have gdb tell what crashed at runtime, but a c compiler will
   have to do for now.

Licenses

   To get all licenses from Buildroot, you write

   make --directory buildroot legal-info_______________________

   They're then found in buildroot/output/legal-info. Getting a complete
   list of licenses for everything used here is left as an exercise for
   the reader.

What's next

   In the next tutorial(s) I'll show you how to:
     * Interact with v86 from JavaScript via serial and 9P.
     * Create a simple interface for dmesg diffing to better optimize the
       image.
     * Compile and run c applications in the browser with a small ui.
     * Build a streaming parser for Linux kernel calls to create a basic
       but highly stylable console with unicode support; to display stdout
       (printf/puts/putchar/...) and ask for input on stdin
       (scanf/gets/getchar/...).

   If you got this far, perhaps you want to subscribe to new tutorials?
   Then [14]subscribetoj@nsommer.dk and I'll add you to the list. The mail
   can be empty, but if not I promise I'll read it. You can always
   [15]unsubscribetoj@nsommer.dk.

   Tipping: I'm writing tutorials for as long as there's money in the
   bank. Help me write more by tipping via bank transfer (IBAN) to DK81
   2000 6277 7121 54. Any amount is highly appreciated!

References

   1. https://ja.nsommer.dk/
   2. https://tbfleming.github.io/cib
   3. https://github.com/copy/v86/blob/master/Readme.md#readme
   4. https://copy.sh/
   5. https://buildroot.org/
   6. http://www.etalabs.net/compare_libcs.html
   7. https://www.se-radio.net/2020/06/episode-414-jens-gustedt-on-modern-c
   8. http://localhost:8000/
   9. https://bellard.org/tcc/tcc-doc.html#ISOC99-extensions
  10. https://bellard.org/tcc/
  11. https://repo.or.cz/tinycc.git
  12. https://dev.gentoo.org/~vapier/crt.txt
  13. http://localhost:8000/
  14. mailto:subscribetoj@nsommer.dk
  15. mailto:unsubscribetoj@nsommer.dk