diff options
Diffstat (limited to 'docs/ja.nsommer.dk_articles_linux-and-tiny-c-compiler-in-the-browser-part-one.txt')
-rw-r--r-- | docs/ja.nsommer.dk_articles_linux-and-tiny-c-compiler-in-the-browser-part-one.txt | 777 |
1 files changed, 777 insertions, 0 deletions
diff --git a/docs/ja.nsommer.dk_articles_linux-and-tiny-c-compiler-in-the-browser-part-one.txt b/docs/ja.nsommer.dk_articles_linux-and-tiny-c-compiler-in-the-browser-part-one.txt new file mode 100644 index 0000000..d7eb9e7 --- /dev/null +++ b/docs/ja.nsommer.dk_articles_linux-and-tiny-c-compiler-in-the-browser-part-one.txt @@ -0,0 +1,777 @@ + [1]< Table of contents + + Linux and Tiny C Compiler in the browser, part one + + 2022-05-22 + +Introduction + + Current C compilers running in the browser are experimental, though + [2]Clang In Browser is pretty impressive. Instead of porting a compiler + to WASM, I'm going to take a different approach and use my favourite + method for a lot of things: virtual machines. It's slower, especially + since I'm using a JavaScript cpu emulator, but decent performance is + possible with a fast compiler like Tiny C Compiler and a custom Linux. + Demo + + Try cat /opt/test.c and tcc -run /opt/test.c + +Motivation + + I could sit for hours back in the days and tweak the Linux kernel on my + Pentium-something, in an attempt to make the system boot faster. Most + of the time I just broke things and had to recompile Gentoo. But + there's rarely a need today to compile Linux; if you need something + barebone, you probably use Docker with Alpine Linux. Compiling Linux is + still useful in the embedded space, and with a c compiler in the mix + you get to learn the basics of how programs work. + + In the mean time, unikernels such as MirageOS and Unikraft have + surfaced as a supplement or even alternative to Docker. One of the ways + they differ is that your code is compiled into an operating system + instead of running on top of Linux. Imagine you could compile Linux + into your code, having dead code elimination on every feature you don't + use! The sales pitch is this: reduced attack surface, fast boot times + and better performance. Building a custom Linux then becomes even more + exciting because unikernels borrow many concepts from Linux, eg. + Unikraft is configured in the same tui as the Linux kernel (and + Buildroot), make and gcc are used to a great extent, and you can choose + between multiple libc implementations, but what exactly is that? + + So... + +What to expect + + This tutorial teaches you how to compile a small Linux image for + running in the browser via v86; a 32-bit x86 cpu emulator in + javascript. You get insights into cross compilation with a modern + implementation of the c standard library, and c internals when we add a + fast compiler to the image. Remote debugging via gdb is described in + the end using gdbserver, virtual serial ports and qemu. + +Prerequisites + + Linux that is not wsl, at least an hour to spare for compilation and + the following packages needed by Buildroot: + + sudo apt install make gcc g++ libncurses-dev libssl-dev_____ + + I've built this on Ubuntu 20.04 and 22.04 using bash, but most modern + distro's should be fine. + + Before you start, create a directory for the project, perhaps + ~/my-v86-linux, then cd into that and run all commands from there. All + commands will assume you are in that directory. Name it whatever you + like, and it doesn't have to be in ~/. + +The v86 CPU emulator + + v86 runs in the browser and emulates an x86-compatible cpu and hardware + where machine code is translated to WebAssembly modules at runtime. The + list of emulated hardware is impressive: + * x86 instruction set similar to Pentium III + * Keyboard and mouse support + * VGA card + * IDE disk controller + * Network card + * virtio filesystem + * Sound card + + View the full list of emulated hardware in [3]v86's readme. + + You're not limited to Linux on this emulator. It runs Windows (1.01, + 3.1, 95, 98, 2000), ReactOS, FreeBSD, OpenBSD and various hobby + operating systems. + + v86 is a hobby project written by an anonymous developer under the + pseudonym "copy". Previous work according to [4]copy's webpage includes + an impossible game, Game of Life and a brainfuck interpreter written in + javascript. + +Buildroot + + Buildroot is a tool to generate embedded Linux systems through cross + compilation. It's a huge work effort of cross compilation scripts and + configuration files put together in a nice terminal ui, and you can + tweak just about anything. It also acts as customizable toolchain, that + provides us with all the necessary tools to cross compile applications + that doesn't come in Buildroot packages. Read more on + [5]https://buildroot.org. + + Let's get started. + + cd into your project directory, then download and extract Buildroot: + + Hint: Tab through commands and copy instead of using your mouse. + + mkdir buildroot_____________________________________________ + wget https://github.com/buildroot/buildroot/archive/refs/tag + --output-document - \__________________________________ + | tar -xz --strip-components 1 --directory buildroot________ + + Instead of building Linux from the default Buildroot configuration, we + use a template that sets the right cpu and architecture among other + things: + + wget https://github.com/humphd/browser-vm/archive/refs/tags/ + --output-document - \__________________________________ + | tar -xz --strip-components 1 browser-vm-1.0.2/buildroot-v8 + + Remove commands that compress licenses. We'll get to that later. + + echo "" > buildroot-v86/board/v86/post-image.sh_____________ + + Tell Buildroot to create a new .config file with preloaded settings + from the template + + make --directory buildroot BR2_EXTERNAL=../buildroot-v86 v86 + + You're almost ready to build the initial image. Execute: + + make --directory buildroot menuconfig_______________________ + + Go to Toolchain -> C library and pick musl, exit and save. Then build + everything + + make --directory buildroot__________________________________ + + This is going to take a while, but the good thing is that caching is + enabled, so next time will be substantially faster. + + About musl... It's an implementation of the c standard library, like + uclibc and glibc. Your distro is probably using glibc, the GNU C + Library, which is big in size and not well suited for embedded Linux + where size matters. uclibc is better suited here, and so is musl which + seems to be the clear winner in [6]this (biased) comparison. I prefer + musl's MIT license over (L)GPL, which makes it interesting for + proprietary applications running in unikernels. It's developed and + maintained by Rich Felker, with a long list of contributors, and the + source code is said to be a reference code to look into for systems + programming in [7]this podcast (at 01:01:17). + +Preparing the website + + While waiting for Buildroot to compile, let's create the website that + will host the emulator and run Buildroot Linux: + + mkdir web___________________________________________________ + wget https://github.com/copy/v86/releases/download/latest/li + --directory-prefix web__________________________________ + wget https://github.com/copy/v86/releases/download/latest/v8 + --directory-prefix web__________________________________ + wget https://github.com/copy/v86/releases/download/latest/v8 + --directory-prefix web__________________________________ + wget https://github.com/copy/v86/archive/refs/tags/latest.ta + --output-document - \__________________________________ + | tar -xz --strip-components 2 --directory web \___________ + v86-latest/bios/seabios.bin \__________________________ + v86-latest/bios/vgabios.bin_____________________________ + ____________________________________________________________ + cat >web/index.html <<EOF___________________________________ + ____________________________________________________________ + <meta charset="utf8">_______________________________________ + <title>Emulator</title>_____________________________________ + <body bgcolor="#101010">____________________________________ + ____________________________________________________________ + <div id="screen_container">_________________________________ + <div style="white-space: pre; font: 14px monospace; line + <canvas hidden></canvas>________________________________ + </div>______________________________________________________ + ____________________________________________________________ + <script src="/libv86.js"></script>__________________________ + <script>____________________________________________________ + var emulator = new V86Starter({_____________________________ + wasm_path : "/v86.wasm",_________________________ + memory_size : 64 * 1024 * 1024, // 64 MB memory ou + vga_memory_size : 2 * 1024 * 1024,_____________________ + screen_container : screen_container,____________________ + bios : {url: "/seabios.bin"},_______________ + vga_bios : {url: "/vgabios.bin"},_______________ + cdrom : {url: "/linux.iso"},_________________ + filesystem : {},__________________________________ + autostart : true_________________________________ + })__________________________________________________________ + </script>___________________________________________________ + EOF_________________________________________________________ + + When Buildroot is done compiling, run + + cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____ + + Then open a new terminal and start a simple webserver pointing to the + web directory, eg. + + python3 -m http.server 8000 --directory web_________________ + + and open [8]http://localhost:8000 to see v86 in action. Log in as root, + no password needed. + +Customize your image + + Buildroot is all about customization. Try the following commands: + + make --directory buildroot menuconfig_______________________ + + make --directory buildroot busybox-menuconfig_______________ + + make --directory buildroot linux-menuconfig_________________ + + There's a lot to explore. + + menuconfig + + menuconfig is where you configure Buildroot with things such as Linux + kernel version, what bootloader to use (grub2, syslinux etc.), the libc + library you want to use like when you chose musl and which architecture + to compile for. There are multiple packages to choose from, ranging + from small libraries and utilities to X11 and Qt. + + busybox-menuconfig + + Busybox combines hundreds of Linux utilities into one binary and is + also highly configurable with busybox-menuconfig. It provides you with + ls, grep, diff and many other utilities you're used to on Linux, and + I'd encourage you to remove all the tools you don't use to create a + smaller image. Ideally Busybox would come with the bare minimum instead + of having to manually remove unnecessary things. This is where + unikernels shine, because they take the opposite approach, where you + start with almost nothing and add what you need. + + linux-menuconfig + + linux-menuconfig is where you configure the Linux kernel. There's a + million things to configure, and you can easily break something unless + you know what you're doing. In one of the following tutorials for this + series, I'll show you how to tweak the kernel by trial and error, since + that's how I do it: Remove one feature, test the system, rinse and + repeat. + + Resist the temptation to make changes for now. + + rootfs_overlay + + Located in buildroot-v86/board/v86/rootfs_overlay, this is where you + place files that you want to add to the image. Our template includes + two files: etc/fstab and etc/inittab. + + Disable kernel messages after login + + Some things are not critical for booting the system, but is run as part + of the boot process anyway. They can be slow to start and clutter the + terminal after login, potentially adding log messages in the middle of + writing a command. To disable kernel messages after login, create the + following file + + mkdir buildroot-v86/board/v86/rootfs_overlay/etc/profile.d__ + echo "echo 0 >/proc/sys/kernel/printk" \___________________ + >buildroot-v86/board/v86/rootfs_overlay/etc/profile.d/no + + All .sh files in etc/profile.d are run on login. + + Auto login + + etc/inittab prepares the file system and mounts etc/fstab, runs init + scripts and "spawn" applications after boot. One of the commands for + spawning ends with the comment "# GENERIC_SERIAL" and that line needs + to be changed to not prompt for login and just start /bin/sh. + + (F=buildroot-v86/board/v86/rootfs_overlay/etc/inittab && cp + && sed --in-place "28d" $F \______________________________ + && sed --in-place "s/.*# GENERIC_SERIAL/console::respawn:-\ + && diff /tmp/oldf $F)______________________________________ + + Notice that the command starts with console::respawn. Respawn means + that if sh crashes, Busybox will keep restarting it until it succeeds. + + getty is replaced here because it's the application that prompts for + login. It also prevents us from sending messages between tty's, which + only makes sense in a multi user system: If user A is logged into tty1 + and user B is in tty2, then A shouldn't be able to bother B with `echo + "Hi B!" >/dev/tty2`. Instead we spawn -/bin/sh, where the hyphen + instructs Busybox to treat the shell as a login shell. Without it, + /etc/profile and scripts in /etc/profile.d are ignored. + + To add the new files to your image, you simply compile again + + make --directory buildroot__________________________________ + cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____ + +Add Tiny C Compiler + + Tiny C Compiler, or tcc, is: + * ANSI C compliant, with most [9]C99 extensions. + * Small, roughly ~300 KB. + * Fast according to [10]the homepage, specifically 9 times faster + than gcc. + + I've used tcc to compile win32 applications with opengl and gdi+, and a + pdf library that we'll use later to benchmark performance. There are + limitations to what can be compiled, I haven't managed to compile + libpng for instance, but you can use gcc to provide a shared library + that tcc can link with. + + The compiler is written by Fabrice Bellard, author of qemu, ffmpeg, + quickjs, jslinux and the list goes on. You've likely used his software + in one way or another. I will use the last version he released before + abondoning tcc, but it's alive and well in [11]this fork. + + To get tcc working we have to compile it twice: The first time is to + compile libtcc1.a. The way this happens according to the Makefile is + that gcc is used to compile tcc, and then tcc builds and outputs + libtcc1.a. If we start by compiling with musl, it's not going to run on + the host, and thus libtcc1.a cannot be built. So first step is to + configure the build with --enable-cross, which builds a cross compiler + that compiles the right libtcc1.a. After that, we can compile for a + single architecture and libc: x86 musl. + + mkdir tcc___________________________________________________ + wget http://download.savannah.gnu.org/releases/tinycc/tcc-0. + --output-document - \__________________________________ + | tar -xj --strip-components 1 --directory tcc \___________ + --exclude tests --exclude examples______________________ + ____________________________________________________________ + mkdir libtcc________________________________________________ + cp --recursive tcc/* libtcc_________________________________ + + Configure tcc cross compilers for current cpu architecture to get + i386-version of libtcc1.a + + (cd libtcc && ./configure --prefix=./output --enable-cross)_ + + Malloc hooks have been removed in glibc 2.34 and Ubuntu 22.04 ships + with glibc 2.35. The next two commands are unnecessary on Ubuntu 20.04, + but harmless. + + (F=libtcc/lib/bcheck.c && cp $F /tmp/oldf \________________ + && sed --in-place "s/#define CONFIG_TCC_MALLOC_HOOKS//" $F + && sed --in-place "s/#define HAVE_MEMALIGN//" $F \________ + && diff /tmp/oldf $F)______________________________________ + + Then build libtcc on the host and copy to the file system overlay. + + make --directory libtcc_____________________________________ + make --directory libtcc install_____________________________ + mkdir -p buildroot-v86/board/v86/rootfs_overlay/lib/tcc_____ + cp libtcc/output/lib/tcc/i386-libtcc1.a \__________________ + buildroot-v86/board/v86/rootfs_overlay/lib/tcc/libtcc1.a + + Next step is to configure and build the compiler for x86 musl. + + (cd tcc && ./configure \___________________________________ + --cpu=x86 \____________________________________________ + --config-musl \________________________________________ + --cross-prefix=${PWD}/../buildroot/output/host/bin/i686- + --elfinterp=/lib/ld-musl-i386.so.1 \___________________ + --crtprefix=/lib \_____________________________________ + --libdir=/lib \________________________________________ + --tccdir=/lib/tcc \____________________________________ + --bindir=/bin \________________________________________ + --includedir=/include \________________________________ + --sysincludepaths=/lib/tcc/include:/include \__________ + --sharedir=-unused \___________________________________ + `# We need debug symbols for later, but uncomment this i + `# The difference is ~70% file size reduction.` \______ + `# --strip-binaries`)___________________________________ + make --directory tcc \_____________________________________ + --assume-old libtcc1.a \_______________________________ + --assume-old tcc-doc.html \____________________________ + --assume-old tcc-doc.info_______________________________ + DESTDIR=$PWD/tcc/output make --directory tcc install________ + cp --recursive tcc/output/* buildroot-v86/board/v86/rootfs_o + + --assume-old makes make skip libtcc1.a. Also skip steps that require + makeinfo since documentation will end up in the directory + "output-unused" as specified a bit hacky with --sharedir=-unused. + DESTDIR is set when installing because configuring with + --prefix=./output compiles tcc with search paths beginning with that + prefix. + + --elfinterp points to the dynamic linker in the image, responsible for + locating shared libraries needed by an application, prepare it to run + and then execute it. Because we use musl, this file is called + ld-musl-i386.so.1, but on your glibc-based distro it's (likely) + ld-linux-x86-64.so.2. Without it, the system won't know how to start + applications and you'll get `/bin/sh: {your command}: not found` + + For tcc to create executables, it needs startup routines that are + linked into the executable. Those files start with crt, short for c + runtime, and we have configured tcc to search for them in /lib. Since + tcc supports running c without creating an executable via `tcc -run + file.c`, you only need these files if you want to build executables + (and if you plan on continuing this tutorial). Here's a quick summary + of crt files from [12]https://dev.gentoo.org/~vapier/crt.txt: + + crt1.o + Contains the _start symbol which sets up the env with + argc/argv/libc _init/libc _fini before jumping to the libc main. + + crti.o + Defines the function prolog; _init in the .init section and + _fini in the .fini section. + + crtn.o + Defines the function epilog. + + + cp buildroot/output/host/i686-buildroot-linux-musl/sysroot/l + buildroot-v86/board/v86/rootfs_overlay/lib______________ + + That is what's needed for running tcc in v86, but it doesn't do much + without musl's standard c headers. We pick only the bare minimum, + because all headers are ~5 mb uncompressed. + + printf "buildroot/output/host/i686-buildroot-linux-musl/sysr + bits alloca.h assert.h complex.h ctype.h errno.h fenv.h + inttypes.h iso646.h limits.h locale.h math.h memory.h ma + signal.h stdalign.h stdarg.h stdbool.h stddef.h stdint.h + stdnoreturn.h string.h strings.h tgmath.h threads.h time + wchar.h wctype.h \_____________________________________ + | xargs -0 cp --recursive --target buildroot-v86/board/v86/r + + Hello world + + With tcc compiled and installed into our image, it's time to prepare + some code to test if the compiler works. + + mkdir buildroot-v86/board/v86/rootfs_overlay/opt____________ + cat >buildroot-v86/board/v86/rootfs_overlay/opt/test.c <<EOF + #include <stdio.h>__________________________________________ + #include <string.h>_________________________________________ + ____________________________________________________________ + int main(int argc, char **argv)_____________________________ + {___________________________________________________________ + char *name = "stranger";________________________________ + if (argc > 1 && strlen(argv[1]) > 0)____________________ + name = argv[1];_____________________________________ + printf("Hello, %s\n", name);____________________________ + return 0;_______________________________________________ + }___________________________________________________________ + EOF_________________________________________________________ + + Rebuild image with the new files: + + make --directory buildroot__________________________________ + cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____ + + If you've closed your server, open a new terminal and run + + python3 -m http.server 8000 --directory web_________________ + + Go to [13]http://localhost:8000 and try this in the emulator: + + # Compile and run without producing a binary________________ + tcc -run /opt/test.c________________________________________ + ____________________________________________________________ + # Create binary_____________________________________________ + tcc /opt/test.c -o hello____________________________________ + ./hello world_______________________________________________ + +Benchmarking + + Time for a quick benchmark to see what performance we can expect. We'll + use the excellent pdf writer library, libharu. + + mkdir libharu_______________________________________________ + wget https://github.com/libharu/libharu/archive/refs/tags/RE + --output-document - \__________________________________ + | tar -xz --strip-components 1 --wildcards --directory libha + "libharu-RELEASE_2_3_0/include/*.h" \________________ + "libharu-RELEASE_2_3_0/src/*.c" \____________________ + libharu-RELEASE_2_3_0/src/t4.h \_____________________ + libharu-RELEASE_2_3_0/demo/line_demo.c________________ + ____________________________________________________________ + cat >libharu/include/hpdf_config.h <<EOF____________________ + #define LIBHPDF_HAVE_NOPNGLIB_______________________________ + #define HPDF_NOPNGLIB_______________________________________ + #define LIBHPDF_HAVE_NOZLIB_________________________________ + EOF_________________________________________________________ + + Doing `sudo apt install sloccount` and then `sloccount libharu` tells + us that the library consists of 128394 physical source lines of code. + That's because of surprisingly big files with arrays containing + encoding data, but let's see how long it'll take to compile that by + creating a quick and dirty benchmark that works for both gcc and tcc. + + cat >libharu/benchmark <<EOF________________________________ + LIBHARUDIR=\$(dirname \$(readlink -f "\$0"))________________ + CC=\$1______________________________________________________ + [[ \$CC = gcc ]] && LIBMATH=-lm_____________________________ + time \$CC -I\$LIBHARUDIR/include "\$LIBHARUDIR/src/*.c" \\_ + \$LIBHARUDIR/demo/line_demo.c \$LIBMATH -o /dev/null____ + EOF_________________________________________________________ + chmod +x libharu/benchmark__________________________________ + ____________________________________________________________ + # Build a shared library for another benchmark______________ + buildroot/output/host/bin/i686-buildroot-linux-musl-gcc -sha + -Ilibharu/include libharu/src/*.c -lm \________________ + -o buildroot-v86/board/v86/rootfs_overlay/lib/libharu.so + ____________________________________________________________ + # Make it easy to run a benchmark where tcc links with libha + # compiling from scratch.___________________________________ + cat >libharu/benchmark-link <<EOF___________________________ + time tcc -Ilibharu/include -lharu libharu/demo/line_demo.c - + EOF_________________________________________________________ + chmod +x libharu/benchmark-link_____________________________ + ____________________________________________________________ + cp --recursive libharu buildroot-v86/board/v86/rootfs_overla + ____________________________________________________________ + make --directory buildroot__________________________________ + cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____ + + Run the benchmarks + + Run this locally + + libharu/benchmark gcc_______________________________________ + + Run this in the emulator + + libharu/benchmark tcc # Patience required__________________ + + libharu/benchmark-link______________________________________ + + As the benchmark unsurprisingly shows us, linking to a precompiled + shared library is faster than compiling from scratch. On my machine, + benchmark-link is 60 ms in v86. Not bad! Take a look at + libharu/demo/line_demo.c, it's not the tinyest c file out there. + + I didn't show you how to compile a shared library with tcc on purpose + (only how to link with one). There's a bug somewhere, and we'll + investigate that in the next section. + +Debugging + + If you've followed the steps so far, you can open your emulator and + execute + + tcc -shared -fPIC -Ilibharu/include libharu/src/*.c_________ + + This command tells tcc to compile a shared library instead of an + executable and will take approximately 30 seconds, then it'll exit with + a segmentation fault. + + I won't tell you how to fix this problem, because I have no need to + compile shared libraries with tcc on a custom x86 system, nor do I have + the intellect to fix the bug. But I didn't know (the latter) at the + time, so I wanted to figure out what was wrong, which required... + + Remote debugging + + The gnu debugger, gdb, supports remote debugging via gdbserver, which + is a small application you run on the target and connect to from gdb. + Running gdbserver inside v86, inside a browser, and connecting to that + from gdb would be cool, but since gdb doesn't work in v86 (you'll find + out why later), gdbserver is not going to either. So to debug + something, we need to reproduce the bug in qemu, and use socat to + create a virtual serial port for gdb/gdbserver communication. And to + compile gdb we need musl-cross-make via git. + + sudo apt install qemu-system-i386 socat git_________________ + + With qemu installed, it's easy to boot your image + + qemu-system-i386 -serial stdio -cdrom web/linux.iso -cpu Wes + + And you even get a nice serial console for copy pasting! That was the + good news, now for the bad... + + Buildroot, gdb and musl doesn't go well together and results in + configure errors if you select the gdb package. So we have to compile + gdb on our own, using a different toolchain. This could have been + avoided with uclibc instead of musl, but in the name of MIT licenses, + here we are. Hopefully you won't mind another huge compilation step. + + The following will clone musl-cross-make, configure and compile it. + + git clone https://github.com/richfelker/musl-cross-make.git_ + ____________________________________________________________ + cat >musl-cross-make/config.mak <<EOF_______________________ + TARGET=i686-linux-musl______________________________________ + MUSL_VER=git-v1.2.2_________________________________________ + GCC_VER=10.3.0______________________________________________ + # Not needed libs___________________________________________ + COMMON_CONFIG += --disable-nls______________________________ + EOF_________________________________________________________ + ____________________________________________________________ + make --directory musl-cross-make -j$(nproc)_________________ + make --directory musl-cross-make install____________________ + + Now is the time to grab a coffee. + + Welcome back, we're now ready to build gdb/gdbserver with the toolchain + installed into musl-cross-make/output/bin. Compiling gdb 10.2 is ideal + here because it doesn't require gmp (GNU Multiple Precision Arithmetic + Library), which later versions does. + + mkdir gdb___________________________________________________ + wget https://ftp.gnu.org/gnu/gdb/gdb-10.2.tar.gz --output-do + | tar -xz --strip-components 1 --directory gdb______________ + (cd gdb && \_______________________________________________ + PATH=$PATH:$PWD/../musl-cross-make/output/bin \__________ + ./configure \_____________________________________________ + --prefix=$PWD/output \________________________________ + --host=i686-linux-musl \______________________________ + --disable-nls \_______________________________________ + --with-curses)_________________________________________ + PATH=$PATH:$PWD/musl-cross-make/output/bin make --directory + PATH=$PATH:$PWD/musl-cross-make/output/bin make --directory + + The new toolchain in `musl-cross-make/output/bin` follows a naming + convention for cross compilers, so every program starts with + i686-linux-musl as specified in musl-cross-make/config.mak by TARGET. + gdb follows the same convention, and by specifying i686-linux-musl in + `--host` and adding the toolchain to PATH, gdb is able to locate the + right tools without having to install them on your system. We also + --disable-nls (localization) and compile --with-curses instead of a + default ancient alternative that we'd have to compile separately. + + Clean gdbserver by strip'ing it of debug symbols and non-essential + data, and copy to the target. This reduces gdbserver file size from 8 + mb to 500 kb. For gdbserver to run, the c++ standard library is + required as well. + + musl-cross-make/output/bin/i686-linux-musl-strip gdb/output/ + cp gdb/output/bin/gdbserver buildroot-v86/board/v86/rootfs_o + musl-cross-make/output/bin/i686-linux-musl-strip \_________ + musl-cross-make/output/i686-linux-musl/lib/libstdc++.so. + cp musl-cross-make/output/i686-linux-musl/lib/libstdc++.so.6 + buildroot-v86/board/v86/rootfs_overlay/lib______________ + + These files are ~2500 kb in total, so you want to remove them again + after debugging. + + gdb must then be compiled for the host with i686 target support, which + is easy in Buildroot: + + make --directory buildroot menuconfig_______________________ + + then select Toolchain -> Build cross gdb for the host and compile + + make --directory buildroot__________________________________ + cp buildroot/output/images/rootfs.iso9660 web/linux.iso_____ + + Qemu and virtual serial ports + + While compiling, we create a pseudo terminal (pty) acting as a virtual + serial port. Since socat uses random id's for the terminals like + /dev/pty/2 and /dev/pty/18, we tell socat to create symbolic links for + the random id's with id's we know in advance. + + Open a new terminal and run the following: + + socat pty,rawer,link=/tmp/vserial-host pty,rawer,link=/tmp/v + + When compilation is done, start qemu in a new terminal and connect with + the virtual serial port on the host + + qemu-system-i386 -serial stdio -cdrom web/linux.iso -cpu Wes + -chardev serial,id=gdbserial,path=/tmp/vserial-host \__ + -device isa-serial,chardev=gdbserial____________________ + + if you write `dmesg | grep tty` in the serial console you'll see two + connected ports: ttyS0 which is connected to your terminal via `-serial + stdio` and ttyS1 is connected to the virtual socat serial port. + + Start gdbserver in your qemu serial console for tcc debugging + + gdbserver /dev/ttyS1 tcc -shared -fPIC -Ilibharu/include lib + + then start gdb on the host, pointing to the cross compiled version of + tcc + + buildroot/output/host/bin/i686-buildroot-linux-musl-gdb \__ + -ix buildroot/output/staging/usr/share/buildroot/gdbinit + tcc/output/bin/tcc______________________________________ + + -ix means: Before the "inferior", which is gdb's name for a process + (simply put), execute the file buildroot/.../gdbinit. `gdbinit` is + provided by Buildroot and contains the following: +add-auto-load-safe-path {...}/buildroot/output/host/i686-buildroot-linux-musl/sy +sroot +set sysroot {...}/buildroot/output/host/i686-buildroot-linux-musl/sysroot + + which specify the directory that contains copies of libraries on the + target, in corresponding subdirectories. + + Let's connect to qemu and run tcc: + + (gdb) + target remote /tmp/vserial-target___________________________ + (gdb) + continue____________________________________________________ + + You'll get a few warnings that I believe is due to shared libraries + being stripped of debugging symbols by Buildroot. Then the following + error appears: +0x004f9c1f in fill_local_got_entries (s1=0xb7e99020) at tccelf.c:1362 +1362 for_each_elem(s1->got->reloc, 0, rel, ElfW_Rel) { + + Looking into tcc's source, we see that this code is only run when + compiling shared libraries. Perhaps recompiling for uclibc makes a + difference, or upgrading to the tcc fork (which requires additional + work in regards to compilation). Let me know if you fix the error and + I'll add it to the tutorial. + + We could have added gdb to rootfs_overlay and run that in qemu instead, + but then we lose code snippets of the error due to missing source + files. Feel free to use gdb on the target if you're okay with just line + numbers. + + Debugging in v86 + + I've not been able to get gdb working in v86. Everything segfaults + whenever I attempt to debug. Changing toolchain to uclibc will make + Buildroot compile gdb, but it doesn't fix the issue, and downgrading + gdb from 11.2 to 10.2 or 8 makes no difference. gdb works when running + in qemu, so it must have something to do with v86. It would have been + great to have gdb tell what crashed at runtime, but a c compiler will + have to do for now. + +Licenses + + To get all licenses from Buildroot, you write + + make --directory buildroot legal-info_______________________ + + They're then found in buildroot/output/legal-info. Getting a complete + list of licenses for everything used here is left as an exercise for + the reader. + +What's next + + In the next tutorial(s) I'll show you how to: + * Interact with v86 from JavaScript via serial and 9P. + * Create a simple interface for dmesg diffing to better optimize the + image. + * Compile and run c applications in the browser with a small ui. + * Build a streaming parser for Linux kernel calls to create a basic + but highly stylable console with unicode support; to display stdout + (printf/puts/putchar/...) and ask for input on stdin + (scanf/gets/getchar/...). + + If you got this far, perhaps you want to subscribe to new tutorials? + Then [14]subscribetoj@nsommer.dk and I'll add you to the list. The mail + can be empty, but if not I promise I'll read it. You can always + [15]unsubscribetoj@nsommer.dk. + + Tipping: I'm writing tutorials for as long as there's money in the + bank. Help me write more by tipping via bank transfer (IBAN) to DK81 + 2000 6277 7121 54. Any amount is highly appreciated! + +References + + 1. https://ja.nsommer.dk/ + 2. https://tbfleming.github.io/cib + 3. https://github.com/copy/v86/blob/master/Readme.md#readme + 4. https://copy.sh/ + 5. https://buildroot.org/ + 6. http://www.etalabs.net/compare_libcs.html + 7. https://www.se-radio.net/2020/06/episode-414-jens-gustedt-on-modern-c + 8. http://localhost:8000/ + 9. https://bellard.org/tcc/tcc-doc.html#ISOC99-extensions + 10. https://bellard.org/tcc/ + 11. https://repo.or.cz/tinycc.git + 12. https://dev.gentoo.org/~vapier/crt.txt + 13. http://localhost:8000/ + 14. mailto:subscribetoj@nsommer.dk + 15. mailto:unsubscribetoj@nsommer.dk |