mission ------- minimal C to program the kernel. design desicions ---------------- Use enum constants rather than preprocessor constants. Do we allow structs, functions etc. We could go with global variables only and basic types, makes the compiler simpler, but the code maybe not so well-structured. The kind of C dialect we use is a simplified version, so we rather write i = i + 1 than i++, ++i Constructs we need ------------------ From CPP -------- In principle we would like to avoid having to implement CPP funtionality, but.. header file includes -------------------- #include "filename.h" (but we can go without preprocessor and make it a special command in the language itself. We don't want and need preprocessor tricks (we think). On the other hand we want to use only things known to standard compilers so that we can bootstrap from a host with a standard compiler. => #include "filename.h" include guards -------------- #import "a.h" is deprectated in gcc, was Microsoft extension and Objective-C. ok, no go then. #pragma once in a.h very simple to implement, avoids clashes of names of guard macros, which is very good. instead of #ifndef XXXX #define XXXX #include "a.h" #endif We could implement a simple preprocessor functionality which only works with defined symbols (names and no values). #pragma once is not portable, but maybe quite well supported? include_next ------------ To extend a standard header. I see this only in a hosted environment. For instance to shim standard headers if a compiler doesn't provide some things (see stdlib.h in abaos/libc). platform switches ----------------- #ifdef HOSTED #define print(X) puts(X) #else // our own function print to our kernel #endif alternatives? Linking to specific implementation files io-host.c vs io-abaos.c pimpls and casts for OS-specific structs, not-typesave. on the other hand #ifdef's in structs are also quite dangerous (ABI mismatches). debug code ---------- #ifdef DEBUG #endif constants --------- for array dimensions mainly. #define ACONSTANT 5 int b[ACONSTANT]; constants can be done with 'enum { ACONSTANT = 5 };' No this is really a problem when parts of standard C require a macro processor just to define things like NULL: #if !defined(NULL) #define NULL ((void*)0) #endif runtime ------- We need a minimalistic runtime (basically the functions needed to write the self-compiling first-stage compiler). crt0 type entry points: raise _start _exit testing ------- From the simplest test program on, make sure we generate some output and verify it! Even if this means we must write special bootstrap test code.. Make sure we can output all phases of the compiler, lexing/parsing, semantics, code generation, etc. Have a pseudo code generator like "cucu", or maybe better something which even runs (Jasmin, Java byte-code), cucu has a Python interpreter for the pseudo assembler. The real target we should postpone, as it involves linking, ELF, GPT, Intel assembly code and other things which complicate issues in the beginning. Also having a small virtual CPU as in Oberon (where the virtual CPU actually can get real :-) ) or as in http://schweigi.github.io/assembler-simulator/instruction-set.html is an idea. The idea is even usable for a running system like erlm.github.io/OberonEmulator/ header files ------------ Generate them, later make the compiler self-aware of source files in Oberon-style. For bootstrapping via another C compiler we need simple header file (#pragma once only, #include only in the C file. For including things which are prerequisites either: - the program can define a global series of includes - the library or component (like minilib) defines a facade header file generating header files: http://www.hwaci.com/sw/mkhdr/ But, we would like the minic compiler to do this, not yet another tool in the chain. makeheaders plays too many tricks with preprocessors. Another possibility is to have a modified C with other constructs which then first gets converted into plain C89 with full-preprocessor support. io.c: typedef struct Io { } Io; implicit -> #include "io.h" declaration gets moved into io.h user of a module uses: import io; shadowing --------- Traditional C has it, but is it a good idea? Why has it been invented? How much is the complecity of symbol management increased? TODO: Finding a counter example. Coffescript doesn't have explicit shadowing, so you always have to choose proper names. We have to avoid confusion between assignment and declaration: x = a; int x = a; One symbol space (pascal) so you cannot have a type 'X' and a variable 'X' or a function 'X' in the same scope. Also here C deviates and has separate namespaces. floating point arithmentic -------------------------- For kernel programming more a nuisance than helpful, so it's second priority. unicode ------- Traditionally a mess, so maybe having it in userland as user library only? compiler and linker for CPU features ------------------------------------ Imagine a linker who can handle f_sse2( ), f_i486( ) as function f( ) at runtime. This would also eliminate the need for #ifdef i386 and stuff like that. SSE2 { } i486 { } could be like namespaces with special meaning. But this would require inline assembly as optimzation of C inside the namespace might not be enough. Approaches ---------- C4 -- C4 is self-hosting and has the minimum features we need, it lacks some things: - create object files for running (not just in-memory execution) - too many OS dependencies - functions are part of the parser This shows that the compiler is indeed self-hosting: ./c4 c4.c c4.c hello.c hello, world exit(0) cycle = 9 exit(0) cycle = 26015 exit(0) cycle = 10059669 Minimalistic, usable with modifications for bootstrapping a compiler. lcc --- book. very good to read. shows practical issues. sadly the coding style is not of our likeing. the distinction in front and backend is very good. picoc looks like a bootstrapping interpreter. qbe --- Interesting project. with a bootstrapping minic. Sort of an intermediate LLVM-like language, but much simpler. links ----- https://github.com/alexfru/SmallerC https://github.com/rswier/c4 http://c9x.me/compile/doc/il.html (QBE) Building -------- gcc -I../minilib -g -O0 -m32 -march=i386 -ffreestanding -Werror -Wall -Wno-return-type -pedantic -std=c89 -o minic *.c ../minilib/*.c clang -I../minilib -g -O0 -march=i386 -fno-builtin -std=c89 -Werror -Wall -Wno-return-type -o minic *.c ../minilib/*.c tcc -I../minilib -g -O0 -march=i386 -fno-builtin -std=c89 -Werror -Wall -Wno-return-type -o minic *.c ../minilib/*.c pcc -I../minilib -g -O0 -march=i386 -fno-builtin -std=c89 -Wall -Wno-return-type -o minic *.c ../minilib/*.c