summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAndreas Baumann <mail@andreasbaumann.cc>2021-09-30 11:21:37 +0200
committerAndreas Baumann <mail@andreasbaumann.cc>2021-09-30 11:21:37 +0200
commitf5c8ee1f889824eaa4bb89a01711a538e295dba0 (patch)
tree1782d5aba010d3dda89e37e1b4334fc319649c56
parentb055c157e04ae30f6ffb9eef1fa97d512ce4e65a (diff)
downloadcompilertests-f5c8ee1f889824eaa4bb89a01711a538e295dba0.tar.gz
compilertests-f5c8ee1f889824eaa4bb89a01711a538e295dba0.tar.bz2
added a Makefile for wordgrinder export and markdown to HMTL
-rw-r--r--miniany/Makefile10
-rw-r--r--miniany/README113
-rw-r--r--miniany/README.html79
-rw-r--r--miniany/cc.c43
4 files changed, 132 insertions, 113 deletions
diff --git a/miniany/Makefile b/miniany/Makefile
new file mode 100644
index 0000000..24c535e
--- /dev/null
+++ b/miniany/Makefile
@@ -0,0 +1,10 @@
+.PHONY: doc
+
+doc: README.html
+
+README.html: cc.md
+ md2html --fpermissive-url-autolinks < cc.md > README.html
+
+cc.md: cc.wg
+ wordgrinder -c cc.wg cc.md
+ \ No newline at end of file
diff --git a/miniany/README b/miniany/README
deleted file mode 100644
index 814f6b3..0000000
--- a/miniany/README
+++ /dev/null
@@ -1,113 +0,0 @@
-
-# CC - a self-hosting, bootstrappable, minimal C compiler
-
-## Introduction
-
-On the never-ending quest of a minimal system I found Swieros and C4 (the C compiler in 4 functions). Inspired and intrigued I started to implement my own.
-
-For abaos (a small operating system of mine, also in C) I cloned the minimal C library, so we can build a freestanding version of C4.
-
-C4 serves as a test whether my own CC is minimal enough and doesn't use silly functions. Additionally C4 as well as CC are compiled both in a (on Linux) hosted version and a freestanding version. We use a series of compilers like gcc, clang, tcc and pcc to make sure that we are not using silly C constructs.
-
-In order to be able to port easily we make almost no use of system calls, the ones we need are:
-
-
-- brk: for malloc/free, change the start address of the heap segment of the process, if the OS only assigns a single static space, then brk results in a NOP.
-- exit: terminate the process, return does not always work in all combinations (for instance with pcc on Linux). Can be a NOP, we don't require any trickery as <i>atext </i>and we don't use buffering anywhere (for instance flushing stdout on exit).
-- read/write: read from stdin linearly, write to stdout linearly, this is essentially a model using an input and an output tape. Those two functions must really exist. This basically eliminates the need for a file system which we might not have during early bootstrapping.
-
-Similarly we simplify the C language to not use certain features which can cause trouble when bootstrapping:
-
-
-- variable arguments: though simple in principle (just some pointers into the stack if you use a stack for function parameters), it is not typesafe. And the only example in practice it's really heavily used for is in printf-like functions.
-- preprocessor: it needs a filesystem, we take this outside of the compiler by feeding it an (eventually) concatenated list of \*.c files.
-- two types: int and char, so we can interpret memory as words or as bytes.
-
-## Local version of C4
-
-The local version of C4 has the following adaoptions and extensions:
-
-
-- switch statement from the <i>switch-and-struct</i>s branch, adapted c4 itself to use switch statements instead of if's (as in the <i>switch-and-structs </i>branch)
-- struct support from <i>switch-and-structs</i>
-- constants like <i>EO</i>F, <i>EXIT\_SUCCES</i>S, <i>NUL</i>L
-- standard C block comments along to c++ end of line ones
-- negative enum initializers
-- do/while loops
-- more C functions like <i>isspac</i>e, <i>get</i>c, <i>strcm</i>p
-- some simplified functions for printing like <i>putstring</i>, <i>putin</i>t, <i>putn</i>l
-- strict C89 conformance, mainly use standard comment blocks, also removed some warnings
-- some casts around malloc and memset to fit to non-void freestanding-libc
-- converted printf to putstring/putint/putnl and some helper functions for error reporting like error()
-- removed all memory leaks
-- de-POSIX-ified, no open/read/close, use getchar from stdin only (don't assume the existence of a file system), this also means we had to create sort of an old style tape-file with FS markers to separate the files piped to c4.
-
-<i>Note: </i>only too late I discovered that there was a C5 version of the same compiler, which would maybe have served better as a basis.
-
-## Examples
-
-### Running on the host system using the hosts C compiler
-
-Compiled in either hosted (host libc) or freestanding (our own libc, currently IA-32 Linux kernel only syscalls):
-
-`./build.sh cc hostcc hosted d
-./build.sh cc hostcc freestanding d
-./cc \< test1.c \> test1.asm`
-Create a plain binary from the assembly code:
-
-`fasm test1.asm test1.bin`
-Disassemble it to verify it's correctness:
-
-`ndisasm -b32 -o1000000h -a test1.bin`
-You can choose <i>gcc</i>, <i>clang</i>, <i>tcc </i>or <i>pcc </i>as host compiler (<i>hostcc</i>).
-
-### Running on the host in the C4 interpreter
-
-Running in C4 interpreter, again, the C4 program can be compiled in hosted or freestanding mode:
-
-`./build.sh c4 hostcc hosted d
-./build.sh c4 hostcc freestanding d`
-Here again you can choose the host compiler for compiling C4.
-
-Then we have to create the standard input for C4 using:
-
-`echo -n -e "\034" \> EOF
-cat cc.c EOF hello.c | ./c4
-cat c4.c EOF cc.c EOF hello.c | ./c4
-cat c4.c4 EOF c4.c EOF cc.c EOF hello.c | ./c4`
-EOF contains the traditional FS (file separator) character in the ASCII character set. Every time c4/c4.c is invoked it reads exacly one input file up to the first FS character (or stops at the end of stdin).
-
-We can also use <i>-s</i>, or <i>-d </i>on every level as follows:
-
-`cat cc.c EOF hello.c | ./c4 -d`
-## References
-
-Compiler construction in general:
-
-
-- <i>"Compiler </i><i>Construction"</i>", Niklaus Wirth
-- https://github.com/DoctorWkt/acwj: a nice series on building a C compiler, step by step with lots of good explanations
-- https://www.engr.mun.ca/~theo/Misc/exp\_parsing.htm\#climbing, https://en.wikipedia.org/wiki/Operator-precedence\_parser\#Precedence\_climbing\_method
-- https://github.com/lotabout/write-a-C-interpreter/blob/master/tutorial/en/, tutorial based on C4 how to build a C interpreter, explains nicely details in C4.
-
-C4:
-
-
-- https://github.com/rswier/c4.git, <i>C4 </i><i>- </i><i>C </i><i>in </i><i>four </i><i>functions</i>, Robert Swierczek, minimalistic C compiler running on an emulator on the IR, inspiration for this project
-- https://github.com/rswier/c4/blob/switch-and-structs/c4.c, c4 adaptions to provide switch and structs
-- https://github.com/EarlGray/c4: a X86 JIT version of c4
-- https://github.com/jserv/amacc: based on C4, JIT or native code, for ARM, quite well documented, also very nice list of compiler resources on Github page
-
-Other minimal compilers and systems:
-
-
-- http://selfie.cs.uni-salzburg.at/: C\* self-hosting C compiler (also emulator, hypervisor) for RISCV, inspiration for what makes up a minimal C language
-- http://www.iro.umontreal.ca/~felipe/IFT2030-Automne2002/Complements/tinyc.c, Marc Feeley, really easy and much more readable, meant as educational compiler
-- https://github.com/rswier/swieros.git: c.c in swieros, Robert Swierczek
-
-Assembly:
-
-
-- https://github.com/felipensp/assembly/blob/master/x86/itoa.s, for putint (early debugging keyword)
-- https://baptiste-wicht.com/posts/2011/11/print-strings-integers-intel-assembly.htm (earldy debugging keyword)
-
diff --git a/miniany/README.html b/miniany/README.html
new file mode 100644
index 0000000..411c667
--- /dev/null
+++ b/miniany/README.html
@@ -0,0 +1,79 @@
+<h1>CC - a self-hosting, bootstrappable, minimal C compiler</h1>
+<h2>Introduction</h2>
+<p>On the never-ending quest of a minimal system I found Swieros and C4 (the C compiler in 4 functions). Inspired and intrigued I started to implement my own.</p>
+<p>For abaos (a small operating system of mine, also in C) I cloned the minimal C library, so we can build a freestanding version of C4.</p>
+<p>C4 serves as a test whether my own CC is minimal enough and doesn't use silly functions. Additionally C4 as well as CC are compiled both in a (on Linux) hosted version and a freestanding version. We use a series of compilers like gcc, clang, tcc and pcc to make sure that we are not using silly C constructs.</p>
+<p>In order to be able to port easily we make almost no use of system calls, the ones we need are:</p>
+<ul>
+<li>brk: for malloc/free, change the start address of the heap segment of the process, if the OS only assigns a single static space, then brk results in a NOP.</li>
+<li>exit: terminate the process, return does not always work in all combinations (for instance with pcc on Linux). Can be a NOP, we don't require any trickery as <i>atext </i>and we don't use buffering anywhere (for instance flushing stdout on exit).</li>
+<li>read/write: read from stdin linearly, write to stdout linearly, this is essentially a model using an input and an output tape. Those two functions must really exist. This basically eliminates the need for a file system which we might not have during early bootstrapping.</li>
+</ul>
+<p>Similarly we simplify the C language to not use certain features which can cause trouble when bootstrapping:</p>
+<ul>
+<li>variable arguments: though simple in principle (just some pointers into the stack if you use a stack for function parameters), it is not typesafe. And the only example in practice it's really heavily used for is in printf-like functions.</li>
+<li>preprocessor: it needs a filesystem, we take this outside of the compiler by feeding it an (eventually) concatenated list of *.c files.</li>
+<li>two types: int and char, so we can interpret memory as words or as bytes.</li>
+</ul>
+<h2>Local version of C4</h2>
+<p>The local version of C4 has the following adaoptions and extensions:</p>
+<ul>
+<li>switch statement from the <i>switch-and-struct</i>s branch, adapted c4 itself to use switch statements instead of if's (as in the <i>switch-and-structs </i>branch)</li>
+<li>struct support from <i>switch-and-structs</i></li>
+<li>constants like <i>EO</i>F, <i>EXIT_SUCCES</i>S, <i>NUL</i>L</li>
+<li>standard C block comments along to c++ end of line ones</li>
+<li>negative enum initializers</li>
+<li>do/while loops</li>
+<li>more C functions like <i>isspac</i>e, <i>get</i>c, <i>strcm</i>p</li>
+<li>some simplified functions for printing like <i>putstring</i>, <i>putin</i>t, <i>putn</i>l</li>
+<li>strict C89 conformance, mainly use standard comment blocks, also removed some warnings</li>
+<li>some casts around malloc and memset to fit to non-void freestanding-libc</li>
+<li>converted printf to putstring/putint/putnl and some helper functions for error reporting like error()</li>
+<li>removed all memory leaks</li>
+<li>de-POSIX-ified, no open/read/close, use getchar from stdin only (don't assume the existence of a file system), this also means we had to create sort of an old style tape-file with FS markers to separate the files piped to c4.</li>
+</ul>
+<p><i>Note: </i>only too late I discovered that there was a C5 version of the same compiler, which would maybe have served better as a basis.</p>
+<h2>Examples</h2>
+<h3>Running on the host system using the hosts C compiler</h3>
+<p>Compiled in either hosted (host libc) or freestanding (our own libc, currently IA-32 Linux kernel only syscalls):</p>
+<p><code>./build.sh cc hostcc hosted d ./build.sh cc hostcc freestanding d ./cc \&lt; test1.c \&gt; test1.asm</code>
+Create a plain binary from the assembly code:</p>
+<p><code>fasm test1.asm test1.bin</code>
+Disassemble it to verify it's correctness:</p>
+<p><code>ndisasm -b32 -o1000000h -a test1.bin</code>
+You can choose <i>gcc</i>, <i>clang</i>, <i>tcc </i>or <i>pcc </i>as host compiler (<i>hostcc</i>).</p>
+<h3>Running on the host in the C4 interpreter</h3>
+<p>Running in C4 interpreter, again, the C4 program can be compiled in hosted or freestanding mode:</p>
+<p><code>./build.sh c4 hostcc hosted d ./build.sh c4 hostcc freestanding d</code>
+Here again you can choose the host compiler for compiling C4.</p>
+<p>Then we have to create the standard input for C4 using:</p>
+<p><code>echo -n -e &quot;\034&quot; \&gt; EOF cat cc.c EOF hello.c | ./c4 cat c4.c EOF cc.c EOF hello.c | ./c4 cat c4.c4 EOF c4.c EOF cc.c EOF hello.c | ./c4</code>
+EOF contains the traditional FS (file separator) character in the ASCII character set. Every time c4/c4.c is invoked it reads exacly one input file up to the first FS character (or stops at the end of stdin).</p>
+<p>We can also use <i>-s</i>, or <i>-d </i>on every level as follows:</p>
+<p><code>cat cc.c EOF hello.c | ./c4 -d</code></p>
+<h2>References</h2>
+<p>Compiler construction in general:</p>
+<ul>
+<li><i>&quot;Compiler </i><i>Construction&quot;</i>&quot;, Niklaus Wirth</li>
+<li><a href="https://github.com/DoctorWkt/acwj">https://github.com/DoctorWkt/acwj</a>: a nice series on building a C compiler, step by step with lots of good explanations</li>
+<li><a href="https://www.engr.mun.ca/%7Etheo/Misc/exp">https://www.engr.mun.ca/~theo/Misc/exp</a>_parsing.htm#climbing, <a href="https://en.wikipedia.org/wiki/Operator-precedence">https://en.wikipedia.org/wiki/Operator-precedence</a>_parser#Precedence_climbing_method</li>
+<li><a href="https://github.com/lotabout/write-a-C-interpreter/blob/master/tutorial/en/">https://github.com/lotabout/write-a-C-interpreter/blob/master/tutorial/en/</a>, tutorial based on C4 how to build a C interpreter, explains nicely details in C4.</li>
+</ul>
+<p>C4:</p>
+<ul>
+<li><a href="https://github.com/rswier/c4.git">https://github.com/rswier/c4.git</a>, <i>C4 </i><i>- </i><i>C </i><i>in </i><i>four </i><i>functions</i>, Robert Swierczek, minimalistic C compiler running on an emulator on the IR, inspiration for this project</li>
+<li><a href="https://github.com/rswier/c4/blob/switch-and-structs/c4.c">https://github.com/rswier/c4/blob/switch-and-structs/c4.c</a>, c4 adaptions to provide switch and structs</li>
+<li><a href="https://github.com/EarlGray/c4">https://github.com/EarlGray/c4</a>: a X86 JIT version of c4</li>
+<li><a href="https://github.com/jserv/amacc">https://github.com/jserv/amacc</a>: based on C4, JIT or native code, for ARM, quite well documented, also very nice list of compiler resources on Github page</li>
+</ul>
+<p>Other minimal compilers and systems:</p>
+<ul>
+<li><a href="http://selfie.cs.uni-salzburg.at/">http://selfie.cs.uni-salzburg.at/</a>: C* self-hosting C compiler (also emulator, hypervisor) for RISCV, inspiration for what makes up a minimal C language</li>
+<li><a href="http://www.iro.umontreal.ca/%7Efelipe/IFT2030-Automne2002/Complements/tinyc.c">http://www.iro.umontreal.ca/~felipe/IFT2030-Automne2002/Complements/tinyc.c</a>, Marc Feeley, really easy and much more readable, meant as educational compiler</li>
+<li><a href="https://github.com/rswier/swieros.git">https://github.com/rswier/swieros.git</a>: c.c in swieros, Robert Swierczek</li>
+</ul>
+<p>Assembly:</p>
+<ul>
+<li><a href="https://github.com/felipensp/assembly/blob/master/x86/itoa.s">https://github.com/felipensp/assembly/blob/master/x86/itoa.s</a>, for putint (early debugging keyword)</li>
+<li><a href="https://baptiste-wicht.com/posts/2011/11/print-strings-integers-intel-assembly.htm">https://baptiste-wicht.com/posts/2011/11/print-strings-integers-intel-assembly.htm</a> (earldy debugging keyword)</li>
+</ul>
diff --git a/miniany/cc.c b/miniany/cc.c
index c814f3d..28d9f1c 100644
--- a/miniany/cc.c
+++ b/miniany/cc.c
@@ -1295,6 +1295,49 @@ void parseIf( struct Compiler *compiler )
free( label1 );
}
+/*
+ void parseIf( struct Compiler *compiler )
+{
+ struct Parser *parser;
+ struct ASTnode *node;
+ char *label1, *label2;
+
+ parser = compiler->parser;
+ parserExpect( parser, S_IF, "if" );
+ parserExpect( parser, S_LPAREN, "(" );
+ node = parseExpression( parser, 0 );
+ if( compiler->generator->debug ) {
+ putstring( "; if <cond> then" ); putnl( );
+ }
+ generateFromAST( compiler->generator, node, NOREG );
+ putstring( "cmp al, 0" ); putnl( );
+ label1 = genGetLabel( compiler, compiler->parser->global_scope );
+ putstring( "je " ); putstring( label1 ); putnl( );
+ genFreeAllRegs( compiler->generator );
+ parserExpect( parser, S_RPAREN, ")" );
+ parseStatementBlock( compiler );
+ if( parser->token == S_ELSE ) {
+ label2 = genGetLabel( compiler, compiler->parser->global_scope );
+ putstring( "jmp " ); putstring( label2 ); putnl( );
+ if( compiler->generator->debug ) {
+ putstring( "; else" ); putnl( );
+ }
+ putstring( label1 ); putchar( ':' ); putnl( );
+ compiler->parser->token = getToken( compiler->parser->scanner );
+ parseStatementBlock( compiler );
+ putstring( label2 ); putchar( ':' ); putnl( );
+ free( label2 );
+ } else {
+ putstring( label1 ); putchar( ':' ); putnl( );
+ }
+
+ if( compiler->generator->debug ) {
+ putstring( "; fi" ); putnl( );
+ }
+ free( label1 );
+}
+
+ */
void parseStatement( struct Compiler *compiler )
{
struct Parser *parser;