summaryrefslogtreecommitdiff
path: root/minie/DESIGN
blob: 1dabf9cd66bc41941ea046c81043013f3f498f39 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
Premises:

We want to build a simple compiler for a simple language.

In the end we want to be self-hosting.

Most languages stop maintaining the first compiler in another
language because they optimize on their the compiler written
in their own language.

Starting with a C compiler is too hard, has too many quirks.

We want minimal code we duplicate in more than one language.

We don't want to maintain too much code in the old language.

Every tool should possibly be written in the new language.

Options:

- Choose an existing language or a subset of it, e.g. a mini C
  or mini Pascal. Use the main compiler to do initial builds.
  Use itself to generate the self-hosting binary.

- Choose a new language, create a converter to an existing
  language (for instance C). Something like p2c.

- Bootstrap new language in ever more complex compilers written
  in the new language itself (as gcc does).
  - problem: maintain 2, 3, 4 compilers
  
- There is a two or a three language step. We can use O as
  destination language for generated code, N for the new language
  and write the first tools in a third language X.
  
- Language O is just a special backend for the code generator.
  So it's the first one we implement.
  
Glossary

- O: old language, well-established, can be ported, can build native code
- O', O'': subset languages of language O with reduced features, O' has
  most features in common with O, O''''' has least features in common with O
- N: new language we want to have a compiler for
- N', N'': subset languages of language N with reduced features
- S: a system being able to run all tools of O and N. This is an operating
  system with some tools for building software.
- H: a system S used as host system
- H', H'': reduced host system of H
- T: a system S used as target system
- T', T'': reduced target system
- E(S): emulator able to act as system S

Step 1: Build a translator (a transpiler) from N'' -> O'' written in O, O', O''

We also use the O-toolchain for building all artifacts (compiler, assembler, linker).
Try to build minimal subsets of N and use as little features of O for
the generated code. This first compiler allows to port N to new platforms,
given they have a compiler in O. The less features of O we are using, the
better.

Step 2: Write compiler in N, with a backend for O''

Do all the steps of step 1 again in language N as nicely as possible.
Introduce new elements to N (and thus to the N to O translator as needed).
The question remains whether this is a full N language or only a subset
of language N'. We produce code for a small subset of O, so also this code can be
built on as many systems as possible. The goal is that N is self-hosting
using the toolchain of O.

Step 3: Write a native backend for N targetting system T'' or a for a simulator E(T'')

TODO: Tombstone notation, can also be applied to more than just a compiler