summaryrefslogtreecommitdiff
path: root/miniany/REQUIREMENTS
blob: 645bdfa980be9ef84cd266e06ad0a9e1c9881841 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
implementing:

- userland
  - argument passing to main function (argc, argv)
- libc
  - print_char
    - requires a 3 parameter syscall to 80h (Linux)
      - requires
        - inline assembly

not implementing:
- libc
  - variadic functions are not type-safe, do we need them?
    - printf -> putint, putchar, etc.
      - format string only, as replacement for puts
      - vararg required in compiler
      - not type-safe
    - snprintf no option, strcat, strstr etc also not really
    - newer formating functions and logging: strfmon, error, warn, syslog
    - syscall
  - puts
    - requires stdout, which is a FILE structure
  - print_char
    - requires a 3 parameter syscall to 80h (Linux)
      - requires
        - either inline assembly
        - linker and calling convention
- preprocessor
  - have a cat building up the required modules instead
  - needs file operations (at least open, close, read)
  - needs a file system on the host and the destination
    (alternative: have a tape-like file system)
- linker
  - have compilation units needs a linker do build
    an executable
- symname[t] printing the symbol and not the number,
  requires static initializers for array of char*
- ASTs are basically only useful when you start to optimize,
  till then you can use an intermediate format (as C4) does
  and a stack machine. They also make the code easier readable.
  For use they fore the introduction of pointers, references and structs.
  In expression parsing we see, that const folding already needs
  an AST, because we should not emit code when still reading
  a constant expression. It also seperates syntactical stuff like '['
  from logical stuff like 'declaration of array size' and 'derefencing
  a pointer'.
- void * allowing to omit (char *) from and to for instance structs
  in dynamic memory management
- typedefs are just syntactic sugar, I use them mostly for 'struct T' -> 'T'
- initializers of global and locals, not that important as we use C89 anyway,
  forcing us to separate declaration and usage of variables per scope
- unions, useful to safe space in AST, but not strictly necessary  
- bool, useful, but not strigtly necessary
- enums as constant replacement (instead of preprocessor), realy enum types
  are not really useful.
- forward struct definitions or typedefs (handy for Compiler structure), but..
- for loop: unless we start optimizing (SIMD) there is no real benefit
  for a generic 'for', a strict for i=0 to N, i++ is easier to optimize, when
  you have a grammatical construct to help recognizing it.
- register number for register alloation
  https://en.wikipedia.org/wiki/Strahler_number
- volatile: we are not doing any optimizations for now, so volatile (as const)
  can just be a ignored keyword.
- c4 freestanding
  - uses some casts, the malloc ones are actually good for clarification,
    the ones in memset are not so usefull (this is all because we don't
    have 'void *')
  - open/read/close is POSIX, we would prefer either C style file handling
    (we have it in libc-freestanding.c or some stdin, stdout thingy)
  - again printf and varargs, either use libc-freestanding.c or revert to
    putint, putstring, putnl..
    - if (tk == '(') next(); else { printf("%d: open paren expected\n", line); exit(-1); }
      =>
      error("open paren expected"); }
    - printf("%d: compiler error tk=%d\n", line, tk); exit(-1);
    - printf("could not malloc(%d) symbol area\n") => remove size, also map to error
    - printf("read() returned %d\n", i); => dito
    - we also print a non-sensical line, but we don't really care about this
    - printf("%d: bad enum identifier %d\n", line, tk); he number 'tk' looks like
      debug output here, so we drop it.
      error1int is the other option (also choosen in other places)
    - other cases translate by hand:
      - case EXIT: /* putstring("exit("); putint(*sp); putstring(") cycle = "); putint(cycle); putnl(); */ return *sp;
      - default: putstring("unknown instruction = "); putint(i); putstring("! cycle = "); putint(cycle); putnl(); return -1;