expression can also be the result of a function (system.readline) assignment dependens on the type of the left-hand-side string (s := system.readline) or w introduce VAR parameters s := system.writeline( ); system.readline( var s : array of char ); the problem here is: we cannot return a structure of arbitrary size via the stack. so the var version is the one fitting more to one written in a real system module in language e. The embedded function version of the pseudo system module looks more like a special assignment for string arrays. types: boolean: and or not operators integer : + - * / mod, max(int), min(int) char: 'A' or char(13) byte: 8-bit unsigned value word: 32-bit addressable value enum colors (red, blue, green, yellow) we allow int(red)=0 what about assignments of explicit values as in C? <= <> = (assignment :=) const types? no floats for now structured types: record, array, set array a[1:25] strings are arrays of variable len? do we need ranges for arrays? a : array[20] of integer; is clearer than in Edison: array a[0..20] (int) array 10, 10 of integer, that's Oberon syntatic sugar another way of representing the length in the last bytes of the array and also to zero terminate the string (bron dijkstra string): https://github.com/norayr/Bron-Dijkstra-Strings/blob/master/bdStrings.Mod Dijkstra\ -\ Efficient\ String.pdf --- edison-es drops modules. I actually find system.writeln, system.readln quite appealing. writeinteger writeboolean writechar writeln/writeline/writestring, but there is no basic type for a sequence of chars, is this a array[20] of char? explicit skip strings as types: "Abc" is a string constant can be represented as array[3] of char, but then, how can this be assigned to a array[4] of char? So types can be assigned if they are compatible, so we can say assigning an array[3] of char (also 'Abc') to array[4] of char is possible, but not the other way round as it would violate the boundaries! array of char is only possible with dynamic memory management, which is a thing we might not want at all? 0-terminated vs. length. but not 255 Pascal-like, have a RLE schema for first N bytes. char can also be unicode, conversion to integer is possible, but not to byte. Use array[128] of byte for buffers. certain functions might have to work on arrays of arbitrary size, like a 'StrCopy' function with 'array of char' with an unknown size. They need a relaxed type check and delegate checking of boundaries if needed into the runtime. built-in functions like LEN or system.length. length, sizeof sounds more like a compiler thing, system.length more like a library thing. Actually. We don't want to say len is platform dependency, so using length in a piece of code might be very portable. So we have an internal set of functions related to compiler things: - domains of data types - conversion of data types - len, size, addr of variables/arrays The system module on the other hand contains things which relate somehow to the environment, e.g. backend, operating system and which might have to be ported heavily. They are still called inside the compiler most likely when generating code. expressions var b : boolean; b := s[i] <> char( 0 ); if b do x end is the same as: if s[i] <> char( 0 ) do x end The '<>' operator must return a boolean type. So we just call expressions inside if as for in the assignment (later also in the 'while' condition). return expression: only at the end, after statementBlock, or as "begin" statementlist [ "return expression ] "end" or as a semantic thing allowing "return" everywhere but knowing whether the context is a procedural or a function context, or as in C, allow it everywhere because everything is a function. system.readline( s ) fits more to fgets, but s := system.readline; is more what I want. memory management ----------------- options: - static allocation - stack-based - explicit: C malloc/free - region-based - thread-local heap - implizit: - garbage collection - ARC: reference counting and weak pointers decouple from polymorphism, seems to be a big design problem in most programming messages. dangers in real-time programming: - priority inversion on locks - fragmentation of memory, program fails because there is not enough un-fragmented space, a copying garbage collector might help or compacting and rewriting pointers, but this is again a real-time issue if not done incrementally how to decoouple read-only and read-write parts of the the statically allocated memory? Stack only allocation if possible. This also means, temorary structures can not be trees with pointers. This means a transpiler must emit code (in our case C source code) while parsing, which might me challenging. Even better is static sized local buffers and global statically allocated structures (e.g. a symbol table with at most 50 types). This limits have to be adapted and the compiler has to be recompiled. But the benefits are that you are not using any dynamic memory allocation which can go wrong in some ways. procedure/function declarations: Pascalish: procedure f : char; procedure f; C-ish: procedure f( ) : char; procedure f( ); => matter of definition of ParameterList in ProcedureDeclaration function getChar : char; procedure getChar : char; => doesn't add anything to help parsing => pascal/oberon calls it function procedures => had some weird discussions telling me functions are not procedures.. procedure f( var a : integer ); function sin( x : float ) : float; => function seems more mathematical, but otherwise we don't gain anything to have a keyword more for detecting anything we couldn't detect already using/calling: procedure call: init; proc( a, b ); function call in expression: a+sin( b ) a+rand( ) => even here it might feel more logical and actually the syntax element '(' helps us to detect it is actually a function, otoh we can get the same information from the symbol table. enums: Oberon has none. You can always use constants or sets, but then the switch statement cannot be protected against wrong use of constants. C and Java went the way from constants to proper enums. => subtyping problem, extending enums means removing states to be defined in a sane way. Now removing states in an enum makes hardly code relying on more states behave in a consistent way. => subtype-explosion, enums are just a fancy way of defining integer constants the only practical application I have is avoid implicit type-coersion to ints and handle the ranges in a state machine switch. => enums used in array subscripts lead to the sub-range problem of pascal/edison unless I force enums to always start from 0,1,2,... as internal representation => OOP has no need for enums, as I can discriminate and extend a basic type, e.g. KEYWORD extended to KEYWORD_MODULE, KEYWORD_IF, etc. => enum constants have no const value, so they cannot be used to define an array (or at least, this needs a special cast again) compared to functional languages the C-version of enums is quite limites, see tagged unions (for instance in Rust). underscores: started when trying to add S_module constant, so defacto a workaround for a missing namespace/module called 'Scanner' with constant 'module'. Do we forbid _ alltogether, as they are a sign of bad modularization or namespace emulation? On the other hand we will have longer identifiers, so _ is needed to separate words. AST: https://stackoverflow.com/questions/21150454/representing-an-abstract-syntax-tree-in-c design Scanner class or struct vs. OPS module containing all variables. all modules in the Oberon compiler act as singletons. nested procedures intermediate formats can be in memory or on external media. The later is the older design when memory was scarse. It also avoids using complicated data structures in memory. symbols all allocation on stack might not be the best idea.. ..we allocate all symbols with their type, we must split symbols and types. Variables point to their defined type, parameters in a function to local variables (or variables of the upper scope in case of a VAR parameter), names in a record type definition to symbols agains, which point to types. This also means we usually get a global namespace this way, so a variable 'a', a type 'a' and a procedure 'a' cannot co-exist. procdure types type Func = procdure( x : integer, y : integer ) : char; var f : func; begin f := nil; this is a pointer to a function, so is 'pointer to procdure' better as in a pointer to a procedure implementing that inteface? also, the 'x' and 'y' don't really have a semantic meaning in the type declaration as I can assign also a matching function 'f2': procdure f2( a : integer, b : integer ) : char; begin ... end f := f2; nil should most likely end in an exception. but this means we have to check a special condition on every function call. e2c or C-transpiler questions Symbol for scanner and Symbol for symbol table clash, we should use modules and different names. And why are we insisting on having no preprocessor and only one e2c.c? set/bitset/enum set: mutually exclusive bitset: usable as switches, flags have a look at System/360 and how the grand-fathers and fathers did it: XPL/XCOM: strings af variable size with garbage collection links ----- https://hackernoon.com/considerations-for-programming-language-design-a-rebuttal-5fb7ef2fd4ba https://en.wikibooks.org/wiki/Oberon/A2/Oberon.Strings.Mod https://en.wikipedia.org/wiki/Tombstone_diagram