SHOW:
|
|
- or go back to the newest paste.
1 | /* The top level of a --C program a series of 1 or more function definitions. | |
2 | * Not having a function named "main" should be considered an error, though | |
3 | * "main" may appear anywhere. | |
4 | * Comments can also appear on the top level, but they should be tokenised and | |
5 | * then dropped (https://www.youtube.com/watch?v=YLdCKDyxQeI) during the | |
6 | * lexical analysis. | |
7 | */ | |
8 | ||
9 | // I'm going to use regular yacc syntax in this document because. | |
10 | ||
11 | /* function_definition : identifier identifier LBRACKET param_list RBRACKET block | |
12 | * ; | |
13 | ||
14 | type_identifier identifier | |
15 | ||
16 | * (names for stuff to be decided) | |
17 | * | |
18 | - | * say that a function doesn't return anything. We may want to just use an |
18 | + | |
19 | * call.) | |
20 | * There are only three types (int, function, void) and one of them is only to | |
21 | * say that a function doesn't return anything | |
22 | ||
23 | I'm thinking no void and default to returning 0 in int functions and raising an error in function returning functions. | |
24 | ||
25 | . We may want to just use an | |
26 | * IDENTIFIER token for both of them, but an identifier rule would be useful so | |
27 | * we can have it do t[0] = t[1].value for all the other times we need to use | |
28 | * such identifiers. | |
29 | * def p_identifier(t): | |
30 | * """identifier : IDENTIFIER""" | |
31 | * t[0] = t[1].value | |
32 | * (side note: yacc case sensitivity?) | |
33 | * (of course we may want it to output something like ("ID", t[1].value) so the | |
34 | * execution part can easily tell what kind of eldrich horror it has on its | |
35 | * virtual hands) | |
36 | * | |
37 | * We might also want to separate type identifiers and variable identifiers so | |
38 | * that we can get yacc to deal with the fact that there are only three possible | |
39 | * types at parse time. | |
40 | * type : TYPE_INT | |
41 | * | TYPE_FUNCTION | |
42 | * ; | |
43 | * If we were feeling particularly enterprising/insane we could even make a | |
44 | * separate function_type rule for dealing with functions. | |
45 | * type_void : TYPE_VOID | |
46 | * ; | |
47 | * function_type : type | |
48 | * | type_void | |
49 | * ; | |
50 | * (using "type_void" instead of "TYPE_VOID" works better) | |
51 | */ | |
52 | function negative_double_multiplier(int m) { | |
53 | int m2 = -2 * m; | |
54 | int f(int n) { | |
55 | return n * m2; | |
56 | } | |
57 | return f; | |
58 | } | |
59 | ||
60 | /* block : LCURLY declare_list statement_list RCURLY | |
61 | * ; | |
62 | ||
63 | We'll want block : { statement_list } | { declare_list statement_list } | |
64 | So we don't have to include an empty statement_list | |
65 | ||
66 | * A block is the body of a function, an if statement, a while loop etc. | |
67 | * I'm not sure about this, but would a function definition such as maybe_this | |
68 | * below be valid? | |
69 | */ | |
70 | int maybe_this(int a, int b) | |
71 | return a + b; | |
72 | /* If so, then the "block" in function_definition above could be replaced with | |
73 | * "statement". | |
74 | * (side note: statements are executed, expressions are evaluated(?)) | |
75 | * statement : block | |
76 | * | expression SEMICOLON | |
77 | * | assign_statement SEMICOLON | |
78 | * | function_definition | |
79 | * | while_statement | |
80 | * | if_statement | |
81 | * | return_statement SEMICOLON | |
82 | * | iter_control SEMICOLON | |
83 | * ; | |
84 | ||
85 | Your side note doesn't make sense as evaluation should be the same as execution. Possibly you can split them based on execution affecting environment. | |
86 | ||
87 | Either way, I'd think: | |
88 | statement : block | expression | control_statement | |
89 | ||
90 | control_statement them subsumes while, if, return, iter (we don't need iteration as far as I know, just while) | |
91 | expression subsumes assignment as (a = b) should evaluate to b in general C and it's easy enough to do | |
92 | function_definition should only be permitted in declare_list | |
93 | ||
94 | * The only difference between a function's block and an if statement's one is | |
95 | * that the function also has the calling arguments in the local environment. | |
96 | ||
97 | Not a difference at all really, since the calling arguments should behave like a parent environment or be initially set for the current environment | |
98 | ||
99 | * (also it can't contain iter_control statements (break, continue) as top-level | |
100 | * statements, but when not inside a while loop neither can if statements, so | |
101 | * we will also have to some logic for that somewhere (while possible, | |
102 | * making an alternate statement definition, if_statement definition etc might | |
103 | * not be the best (but it would allow for yacc to detect wrongly placed | |
104 | * iter_control statements))). | |
105 | ||
106 | I don't think you can elegantly do this in the grammar, however you can maybe set a loop_counter that gets incremented when inside a loop, decerementing when leaving, raising an error when trying to leave l_c=0. | |
107 | ||
108 | In the interpreter, detecting this is trivial since control statements should just bubble up the call chain until reaching a loop object or function object, raising an error if it reaches a function object (break, continue) or the "parent of root" (outside of function return) | |
109 | ||
110 | * | |
111 | * Declaration statements (while they are statements) aren't here because they | |
112 | * aren't allowed in statement_list, only in declare_list. | |
113 | * | |
114 | * One other point: do embedded function definitions have to be at the top of | |
115 | * the block (that is to say in the declare_list)? If so then obviously | |
116 | * function_definition shouldn't be here. | |
117 | * | |
118 | * declare_start : type identifier | |
119 | * ; | |
120 | * I separate this from declare because it can also be used in param_list. | |
121 | * (unrelated: (lambda x: x(x))(lambda y: y(y))) | |
122 | * declare : declare_start SEMICOLON | |
123 | * | declare_start ASSIGN expression SEMICOLON | |
124 | * ; | |
125 | * ASSIGN is "=", I used the name because it is less ambiguous. Equality is | |
126 | * "EQ". | |
127 | * | |
128 | * expression : function_call | |
129 | * | unary_expression | |
130 | * | infix_expression | |
131 | * | bracket_expression | |
132 | * | number | |
133 | * | identifier | |
134 | * ; | |
135 | ||
136 | I massively suspect reading about expression grammars is important here to make sure stuff like precedence is inherent in the grammar where feasible. | |
137 | ||
138 | * I may have forgotten one or two things here. | |
139 | */ | |
140 | ||
141 | int main() { | |
142 | int x; | |
143 | function multi = negative_double_multiplier(7); | |
144 | x = multi(multi(5)); | |
145 | return maybe_this(2, x); | |
146 | } |