Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- /* The top level of a --C program a series of 1 or more function definitions.
- * Not having a function named "main" should be considered an error, though
- * "main" may appear anywhere.
- * Comments can also appear on the top level, but they should be tokenised and
- * then dropped (https://www.youtube.com/watch?v=YLdCKDyxQeI) during the
- * lexical analysis.
- */
- // I'm going to use regular yacc syntax in this document because.
- /* function_definition : identifier identifier LBRACKET param_list RBRACKET block
- * ;
- type_identifier identifier
- * (names for stuff to be decided)
- *
- * (side note: Params - in the function definition. Args - in the function
- * call.)
- * There are only three types (int, function, void) and one of them is only to
- * say that a function doesn't return anything
- I'm thinking no void and default to returning 0 in int functions and raising an error in function returning functions.
- . We may want to just use an
- * IDENTIFIER token for both of them, but an identifier rule would be useful so
- * we can have it do t[0] = t[1].value for all the other times we need to use
- * such identifiers.
- * def p_identifier(t):
- * """identifier : IDENTIFIER"""
- * t[0] = t[1].value
- * (side note: yacc case sensitivity?)
- * (of course we may want it to output something like ("ID", t[1].value) so the
- * execution part can easily tell what kind of eldrich horror it has on its
- * virtual hands)
- *
- * We might also want to separate type identifiers and variable identifiers so
- * that we can get yacc to deal with the fact that there are only three possible
- * types at parse time.
- * type : TYPE_INT
- * | TYPE_FUNCTION
- * ;
- * If we were feeling particularly enterprising/insane we could even make a
- * separate function_type rule for dealing with functions.
- * type_void : TYPE_VOID
- * ;
- * function_type : type
- * | type_void
- * ;
- * (using "type_void" instead of "TYPE_VOID" works better)
- */
- function negative_double_multiplier(int m) {
- int m2 = -2 * m;
- int f(int n) {
- return n * m2;
- }
- return f;
- }
- /* block : LCURLY declare_list statement_list RCURLY
- * ;
- We'll want block : { statement_list } | { declare_list statement_list }
- So we don't have to include an empty statement_list
- * A block is the body of a function, an if statement, a while loop etc.
- * I'm not sure about this, but would a function definition such as maybe_this
- * below be valid?
- */
- int maybe_this(int a, int b)
- return a + b;
- /* If so, then the "block" in function_definition above could be replaced with
- * "statement".
- * (side note: statements are executed, expressions are evaluated(?))
- * statement : block
- * | expression SEMICOLON
- * | assign_statement SEMICOLON
- * | function_definition
- * | while_statement
- * | if_statement
- * | return_statement SEMICOLON
- * | iter_control SEMICOLON
- * ;
- Your side note doesn't make sense as evaluation should be the same as execution. Possibly you can split them based on execution affecting environment.
- Either way, I'd think:
- statement : block | expression | control_statement
- control_statement them subsumes while, if, return, iter (we don't need iteration as far as I know, just while)
- expression subsumes assignment as (a = b) should evaluate to b in general C and it's easy enough to do
- function_definition should only be permitted in declare_list
- * The only difference between a function's block and an if statement's one is
- * that the function also has the calling arguments in the local environment.
- Not a difference at all really, since the calling arguments should behave like a parent environment or be initially set for the current environment
- * (also it can't contain iter_control statements (break, continue) as top-level
- * statements, but when not inside a while loop neither can if statements, so
- * we will also have to some logic for that somewhere (while possible,
- * making an alternate statement definition, if_statement definition etc might
- * not be the best (but it would allow for yacc to detect wrongly placed
- * iter_control statements))).
- I don't think you can elegantly do this in the grammar, however you can maybe set a loop_counter that gets incremented when inside a loop, decerementing when leaving, raising an error when trying to leave l_c=0.
- In the interpreter, detecting this is trivial since control statements should just bubble up the call chain until reaching a loop object or function object, raising an error if it reaches a function object (break, continue) or the "parent of root" (outside of function return)
- *
- * Declaration statements (while they are statements) aren't here because they
- * aren't allowed in statement_list, only in declare_list.
- *
- * One other point: do embedded function definitions have to be at the top of
- * the block (that is to say in the declare_list)? If so then obviously
- * function_definition shouldn't be here.
- *
- * declare_start : type identifier
- * ;
- * I separate this from declare because it can also be used in param_list.
- * (unrelated: (lambda x: x(x))(lambda y: y(y)))
- * declare : declare_start SEMICOLON
- * | declare_start ASSIGN expression SEMICOLON
- * ;
- * ASSIGN is "=", I used the name because it is less ambiguous. Equality is
- * "EQ".
- *
- * expression : function_call
- * | unary_expression
- * | infix_expression
- * | bracket_expression
- * | number
- * | identifier
- * ;
- I massively suspect reading about expression grammars is important here to make sure stuff like precedence is inherent in the grammar where feasible.
- * I may have forgotten one or two things here.
- */
- int main() {
- int x;
- function multi = negative_double_multiplier(7);
- x = multi(multi(5));
- return maybe_this(2, x);
- }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement