Files in the MUDC compiler and what they do: lex.l - Processed by lex (the lexical analyzer generator) to produce lex.yy.c (C source code). lex.yy.c - The product of lex and lex.l, but included since the the lexical analyzer rarely needs changing in distributions. yacc.y - The context free grammar and stubs for MUDC language. Processed by yacc to produce yacc.tab.c, which is C source code for the LALR parser. Embedded in yacc.y is the logic for constructing the parse tree for the input stream. The parse tree is then passed to the code generator (emit.c) which uses syntax directed translation to output MUDASM. After MUDASM, you are in Artur's territory. :) sym.h, sym.c - C source for symbol table utilities. parse.h, parse.c - C source for parse tree utilities. emit.c - MUDASM generator, traverses the parse tree built by the LALR parser and emits MUDASM based on syntax rules, and symbol table types, etc. [ LEXICAL ANALYZER ] <- GET_TOKEN() -> [ LALR PARSER ] | [ EMIT_ASM() ] Don't modify lex.yy.c or yacc.tab.c directly, unless there is some bug which keeps your C compiler from compiling it. To work on the compiler, get flex and bison (lex and yacc are probably already installed) and work from lex.l and yacc.y.