Half-way through this Abstract Syntax Tree stuff...
Whilst not required at this stage, I would like to have some kind of source caching in place at a later date. Opening and reading files in Windows is quite slow and for a large code base with 100+ source files this adds a lot of up-front time -- 100 file reads are a roadblock as-is, but the bulk of time then falls upon parsing the text into words.
My plan is that once a project is assembled, the token-streams for all files are written out to a binary file. Should the dates / filesizes of source files not change, the tokenised version of the file can be retrieved from the cache, without having to do any string parsing -- a major speed boost, no doubt.
Now one could cache the Abstract Syntax Tree instead, which would be even faster (as the token-streams would not have to be built into the syntax tree), but this is far, far too dangerous and unreliable. The whole point of the Abstract Syntax Tree is to do away with validation during assembly, having moved that responsibility to the source parsing. If one were to accept a binary file of an Abstract Syntax Tree, the nodes could be entirely wrong, crashing the assembler with little hope of recovery. Also, should the OZ80 language change, loading an outdated AST would have the same effect.
Instead, if we cache the token-stream (a simple word-for-word compact binary form of the source code), we can re-run this through the validation to build an AST. Therefore poisoned and out-of-date caches would be safely parsed with full error handling.