Forums for the FreeOrion project
It is currently Thu Oct 19, 2017 9:13 am

All times are UTC

Post new topic This topic is locked, you cannot edit posts or make further replies.  [ 1 post ] 
Author Message
 Post subject: The New FreeOrion Parser
PostPosted: Sun Nov 06, 2011 3:39 pm 
Programming Lead Emeritus
User avatar

Joined: Thu Jun 26, 2003 1:33 pm
Posts: 1092
What Happened

First off, for those of you that don't know a "parser" is a bit of code that reads plain text (AKA a "string"), and does something with it. FreeOrion has a parser that reads the FO config files (e.g. techs.txt) and builds the appropriate data structures to represent the contents of each file.

I've just rewritten it from scratch, using the old one as a guide. This was a fairly big job, but I feel that the benefits outweigh the development time.

The benefits are:
  • It's easier for the programmers to extend and modify the parser. The old design had some warts that are now removed.
  • It's easier for the scripters to write the FO config files, because there are really good error messages now.

That might not sound like much, if you have never tried to add something to the parser, or change one of the config files. ;)

Now that we have several people writing content and asking for new kinds of content to be available for their use (i.e. asking for changes to the parser), this seemed like a good time to make this change. As the config files get larger, being able to figure out where errors are will become more and more important.

Here is an example of an actual error message produced by the new parser:
/home/tzlaine/FO_parser/default/techs.txt:1990:46: Parse error.  Expected Condition here:
                  OwnedBy TheEmpire Source.Owner

The first line starts with file:line:column: for easy access to the exact location of the error. Then, the thing it expected to find (above, it expected a Condition). Next, the parser quotes the line in which the error occurred. Finally, the caret ("^") on the last line indicates the position of the problem in the quoted line.

The new parser follows the old one nearly exactly. However, there is one significant change in what you can put into the config files; non-numeric values can no longer include arithmetic or numbers. For instance, "System.StarType + 1" is no longer allowed. See this thread for an example of why this was a bad idea in the first place. Take special not of this post that underscores how crazy the config files would have to get if the scripter were responsible for keeping track of all the math.

There are a couple of quirks to the new parser. First, you must have at least one space before a C-style/multiline comment ("/* some comment */"). "Some stuff/* some comment*/" will produce an error, but "Some stuff /* some comment*/" will not. Second, error locations are a bit wonky. The error location where the parser thinks an error occurred is wherever the parser last successfully made progress in the parse. So, in the example error message above, there is a typo in the name of the Condition following the quoted line. It should read "ProductionCenter", but I mistyped it as "PoductionCenter". It would be better if the parser pointed to "PoductionCenter" itself. I'm currently working to make the error messages even better in this regard. The tl;dr on error messages is that you should start looking at the indicated location to find the error, since the error might not be exactly where the caret is pointing.

Future Work

I'd like to make the parser a bit more user-friendly. For instance, I find the parameters passed to GenerateSitRepMessage to be a bit clunky. It would be better IMO if they were passed as Tag = Data instead of Tag = "Tag" Data = "Data". I've kept the old way of doing things for the first pass, so we can just evaluate the reimplemented parser for correctness before making any significant changes. I'm soliciting comments now from scripters who find particular parts of the config file grammar to be hard to use or unclear. So, comments welcome.

Also, I tried my best to make the error message output as user-friendly as possible, but as a programmer I find it hard not to use terms like int, double, and string. It's not clear to me how clear the error messages will be to non-programmers, so again, comments welcome!

Display posts from previous:  Sort by  
Post new topic This topic is locked, you cannot edit posts or make further replies.  [ 1 post ] 

All times are UTC

Who is online

Users browsing this forum: No registered users and 1 guest

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group