I was wondering why the game turned out so slow with just a few dozen colonized systems and only 500 in total (game attached). Just one step took several minutes to compute (Freeorion r4279). So I ran callgrind on freeoriond, it took more than three hours to complete one game step, but it was worth it.
I would like to throw in at this time that Freeorion is one of the best-written code examples I have ever read, but at the same time (unfortunately) it features some of the poorest algorithms I have ever seen.
The real evildoer is the WithinStarlaneJumps condition implementation.Edit 2011-09-22 13:40: updated OP to my current knowledge of the code
WithinStarlaneJumps::Eval(context, matches, non_matches, search_domain)
WithinStarlaneJumps has two relevant members, m_subcondition and m_jumps, that represent what we expect to find in proximity and what maximum distance we allow. If search_domain is MATCHES, it is supposed to scan through matches and move all objects to non_matches that are not within m_jumps of an object matching subcondition. If search_domain is NON_MATCHES, instead it is supposed to scan through non_matches and move all objects to matches that are
within m_jumps of an object matching subcondition.
For simplicity, I will assume that both all objects are star systems and seach_domain is MATCHES. Then, the current implementation results to this:
subcondition_matches = eval(m_subcondition)
for candidate in matches:
for it in subcondition_matches:
distance = Universe::LeastJumpsPath(it, candidate)
if distance <= m_jumps:
match = true
move candidate from matches to non_matches
The complexity of this implementation obviously is #matches * #subcondition_matches * #starlanes_in_universe, the latter because Universe::LeastJumpsPath does a BFS to find the shortest path, scanning a significant portion of the universe when candidate and it are further apart. The callgrind file (gzipped 3.1MB)
reveals 200 million node visits for just one game step in the test game attached (you can view the callgrind file after decompression with e.g. KCachegrind or WinCachegrind).
I claim that it is very easy to code a variation of Universe::LeastJumpsPath() so that:
- it respects a maximum distance and does not search beyond that (remember the first queue entry of a level to recognize the next level)
- it receives a set of origin systems instead of one system (just initialize the BFS queue with the whole set)
- it receives a set of targets instead of just one (just test for membership instead of identity compare the target)
- it returns the set of targets that respect the maximum distance (just move objects from targets set to result set when found)
Using this function instead of the two loops, we result in a complexity of #starlanes_in_range_of_subcondition_matches (for search) * log(#matches) (for membership test); for short ranges the membership test should dominate, for longer ranges we'd still have at most #starlanes_in_universe * log(#matches), which should be only a few thousand node visits in the test game.
Clearly, some adjustments are necessary to make this work for fleets. One approach I can think of is return a set of triples (origin, target, distance) and check the fleets in WithinStarlaneJumps::Eval() using these triples. Another approach might be allowing positions on starlanes as origins and targets as well, but I think that would result in more or less the same algorithm.
Personally, I would rather let planets emit (and withdraw) supply objects
within range than checking distances, but the above suggestion should already bring huge speedups even for medium distances.
P.S. callgrind file (gzipped 3.1MB)
is too large for upload, so I had to create a link to my server. Sorry!