[PATCH] WithinStarlaneJumps implementation

Programmers discuss here anything related to FreeOrion programming. Primarily for the developers to discuss.

Moderator: Committer

Message
Author
User avatar
cami
Space Dragon
Posts: 411
Joined: Tue Sep 20, 2011 10:32 pm
Location: Halle (Saale)

Re: [PATCH] WithinStarlaneJumps implementation

#16 Post by cami »

I was able to get rid of all these costs by rather simple means

- add_vertex<> and Universe::WithinJumpsImpl() now use integer vertex descriptors and std::vector when possible
- EffectsGroup::GetTargetSet now directly passes the TargetSet to Condition::Eval using pointer type pun when this is safe
(conditions checked are: TargetSet and ObjectSet are of type std::set<T1> and std::set<T2>, T1 is convertible to T2, T1 and T2 have
the same size, and T1() == T2() returns true)
- Universe::GetEffectsAndTargets(EffectsTargetsCausesMap&, const std::vector<int>&) constructs an Effect::TargetSet once and passes this down instead of letting Universe::StoreTargetsAndCausesOfEffectsGroups reconstruct this set on every call
- Universe::StoreTargetsAndCausesOfEffectsGroups restores all_potential_targets by moving pointers back from targets instead of creating a new copy of it on every iteration

Im currently testing and bugfixing these modifications

Edit: I know that the current implementations don't use std::set's efficient membership tests much, if at all, but personally I dont think it's a bad idea to have them. However, I agree that boost::unordered_set might be a better choice, if it is taken the pointer type pun precautions above would have to be modified to check for boost::unordered_set instead. Unfortunately, there is no way to generally check whether two types are the same container template applied to different value types without naming all potential container templates individually.
Yesterday, we were still on the brink. Fortunately, today we have come one step further.

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 13587
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: [PATCH] WithinStarlaneJumps implementation

#17 Post by Geoff the Medio »

cami wrote:I know that the current implementations don't use std::set's efficient membership tests much, if at all, but personally I dont think it's a bad idea to have them.
Having the efficient searching isn't a problem, but needing to maintain a sorted set or dealing with hashing overhead in order to get them might be...
However, I agree that boost::unordered_set might be a better choice...
Even when std::set was used, there was no need to search through the ObjectSet or TargetSet efficiently, so I don't think that's a relevant reason to prefer unordered_set over vector.

Using vector would be more cumbersome for randomly removing objects from the container, though. The need to do this efficiently would require swapping with the end, and that would have the effect of partly randomizing the order of the contents, removing a potential benefit of using vectors (which would have been to maintain a consistent ordering of objects, which might be useful in some situations).

However, any searching in containers for objects would best be done by int object ID rather than pointer value (for safety reasons like how most / all UI classes store object ids instead of pointers, and because of issues with how ObjectMap works which mean there could be multiple instances of UniverseObject around with the same ID and representing the same object, but at different memory addresses). So, you'd probably want to use unordered_map instead unordered_set in order to index and sort by object ID instead of pointer value.

User avatar
cami
Space Dragon
Posts: 411
Joined: Tue Sep 20, 2011 10:32 pm
Location: Halle (Saale)

Re: [PATCH] WithinStarlaneJumps implementation

#18 Post by cami »

Well, the object stores the ID doesnt it? so you can easily sort by ID using a comparison functor that uses the ID stored with the object data.
If it doesn't store the object ID, of course, a map is the only choice.

However, the problem isn't really the type of container chosen, but the fact that containers are copied, reconstructed or converted tens of thousands of times, often when containing the whole universe. Avoiding these operations will tremendously outweigh any container change.

Feel free to switch the container type, but Ill focus on avoiding copying, conversions and reconstructions.
Yesterday, we were still on the brink. Fortunately, today we have come one step further.

User avatar
cami
Space Dragon
Posts: 411
Joined: Tue Sep 20, 2011 10:32 pm
Location: Halle (Saale)

Re: [PATCH] WithinStarlaneJumps implementation

#19 Post by cami »

I hope you are still with me =)
I risked some trouble and spent more time on the patch this weekend.
Unfortunately, I had two very subtle bugs (a missing & in a parameter list, and a member initialization race) which cost valuable time.
Fortunately, I was able to cut down the turn computation time by another factor of three (10 seconds now, several minutes initially). Still, I believe this could be cut down to less than a second, though I may begin to hit the wall where small modifications (that I dont want to go beyond) can't improve much anymore.

Current patch state attached, valgrind running, and back to the work I get paid for now.

Regards, Carsten

Edit: Thanks for accepting my previous work! I updated the patch to r4295.
Attachments

[The extension patch has been deactivated and can no longer be displayed.]

camis-speedups.patch.gz
various speedups for freeorion r4280
(12.04 KiB) Downloaded 55 times
Yesterday, we were still on the brink. Fortunately, today we have come one step further.

User avatar
cami
Space Dragon
Posts: 411
Joined: Tue Sep 20, 2011 10:32 pm
Location: Halle (Saale)

Re: [PATCH] WithinStarlaneJumps implementation

#20 Post by cami »

Great! As the latest callgrind (gzipped, 3.0 MB) reveals, we're almost done. The only remaining cycle-eater is SystemPathing::SystemGraphIndex, which loops over the whole graph to find the node with a given system id. Ill see if I can solve this using a caching strategy (stale entries can easily be identified by checking whether the node really has that system id before returning) when I get to it

Edit: Implemented the cache and rebuilt with function inlining re-enabled (had disabled this for more precise callgrind results). I can turn-step the example game in ~5 seconds now on my machine, which is still not as good as I hoped for, but there are no "easy" optimizations left as far as I can see. So here goes the final patch. Thanks and enjoy =)

Hint: I noticed being unable to save any ship designs. Design window has always had its glitches (for instance using backspace in text fields rendered them immutable), but I have been able to save new designs last week. This problem also occurs in vanilla r4296, so I started a bisect at r4279.

OK 4279 ... 4288 ! 4289 4290 ... 4292 ... 4296 BROKEN

broken since r4289. Seems to be an incomplete refactoring, unrelated to this patch. Applied patch to r4288 and successfully checked that it still works.
Attachments

[The extension patch has been deactivated and can no longer be displayed.]

Yesterday, we were still on the brink. Fortunately, today we have come one step further.

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: [PATCH] WithinStarlaneJumps implementation

#21 Post by tzlaine »

I have so far been unable to reproduce the several-minutes-per-turn-update in the OP by playing through the game myself. Also, the save attached to the OP does not work for me (I just gunzip'd it -- should I have done something else?). Without an A-B timing comparison, it's hard to evaluate these patches against much smaller changes that I would like to try instead. Can either of you tell me how to get into this minutes-long-update mode?

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 13587
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: [PATCH] WithinStarlaneJumps implementation

#22 Post by Geoff the Medio »

tzlaine wrote:Also, the save attached to the OP does not work for me (I just gunzip'd it -- should I have done something else?).
How does the save not work... Just an error message about unable to load the save? What version of boost are you building with? If it's older than was used to make the save, there might be problems loading saves... I don't think there have been any save format changes since SVN 4280 so I don't think a save from around would be otherwise incompatible with the SVN version.

User avatar
cami
Space Dragon
Posts: 411
Joined: Tue Sep 20, 2011 10:32 pm
Location: Halle (Saale)

Re: [PATCH] WithinStarlaneJumps implementation

#23 Post by cami »

I'm building against boost 1.46.1.
Some of my changes have already been committed, so if you want a full comparison you need to checkout revision 4279 or older.
The provided game is a 500 star galaxy with four species, ~40 colonized systems and most of the tech tree researched and in use.

Note that my changes are relatively small. The patches are only a bit large because boost doesn't support BFS starting at multiple origins, so I had to implement an adaptor to the system graph that allows adding a virtual starting vertex. Boost didn't help too much here either so the adaptor contains quite some boilerplate. The rest is just a generic multi-source multi-target version of LeastumpsPath() and hooking it into WithinStarlaneJumps and some subtle modifications to reduce copying.

The effects can also very easily compared using the callgrind files ... we started at 200 million calls to operator new() for instance, and in the last file its about 4 million IIRC. The final patch probably even has a little less. Or you can compare the total cycles, as all callgrinds load the exact same game file and execute it for one turn without modifications, on the same machine with the same compiler flags used.
Yesterday, we were still on the brink. Fortunately, today we have come one step further.

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: [PATCH] WithinStarlaneJumps implementation

#24 Post by tzlaine »

It just says "Error reading save file." I'll rebuild with boost 1.47 and see if that fixes things. Geoff, does this save file work for you with a Boost 1.44 build?

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: [PATCH] WithinStarlaneJumps implementation

#25 Post by tzlaine »

cami wrote:I'm building against boost 1.46.1.
Note that my changes are relatively small. The patches are ...
I understand you changes, I'm sure you'll understand that reproducability is pretty important before we complete an evaluation of a patch.

Also, I think that while some parts of your patch are quite clever, they are also a bit too complicated. I think some/most of the gains can be realized via some much simpler and more maintainable changes. But to test whether my hypothesis is accurate or not, I need to reproduce your original failure mode, and compare our two approaches as possible fixes.

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 13587
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: [PATCH] WithinStarlaneJumps implementation

#26 Post by Geoff the Medio »

tzlaine wrote:It just says "Error reading save file." I'll rebuild with boost 1.47 and see if that fixes things. Geoff, does this save file work for you with a Boost 1.44 build?
I don't have a boost 1.44 build, but I've tried loading it with v0.3.17 and the latest SVN built with 1.46.1 and booth have the same error. In the server logs, it says there is an exception thrown when trying to deserialize: "invalid signature".

After a bit of googling, it appears that boost::serialization binary archives aren't portable between OSes. I switched to binary serialization a while ago to reduce the file sizes, but apparently that has broken cross-platform serialization...

I assume you both were running on Linux, though... I don't know if the portability issue would arise between different distros, although it seems plausible.

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: [PATCH] WithinStarlaneJumps implementation

#27 Post by tzlaine »

Geoff the Medio wrote:
tzlaine wrote:It just says "Error reading save file." I'll rebuild with boost 1.47 and see if that fixes things. Geoff, does this save file work for you with a Boost 1.44 build?
I don't have a boost 1.44 build, but I've tried loading it with v0.3.17 and the latest SVN built with 1.46.1 and booth have the same error. In the server logs, it says there is an exception thrown when trying to deserialize: "invalid signature".
I rebuilt using Boost 1.47, and I can load the attached save now. But doesn't FO use Boost 1.44 currently?
After a bit of googling, it appears that boost::serialization binary archives aren't portable between OSes. I switched to binary serialization a while ago to reduce the file sizes, but apparently that has broken cross-platform serialization...

I assume you both were running on Linux, though... I don't know if the portability issue would arise between different distros, although it seems plausible.
I wouldn't worry too much about this. For platforms with the same type-sizes (sizeof(some_type) is the same for both) and the same endiannesses, the save files should be portable. And they should even be portable across 32- vs. 64-bit systems, as long as we don't try to serialize longs.

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: [PATCH] WithinStarlaneJumps implementation

#28 Post by tzlaine »

Some timings. This was done on a 4-core machine. Each processor is reported as "Intel(R) Core(TM) i7 CPU Q 740 @ 1.73GHz". I'm running 64-bit Linux. I built using GCC 4.5 with an optimized build. I ran FO like this:

time ./freeorion --load ~/.freeorion/save/FreeOrion_Eoerg_Arghjoh_0407.sav --auto-advance-first-turn

then hit Ctrl-c to kill it after the end-turn button was enabled. Not the most scientific approach, but for the gross numbers below it works out alright I think. I recorded only the "user" time. IOW, I/O is left out; this should just reflect CPU time.

I did 2 times for each, to make sure I didn't record an outlier.

r4282: 21.0 21.2
r4299: 19.3 19.6

r4299 + Zach: 18.9 18.7
r4299 + Cami: 5.4 5.3

The first two lines are for those revisions of FO before and after Carsten's first patch.

The second pair is for my proposed very simple changes (one of which I found Carsten was already doing after a second look at his code ;)), which did not make a significant dent, and Carsten's, which really did. I still think one of my changes could replace some of Carsten's patch for an even faster result. More on that later.

Carsten, as you can see above, I have been unable to reproduce all your results. The optimized version runs in ~5s for me as for you, but I'm getting ~20s for the vanilla FO you first started investigating -- not minutes. I'm really curious what might be happening here, which is why I posted such detailed info on my test system above. Any idea why we're seeing such radically different things out of FO rev. 4282?

Again though, nice work! I have some comments on the patch, those will come after I've tried integrating some of my ideas into your existing work (those efforts may or may not moot my comments).

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 13587
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: [PATCH] WithinStarlaneJumps implementation

#29 Post by Geoff the Medio »

tzlaine wrote:But doesn't FO use Boost 1.44 currently?
v0.3.17 uses Boost 1.46.1 on Windows. I don't know what's used on OSX.
I wouldn't worry too much about this. For platforms with the same type-sizes (sizeof(some_type) is the same for both) and the same endiannesses, the save files should be portable. And they should even be portable across 32- vs. 64-bit systems, as long as we don't try to serialize longs.
Is that true starting with 1.47? Because testing by trying to load cami's save - created on Linux with 1.46.1 and loaded with 1.46.1 on Windows - seems to indicate portability is not a sure thing...

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: [PATCH] WithinStarlaneJumps implementation

#30 Post by tzlaine »

Geoff the Medio wrote:
tzlaine wrote:But doesn't FO use Boost 1.44 currently?
v0.3.17 uses Boost 1.46.1 on Windows. I don't know what's used on OSX.
I wouldn't worry too much about this. For platforms with the same type-sizes (sizeof(some_type) is the same for both) and the same endiannesses, the save files should be portable. And they should even be portable across 32- vs. 64-bit systems, as long as we don't try to serialize longs.
Is that true starting with 1.47? Because testing by trying to load cami's save - created on Linux with 1.46.1 and loaded with 1.46.1 on Windows - seems to indicate portability is not a sure thing...
When is it ever? ;) If that doesn't work, maybe we should fix it.

Post Reply