[RESOLVED] Condition::Contains implementation

Programmers discuss here anything related to FreeOrion programming. Primarily for the developers to discuss.

Moderator: Committer

Message
Author
User avatar
cami
Space Dragon
Posts: 411
Joined: Tue Sep 20, 2011 10:32 pm
Location: Halle (Saale)

Re: Condition::Contains implementation

#16 Post by cami »

Update: I thought about it for a while and in fact you're wrong. The condition code CAN know which set(s) is/are wanted. In the pointer version I posted, it can tell so because the "wanted" set is not NULL. In the templated version, the code wouldn't specifically check which set is wanted, but it would implicitly depend on which parameters are null_container<> and which are ObjectSet, propagating this via template instantiation. However, the templated version is problematic here because virtual methods cannot be templates, thus for the virtual method, we would have to use the pointer signature (but we can resort to templated calls class-internally). Note that all non-boolean condition operators only use the matches result from subcondition evaluations and thus can also use the simple signature returning only one result set. "And" can do equally well with only one result set as it can with both available. The only remaining operator is "Not", which in fact only propagates its arguments further down, switching matches and non-matches (in both the destructive and the non-destructive version), so it doesn't matter whether one of them is NULL to "Not".

The effect of this is that your preprocession step is done "on the fly" on the call stack. Each caller propagates the "wanted set" down to its callee simply by passing the correct parameters. So in fact, there is no situation where we ever really need the destructive version.
Yesterday, we were still on the brink. Fortunately, today we have come one step further.

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: Condition::Contains implementation

#17 Post by tzlaine »

Hmm, well I'm definitely wrong about Not. I still feel like we need both sets to be kept in sync. I don't remember the exact reason why, but I looked into doing something really similar to what you've proposed a few weeks ago, and I concluded at the time that it was not currently possible. I may have been mistaken then as well. I'd really want to see us add some regression tests before making such a change though, so we can verify that we haven't borken something in doing so.

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 13603
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Condition::Contains implementation

#18 Post by Geoff the Medio »

tzlaine wrote:Good point! We can probably use std::list after all.
We could also use a vector without filling mostly with null, by treating it like a list, and swapping the element to be erased with the last, and then just popping the last element off the list. This would make copying fast, while keeping fast erase and insert (by push_back).

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: Condition::Contains implementation

#19 Post by tzlaine »

Even better.

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: Condition::Contains implementation

#20 Post by tzlaine »

Carsten, now that we've talked this through a bit, would you care to work on a patch?

User avatar
cami
Space Dragon
Posts: 411
Joined: Tue Sep 20, 2011 10:32 pm
Location: Halle (Saale)

Re: Condition::Contains implementation

#21 Post by cami »

I've been experimenting simultaneously with the discussion.
I'm currently using an adaptor template that maps an underlying Sequence or Unique associative container to a subset of the unique associative container interface that we're actually using, providing fast insertion/deletion/iteration but no lookup.
If you don't want the adapter code, I can adjust the code to directly use a specific container like std::list or std::vector.
Yesterday, we were still on the brink. Fortunately, today we have come one step further.

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: Condition::Contains implementation

#22 Post by tzlaine »

KISS. Let's use vector everywhere.

User avatar
cami
Space Dragon
Posts: 411
Joined: Tue Sep 20, 2011 10:32 pm
Location: Halle (Saale)

Re: Condition::Contains implementation

#23 Post by cami »

Ok, I assume this means you want std::vector hard-coded. This will require a lot more work. In the meantime, here's my experimental results (gzipped, 3.2MB) using an adaptor on std::list:

-I loaded my test game (without Geoffs effect group) and processed two turns, like in the last vanilla SVN callgrind (gzipped, 3.1MB).
-overall running time was reduced by ~40%
-insertion running time was reduced by ~85%
-erasing running time was reduced by ~75%
-although the adaptor used linear-time erase(key) and find(key) implementations, no new bottlenecks were introduced
-implementing the adaptor showed that we actually use some of std::set functionality, but do not benefit from it in overall performance
Attachments

[The extension patch has been deactivated and can no longer be displayed.]

Yesterday, we were still on the brink. Fortunately, today we have come one step further.

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: Condition::Contains implementation

#24 Post by tzlaine »

cami wrote:Ok, I assume this means you want std::vector hard-coded.
Yes.
This will require a lot more work. In the meantime, here's my experimental results (gzipped, 3.2MB) using an adaptor on std::list:
Can you time it as below with std::vector instead of std::list, using the swap-with-back()-before-pop_back() erase trick? This will tell us whether we should be using std::list or std::vector.
-I loaded my test game (without Geoffs effect group) and processed two turns, like in the last vanilla SVN callgrind (gzipped, 3.1MB).
-overall running time was reduced by ~40%
-insertion running time was reduced by ~85%
-erasing running time was reduced by ~75%
-although the adaptor used linear-time erase(key) and find(key) implementations, no new bottlenecks were introduced
-implementing the adaptor showed that we actually use some of std::set functionality, but do not benefit from it in overall performance
Very nice!

User avatar
cami
Space Dragon
Posts: 411
Joined: Tue Sep 20, 2011 10:32 pm
Location: Halle (Saale)

Re: Condition::Contains implementation

#25 Post by cami »

tzlaine wrote:Can you time it as below with std::vector instead of std::list, using the swap-with-back()-before-pop_back() erase trick? This will tell us whether we should be using std::list or std::vector.
Yep, but that needs a little more code specialized for random access containers. Coming soon.
Yesterday, we were still on the brink. Fortunately, today we have come one step further.

User avatar
cami
Space Dragon
Posts: 411
Joined: Tue Sep 20, 2011 10:32 pm
Location: Halle (Saale)

Re: Condition::Contains implementation

#26 Post by cami »

Warning: Since ObectSet no longer will be an associative container, the algorithm of ConditionBase::Eval(), and many similar algorithms, have to be changed:

Code: Select all

 {
     ObjectSet& from_set = search_domain == MATCHES ? matches : non_matches;
     ObjectSet& to_set = search_domain == MATCHES ? non_matches : matches;
-    ObjectSet::iterator it = from_set.begin();
-    ObjectSet::iterator end_it = from_set.end();
-    for ( ; it != end_it; ) {
-        ObjectSet::iterator temp = it++;
-        bool match = Match(ScriptingContext(parent_context, *temp));
+    for (ObjectSet::iterator it = from_set.begin(); it != from_set.end(); ) {
+        bool match = Match(ScriptingContext(parent_context, *it));
         if ((search_domain == MATCHES && !match) || (search_domain == NON_MATCHES && match)) {
-            to_set.insert(*temp);
-            from_set.erase(temp);
+            to_set.insert(*it);
+            it = from_set.erase(it);
+        } else {
+            ++it;
         }
     }
 }
  1. Especially in case of std::vector the next item after the erased one will be at "it", not "++it" (or even something completely different when the vector is moved)
  2. likewise, from_set.end() will change when erasing elements
Last edited by cami on Thu Oct 06, 2011 8:16 pm, edited 1 time in total.
Yesterday, we were still on the brink. Fortunately, today we have come one step further.

User avatar
cami
Space Dragon
Posts: 411
Joined: Tue Sep 20, 2011 10:32 pm
Location: Halle (Saale)

Re: Condition::Contains implementation

#27 Post by cami »

Surprisingly, std::vector turned out a little more expensive than std::list, at least with my implementation. Here are the results (percent)

Code: Select all

_container_|_total_|_insert_|_erase_
       set | 100.0 |  43.4  |  4.1
    vector |  64.2 |   9.2  |  3.2
      list |  60.0 |   7.2  |  1.0
100% means full running time of loading the test game and processing two turns with vanilla SVN code. All other values are relative to this time. Erase means erase by iterator. I omitted find(key) and erase(key) as they practically don't contribute to running time at all. std::list is slightly but clearly superior in the test.

callgrind std::set (gzipped, 3.1MB)
callgrind std::list (gzipped, 3.2MB)
callgrind std::vector (gzipped, 3.2MB)

I suppose std::vector suffers from the fact that its buffer isn't preallocated, so it has to copy its entire contents when it gets full.
Attachments

[The extension patch has been deactivated and can no longer be displayed.]

Yesterday, we were still on the brink. Fortunately, today we have come one step further.

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: Condition::Contains implementation

#28 Post by tzlaine »

That's really interesting that list is faster than vector. I wonder if a few calls to vector::reserve() might fix that? Do you have any idea what the average size of these vectors is?

If that's too much trouble to investigate, I'd settle for a std::list implementation without the set adapter. :)

tzlaine
Programming Lead Emeritus
Posts: 1092
Joined: Thu Jun 26, 2003 1:33 pm

Re: Condition::Contains implementation

#29 Post by tzlaine »

Looking at those numbers again. it seems that all of the difference between list and vector is coming from insert and erase, not copying. I also notice that your adapter is doing a lot more than "push_back()" and "*it = back(); pop_back()", respectively. This leads me to believe that an adapter-free implementation using vector might be faster than both the adapter-based ones.

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 13603
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Condition::Contains implementation

#30 Post by Geoff the Medio »

cami wrote:I suppose std::vector suffers from the fact that its buffer isn't preallocated, so it has to copy its entire contents when it gets full.
In addition to tzlaine's comment about optimal use of vector without the adaptor, I'd like to know if using vector::reserve to allocate sufficient memory to hold all potential pointers that might be pushed into the vector would help. In particular, whenever making a vector containing all the UniverseObjects in the Universe, and an empty vector into which Eval might move some of those UniverseObjects, immediately reserve space in the initially-empty vector to be able to hold all the objects with reallocation.

Post Reply