Geoff the Medio wrote:
I wonder if there's a workaround using threads with variable priority within the AI clients?
Good question. The first problem I see is how thread priority and process priority relate. If thread priority is only used to determine how much cpu time a thread gets in relation to other threads of the same process, then this won't help.
I googled a bit, but the info I found provided no conclusive answer to that. There is a standard on Unix, posix threads, that should ensure sufficient portability, but that's not 100%, there seem to be implementation differences even between different Linux distros. Boost offers a library for threads that we could use, but I didn't find anything to manipulate thread priority.
This whole scheduler/process priority thing seems to be implementation dependend on Unix, at least to a certain extend. Apparently manipulating the "nice value" of processes might produce (slightly?) different results on OSX and Linux, or even on different Linux distros (maybe depending on the kernel version used?). I've to admit, reading through all that info was a bit confusing and not as enlightening as I hoped it would be.
Besides all that I wonder if the current implementation of the AI clients is very effective concerning performance versus usage of system resources. Launching 12 AI client processes on an dual core machine isn't very effective perfomance wise, but still uses ~23MB per client and have the human client to compete for cpu time with 12 concurrently running processes. Having all the AI clients run as threads in one process would be far more resource efficient and give more flexibility in managing their cpu time consumption (for example you could query the amount of cpu cores present on the system and only allow a respective number of the clients simultaneously, etc.). This of course would require a major rewrite of the AI client code and more complex AI client management. The question is if the expected gains justify the costs in programming efforts...
Let's see how the current solution (AI process priority lowered all the time on OSX and Linux) works. If the AI is sufficiently responsive despite the lowered priority, we could just leave it at that, at least for the time being.
Have you observed any client UI hangs / nonresponsiveness on Linux or OSX systems with lots of busy AIs processing their turns right away? If the problem only occurs on Windows, then it probably only needs to be fixed on windows...
The problem here is that the machines at my disposal which run Windows and my Macs aren't comparable. I can't say anything in regard to Linux, I haven't even tried to build FO on Linux.
On my Macs I have to start games with 12 AI's in order to notice reduced responsiveness of the UI when all of them are busy (which I achieved with my special test PythonAI). It's not very dramatic, but it's definitely noticeable. My Macs are Core i5/i7 machines. My Windows boxes are somewhat older dual core machines (the better one a 6 year old Core Duo). On this machines the effect of several busy AI processes is much more evident. Unfortunately I'm not able to run FO on the Win7 VM on my Mac due to some incompatibilities between Ogre and the OpenGL drivers provided by VirtualBox.
So, as far as I can tell from my own observations, that's more dependent on the hardware than the OS you're using.
I'm not clear on what the problem is here... What is the consequence of a submit orders message hanging until all clients are done? The turn can't proceed until all orders are ready... Is this a UI hang issue? I don't see that on Windows even with busy-delaying AIs and autosave on...
This "hang" isn't obvious when playing the game. I only noticed something was amiss when I did some test runs to check if my priority switching is working as intended. As it turned out, it wasn't. I monitored the process priorities of the AI clients, and they didn't change after I hit "End turn". So I placed several debug log messages in the code to see what's going on and discovered that the condition for raising the AI's process priority was never met.
Originally I had placed the check for raising the AI's process priority in the WaitingForTurnEnd.CheckTurnEndConditions event handler in the server FSM. What has happened? After I had hit "Turn end", WaitingForTurnEnd.TurnOrders was triggered (as it should), which calls post_event(CheckTurnEndConditions()) before it returns. You'd expect that this event would be handled immediately afterwards, but that didn't happen. The event "hang" until all AI's had submitted their orders (and of course for each AI the CheckTurnEndConditions event was posted), only then the first CheckTurnEndConditions event was processed. At this time of course all AI's were finished doing their turns, and although their process priority was raised at last, it got lowered again immediately because server-side turn processing started. The remaining CheckTurnEndConditions events weren't handled properly and ended up triggering the "unconsumed_event" handler of the server FSM (resulting in the respective log messages, one for each AI process).
Checking the logs I noticed what looked like debug log entries from the save game process appearing before the order submission log entries of the AIs, so I started to suspect the autosave feature might be to blame. I turned of autosave, tried again and everything worked just fine.
Now this is where things start to get very odd, because after taking a look at the code this shouldn't be possible. Autosave is triggered during the TurnUpdate event, and this event can't possibly occur before all clients have submitted their orders and server-side turn processing has started. Maybe I misread the log files? What I'm going to do is trying to reproduce this behaviour and examine the logs more carefully, then post a more comprehensive bug report and attach the logs. EDIT: Done, see here
Finally, a question: Who shall/will remove the entry on the programming work page? Should I do that myself, or leave that to someone in charge of this list?