On Sat, 30 Jan 1999 19:00:11 -0800 (PST)
diablo <diablo@best.com> wrote:
> J C Lawrence wrote:
> <many undoubtedly erudite questions snipped>
> I snipped those questions above because I wouldn't know how to
> answer them.
While Linux (from my semi-brief checking) seems to be rather weak in
this area, most commercial Unixes have extensive system statistics
reporting tools. Sometimes they are grouped under some sort of
menuing or shell program, equally often they are standalone tools;
ioperf, sar, dio, tprof, etc.
> I am guessing the problem is processor-related becuase I was able to
> solve most of the problem by reducing the number of calls per second
> our main mobile ai routine gets.
I do presume that you are only involing your AI routines when there is
direct cause to, and not every XXX time units? You can usually fix a
lot here by going for a blocking model where the majority of the time
the "event" is blocked (cf process schedulers and the concept of
"blocked" or waiting" processes).
> This fixed the problem we were having, which was that our tasks
> (timed events) were taking WAY too long to go off.
I presume you are handling your own event scheduling, and that you are
not using cooperative multi-tasking under your scheduler (ie an event,
prior to compleation, cannot "yield" the processor to other events
awaiting execution (potentially after their own "yields")? ColdX, and
some motions in the MOO word (I never tracked if they actually
happened) benefit significantly from this latter approach.
> A 3 second task would take 20 seconds, but this only happened with a
> heavy player load. Our tasks are held in a table indexed to some
> number (some sort of ticker in linux? I don't know the proper names
> for anything). That table is polled every time the game searches for
> player input (it cycles through all our player lines constantly),
> and if a task is found to be either overdue or ready to go off, it
> goes off.
Ouch.
Cheaper:
1) Keep the table, but sort it by due time and pProcess the table
only up to the first record that has a due date in the future. You
need to take care in your table and sort code to make it minimally
expensive (choice of sorting algorithm), and in your storage format to
ensure minimal traversal and ordering expense (no buffer copies, only
pointer copies if possible).
2) Forget about polling your players. Use blocking IO and only hit
them if there's something there waiting.
> The problem comes when those tasks start taking too long. This of
> course causes a cascading effect as more and more tasks build up.
Aye.
> Generally, the way the author of our language/engine explained it to
> us is that the engine and language are not Object Oriented in order
> to be faster...
Eeek. OO is not inherently slow. It *can* be slow, but then so can
standard top-down structured code. Good OO designs can be quite
performant.
> ...and in order to allow for more proactivity (as opposed to
> reactivity).
This doesn't relate to the first pat of your sentence. Its either a
red herring or a misunderstanding. OO has nothing particularly to do
with pro-active or retro-active design considerations, or how well
such perform.
> We will try reduce the drain that things like mobile ai put on the
> processor, but it's a pretty heavy task, complete with string
> manipulation and interpreting a mini-programming language we have
> inside the game.
Other possibilities:
1) Move, where possible, string processing to reference counted
strings. Buffer copies get expensive, quickly.
2) Watch your malloc/sbrk rate (profiler). Minimse it.
3) Look into moving your language to a mini VM and then byte-coding
it. Your server can then execute the bytecode rather than
re-interpreting the scripting language every time again (ie save parse
and interpret expenses).
4) Look into the table sorting and blocking models mentioned above
and see about minimise the number of useless calls.
> Could someone like yourself, or someone similarly erudite about such
> matters speed things up by recoding? Probably, but there's a lot of
> code, and probably a _lot_ that needs doing. We'll certainly try to
> do it, but we will also be buying the fastest processor we can
> afford. I know it goes against the programmers ethic to dismiss a
> problem by saying "buy more memory" or "buy a new processor" but
> I've committed worse sins I suppose (though my new coder starts
> foaming at the mouth everytime I suggest it. I'm a little frightened
> that upgrading instead of optimizing will kill him.)
There's a balance in there. Given a competent programmer (for
suitable values of "competent"), and a reasonable choice of algorithm
and implementation (nothing wonderful required, just "good enough") of
that algorithm, then yes, faster hardware is almost always the cheaper
and more rewarding answer. Programmer time is expensive. Hardware is
cheap. The counter side of course is that lazy, or pseudo-competent
programmers come to rely on this, get sloppy with algorithm and
implementation choices, and then try and answer everything with faster
hardware.
Its a fuzzy line. As I make my living writing code, and in particular
from cleaning up other's code, I tend to be a little biased here,
despite my best efforts not to be. My general rules of thumb:
1) Ask the programmer what the performance constraints on the system
are, why they are what they are, and why that is justified. He should
know. He might not know with any great accuracy, or even have checked
into it much, but he should at least have some sort of idea as to what
the limiting factors are going to be etc.
2) Check into those answers if your a programmer, or have your
programmer check into them. Usually no more than a few hours or
perhaps a day is needed. You'll either (hopefully) find that your
programmer's ideas were correct, or you'll fidn the reasl answer.
3) Get with the programmers in question and the people who know the
basic design and reasons for the design for the system (in your case
this is probably just the programmer, in corporation=-land, you can
end up with a dozen people). Find out if the performance constraints
as found are "reasonable" or are something reasonably easily
corrected/changed.
4) Take their answer, and if it is reasonable, find out what the
next performance constraint is going to be.
5) Now evaluate whether it is worth doing what was found in #3.
>> Unfortunately this whole area is a bit of an art. Very very tiny
>> changes made to systems can have massive performance returns (or
>> penalties).
> In that case, consider me the Thomas Kincaide of programming.
<kof>
I mentioned above changing the stripe size on a RAID array. The
server in question was a standard news server (inn). We found, thru
simple happy accident, that raising the stripe size from less than
1Meg to 1Meg or more, suddenly more than doubled system thru-put for
news feeds. We knew that disk IO was a bottleneck, but had never
imagined that the response/latency characteristics of the RAID array
were driving the limitation. As we raised the strip size up past 1Meg
(working on the basis of making the basic stripe size larger than the
majority of single articles, and thus "encouraging" any single article
to only reside on one stripe on one disk in the array), performance
continued to grow (diminishing returns of course).
Tiny change. Huge reaction.
If the story is correct (passed on by a mate at SGI), in an earlier
version of IRIX they killed some "unecessary" buffer copies in the
TCP/IP stack (as I understand it, just a single extra copy per
packet). Stack performance gained by over 35%.
Tiny change. Huge reaction.
> Do profilers have to be written for specific languages?
Typically they come with compilers.
> My code is written in a language written and maintained by a friend
> of mine. Not marketed.
And the compiler base?
--
J C Lawrence Internet: claw@kanga.nu
----------(*) Internet: coder@kanga.nu
...Honorary Member of Clan McFud -- Teamer's Avenging Monolith...