inlining
Here is a list of usenet articles about inlining that just appeared in
comp.compilers.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
Here is a list of usenet articles about inlining that just appeared in
comp.compilers.
Good discussion and I am happy to see you post it. I follow comp.arch
regularly and there are often very interesting hints there too amid
the dross. Actually it is not a high traffic group except for the
occasional "sunspot cycle".
optimizing compiler. The code placement tool (ala Pettis & Hanson)
needs to be inlining-aware. Code growth is not that big of a problem
in many codes. Many very large codes have relatively small dynamic hot
spots. Database codes are a notable exception. Another big downside
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Database codes are the mothers heartbreak of both the compiler design and
hardware architecture communities. They blow up caches, there are never
5 instructions in a row before a branch, they whack at the whole working
set (which blows up the tlb and bus), they have poor locality so when they
miss cache you can't fix it with bandwith. Everything depends on everything
so you can't parallize at small scales. Hopeless really.
Btw, I sure wish someone would comment on the S_LOCK analysis even if only
to tell me not to make such long posts as it wastes bandwidth. Or was it just
too long to read?
-dg
David Gould dg@illustra.com 510.628.3783 or 510.305.9468
Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612
"Don't worry about people stealing your ideas. If your ideas are any
good, you'll have to ram them down people's throats." -- Howard Aiken
Btw, I sure wish someone would comment on the S_LOCK analysis even if only
to tell me not to make such long posts as it wastes bandwidth. Or
was it just too long to read?
I read it all! Great analysis of the situation and not a waste, IMHO.
One comment...when you ran the tests in succession, could the cache be
responsible for the timing groupings in the same test? Should a
little program be run in between to "flush" the cache full of garbage
so each real run will miss? Seem to recall a little program, in CUJ,
I think, that set up a big array and then iterated over it to trash
the cache.
Darren aka stuporg@erols.com
Btw, I sure wish someone would comment on the S_LOCK analysis even if only
to tell me not to make such long posts as it wastes bandwidth. Or
was it just too long to read?I read it all! Great analysis of the situation and not a waste, IMHO.
One comment...when you ran the tests in succession, could the cache be
responsible for the timing groupings in the same test? Should a
little program be run in between to "flush" the cache full of garbage
so each real run will miss? Seem to recall a little program, in CUJ,
I think, that set up a big array and then iterated over it to trash
the cache.
Yes, that is a good point. When testing in a loop, the function is in
the cache, while in normal use, the function may not be in the cache
because of intervening instructions.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
At 4:50 AM -0700 6/12/98, Stupor Genius wrote:
One comment...when you ran the tests in succession, could the cache be
responsible for the timing groupings in the same test? Should a
little program be run in between to "flush" the cache full of garbage
so each real run will miss? Seem to recall a little program, in CUJ,
I think, that set up a big array and then iterated over it to trash
the cache.
Obviously I'm commenting at second hand, and perhaps this problem is
handled properly, but:
Many CPU's have independent data and instruction caches. Setting up a big
array and moving through it will flush the data cache, but most benchmark
anomalies are likely to be due to the instruction cache, aren't they?
Also, if you have a process (program) stop and then restart is the OS smart
enough to reconnect the VM state in such a way that the cache isn't flushed
anyway? Can it even preserve cache coherence through a fork (when the VM
state is mostly preserved)? I doubt it.
That said if you are testing multiple SQL statements within a single
connection (so the backend doesn't fork a new process) then I could see
some anomalies. Otherwise I doubt it.
Anyone know better?
Signature failed Preliminary Design Review.
Feasibility of a new signature is currently being evaluated.
h.b.hotz@jpl.nasa.gov, or hbhotz@oxy.edu