Slow count(*) again...

Started by Neil Whelchelover 15 years ago275 messageshackers

neil.whelchel@gmail.com

over 15 years ago

I know that there haven been many discussions on the slowness of count(*) even
when an index is involved because the visibility of the rows has to be
checked. In the past I have seen many suggestions about using triggers and
tables to keep track of counts and while this works fine in a situation where
you know what the report is going to be ahead of time, this is simply not an
option when an unknown WHERE clause is to be used (dynamically generated).
I ran into a fine example of this when I was searching this mailing list,
"Searching in 856,646 pages took 13.48202 seconds. Site search powered by
PostgreSQL 8.3." Obviously at some point count(*) came into play here because
the site made a list of pages (1 2 3 4 5 6 > next). I very commonly make a
list of pages from search results, and the biggest time killer here is the
count(*) portion, even worse yet, I sometimes have to hit the database with
two SELECT statements, one with OFFSET and LIMIT to get the page of results I
need and another to get the amount of total rows so I can estimate how many
pages of results are available. The point I am driving at here is that since
building a list of pages of results is such a common thing to do, there need
to be some specific high speed ways to do this in one query. Maybe an
estimate(*) that works like count but gives an answer from the index without
checking visibility? I am sure that this would be good enough to make a page
list, it is really no big deal if it errors on the positive side, maybe the
list of pages has an extra page off the end. I can live with that. What I
can't live with is taking 13 seconds to get a page of results from 850,000
rows in a table.
-Neil-

Scott Marlowe

scott.marlowe@gmail.com

over 15 years ago

In reply to: Neil Whelchel (#1)

Re: Slow count(*) again...

On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel <neil.whelchel@gmail.com> wrote:

I know that there haven been many discussions on the slowness of count(*) even
when an index is involved because the visibility of the rows has to be
checked. In the past I have seen many suggestions about using triggers and
tables to keep track of counts and while this works fine in a situation where
you know what the report is going to be ahead of time, this is simply not an
option when an unknown WHERE clause is to be used (dynamically generated).
I ran into a fine example of this when I was searching this mailing list,
"Searching in 856,646 pages took 13.48202 seconds. Site search powered by
PostgreSQL 8.3." Obviously at some point count(*) came into play here because
the site made a list of pages (1 2 3 4 5 6 > next). I very commonly make a
list of pages from search results, and the biggest time killer here is the
count(*) portion, even worse yet, I sometimes have to hit the database with
two SELECT statements, one with OFFSET and LIMIT to get the page of results I
need and another to get the amount of total rows so I can estimate how many
pages of results are available. The point I am driving at here is that since
building a list of pages of results is such a common thing to do, there need
to be some specific high speed ways to do this in one query. Maybe an
estimate(*) that works like count but gives an answer from the index without
checking visibility? I am sure that this would be good enough to make a page
list, it is really no big deal if it errors on the positive side, maybe the
list of pages has an extra page off the end. I can live with that. What I
can't live with is taking 13 seconds to get a page of results from 850,000
rows in a table.

99% of the time in the situations you don't need an exact measure, and
assuming analyze has run recently, select rel_tuples from pg_class for
a given table is more than close enough. I'm sure wrapping that in a
simple estimated_rows() function would be easy enough to do.

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Neil Whelchel (#1)

Re: Slow count(*) again...

Neil Whelchel wrote:

I know that there haven been many discussions on the slowness of count(*) even
when an index is involved because the visibility of the rows has to be
checked. In the past I have seen many suggestions about using triggers and
tables to keep track of counts and while this works fine in a situation where
you know what the report is going to be ahead of time, this is simply not an
option when an unknown WHERE clause is to be used (dynamically generated).
I ran into a fine example of this when I was searching this mailing list,
"Searching in 856,646 pages took 13.48202 seconds. Site search powered by
PostgreSQL 8.3." Obviously at some point count(*) came into play here because
the site made a list of pages (1 2 3 4 5 6 > next). I very commonly make a
list of pages from search results, and the biggest time killer here is the
count(*) portion, even worse yet, I sometimes have to hit the database with
two SELECT statements, one with OFFSET and LIMIT to get the page of results I
need and another to get the amount of total rows so I can estimate how many
pages of results are available. The point I am driving at here is that since
building a list of pages of results is such a common thing to do, there need
to be some specific high speed ways to do this in one query. Maybe an
estimate(*) that works like count but gives an answer from the index without
checking visibility? I am sure that this would be good enough to make a page
list, it is really no big deal if it errors on the positive side, maybe the
list of pages has an extra page off the end. I can live with that. What I
can't live with is taking 13 seconds to get a page of results from 850,000
rows in a table.
-Neil-

Unfortunately, the problem is in the rather primitive way PostgreSQL
does I/O. It didn't change in 9.0 so there is nothing you could gain by
upgrading. If you execute strace -o /tmp/pg.out -e read <PID of the
sequential scan process> and inspect the file /tmp/pg.out when the query
finishes, you will notice a gazillion of read requests, all of them 8192
bytes in size. That means that PostgreSQL is reading the table block by
block, without any merging of the requests. You can alleviate the pain
by using the OS tricks, like specifying the deadline I/O scheduler in
the grub.conf and set prefetch on the FS block devices by using
blockdev, but there is nothing special that can be done, short of
rewriting the way PostgreSQL does I/O. There were rumors about the
version 9.0 and asynchronous I/O, but that didn't materialize. That is
really strange to me, because PostgreSQL tables are files or groups of
files, if the table size exceeds 1GB. It wouldn't be very hard to try
reading 1MB at a time and that would speed up the full table scan
significantly.
Problem with single block I/O is that there is a context switch for each
request, the I/O scheduler has to work hard to merge requests
appropriately and there is really no need for that, tables are files
navigating through files is not a problem, even with much larger blocks.
In another database, whose name I will not mention, there is a parameter
db_file_multiblock_read_count which specifies how many blocks will be
read by a single read when doing a full table scan. PostgreSQL is in
dire need of something similar and it wouldn't even be that hard to
implement.

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

Joe Conway

mail@joeconway.com

over 15 years ago

In reply to: Mladen Gogala (#3)

Re: Slow count(*) again...

On 10/09/2010 06:54 PM, Mladen Gogala wrote:

In another database, whose name I will not mention, there is a parameter
db_file_multiblock_read_count which specifies how many blocks will be
read by a single read when doing a full table scan. PostgreSQL is in
dire need of something similar and it wouldn't even be that hard to
implement.

You're correct in that it isn't particularly difficult to implement for
sequential scans. But I have done some testing with aggressive read
ahead, and although it is clearly a big win with a single client, the
benefit was less clear as concurrency was increased.

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Joe Conway (#4)

Re: Slow count(*) again...

Joe Conway wrote:

On 10/09/2010 06:54 PM, Mladen Gogala wrote:

In another database, whose name I will not mention, there is a parameter
db_file_multiblock_read_count which specifies how many blocks will be
read by a single read when doing a full table scan. PostgreSQL is in
dire need of something similar and it wouldn't even be that hard to
implement.

You're correct in that it isn't particularly difficult to implement for
sequential scans. But I have done some testing with aggressive read
ahead, and although it is clearly a big win with a single client, the
benefit was less clear as concurrency was increased.

Joe

Well, in my opinion that should be left to the DBA, the same as in the
"other database". The mythical DBA, the creature that mighty Larry
Ellison himself is on a crusade against, usually can figure out the
right value for the database he or she's is in charge of. I humbly
confess to being an Oracle DBA for more than 2 decades and now branching
into Postgres because my employer is less than enthusiastic about
Oracle, with the special accent on their pricing.

Modern databases, Postgres included, are quite complex and companies
need DBA personnel to help fine tune the applications. I know that good
DBA personnel is quite expensive but without a competent DBA who knows
the database software well enough, companies can and will suffer from
blunders with performance, downtime, lost data and alike. In the world
where almost every application is written for the web, performance,
uptime and user experience are of the critical importance. The
architects of Postgres database would be well advised to operate under
the assumption that every production database has a competent DBA
keeping an eye on the database.

Every application has its own mix of sequential and index scans, you
cannot possibly test all possible applications. Aggressive read-ahead
or "multi-block reads" can be a performance problem and it will
complicate the optimizer, because the optimizer now has a new variable
to account for: the block size, potentially making seq_page_cost even
cheaper and random_page_cost even more expensive, depending on the
blocking. However, slow sequential scan is, in my humble opinion, the
single biggest performance problem of the PostgreSQL databases and
should be improved, the sooner, the better. You should, however, count
on the DBA personnel to help with the tuning.
We're the Tinkerbells of the database world. I am 6'4", 240 LBS, no wings.

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Scott Marlowe (#2)

Re: Slow count(*) again...

On Saturday 09 October 2010 18:47:34 Scott Marlowe wrote:

On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel <neil.whelchel@gmail.com>

wrote:

I know that there haven been many discussions on the slowness of count(*)
even when an index is involved because the visibility of the rows has to
be checked. In the past I have seen many suggestions about using
triggers and tables to keep track of counts and while this works fine in
a situation where you know what the report is going to be ahead of time,
this is simply not an option when an unknown WHERE clause is to be used
(dynamically generated). I ran into a fine example of this when I was
searching this mailing list, "Searching in 856,646 pages took 13.48202
seconds. Site search powered by PostgreSQL 8.3." Obviously at some point
count(*) came into play here because the site made a list of pages (1 2
3 4 5 6 > next). I very commonly make a list of pages from search
results, and the biggest time killer here is the count(*) portion, even
worse yet, I sometimes have to hit the database with two SELECT
statements, one with OFFSET and LIMIT to get the page of results I need
and another to get the amount of total rows so I can estimate how many
pages of results are available. The point I am driving at here is that
since building a list of pages of results is such a common thing to do,
there need to be some specific high speed ways to do this in one query.
Maybe an estimate(*) that works like count but gives an answer from the
index without checking visibility? I am sure that this would be good
enough to make a page list, it is really no big deal if it errors on the
positive side, maybe the list of pages has an extra page off the end. I
can live with that. What I can't live with is taking 13 seconds to get a
page of results from 850,000 rows in a table.

99% of the time in the situations you don't need an exact measure, and
assuming analyze has run recently, select rel_tuples from pg_class for
a given table is more than close enough. I'm sure wrapping that in a
simple estimated_rows() function would be easy enough to do.

This is a very good approach and it works very well when you are counting the
entire table, but when you have no control over the WHERE clause, it doesn't
help. IE: someone puts in a word to look for in a web form.

From my perspective, this issue is the biggest problem there is when using
Postgres to create web pages, and it is so commonly used, I think that there
should be a specific way to deal with it so that you don't have to run the
same WHERE clause twice.
IE: SELECT count(*) FROM <table> WHERE <clause>; to get the total amount of
items to make page navigation links, then:
SELECT <columns> FROM table WHERE <clause> LIMIT <items_per_page> OFFSET
<(page_no-1)*items_per_page>; to get the actual page contents.

It's bad enough that count(*) is slow, then you have to do it all over again
to get the results you need! I have not dug into this much yet, but would it
be possible to return the amount of rows that a WHERE clause would actually
return if the LIMIT and OFFSET were not applied. IE: When a normal query is
executed, the server returns the number of rows aside from the actual row
data. Would it be a big deal to modify this to allow it to return the amount
of rows before the LIMIT and OFFSET is applied as well? This would sure cut
down on time it takes to do the same WHERE clause twice... I have considered
using a cursor to do this, however this requires a transfer of all of the rows
to the client to get a total count, then setting the cursor to get the rows
that you are interested in. Or is there a way around this that I am not aware
of?
-Neil-

Sam Gendler

sgendler@ideasculptor.com

over 15 years ago

In reply to: Mladen Gogala (#5)

Re: Slow count(*) again...

On Sat, Oct 9, 2010 at 7:44 PM, Mladen Gogala <mladen.gogala@vmsinfo.com>wrote:

The architects of Postgres database would be well advised to operate under
the assumption that every production database has a competent DBA keeping an
eye on the database.

I'd actually go so far as to say that they have already made this
assumption. The out of the box config needs modification for all but the
most low-volume applications and postgres really benefits from having some
attention paid to performance. Not only does tuning the db provide enormous
gains, but it is often possible to dramatically improve query responsiveness
by simply restructuring a query (assuming an aggregating query over a fairly
large table with a few joins thrown in). My team does not have a competent
DBA (though I've got 15+ years of experience developing on top of various
dbs and certainly don't make overly naive assumptions about how things work)
and the gains that we made, when I finally just sat down and read everything
I could get my hands on about postgres and started reading this list, were
really quite impressive. I intend to take some of the courses offered by
some of the companies that are active on this list when my schedule allows
in order to expand my knowledge even farther, as a DBA is a luxury we cannot
really afford at the moment.

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: Neil Whelchel (#6)

Re: Slow count(*) again...

On 10/10/2010 11:02 AM, Neil Whelchel wrote:

On Saturday 09 October 2010 18:47:34 Scott Marlowe wrote:

On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel<neil.whelchel@gmail.com>

wrote:

I know that there haven been many discussions on the slowness of count(*)
even when an index is involved because the visibility of the rows has to
be checked. In the past I have seen many suggestions about using
triggers and tables to keep track of counts and while this works fine in
a situation where you know what the report is going to be ahead of time,
this is simply not an option when an unknown WHERE clause is to be used
(dynamically generated). I ran into a fine example of this when I was
searching this mailing list, "Searching in 856,646 pages took 13.48202
seconds. Site search powered by PostgreSQL 8.3." Obviously at some point
count(*) came into play here because the site made a list of pages (1 2
3 4 5 6> next). I very commonly make a list of pages from search
results, and the biggest time killer here is the count(*) portion, even
worse yet, I sometimes have to hit the database with two SELECT
statements, one with OFFSET and LIMIT to get the page of results I need
and another to get the amount of total rows so I can estimate how many
pages of results are available. The point I am driving at here is that
since building a list of pages of results is such a common thing to do,
there need to be some specific high speed ways to do this in one query.
Maybe an estimate(*) that works like count but gives an answer from the
index without checking visibility? I am sure that this would be good
enough to make a page list, it is really no big deal if it errors on the
positive side, maybe the list of pages has an extra page off the end. I
can live with that. What I can't live with is taking 13 seconds to get a
page of results from 850,000 rows in a table.

99% of the time in the situations you don't need an exact measure, and
assuming analyze has run recently, select rel_tuples from pg_class for
a given table is more than close enough. I'm sure wrapping that in a
simple estimated_rows() function would be easy enough to do.

This is a very good approach and it works very well when you are counting the
entire table, but when you have no control over the WHERE clause, it doesn't
help. IE: someone puts in a word to look for in a web form.

For that sort of thing, there isn't much that'll help you except
visibility-aware indexes, covering indexes, etc if/when they're
implemented. Even then, they'd only help when it was a simple
index-driven query with no need to hit the table to recheck any test
conditions, etc.

I guess there could be *some* way to expose the query planner's cost
estimates in a manner useful for result count estimation ... but given
how coarse its stats are and how wildly out the estimates can be, I kind
of doubt it. It's really intended for query planning decisions and more
interested in orders of magnitude, "0, 1, or more than that" measures,
etc, and seems to consider 30% here or there to be pretty insignificant
most of the time.

It's bad enough that count(*) is slow, then you have to do it all over again
to get the results you need! I have not dug into this much yet, but would it
be possible to return the amount of rows that a WHERE clause would actually
return if the LIMIT and OFFSET were not applied. IE: When a normal query is
executed, the server returns the number of rows aside from the actual row
data. Would it be a big deal to modify this to allow it to return the amount
of rows before the LIMIT and OFFSET is applied as well?

It'd force the server to fully execute the query. Then again, it sounds
like you're doing that anyway.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: Mladen Gogala (#3)

Re: Slow count(*) again...

On 10/10/2010 9:54 AM, Mladen Gogala wrote:

Unfortunately, the problem is in the rather primitive way PostgreSQL
does I/O. It didn't change in 9.0 so there is nothing you could gain by
upgrading. If you execute strace -o /tmp/pg.out -e read <PID of the
sequential scan process> and inspect the file /tmp/pg.out when the query
finishes, you will notice a gazillion of read requests, all of them 8192
bytes in size. That means that PostgreSQL is reading the table block by
block, without any merging of the requests.

I'd be really interested in any measurements you've done to determine
the cost of this over doing reads in larger chunks. If they're properly
detailed and thought out, the -hackers list is likely to be interested
as well.

The Linux kernel, at least, does request merging (and splitting, and
merging, and more splitting) along the request path, and I'd personally
expect that most of the cost of 8k requests would be in the increased
number of system calls, buffer copies, etc required. Measurements
demonstrating or contradicting this would be good to see.

It's worth being aware that there are memory costs to doing larger
reads, especially when you have many backends each of which want to
allocate a larger buffer for reading. If you can use a chunk of
shared_buffers as the direct destination for the read that's OK, but
otherwise you're up for (1mb-8kb)*num_backends extra memory use on I/O
buffers that could otherwise be used as shared_buffers or OS cache.

Async I/O, too, has costs.

PostgreSQL is in
dire need of something similar and it wouldn't even be that hard to
implement.

I'd really like to see both those assertions backed with data or patches ;-)

Personally, I know just enough about how PG's I/O path works to suspect
that "not that hard to implement" is probably a little ...
over-optimistic. Sure, it's not that hard to implement in a new program
with no wired-in architectural and design choices; that doesn't mean
it's easy to retrofit onto existing code, especially a bunch of
co-operating processes with their own buffer management.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

#10

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Craig Ringer (#8)

Re: Slow count(*) again...

On Saturday 09 October 2010 23:56:15 Craig Ringer wrote:

On 10/10/2010 11:02 AM, Neil Whelchel wrote:

On Saturday 09 October 2010 18:47:34 Scott Marlowe wrote:

On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel<neil.whelchel@gmail.com>

wrote:

I know that there haven been many discussions on the slowness of
count(*) even when an index is involved because the visibility of the
rows has to be checked. In the past I have seen many suggestions about
using triggers and tables to keep track of counts and while this works
fine in a situation where you know what the report is going to be
ahead of time, this is simply not an option when an unknown WHERE
clause is to be used (dynamically generated). I ran into a fine
example of this when I was searching this mailing list, "Searching in
856,646 pages took 13.48202 seconds. Site search powered by PostgreSQL
8.3." Obviously at some point count(*) came into play here because the
site made a list of pages (1 2 3 4 5 6> next). I very commonly make a
list of pages from search results, and the biggest time killer here is
the count(*) portion, even worse yet, I sometimes have to hit the
database with two SELECT statements, one with OFFSET and LIMIT to get
the page of results I need and another to get the amount of total rows
so I can estimate how many pages of results are available. The point I
am driving at here is that since building a list of pages of results
is such a common thing to do, there need to be some specific high
speed ways to do this in one query. Maybe an estimate(*) that works
like count but gives an answer from the index without checking
visibility? I am sure that this would be good enough to make a page
list, it is really no big deal if it errors on the positive side,
maybe the list of pages has an extra page off the end. I can live with
that. What I can't live with is taking 13 seconds to get a page of
results from 850,000 rows in a table.

99% of the time in the situations you don't need an exact measure, and
assuming analyze has run recently, select rel_tuples from pg_class for
a given table is more than close enough. I'm sure wrapping that in a
simple estimated_rows() function would be easy enough to do.

This is a very good approach and it works very well when you are counting
the entire table, but when you have no control over the WHERE clause, it
doesn't help. IE: someone puts in a word to look for in a web form.

For that sort of thing, there isn't much that'll help you except
visibility-aware indexes, covering indexes, etc if/when they're
implemented. Even then, they'd only help when it was a simple
index-driven query with no need to hit the table to recheck any test
conditions, etc.

Good point, maybe this is turning more into a discussion of how to generate a
list of pages of results and one page of results with one query so we don't
have to do the same painfully slow query twice to do a very common task.

On the other hand, I copied a table out of one of my production servers that
has about 60,000 rows with 6 columns (numeric, numeric, bool, bool, timestamp,
text). The first numeric column has numbers evenly spread between 0 and 100
and it is indexed. I put the table in a pair of database servers both running
on the same physical hardware. One server is Postgres, the other is a popular
server (I am not mentioning names here). on Postgres: SELECT count(*) FROM
table where column>50; takes about 8 seconds to run. The other database server
took less than one second (about 25 ms) as it is using the index (I assume) to
come up with the results. It is true that this is not a fair test because both
servers were tested with their default settings, and the defaults for Postgres
are much more conservative, however, I don't think that any amount of settings
tweaking will bring them even in the same ball park. There has been discussion
about the other server returning an incorrect count because all of the indexed
rows may not be live at the time. This is not a problem for the intended use,
that is why I suggested another function like estimate(*). It's name suggests
that the result will be close, not 100% correct, which is plenty good enough
for generating a list of results pages in most cases. I am faced with a very
serious problem here. If the query to make a list of pages takes say 6 seconds
and it takes another 6 seconds to generate a page of results, the customer is
waiting 12 seconds. This is not going to work. If count made a quick estimate,
say less than a second, and it took 6 seconds to come up with the actual
results, I could live with that. Or if coming up with the window of results
via (OFFSET and LIMIT) and returned the total number of rows that would have
matched the query, then I would still have everything I need to render the
page in a reasonable time. I really think that this needs to be addressed
somewhere. It's not like I am the only one that does this. You see it nearly
everywhere a long list of results is (expected to be) returned in a web site.
Among the people I work with, this seems to be the most mentioned reason that
they claim that they don't use Postgres for their projects.

It would be nice to see how the server comes up with the search results and
list of links to pages of results for this mailing list.
(http://search.postgresql.org/search?q=slow+count%28%29&m=1&l=&d=365&s=r) I am
guessing that it probably uses the count and query method I am talking about.

Show quoted text

I guess there could be *some* way to expose the query planner's cost
estimates in a manner useful for result count estimation ... but given
how coarse its stats are and how wildly out the estimates can be, I kind
of doubt it. It's really intended for query planning decisions and more
interested in orders of magnitude, "0, 1, or more than that" measures,
etc, and seems to consider 30% here or there to be pretty insignificant
most of the time.

It's bad enough that count(*) is slow, then you have to do it all over
again to get the results you need! I have not dug into this much yet,
but would it be possible to return the amount of rows that a WHERE
clause would actually return if the LIMIT and OFFSET were not applied.
IE: When a normal query is executed, the server returns the number of
rows aside from the actual row data. Would it be a big deal to modify
this to allow it to return the amount of rows before the LIMIT and
OFFSET is applied as well?

It'd force the server to fully execute the query. Then again, it sounds
like you're doing that anyway.

#11

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: Neil Whelchel (#6)

Re: Slow count(*) again...

2010/10/10 Neil Whelchel <neil.whelchel@gmail.com>

On Saturday 09 October 2010 18:47:34 Scott Marlowe wrote:

On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel <neil.whelchel@gmail.com>

wrote:

I know that there haven been many discussions on the slowness of

count(*)

even when an index is involved because the visibility of the rows has

to

be checked. In the past I have seen many suggestions about using
triggers and tables to keep track of counts and while this works fine

in

a situation where you know what the report is going to be ahead of

time,

this is simply not an option when an unknown WHERE clause is to be used
(dynamically generated). I ran into a fine example of this when I was
searching this mailing list, "Searching in 856,646 pages took 13.48202
seconds. Site search powered by PostgreSQL 8.3." Obviously at some

point

count(*) came into play here because the site made a list of pages (1 2
3 4 5 6 > next). I very commonly make a list of pages from search
results, and the biggest time killer here is the count(*) portion, even
worse yet, I sometimes have to hit the database with two SELECT
statements, one with OFFSET and LIMIT to get the page of results I need
and another to get the amount of total rows so I can estimate how many
pages of results are available. The point I am driving at here is that
since building a list of pages of results is such a common thing to do,
there need to be some specific high speed ways to do this in one query.
Maybe an estimate(*) that works like count but gives an answer from the
index without checking visibility? I am sure that this would be good
enough to make a page list, it is really no big deal if it errors on

the

positive side, maybe the list of pages has an extra page off the end. I
can live with that. What I can't live with is taking 13 seconds to get

a

page of results from 850,000 rows in a table.

99% of the time in the situations you don't need an exact measure, and
assuming analyze has run recently, select rel_tuples from pg_class for
a given table is more than close enough. I'm sure wrapping that in a
simple estimated_rows() function would be easy enough to do.

This is a very good approach and it works very well when you are counting
the
entire table, but when you have no control over the WHERE clause, it
doesn't
help. IE: someone puts in a word to look for in a web form.

From my perspective, this issue is the biggest problem there is when using
Postgres to create web pages, and it is so commonly used, I think that
there
should be a specific way to deal with it so that you don't have to run the
same WHERE clause twice.
IE: SELECT count(*) FROM <table> WHERE <clause>; to get the total amount of
items to make page navigation links, then:
SELECT <columns> FROM table WHERE <clause> LIMIT <items_per_page> OFFSET
<(page_no-1)*items_per_page>; to get the actual page contents.

How about

select * from (select *, count(*) over () as total_count from <table> where
<clause) a LIMIT <items_per_page> OFFSET
<(page_no-1)*items_per_page>
It will return you total_count column with equal value in each row. You may
have problems if no rows are returned (e.g. page num is too high).
--
Best regards,
Vitalii Tymchyshyn

#12

Reid Thompson

Reid.Thompson@ateb.com

over 15 years ago

In reply to: Neil Whelchel (#10)

Re: Slow count(*) again...

On 10/10/2010 6:29 AM, Neil Whelchel wrote:

On Saturday 09 October 2010 23:56:15 Craig Ringer wrote:

On 10/10/2010 11:02 AM, Neil Whelchel wrote:

On Saturday 09 October 2010 18:47:34 Scott Marlowe wrote:

On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel<neil.whelchel@gmail.com>

wrote:

I know that there haven been many discussions on the slowness of
count(*) even when an index is involved because the visibility of the
rows has to be checked. In the past I have seen many suggestions about
using triggers and tables to keep track of counts and while this works
fine in a situation where you know what the report is going to be
ahead of time, this is simply not an option when an unknown WHERE
clause is to be used (dynamically generated). I ran into a fine
example of this when I was searching this mailing list, "Searching in
856,646 pages took 13.48202 seconds. Site search powered by PostgreSQL
8.3." Obviously at some point count(*) came into play here because the
site made a list of pages (1 2 3 4 5 6> next). I very commonly make a
list of pages from search results, and the biggest time killer here is
the count(*) portion, even worse yet, I sometimes have to hit the
database with two SELECT statements, one with OFFSET and LIMIT to get
the page of results I need and another to get the amount of total rows
so I can estimate how many pages of results are available. The point I
am driving at here is that since building a list of pages of results
is such a common thing to do, there need to be some specific high
speed ways to do this in one query. Maybe an estimate(*) that works
like count but gives an answer from the index without checking
visibility? I am sure that this would be good enough to make a page
list, it is really no big deal if it errors on the positive side,
maybe the list of pages has an extra page off the end. I can live with
that. What I can't live with is taking 13 seconds to get a page of
results from 850,000 rows in a table.

Good point, maybe this is turning more into a discussion of how to generate a
list of pages of results and one page of results with one query so we don't
have to do the same painfully slow query twice to do a very common task.

On the other hand, I copied a table out of one of my production servers that
has about 60,000 rows with 6 columns (numeric, numeric, bool, bool, timestamp,
text). The first numeric column has numbers evenly spread between 0 and 100
and it is indexed. I put the table in a pair of database servers both running
on the same physical hardware. One server is Postgres, the other is a popular
server (I am not mentioning names here). on Postgres: SELECT count(*) FROM
table where column>50; takes about 8 seconds to run. The other database server
took less than one second (about 25 ms) as it is using the index (I assume) to
come up with the results. It is true that this is not a fair test because both
servers were tested with their default settings, and the defaults for Postgres
are much more conservative, however, I don't think that any amount of settings
tweaking will bring them even in the same ball park. There has been discussion
about the other server returning an incorrect count because all of the indexed
rows may not be live at the time. This is not a problem for the intended use,
that is why I suggested another function like estimate(*). It's name suggests
that the result will be close, not 100% correct, which is plenty good enough
for generating a list of results pages in most cases. I am faced with a very
serious problem here. If the query to make a list of pages takes say 6 seconds
and it takes another 6 seconds to generate a page of results, the customer is
waiting 12 seconds. This is not going to work. If count made a quick estimate,
say less than a second, and it took 6 seconds to come up with the actual
results, I could live with that. Or if coming up with the window of results
via (OFFSET and LIMIT) and returned the total number of rows that would have
matched the query, then I would still have everything I need to render the
page in a reasonable time. I really think that this needs to be addressed
somewhere. It's not like I am the only one that does this. You see it nearly
everywhere a long list of results is (expected to be) returned in a web site.
Among the people I work with, this seems to be the most mentioned reason that
they claim that they don't use Postgres for their projects.

It would be nice to see how the server comes up with the search results and
list of links to pages of results for this mailing list.
(http://search.postgresql.org/search?q=slow+count%28%29&m=1&l=&d=365&s=r) I am
guessing that it probably uses the count and query method I am talking about.

I guess there could be *some* way to expose the query planner's cost
estimates in a manner useful for result count estimation ... but given
how coarse its stats are and how wildly out the estimates can be, I kind
of doubt it. It's really intended for query planning decisions and more
interested in orders of magnitude, "0, 1, or more than that" measures,
etc, and seems to consider 30% here or there to be pretty insignificant
most of the time.

It's bad enough that count(*) is slow, then you have to do it all over
again to get the results you need! I have not dug into this much yet,
but would it be possible to return the amount of rows that a WHERE
clause would actually return if the LIMIT and OFFSET were not applied.
IE: When a normal query is executed, the server returns the number of
rows aside from the actual row data. Would it be a big deal to modify
this to allow it to return the amount of rows before the LIMIT and
OFFSET is applied as well?

It'd force the server to fully execute the query. Then again, it sounds
like you're doing that anyway.

How big is your DB?
How fast is your disk access?
Any chance disks/RAM can be addressed?

My disk access is pitiful...
first run, 2.3 million rows.. 0m35.38s, subsequent runs.. real 0m2.55s

rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
count
---------
2340704
(1 row)

real 0m35.38s
user 0m0.25s
sys 0m0.03s

subsequent runs.... (count changes due to inserts.)

rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
count
---------
2363707
(1 row)

real 0m2.70s
user 0m0.27s
sys 0m0.02s
rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
count
---------
2363707
(1 row)

real 0m2.55s
user 0m0.26s
sys 0m0.02s
rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
count
---------
2363707
(1 row)

real 0m2.50s
user 0m0.26s
sys 0m0.02s

reporting=# SELECT pg_size_pretty(pg_total_relation_size('my_production_table'));
pg_size_pretty
----------------
1890 MB
(1 row)

#13

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: Neil Whelchel (#10)

Re: Slow count(*) again...

On 10/10/2010 6:29 PM, Neil Whelchel wrote:

On the other hand, I copied a table out of one of my production servers that
has about 60,000 rows with 6 columns (numeric, numeric, bool, bool, timestamp,
text). The first numeric column has numbers evenly spread between 0 and 100
and it is indexed. I put the table in a pair of database servers both running
on the same physical hardware. One server is Postgres, the other is a popular
server (I am not mentioning names here).

Please do. Your comment is pretty meaningless otherwise.

If you're talking about MySQL: Were you using InnoDB or MyISAM table
storage? Of course it's fast with MyISAM, it relies on locks to do
updates and has bugger all capability for write concurrency, or to
permit readers while writing is going on.

If you're using InnoDB, then I'd like to know how they've managed that.

If you're talking about some *other* database, please name it and
provide any useful details, because the hand waving is not helpful.

I don't think that any amount of settings
tweaking will bring them even in the same ball park.

If you are, in fact, comparing MySQL+MyISAM and PostgreSQL, then you're
quite right. Pg will never have such a fast count() as MyISAM does or
the same insanely fast read performance, and MyISAM will never be as
reliable, robust or concurrency-friendly as Pg is. Take your pick, you
can't have both.

There has been discussion
about the other server returning an incorrect count because all of the indexed
rows may not be live at the time. This is not a problem for the intended use,
that is why I suggested another function like estimate(*). It's name suggests
that the result will be close, not 100% correct, which is plenty good enough
for generating a list of results pages in most cases.

Do you have any practical suggestions for generating such an estimate,
though? I find it hard to think of any way the server can do that
doesn't involve executing the query. The table stats are WAY too general
and a bit hit-and-miss, and there isn't really any other way to do it.

If all you want is a way to retrieve both a subset of results AND a
count of how many results would've been generated, it sounds like all
you really need is a way to get the total number of results returned by
a cursor query, which isn't a big engineering challenge. I expect that
in current Pg versions a trivial PL/PgSQL function could be used to
slurp and discard unwanted results, but a better in-server option to
count the results from a cursor query would certainly be nice.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

#14

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Craig Ringer (#9)

Re: Slow count(*) again...

Craig Ringer wrote:

On 10/10/2010 9:54 AM, Mladen Gogala wrote:

Unfortunately, the problem is in the rather primitive way PostgreSQL
does I/O. It didn't change in 9.0 so there is nothing you could gain by
upgrading. If you execute strace -o /tmp/pg.out -e read <PID of the
sequential scan process> and inspect the file /tmp/pg.out when the query
finishes, you will notice a gazillion of read requests, all of them 8192
bytes in size. That means that PostgreSQL is reading the table block by
block, without any merging of the requests.

I'd be really interested in any measurements you've done to determine
the cost of this over doing reads in larger chunks. If they're properly
detailed and thought out, the -hackers list is likely to be interested
as well.

I can provide measurements, but from Oracle RDBMS. Postgres doesn't
allow tuning of that aspect, so no measurement can be done. Would the
numbers from Oracle RDBMS be acceptable?

The Linux kernel, at least, does request merging (and splitting, and
merging, and more splitting) along the request path, and I'd personally
expect that most of the cost of 8k requests would be in the increased
number of system calls, buffer copies, etc required. Measurements
demonstrating or contradicting this would be good to see.

Even the cost of hundreds of thousands of context switches is far from
negligible. What kind of measurements do you expect me to do with the
database which doesn't support tweaking of that aspect of its operation?

It's worth being aware that there are memory costs to doing larger
reads, especially when you have many backends each of which want to
allocate a larger buffer for reading.

Oh, it's not only larger memory, the buffer management would have to be
changed too, to prevent process doing a sequential scan from inundating
the shared buffers. Alternatively, the blocks would have to be written
into the private memory and immediately thrown away after that. However,
the experience with Oracle tells me that this is well worth it. Here are
the numbers:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> show parameter db_file_multi

NAME TYPE VALUE
------------------------------------ -----------
------------------------------
db_file_multiblock_read_count integer 16
SQL> alter session set db_file_multiblock_read_count=1;

Session altered.
SQL> select count(*) from ni_occurrence;

COUNT(*)
----------
402062638

Elapsed: 00:08:20.88
SQL> alter session set db_file_multiblock_read_count=128;

Session altered.

Elapsed: 00:00:00.50
SQL> select count(*) from ni_occurrence;

COUNT(*)
----------
402062638

Elapsed: 00:02:17.58

In other words, when I batched the sequential scan to do 128 blocks I/O,
it was 4 times faster than when I did the single block I/O.
Does that provide enough of an evidence and, if not, why not?

If you can use a chunk of
shared_buffers as the direct destination for the read that's OK, but
otherwise you're up for (1mb-8kb)*num_backends extra memory use on I/O
buffers that could otherwise be used as shared_buffers or OS cache.

Async I/O, too, has costs.

There is a common platitude that says that there is no such thing as
free lunch. However, both Oracle RDBMS and IBM DB2 use asynchronous I/O,
probably because they're unaware of the danger. Let me now give you a
full table scan of a much smaller table located in a Postgres database:

news=> select count(*) from internet_web_sites;
count
---------
1290133
(1 row)

Time: 12838.958 ms

Oracle counts 400 million records in 2 minutes and Postgres 9.01 takes
12.8 seconds to count 1.2 million records? Do you see the disparity?

Both databases, Oracle and Postgres, are utilizing the same 3Par SAN
device, the machines housing both databases are comparable HP 64 bit
Linux machines, both running 64 bit version of Red Hat 5.5. Respective
table sizes are here:

SQL> select bytes/1048576 as MB from user_segments
2 where segment_name='NI_OCCURRENCE';

MB
----------
35329

news=> select pg_size_pretty(pg_table_size('moreover.internet_web_sites'));
pg_size_pretty
----------------
216 MB
(1 row)

So, I really pushed Oracle much harder than I pushed Postgres.

PostgreSQL is in

dire need of something similar and it wouldn't even be that hard to
implement.

I'd really like to see both those assertions backed with data or patches ;-)

With the database that doesn't allow tuning of that aspect, it's the
self-defeating proposition. However, I did my best to give you the numbers.

Personally, I know just enough about how PG's I/O path works to suspect
that "not that hard to implement" is probably a little ...
over-optimistic. Sure, it's not that hard to implement in a new program
with no wired-in architectural and design choices; that doesn't mean
it's easy to retrofit onto existing code, especially a bunch of
co-operating processes with their own buffer management.

It maybe so, but slow sequential scan is still the largest single
performance problem of PostgreSQL. The frequency with which that topic
appears on the mailing lists should serve as a good evidence for that. I
did my best to prove my case. Again, requiring "hard numbers" when
using the database which doesn't allow tweaking of the I/O size is self
defeating proposition. The other databases, like DB2 and Oracle both
allow tweaking of that aspect of its operation, Oracle even on the per
session basis. If you still claim that it wouldn't make the difference,
the onus to prove it is on you.

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

#15

Reid Thompson

Reid.Thompson@ateb.com

over 15 years ago

In reply to: Reid Thompson (#12)

Re: Slow count(*) again...

On 10/10/2010 11:02 AM, Reid Thompson wrote:

On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel<neil.whelchel@gmail.com>

On the other hand, I copied a table out of one of my production servers that
has about 60,000 rows with 6 columns (numeric, numeric, bool, bool, timestamp,
text). The first numeric column has numbers evenly spread between 0 and 100
and it is indexed. I put the table in a pair of database servers both running
on the same physical hardware. One server is Postgres, the other is a popular
server (I am not mentioning names here). on Postgres: SELECT count(*) FROM
table where column>50; takes about 8 seconds to run. The other database server
took less than one second (about 25 ms) as it is using the index (I assume) to
come up with the results. It is true that this is not a fair test because both
servers were tested with their default settings, and the defaults for Postgres
are much more conservative, however, I don't think that any amount of settings
tweaking will bring them even in the same ball park. There has been discussion
about the other server returning an incorrect count because all of the indexed
rows may not be live at the time. This is not a problem for the intended use,
that is why I suggested another function like estimate(*). It's name suggests
that the result will be close, not 100% correct, which is plenty good enough
for generating a list of results pages in most cases. I am faced with a very
serious problem here. If the query to make a list of pages takes say 6 seconds
and it takes another 6 seconds to generate a page of results, the customer is
waiting 12 seconds. This is not going to work. If count made a quick estimate,
say less than a second, and it took 6 seconds to come up with the actual
results, I could live with that. Or if coming up with the window of results
via (OFFSET and LIMIT) and returned the total number of rows that would have
matched the query, then I would still have everything I need to render the
page in a reasonable time. I really think that this needs to be addressed
somewhere. It's not like I am the only one that does this. You see it nearly
everywhere a long list of results is (expected to be) returned in a web site.
Among the people I work with, this seems to be the most mentioned reason that
they claim that they don't use Postgres for their projects. t anyway.

How big is your DB?
How fast is your disk access?
Any chance disks/RAM can be addressed?

My disk access is pitiful...
first run, 2.3 million rows.. 0m35.38s, subsequent runs.. real 0m2.55s

rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
count
---------
2340704
(1 row)

real 0m35.38s
user 0m0.25s
sys 0m0.03s

subsequent runs.... (count changes due to inserts.)

rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
count
---------
2363707
(1 row)

real 0m2.70s
user 0m0.27s
sys 0m0.02s
rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
count
---------
2363707
(1 row)

real 0m2.55s
user 0m0.26s
sys 0m0.02s
rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
count
---------
2363707
(1 row)

real 0m2.50s
user 0m0.26s
sys 0m0.02s

reporting=# SELECT pg_size_pretty(pg_total_relation_size('my_production_table'));
pg_size_pretty
----------------
1890 MB
(1 row)

forgot to note, my table schema is significantly larger.

#16

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Vitalii Tymchyshyn (#11)

Re: Slow count(*) again...

On Sunday 10 October 2010 05:02:03 Віталій Тимчишин wrote:

2010/10/10 Neil Whelchel <neil.whelchel@gmail.com>

On Saturday 09 October 2010 18:47:34 Scott Marlowe wrote:

On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel <neil.whelchel@gmail.com>

wrote:

I know that there haven been many discussions on the slowness of

count(*)

even when an index is involved because the visibility of the rows has

to

be checked. In the past I have seen many suggestions about using
triggers and tables to keep track of counts and while this works fine

in

a situation where you know what the report is going to be ahead of

time,

this is simply not an option when an unknown WHERE clause is to be
used (dynamically generated). I ran into a fine example of this when
I was searching this mailing list, "Searching in 856,646 pages took
13.48202 seconds. Site search powered by PostgreSQL 8.3." Obviously
at some

point

count(*) came into play here because the site made a list of pages (1
2 3 4 5 6 > next). I very commonly make a list of pages from search
results, and the biggest time killer here is the count(*) portion,
even worse yet, I sometimes have to hit the database with two SELECT
statements, one with OFFSET and LIMIT to get the page of results I
need and another to get the amount of total rows so I can estimate
how many pages of results are available. The point I am driving at
here is that since building a list of pages of results is such a
common thing to do, there need to be some specific high speed ways
to do this in one query. Maybe an estimate(*) that works like count
but gives an answer from the index without checking visibility? I am
sure that this would be good enough to make a page list, it is
really no big deal if it errors on

the

positive side, maybe the list of pages has an extra page off the end.
I can live with that. What I can't live with is taking 13 seconds to
get

a

page of results from 850,000 rows in a table.

99% of the time in the situations you don't need an exact measure, and
assuming analyze has run recently, select rel_tuples from pg_class for
a given table is more than close enough. I'm sure wrapping that in a
simple estimated_rows() function would be easy enough to do.

This is a very good approach and it works very well when you are counting
the
entire table, but when you have no control over the WHERE clause, it
doesn't
help. IE: someone puts in a word to look for in a web form.

From my perspective, this issue is the biggest problem there is when
using Postgres to create web pages, and it is so commonly used, I think
that there
should be a specific way to deal with it so that you don't have to run
the same WHERE clause twice.
IE: SELECT count(*) FROM <table> WHERE <clause>; to get the total amount
of items to make page navigation links, then:
SELECT <columns> FROM table WHERE <clause> LIMIT <items_per_page> OFFSET
<(page_no-1)*items_per_page>; to get the actual page contents.

How about

select * from (select *, count(*) over () as total_count from <table> where
<clause) a LIMIT <items_per_page> OFFSET
<(page_no-1)*items_per_page>
It will return you total_count column with equal value in each row. You may
have problems if no rows are returned (e.g. page num is too high).

I have done this before, but the speedup from the two hits to the database
that I mentioned above is tiny, just a few ms. It seems to end up doing about
the same thing on the database end. The reason that I don't commonly do this
is what you said about not getting a count result if you run off the end.
-Neil-

#17

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: Mladen Gogala (#14)

Re: Slow count(*) again...

On 10/11/2010 01:14 AM, Mladen Gogala wrote:

I can provide measurements, but from Oracle RDBMS. Postgres doesn't
allow tuning of that aspect, so no measurement can be done. Would the
numbers from Oracle RDBMS be acceptable?

Well, they'd tell me a lot about Oracle's performance as I/O chunk size
scales, but almost nothing about the cost of small I/O operations vs
larger ones in general.

Typically dedicated test programs that simulate the database read
patterns would be used for this sort of thing. I'd be surprised if
nobody on -hackers has already done suitable testing; I was mostly
asking because I was interested in how you were backing your assertions.

PostgreSQL isn't Oracle; their design is in many ways very different.
Most importantly, Oracle uses a redo log, where PostgreSQL stores old
rows with visibility information directly in the tables. It is possible
that a larger proportion of Oracle's I/O costs are fixed per-block
overheads rather than per-byte costs, so it seeks to batch requests into
larger chunks. Of course, it's also possible that 8k chunk I/O is just
universally expensive and is something Pg should avoid, too, but we
can't know that without
dedicated testing, which I at least haven't done. I don't follow
-hackers closely, and wouldn't have seen discussion about testing done
there. The archives are likely to contain useful discussions.

Then again, IIRC Pg's page size is also it's I/O size, so you could
actually get larger I/O chunking by rebuilding Pg with larger pages.
Having never had the need, I haven't examined the performance of page
size changes on I/O performance.

The Linux kernel, at least, does request merging (and splitting, and
merging, and more splitting) along the request path, and I'd
personally expect that most of the cost of 8k requests would be in the
increased number of system calls, buffer copies, etc required.
Measurements demonstrating or contradicting this would be good to see.

Even the cost of hundreds of thousands of context switches is far from
negligible. What kind of measurements do you expect me to do with the
database which doesn't support tweaking of that aspect of its operation?

Test programs, or references to testing done by others that demonstrates
these costs in isolation. Of course, they still wouldn't show what gain
Pg might obtain (nothing except hacking on Pg's sources really will) but
they'd help measure the costs of doing I/O that way.

I suspect you're right that large I/O chunks would be desirable and a
potential performance improvement. What I'd like to know is *how*
*much*, or at least how much the current approach costs in pure
overheads like context switches and scheduler delays.

Does that provide enough of an evidence and, if not, why not?

It shows that it helps Oracle a lot ;-)

Without isolating how much of that is raw costs of the block I/O and how
much is costs internal to Oracle, it's still hard to have much idea how
much it'd benefit Pg to take a similar approach.

I'm sure the folks on -hackers have been over this and know a whole lot
more about it than I do, though.

Oracle counts 400 million records in 2 minutes and Postgres 9.01 takes
12.8 seconds to count 1.2 million records? Do you see the disparity?

Sure. What I don't know is how much of that is due to block sizes. There
are all sorts of areas where Oracle could be gaining.

It maybe so, but slow sequential scan is still the largest single
performance problem of PostgreSQL. The frequency with which that topic
appears on the mailing lists should serve as a good evidence for that.

I'm certainly not arguing that it could use improvement; it's clearly
hurting some users. I just don't know if I/O chunking is the answer - I
suspect that if it were, then it would've become a priority for one or
more people working on Pg much sooner.

It's quite likely that it's one of those things where it makes a huge
difference for Oracle because Oracle has managed to optimize out most of
the other bigger costs. If Pg still has other areas that make I/O more
expensive per-byte (say, visibility checks) and low fixed per-block
costs, then there'd be little point in chunking I/O. My understanding is
that that's pretty much how things stand at the moment, but I'd love
verification from someone who's done the testing.

If you still claim that it wouldn't make the difference,
the onus to prove it is on you.

I didn't mean to claim that it would make no difference. If I sounded
like it, sorry.

I just want to know how _much_ , or more accurately how great the
overheads of the current approach in Pg are vs doing much larger reads.

--
Craig Ringer

#18

Jon Nelson

jnelson+pgsql@jamponi.net

over 15 years ago

In reply to: Mladen Gogala (#14)

Re: Slow count(*) again...

On Sun, Oct 10, 2010 at 12:14 PM, Mladen Gogala
<mladen.gogala@vmsinfo.com> wrote:

In other words, when I batched the sequential scan to do 128 blocks I/O, it
was 4 times faster than when I did the single block I/O.
Does that provide enough of an evidence and, if not, why not?

These numbers tell us nothing because, unless you dropped the caches
between runs, then at least part of some runs was very probably
cached.

--
Jon

#19

Joshua Tolley

eggyknap@gmail.com

over 15 years ago

In reply to: Craig Ringer (#17)

Re: Slow count(*) again...

On Mon, Oct 11, 2010 at 06:41:16AM +0800, Craig Ringer wrote:

On 10/11/2010 01:14 AM, Mladen Gogala wrote:

I can provide measurements, but from Oracle RDBMS. Postgres doesn't
allow tuning of that aspect, so no measurement can be done. Would the
numbers from Oracle RDBMS be acceptable?

Well, they'd tell me a lot about Oracle's performance as I/O chunk size
scales, but almost nothing about the cost of small I/O operations vs
larger ones in general.

Typically dedicated test programs that simulate the database read
patterns would be used for this sort of thing. I'd be surprised if
nobody on -hackers has already done suitable testing; I was mostly
asking because I was interested in how you were backing your assertions.

One thing a test program would have to take into account is multiple
concurrent users. What speeds up the single user case may well hurt the
multi user case, and the behaviors that hurt single user cases may have been
put in place on purpose to allow decent multi-user performance. Of course, all
of that is "might" and "maybe", and I can't prove any assertions about block
size either. But the fact of multiple users needs to be kept in mind.

It was asserted that reading bigger chunks would help performance; a response
suggested that, at least in Linux, setting readahead on a device would
essentially do the same thing. Or that's what I got from the thread, anyway.
I'm interested to know how similar performance might be between the large
block size case and the large readahead case. Comments, anyone?

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

#20

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: Joshua Tolley (#19)

Re: Slow count(*) again...

On 10/11/2010 08:27 AM, Joshua Tolley wrote:

One thing a test program would have to take into account is multiple
concurrent users. What speeds up the single user case may well hurt the
multi user case, and the behaviors that hurt single user cases may have been
put in place on purpose to allow decent multi-user performance. Of course, all
of that is "might" and "maybe", and I can't prove any assertions about block
size either. But the fact of multiple users needs to be kept in mind.

Agreed. I've put together a simple test program to test I/O chunk sizes.
It only tests single-user performance, but it'd be pretty trivial to
adapt it to spawn a couple of worker children or run several threads,
each with a suitable delay as it's rather uncommon to have a bunch of
seqscans all fire off at once.

From this test it's pretty clear that with buffered I/O of an uncached
700mb file under Linux, the I/O chunk size makes very little difference,
with all chunk sizes taking 9.8s to read the test file, with
near-identical CPU utilization. Caches were dropped between each test run.

For direct I/O (by ORing the O_DIRECT flag to the open() flags), chunk
size is *hugely* significant, with 4k chunk reads of the test file
taking 38s, 8k 22s, 16k 14s, 32k 10.8s, 64k - 1024k 9.8s, then rising a
little again over 1024k.

Apparently Oracle is almost always configured to use direct I/O, so it
would benefit massively from large chunk sizes. PostgreSQL is almost
never used with direct I/O, and at least in terms of the low-level costs
of syscalls and file system activity, shouldn't care at all about read
chunk sizes.

Bumping readahead from 256 to 8192 made no significant difference for
either case. Of course, I'm on a crappy laptop disk...

I'm guessing this is the origin of the OP's focus on I/O chunk sizes.

Anyway, for the single-seqscan case, I see little evidence here that
using a bigger read chunk size would help PostgreSQL reduce overheads or
improve performance.

OP: Is your Oracle instance using direct I/O?

--
Craig Ringer

#21

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Joshua Tolley (#19)

#22

Joshua Tolley

eggyknap@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#21)

#23

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: Mladen Gogala (#21)

#24

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Craig Ringer (#17)

#25

PFC

lists@peufeu.com

over 15 years ago

In reply to: Neil Whelchel (#1)

#26

Craig James

craig_james@emolecules.com

over 15 years ago

In reply to: Scott Marlowe (#2)

#27

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Craig James (#26)

#28

Mladen Gogala

mgogala@vmsinfo.com

over 15 years ago

In reply to: Neil Whelchel (#27)

#29

Sam Gendler

sgendler@ideasculptor.com

over 15 years ago

In reply to: Neil Whelchel (#27)

#30

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Neil Whelchel (#27)

#31

Scott Carey

scott@richrelevance.com

over 15 years ago

In reply to: Mladen Gogala (#14)

#32

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Joshua Tolley (#19)

#33

Scott Carey

scott@richrelevance.com

over 15 years ago

In reply to: Scott Carey (#31)

#34

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Scott Carey (#31)

#35

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#30)

#36

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#34)

#37

Sam Gendler

sgendler@ideasculptor.com

over 15 years ago

In reply to: Greg Smith (#32)

#38

Scott Carey

scott@richrelevance.com

over 15 years ago

In reply to: Sam Gendler (#37)

#39

Sam Gendler

sgendler@ideasculptor.com

over 15 years ago

In reply to: Scott Carey (#38)

#40

david@lang.hm

over 15 years ago

In reply to: Sam Gendler (#39)

#41

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Mladen Gogala (#34)

#42

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Sam Gendler (#37)

#43

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: Craig James (#26)

#44

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: Vitalii Tymchyshyn (#43)

#45

david@lang.hm

over 15 years ago

In reply to: Craig Ringer (#44)

#46

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: Craig Ringer (#44)

#47

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: david@lang.hm (#45)

#48

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: david@lang.hm (#45)

#49

Jon Nelson

jnelson+pgsql@jamponi.net

over 15 years ago

In reply to: Mladen Gogala (#48)

#50

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Jon Nelson (#49)

#51

Luca Tettamanti

kronos.it@gmail.com

over 15 years ago

In reply to: Jon Nelson (#49)

#52

Jon Nelson

jnelson+pgsql@jamponi.net

over 15 years ago

In reply to: Greg Smith (#50)

#53

Mladen Gogala

mgogala@vmsinfo.com

over 15 years ago

In reply to: Jon Nelson (#49)

#54

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Mladen Gogala (#48)

#55

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Neil Whelchel (#35)

#56

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Tom Lane (#54)

#57

Joe Uhl

joeuhl@gmail.com

over 15 years ago

In reply to: Neil Whelchel (#35)

#58

bricklen

bricklen@gmail.com

over 15 years ago

In reply to: Neil Whelchel (#1)

#59

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Jon Nelson (#52)

#60

Dan Harris

fbsd@drivefaster.net

over 15 years ago

In reply to: Scott Carey (#31)

#61

david@lang.hm

over 15 years ago

In reply to: Joe Uhl (#57)

#62

david@lang.hm

over 15 years ago

In reply to: Mladen Gogala (#48)

#63

david@lang.hm

over 15 years ago

In reply to: Craig Ringer (#47)

#64

Scott Carey

scott@richrelevance.com

over 15 years ago

In reply to: Sam Gendler (#39)

#65

Chris Browne

cbbrowne@acm.org

over 15 years ago

In reply to: Neil Whelchel (#1)

#66

Sam Gendler

sgendler@ideasculptor.com

over 15 years ago

In reply to: Scott Carey (#64)

#67

Scott Carey

scott@richrelevance.com

over 15 years ago

In reply to: Craig Ringer (#44)

#68

Jesper Krogh

jesper@krogh.cc

over 15 years ago

In reply to: Scott Carey (#64)

#69

Scott Carey

scott@richrelevance.com

over 15 years ago

In reply to: Dan Harris (#60)

#70

Scott Carey

scott@richrelevance.com

over 15 years ago

In reply to: david@lang.hm (#63)

#71

Scott Carey

scott@richrelevance.com

over 15 years ago

In reply to: Scott Carey (#70)

#72

Dan Harris

fbsd@drivefaster.net

over 15 years ago

In reply to: Scott Carey (#69)

#73

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Mladen Gogala (#56)

#74

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: david@lang.hm (#62)

#75

Jesper Krogh

jesper@krogh.cc

over 15 years ago

In reply to: Tom Lane (#73)

#76

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Jesper Krogh (#75)

#77

PFC

lists@peufeu.com

over 15 years ago

In reply to: Neil Whelchel (#35)

#78

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Joe Uhl (#57)

#79

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: PFC (#77)

#80

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Dan Harris (#60)

#81

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: PFC (#77)

#82

Dan Harris

fbsd@drivefaster.net

over 15 years ago

In reply to: Neil Whelchel (#80)

#83

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Tom Lane (#73)

#84

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Neil Whelchel (#24)

#85

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

over 15 years ago

In reply to: Neil Whelchel (#84)

#86

PFC

lists@peufeu.com

over 15 years ago

In reply to: Neil Whelchel (#81)

#87

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Mark Kirkwood (#85)

#88

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Neil Whelchel (#84)

#89

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Mark Kirkwood (#85)

#90

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

over 15 years ago

In reply to: Neil Whelchel (#87)

#91

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Mark Kirkwood (#90)

#92

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: Craig Ringer (#47)

#93

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: Tom Lane (#76)

#94

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: Jesper Krogh (#68)

#95

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Dan Harris (#72)

#96

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Greg Smith (#95)

#97

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Mladen Gogala (#96)

#98

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Neil Whelchel (#87)

#99

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Neil Whelchel (#91)

#100

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Neil Whelchel (#84)

#101

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Neil Whelchel (#84)

#102

Jesper Krogh

jesper@krogh.cc

over 15 years ago

In reply to: Robert Haas (#99)

#103

Alex Hunsaker

badalex@gmail.com

over 15 years ago

In reply to: Tom Lane (#101)

#104

Alex Hunsaker

badalex@gmail.com

over 15 years ago

In reply to: Neil Whelchel (#87)

#105

david@lang.hm

over 15 years ago

In reply to: Tom Lane (#101)

#106

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#96)

#107

Neil Whelchel

neil.whelchel@gmail.com

over 15 years ago

In reply to: Robert Haas (#98)

#108

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

over 15 years ago

In reply to: Neil Whelchel (#91)

#109

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

over 15 years ago

In reply to: Mladen Gogala (#89)

#110

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Jesper Krogh (#102)

#111

mark

dvlhntr@gmail.com

over 15 years ago

In reply to: Robert Haas (#99)

#112

Jesper Krogh

jesper@krogh.cc

over 15 years ago

In reply to: mark (#111)

#113

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: mark (#111)

#114

Jesper Krogh

jesper@krogh.cc

over 15 years ago

In reply to: Robert Haas (#113)

#115

Devrim GÜNDÜZ

devrim@gunduz.org

over 15 years ago

In reply to: Greg Smith (#97)

#116

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Jesper Krogh (#114)

#117

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: bricklen (#58)

#118

Scott Carey

scott@richrelevance.com

over 15 years ago

In reply to: Tom Lane (#76)

#119

Jesper Krogh

jesper@krogh.cc

over 15 years ago

In reply to: Scott Carey (#118)

#120

Scott Carey

scott@richrelevance.com

over 15 years ago

In reply to: Jesper Krogh (#119)

#121

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Tom Lane (#73)

#122

Andrew Dunstan

andrew@dunslane.net

over 15 years ago

In reply to: Bruce Momjian (#121)

#123

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Andrew Dunstan (#122)

#124

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Bruce Momjian (#121)

#125

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Andrew Dunstan (#122)

#126

Sam Gendler

sgendler@ideasculptor.com

over 15 years ago

In reply to: Mladen Gogala (#125)

#127

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Sam Gendler (#126)

#128

Sam Gendler

sgendler@ideasculptor.com

over 15 years ago

In reply to: Mladen Gogala (#127)

#129

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Tom Lane (#123)

#130

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#125)

#131

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Robert Haas (#130)

#132

Jon Nelson

jnelson+pgsql@jamponi.net

over 15 years ago

In reply to: Mladen Gogala (#131)

#133

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#131)

#134

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Jon Nelson (#132)

#135

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Mladen Gogala (#131)

#136

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Greg Smith (#135)

#137

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Mladen Gogala (#136)

#138

Kenneth Marshall

ktm@rice.edu

over 15 years ago

In reply to: Mladen Gogala (#136)

#139

Justin Pitts

justinpitts@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#136)

#140

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Kenneth Marshall (#138)

#141

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Justin Pitts (#139)

#142

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Mladen Gogala (#136)

#143

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Greg Smith (#142)

#144

Justin Pitts

justinpitts@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#141)

#145

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Greg Smith (#142)

#146

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Greg Smith (#142)

#147

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Mladen Gogala (#146)

#148

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: Robert Haas (#134)

#149

david@lang.hm

over 15 years ago

In reply to: Vitalii Tymchyshyn (#148)

#150

Kenneth Marshall

ktm@rice.edu

over 15 years ago

In reply to: david@lang.hm (#149)

#151

Jon Nelson

jnelson+pgsql@jamponi.net

over 15 years ago

In reply to: Kenneth Marshall (#150)

#152

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Vitalii Tymchyshyn (#148)

#153

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: david@lang.hm (#149)

#154

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: Robert Haas (#152)

#155

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Greg Smith (#147)

#156

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Mladen Gogala (#155)

#157

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Mladen Gogala (#155)

#158

Shaun Thomas

sthomas@peak6.com

over 15 years ago

In reply to: Mladen Gogala (#155)

#159

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Mladen Gogala (#155)

#160

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Bruce Momjian (#156)

#161

Chris Browne

cbbrowne@acm.org

over 15 years ago

In reply to: Bruce Momjian (#121)

#162

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Robert Haas (#160)

#163

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Mladen Gogala (#155)

#164

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Chris Browne (#161)

#165

Ross J. Reedstrom

reedstrm@rice.edu

over 15 years ago

In reply to: Chris Browne (#161)

#166

david@lang.hm

over 15 years ago

In reply to: Robert Haas (#153)

#167

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Shaun Thomas (#158)

#168

Chris Browne

cbbrowne@acm.org

over 15 years ago

In reply to: Bruce Momjian (#121)

#169

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Mladen Gogala (#167)

#170

Shaun Thomas

sthomas@peak6.com

over 15 years ago

In reply to: Mladen Gogala (#167)

#171

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Chris Browne (#168)

#172

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

over 15 years ago

In reply to: Mladen Gogala (#167)

#173

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Shaun Thomas (#170)

#174

Ben

bench@silentmedia.com

over 15 years ago

In reply to: Mladen Gogala (#171)

#175

Pavel Stehule

pavel.stehule@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#171)

#176

Josh Berkus

josh@agliodbs.com

over 15 years ago

In reply to: Chris Browne (#168)

#177

Maciek Sakrejda

msakrejda@truviso.com

over 15 years ago

In reply to: Mladen Gogala (#173)

#178

Pavel Stehule

pavel.stehule@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#173)

#179

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Josh Berkus (#176)

#180

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Mladen Gogala (#179)

#181

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

over 15 years ago

In reply to: Josh Berkus (#176)

#182

Michael Glaesemann

grzm@seespotcode.net

over 15 years ago

In reply to: Josh Berkus (#176)

#183

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: david@lang.hm (#166)

#184

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Maciek Sakrejda (#177)

#185

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Mark Kirkwood (#181)

#186

Josh Berkus

josh@agliodbs.com

over 15 years ago

In reply to: Mladen Gogala (#184)

#187

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Mladen Gogala (#184)

#188

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Josh Berkus (#186)

#189

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: david@lang.hm (#166)

#190

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Kevin Grittner (#187)

#191

Joshua D. Drake

jd@commandprompt.com

over 15 years ago

In reply to: Mladen Gogala (#190)

#192

Craig James

craig_james@emolecules.com

over 15 years ago

In reply to: Shaun Thomas (#170)

#193

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#190)

#194

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Joshua D. Drake (#191)

#195

Jeremy Harris

jgh@wizmail.org

over 15 years ago

In reply to: Robert Haas (#189)

#196

Shaun Thomas

sthomas@peak6.com

over 15 years ago

In reply to: Mladen Gogala (#194)

#197

david@lang.hm

over 15 years ago

In reply to: Robert Haas (#189)

#198

Mladen Gogala

mladen.gogala@vmsinfo.com

over 15 years ago

In reply to: Robert Haas (#193)

#199

Jeremy Harris

jgh@wizmail.org

over 15 years ago

In reply to: Mark Kirkwood (#172)

#200

grant

grant@amadensor.com

over 15 years ago

In reply to: Mladen Gogala (#198)

#201

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

over 15 years ago

In reply to: Jeremy Harris (#199)

#202

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: david@lang.hm (#197)

#203

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#198)

#204

david@lang.hm

over 15 years ago

In reply to: Robert Haas (#202)

#205

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: david@lang.hm (#204)

#206

Conor Walsh

ctw@adverb.ly

over 15 years ago

In reply to: Robert Haas (#205)

#207

Joshua D. Drake

jd@commandprompt.com

over 15 years ago

In reply to: Conor Walsh (#206)

#208

Conor Walsh

ctw@adverb.ly

over 15 years ago

In reply to: Joshua D. Drake (#207)

#209

Scott Marlowe

scott.marlowe@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#198)

#210

Jeff Davis

pgsql@j-davis.com

over 15 years ago

In reply to: Mladen Gogala (#171)

#211

Scott Marlowe

scott.marlowe@gmail.com

over 15 years ago

In reply to: Robert Haas (#205)

#212

Mark Mielke

mark@mark.mielke.cc

over 15 years ago

In reply to: Conor Walsh (#208)

#213

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Scott Marlowe (#209)

#214

Scott Marlowe

scott.marlowe@gmail.com

over 15 years ago

In reply to: Greg Smith (#213)

#215

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Scott Marlowe (#214)

#216

Scott Marlowe

scott.marlowe@gmail.com

over 15 years ago

In reply to: Greg Smith (#215)

#217

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Scott Marlowe (#216)

#218

David Wilson

david.t.wilson@gmail.com

over 15 years ago

In reply to: Robert Haas (#205)

#219

Sam Gendler

sgendler@ideasculptor.com

over 15 years ago

In reply to: David Wilson (#218)

#220

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: david@lang.hm (#166)

#221

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: Mladen Gogala (#179)

#222

david@lang.hm

over 15 years ago

In reply to: Vitalii Tymchyshyn (#220)

#223

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: david@lang.hm (#222)

#224

Andrew Dunstan

andrew@dunslane.net

over 15 years ago

In reply to: david@lang.hm (#222)

#225

Torsten Zühlsdorff

foo@meisterderspiele.de

over 15 years ago

In reply to: Mladen Gogala (#190)

#226

grant

grant@amadensor.com

over 15 years ago

In reply to: Scott Marlowe (#209)

#227

Kenneth Marshall

ktm@rice.edu

over 15 years ago

In reply to: david@lang.hm (#197)

#228

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: Kenneth Marshall (#227)

#229

Kenneth Marshall

ktm@rice.edu

over 15 years ago

In reply to: Robert Haas (#205)

#230

Nick Lello

nick.lello@rentrakmail.com

over 15 years ago

In reply to: Mark Kirkwood (#181)

#231

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Mark Mielke (#212)

#232

Scott Marlowe

scott.marlowe@gmail.com

over 15 years ago

In reply to: grant (#226)

#233

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Vitalii Tymchyshyn (#228)

#234

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Greg Smith (#163)

#235

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Mladen Gogala (#171)

#236

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Mladen Gogala (#198)

#237

Mark Mielke

mark@mark.mielke.cc

over 15 years ago

In reply to: Tom Lane (#231)

#238

Scott Marlowe

scott.marlowe@gmail.com

over 15 years ago

In reply to: Bruce Momjian (#236)

#239

david@lang.hm

over 15 years ago

In reply to: Vitalii Tymchyshyn (#228)

#240

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: david@lang.hm (#239)

#241

Scott Marlowe

scott.marlowe@gmail.com

over 15 years ago

In reply to: Robert Haas (#240)

#242

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Scott Marlowe (#241)

#243

Gorshkov

gorshkovlists@gmail.com

over 15 years ago

In reply to: Scott Marlowe (#214)

#244

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Bruce Momjian (#121)

#245

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Robert Haas (#244)

#246

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Kevin Grittner (#245)

#247

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Kevin Grittner (#245)

#248

Shaun Thomas

sthomas@peak6.com

over 15 years ago

In reply to: Kevin Grittner (#245)

#249

Shaun Thomas

sthomas@peak6.com

over 15 years ago

In reply to: Tom Lane (#246)

#250

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Tom Lane (#246)

#251

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Shaun Thomas (#248)

#252

Chris Browne

cbbrowne@acm.org

over 15 years ago

In reply to: Bruce Momjian (#121)

#253

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Robert Haas (#250)

#254

Shaun Thomas

sthomas@peak6.com

over 15 years ago

In reply to: Kevin Grittner (#251)

#255

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Shaun Thomas (#254)

#256

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Mark Mielke (#237)

#257

Greg Smith

gsmith@gregsmith.com

over 15 years ago

In reply to: Shaun Thomas (#249)

#258

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Greg Smith (#257)

#259

Craig James

craig_james@emolecules.com

over 15 years ago

In reply to: Kevin Grittner (#251)

#260

Tobias Brox

tobixen@gmail.com

over 15 years ago

In reply to: Bruce Momjian (#121)

#261

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Tobias Brox (#260)

#262

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: Tobias Brox (#260)

#263

Tobias Brox

tobixen@gmail.com

over 15 years ago

In reply to: Vitalii Tymchyshyn (#262)

#264

Vitalii Tymchyshyn

tivv00@gmail.com

over 15 years ago

In reply to: Tobias Brox (#263)

#265

Tobias Brox

tobixen@gmail.com

over 15 years ago

In reply to: Vitalii Tymchyshyn (#264)

#266

Andrea Suisani

sickpig@opinioni.net

over 15 years ago

In reply to: Tobias Brox (#265)

#267

PFC

lists@peufeu.com

over 15 years ago

In reply to: Tobias Brox (#260)

#268

Rob Wultsch

wultsch@gmail.com

over 15 years ago

In reply to: Chris Browne (#252)

#269

Josh Berkus

josh@agliodbs.com

over 15 years ago

In reply to: Rob Wultsch (#268)

#270

Scott Marlowe

scott.marlowe@gmail.com

over 15 years ago

In reply to: Josh Berkus (#269)

#271

Scott Marlowe

scott.marlowe@gmail.com

over 15 years ago

In reply to: Bruce Momjian (#121)

#272

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Kevin Grittner (#251)

#273

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Bruce Momjian (#272)

#274

pasman pasmański

pasman.p@gmail.com

almost 15 years ago

In reply to: Robert Haas (#273)

#275

Robert Klemme

shortcutter@googlemail.com

almost 15 years ago

In reply to: Craig James (#259)

Slow count(*) again...

Attachments: