Database I/O and other performance questions.

Started by Lincoln Yeohover 25 years ago3 messagesgeneral
Jump to latest
#1Lincoln Yeoh
lyeoh@pop.jaring.my

Hi everyone,

What typical/max database I/O throughputs are people getting for simple
selects of datasets cached in memory but not explicitly clustered?
100KB/sec? 1MB/sec? 10MB/sec? (assuming << 8KB rows).

Does a "select count(*) from tablename where ..." actually retrieve all
columns from all the selected rows or does it only count the rows?

That is to say:
would/should select count(*) be slower than select
count(averysmallcolumnmaybeboolean)

I'll test stuff out in practice, but it'll be good to know what it should
be in theory.

Thanks!
Link.

#2Bruce Momjian
bruce@momjian.us
In reply to: Lincoln Yeoh (#1)
Re: Database I/O and other performance questions.

Hi everyone,

What typical/max database I/O throughputs are people getting for simple
selects of datasets cached in memory but not explicitly clustered?
100KB/sec? 1MB/sec? 10MB/sec? (assuming << 8KB rows).

I see transfer rates that match the I/O speed of the disk.

Does a "select count(*) from tablename where ..." actually retrieve all
columns from all the selected rows or does it only count the rows?

Yes, it reads all rows.

That is to say:
would/should select count(*) be slower than select
count(averysmallcolumnmaybeboolean)

Same speed.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Lincoln Yeoh (#1)
Re: Database I/O and other performance questions.

Lincoln Yeoh <lyeoh@pop.jaring.my> writes:

would/should select count(*) be slower than select
count(averysmallcolumnmaybeboolean)

There is actually a semantic difference there, see
http://www.postgresql.org/devel-corner/docs/postgres/sql-expressions.htm#SYNTAX-AGGREGATES
count(*) means the count of selected rows, but count(foo) means the
number of selected rows where the variable or expression is not NULL.

count(*) should be a fraction faster, because it doesn't expend any
cycles to check whether the specific column is NULL. The "*" in this
syntax doesn't mean "all columns" the way it does in a select list,
it's just a dummy.

regards, tom lane