why postgresql is so slow?

Started by Jinqiang Hanover 22 years ago5 messages
#1Jinqiang Han
postgresql@db.pku.edu.cn

hello, all.

I have a table about 2 million rows. when I run "select * from table1" in psql, it will take me about 10 minutes to get the result. I wonder if postgresql can immediately return result like db2.

After that I create a index on a column named id. The time executing "selct * from table1 where id=10" in psql is much faster than that of executing "select * from table1 where id <10" in psql. why?

By the way I'm using postgresql 7.3.

Who can tell me the reason. 3x.

Jinqiang Han

#2Richard Huxton
dev@archonet.com
In reply to: Jinqiang Han (#1)
Re: why postgresql is so slow?

On Saturday 20 September 2003 10:38, Jinqiang Han wrote:

hello, all.

This isn't really a hackers question - perhaps try the "general","sql" etc
lists in future. This list is for questions about the source-code of PG.

I have a table about 2 million rows. when I run "select * from table1" in
psql, it will take me about 10 minutes to get the result. I wonder if
postgresql can immediately return result like db2.

PG is returning *all* the rows you requested. If you want one row at a time,
perhaps use a cursor.

After that I create a index on a column named id. The time executing
"selct * from table1 where id=10" in psql is much faster than that of
executing "select * from table1 where id <10" in psql. why?

Because there are more records returned? You don't supply enough information.
If you'd like to repost to the general list, along with details of the tables
and the output of EXPLAIN ANALYSE for your queries, I'll be glad to go into
detail there.
--
Richard Huxton
Archonet Ltd

#3Jeff
jdavis-pgsql@empires.org
In reply to: Jinqiang Han (#1)
Re: why postgresql is so slow?

Jinqiang Han wrote:

hello, all.

I have a table about 2 million rows. when I run "select * from table1" in
psql, it will take me about 10 minutes to get the result. I wonder if
postgresql can immediately return result like db2.

If you're executing in psql, it's probably trying to load the whole result
set into RAM at once, and then try to format it, and then pipe it through a
pager (i.e. more). That's not really a test of PostgreSQL, I would
recommend using a cursor or something so that psql doesn't give you a false
impression.

After that I create a index on a column named id. The time executing
"selct * from table1 where id=10" in psql is much faster than that of
executing "select * from table1 where id <10" in psql. why?

Well, that depends. First, turn on stats collecting and run "VACUUM
ANALYZE". That will collect some data about your data which helps the
planner make a good choice.

My initial guess is that the second query is not using the index. That may
happen because if "id" is an INT column, than negatives are allowed, and
the planner figures that "id<10" only narrows it down to about half the
possible rows (in which case it would choose sequential scan). However, we
humans know that the "id" column is really a positive INT (or at least I
assume it is, based on your implication that "id<10" should be about as
fast as "id=10"), therefore "id<10" really means a max 10 rows out of 2M
(which means it should do an index scan). Turning on stats and running
VACUUM ANALYZE will give the planner the information that it needs to make
an intelligent choice.

If that doesn't help you, give us the table schema and run:
EXPLAIN ANALYZE SELECT * FROM table1 WHERE id=10
and then run:
EXPLAIN ANALYZE SELECT * FROM table1 WHERE id<10
and give us the output.

Then run "VACUUM ANALYZE" every once in a while (depending on how fast your
data changes), like every night for instance.

Regards,
Jeff Davis

#4Christopher Browne
cbbrowne@acm.org
In reply to: Jinqiang Han (#1)
Re: why postgresql is so slow?

Jeff <jdavis-pgsql@empires.org> wrote:

Then run "VACUUM ANALYZE" every once in a while (depending on how
fast your data changes), like every night for instance.

Consider VACUUM and ANALYZE somewhat separately.

You need to ANALYZE any time the distribution of the data changes.

You need to VACUUM any time a lot of data is invalidated by UPDATEs or
DELETEs. (And when you VACUUM, it's cheap to throw in an ANALYZE.)

That turns into a variety of policies depending on what the pattern of
activity for a particular table is:

- If a table only ever gets INSERTs, you need only ever ANALYZE it,
as VACUUM should be useless. (Caveat: If INSERTs could get
rolled back due to other processing nuking transactions, that
leaves dead tuples for VACUUM to work with...)

- A table that sees enormous numbers of updates (for instance, a list
of account balances) should be vacuumed REALLY frequently.

- Any time you do a VACUUM, you might as well also do an ANALYZE.

pg_autovacuum applies these policies, so it's a reasonable thing to
use in 7.3/7.4...
--
If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne&gt; rate me
http://www3.sympatico.ca/cbbrowne/
"One of the most dangerous things in the universe is an ignorant
people with real grievances. That is nowhere near as dangerous,
however, as an informed and intelligent society with grievances. The
damage that vengeful intelligence can wreak, you cannot even imagine."
-- Miles Teg, Heretics of Dune

#5Neil Conway
neilc@samurai.com
In reply to: Jeff (#3)
Re: why postgresql is so slow?

On Sat, 2003-09-20 at 06:14, Jeff wrote:

Well, that depends. First, turn on stats collecting and run "VACUUM
ANALYZE". That will collect some data about your data which helps the
planner make a good choice.

The statistics collector and the statistics collected by ANALYZE have
nothing to do with one another -- stats collector stats aren't used by
the query planner. So just running ANALYZE or VACUUM ANALYZE should be
sufficient.

-Neil