Postgresql simple query performance question
Hi
We are in the process of testing for migration of our database from Oracle to Postgresql.
I hava a simple query
Select count(*) from foo
This table has 29384048 rows and is indexed on foo_id
The tables are vacuumed and the explain plan for postgresql is
QUERY PLAN
------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual time=68797.280..68797.280 rows=1 loops=1)
-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0) (actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms
The explain plan for oracle is
OPERATION OBJECT ACCESS_PREDICATES FILTER_PREDICATES
------------------- ------------------------ -------------------- --------------------
SELECT STATEMENT () (null) (null) (null)
SORT (AGGREGATE) (null) (null) (null)
INDEX (FULL SCAN) foo_IDX_ID (null) (null)
Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min 10sec
The same query in oracle without the index and full table scan(like in postgresql) has the
explain plan like this and it takes 34 sec.
select /*+ full(foo1) */ count(*) from foo1
OPERATION OBJECT ACCESS_PREDICATES FILTER_PREDICATES
----------------------- ------------------ -------------------- --------------------
SELECT STATEMENT () (null) (null) (null)
SORT (AGGREGATE) (null) (null) (null)
TABLE ACCESS (FULL) foo (null) (null)
In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 sec
How can I speed up this query in postgresql ? The other postgres settings are
postgresql
max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025
Are there any tuning that need to be done in the OS or database side? I had attached the iostat and vmstat results of postgresql
Thanks
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Attachments:
iostat-vmstat result.txttext/plain; name="=?utf-8?q?iostat-vmstat=20result.txt?="Download
Hello
PostgreSQL doesn't use index for COUN(*)
http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7
Regards
Pavel Stehule
Show quoted text
On 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:
Hi
We are in the process of testing for migration of our database from Oracle
to Postgresql.
I hava a simple querySelect count(*) from foo
This table has 29384048 rows and is indexed on foo_idThe tables are vacuumed and the explain plan for postgresql is
QUERY PLAN
------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 msThe explain plan for oracle is
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null) (null)SORT (AGGREGATE) (null) (null) (null)
INDEX (FULL SCAN) foo_IDX_ID (null) (null)
Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min
10secThe same query in oracle without the index and full table scan(like in
postgresql) has theexplain plan like this and it takes 34 sec.
select /*+ full(foo1) */ count(*) from foo1
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null) (null)SORT (AGGREGATE) (null) (null) (null)
TABLE ACCESS (FULL) foo (null) (null)In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 secHow can I speed up this query in postgresql ? The other postgres settings
arepostgresql
max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025Are there any tuning that need to be done in the OS or database side? I had
attached the iostat and vmstat results of postgresqlThanks
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
I have no doubt you're right, Pavel.
But why not?
It could be a simple enhacement.
Il Tuesday 06 November 2007 15:11:02 Pavel Stehule ha scritto:
Hello
PostgreSQL doesn't use index for COUN(*)
http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7Regards
Pavel StehuleOn 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:
Hi
We are in the process of testing for migration of our database from
Oracle to Postgresql.
I hava a simple querySelect count(*) from foo
This table has 29384048 rows and is indexed on foo_idThe tables are vacuumed and the explain plan for postgresql is
QUERY PLAN
------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 msThe explain plan for oracle is
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)SORT (AGGREGATE) (null) (null)
(null)INDEX (FULL SCAN) foo_IDX_ID (null) (null)
Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min
10secThe same query in oracle without the index and full table scan(like in
postgresql) has theexplain plan like this and it takes 34 sec.
select /*+ full(foo1) */ count(*) from foo1
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)SORT (AGGREGATE) (null) (null)
(null) TABLE ACCESS (FULL) foo (null)
(null)In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 secHow can I speed up this query in postgresql ? The other postgres settings
arepostgresql
max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025Are there any tuning that need to be done in the OS or database side? I
had attached the iostat and vmstat results of postgresqlThanks
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
--
Reg me Please
In response to SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>:
Hi
We are in the process of testing for migration of our database from Oracle to Postgresql.
I hava a simple querySelect count(*) from foo
This is asked a lot. The quick answer is that PostgreSQL method of MVCC
makes it impossible to make this query fast. Perhaps, someday, some
brilliant developer will come up with an optimization, but that hasn't
happened yet.
There may be some tweaks you can make to your tuning, see inline below.
However, if you really need a fast, accurate count of rows in that
table, I recommend you create a trigger to track it.
This table has 29384048 rows and is indexed on foo_id
The tables are vacuumed and the explain plan for postgresql is
QUERY PLAN
------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual time=68797.280..68797.280 rows=1 loops=1)
-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0) (actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 msThe explain plan for oracle is
OPERATION OBJECT ACCESS_PREDICATES FILTER_PREDICATES
------------------- ------------------------ -------------------- --------------------
SELECT STATEMENT () (null) (null) (null)
SORT (AGGREGATE) (null) (null) (null)
INDEX (FULL SCAN) foo_IDX_ID (null) (null)Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min 10secThe same query in oracle without the index and full table scan(like in postgresql) has the
explain plan like this and it takes 34 sec.
select /*+ full(foo1) */ count(*) from foo1
OPERATION OBJECT ACCESS_PREDICATES FILTER_PREDICATES
----------------------- ------------------ -------------------- --------------------
SELECT STATEMENT () (null) (null) (null)
SORT (AGGREGATE) (null) (null) (null)
TABLE ACCESS (FULL) foo (null) (null)In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 secHow can I speed up this query in postgresql ? The other postgres settings are
postgresql
max_connections = 100
shared_buffers = 50000
How much memory does this system have? What version of PostgreSQL are you
using? If you're using an 8.X version and have more 2G of RAM, this
value is likely too low. Start with 1/4 the available RAM and tune from
there.
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025Are there any tuning that need to be done in the OS or database side? I had attached the iostat and vmstat results of postgresql
--
Bill Moran
http://www.potentialtech.com
In response to Reg Me Please <regmeplease@gmail.com>:
I have no doubt you're right, Pavel.
But why not?
It could be a simple enhacement.
It's not simple. Do some searches on the mailing lists and you will
find discussion of why it's difficult to do.
Il Tuesday 06 November 2007 15:11:02 Pavel Stehule ha scritto:
Hello
PostgreSQL doesn't use index for COUN(*)
http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7Regards
Pavel StehuleOn 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:
Hi
We are in the process of testing for migration of our database from
Oracle to Postgresql.
I hava a simple querySelect count(*) from foo
This table has 29384048 rows and is indexed on foo_idThe tables are vacuumed and the explain plan for postgresql is
QUERY PLAN
------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 msThe explain plan for oracle is
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)SORT (AGGREGATE) (null) (null)
(null)INDEX (FULL SCAN) foo_IDX_ID (null) (null)
Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min
10secThe same query in oracle without the index and full table scan(like in
postgresql) has theexplain plan like this and it takes 34 sec.
select /*+ full(foo1) */ count(*) from foo1
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)SORT (AGGREGATE) (null) (null)
(null) TABLE ACCESS (FULL) foo (null)
(null)In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 secHow can I speed up this query in postgresql ? The other postgres settings
arepostgresql
max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025Are there any tuning that need to be done in the OS or database side? I
had attached the iostat and vmstat results of postgresqlThanks
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match--
Reg me Please---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
--
Bill Moran
http://www.potentialtech.com
While I would not spend resources in fine tuning the count(*), I would
spend some to underastand why and how the other ones do it better.
Just to be better.
Il Tuesday 06 November 2007 15:29:34 Bill Moran ha scritto:
In response to Reg Me Please <regmeplease@gmail.com>:
I have no doubt you're right, Pavel.
But why not?
It could be a simple enhacement.It's not simple. Do some searches on the mailing lists and you will
find discussion of why it's difficult to do.Il Tuesday 06 November 2007 15:11:02 Pavel Stehule ha scritto:
Hello
PostgreSQL doesn't use index for COUN(*)
http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7Regards
Pavel StehuleOn 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:
Hi
We are in the process of testing for migration of our database from
Oracle to Postgresql.
I hava a simple querySelect count(*) from foo
This table has 29384048 rows and is indexed on foo_idThe tables are vacuumed and the explain plan for postgresql is
QUERY PLAN
------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 msThe explain plan for oracle is
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)SORT (AGGREGATE) (null) (null)
(null)INDEX (FULL SCAN) foo_IDX_ID (null) (null)
Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1
min 10secThe same query in oracle without the index and full table scan(like
in postgresql) has theexplain plan like this and it takes 34 sec.
select /*+ full(foo1) */ count(*) from foo1
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)SORT (AGGREGATE) (null) (null)
(null) TABLE ACCESS (FULL) foo (null)
(null)In short the query "Select count(*) from foo" takes the following
time: Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 secHow can I speed up this query in postgresql ? The other postgres
settings arepostgresql
max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025Are there any tuning that need to be done in the OS or database
side? I had attached the iostat and vmstat results of postgresqlThanks
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that
your message can get through to the mailing list cleanly---------------------------(end of
broadcast)--------------------------- TIP 9: In versions below 8.0, the
planner will ignore your desire to choose an index scan if your joining
column's datatypes do not match--
Reg me Please---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
--
Reg me Please
I understand that. But why is that when oracle is given a hint to do full table scan instead of using index to get the count, it is still faster than postgres when both has the same explain plan? Oracle takes 34 sec and postgres takes 1 m10 sec . Is there anything that can be done in postgresql for speeding this up?
Oracle --select /*+ full(foo1) */ count(*) from foo1
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------SELECT STATEMENT () (null) (null)
(null)
SORT (AGGREGATE) (null) (null)
(null)
TABLE ACCESS (FULL) foo (null)
(null)
postgresql --Select count(*) from foo
This table has 29384048 rows and is indexed on foo_id
The tables are vacuumed and the explain plan for postgresql is
QUERY PLAN
------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms
Thanks again
sharmila
----- Original Message ----
From: Pavel Stehule <pavel.stehule@gmail.com>
To: SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>
Cc: pgsql-general@postgresql.org
Sent: Tuesday, November 6, 2007 9:11:02 AM
Subject: Re: [GENERAL] Postgresql simple query performance question
Hello
PostgreSQL doesn't use index for COUN(*)
http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7
Regards
Pavel Stehule
On 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:
Hi
We are in the process of testing for migration of our database from
Oracle
to Postgresql.
I hava a simple querySelect count(*) from foo
This table has 29384048 rows and is indexed on foo_idThe tables are vacuumed and the explain plan for postgresql is
QUERY PLAN
------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 msThe explain plan for oracle is
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)
SORT (AGGREGATE) (null) (null)
(null)
INDEX (FULL SCAN) foo_IDX_ID (null) (null)
Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1
min
10sec
The same query in oracle without the index and full table scan(like
in
postgresql) has the
explain plan like this and it takes 34 sec.
select /*+ full(foo1) */ count(*) from foo1
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)
SORT (AGGREGATE) (null) (null)
(null)
TABLE ACCESS (FULL) foo (null)
(null)
In short the query "Select count(*) from foo" takes the following
time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 secHow can I speed up this query in postgresql ? The other postgres
settings
are
postgresql
max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025Are there any tuning that need to be done in the OS or database
side? I had
attached the iostat and vmstat results of postgresql
Thanks
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that
your
message can get through to the mailing list cleanly
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Import Notes
Resolved by subject fallback
Reg Me Please wrote:
While I would not spend resources in fine tuning the count(*), I would
spend some to underastand why and how the other ones do it better.Just to be better.
The problem is well understood, and there is extensive discussion in the
mailing lists archives. The basic problem is that with PG's
implementation of MVCC the indexes don't have row visibility
information. The simple solution of adding it to every index entry would
increase index size substantially imposing costs on every index access
and update.
There's a thread in -hackers called "Visibility map thoughts" that is
looking at the situation again and if/how to implement visibility
information in a compact form.
--
Richard Huxton
Archonet Ltd
Hi,
Oracle, eventhough is a timestamp based database, stores only one
version for each row in the table and the rest of the versions of the
same-row(which might have got created due to updates) in a separate
place called undo log. In postgres, all the versions are stored in the
table. So the table would be bigger than it is in Oracle. Try doing a
Vacuum Full and fire the query. You might save some seconds.
I think we will definitely address this problem in 8.4.
Thanks,
Gokul.
Show quoted text
On Nov 6, 2007 8:07 PM, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:
I understand that. But why is that when oracle is given a hint to do full
table scan instead of using index to get the count, it is still faster than
postgres when both has the same explain plan? Oracle takes 34 sec and
postgres takes 1 m10 sec . Is there anything that can be done in postgresql
for speeding this up?Oracle --select /*+ full(foo1) */ count(*) from foo1
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------SELECT STATEMENT () (null) (null) (null)
SORT (AGGREGATE) (null) (null) (null)
TABLE ACCESS (FULL) foo (null) (null)postgresql --Select count(*) from foo
This table has 29384048 rows and is indexed on foo_id
The tables are vacuumed and the explain plan for postgresql is
QUERY PLAN
------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 msThanks again
sharmila----- Original Message ----
From: Pavel Stehule <pavel.stehule@gmail.com>
To: SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>
Cc: pgsql-general@postgresql.org
Sent: Tuesday, November 6, 2007 9:11:02 AM
Subject: Re: [GENERAL] Postgresql simple query performance questionHello
PostgreSQL doesn't use index for COUN(*)
http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7Regards
Pavel StehuleOn 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:
Hi
We are in the process of testing for migration of our database from Oracle
to Postgresql.
I hava a simple querySelect count(*) from foo
This table has 29384048 rows and is indexed on foo_idThe tables are vacuumed and the explain plan for postgresql is
QUERY PLAN
------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 msThe explain plan for oracle is
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)(null)
SORT (AGGREGATE) (null) (null)
(null)
INDEX (FULL SCAN) foo_IDX_ID (null) (null)
Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min
10secThe same query in oracle without the index and full table scan(like in
postgresql) has theexplain plan like this and it takes 34 sec.
select /*+ full(foo1) */ count(*) from foo1
OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null) (null)SORT (AGGREGATE) (null) (null) (null)
TABLE ACCESS (FULL) foo (null) (null)In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 secHow can I speed up this query in postgresql ? The other postgres settings
arepostgresql
max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025Are there any tuning that need to be done in the OS or database side? I
had
attached the iostat and vmstat results of postgresql
Thanks
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
In response to André Volpato <andre.volpato@ecomtecnologia.com.br>:
Richard Huxton escreveu:
Reg Me Please wrote:
While I would not spend resources in fine tuning the count(*), I would
spend some to underastand why and how the other ones do it better.Just to be better.
The problem is well understood, and there is extensive discussion in
the mailing lists archives. The basic problem is that with PG's
implementation of MVCC the indexes don't have row visibility
information. The simple solution of adding it to every index entry
would increase index size substantially imposing costs on every index
access and update.There's a thread in -hackers called "Visibility map thoughts" that is
looking at the situation again and if/how to implement visibility
information in a compact form.Remember that you can always use serial fields to count a table, like:
alter table foo add id serial;
select id from foo order by id desc limit 1;This should return the same value than count(*), in a few msecs.
I don't think so. What kind of accuracy do you have when rows are
deleted? Also, sequences are not transactional, so rolled-back
transactions will increment the sequence without actually adding
rows.
--
Bill Moran
http://www.potentialtech.com
Import Notes
Reply to msg id not found: 4730A6C5.3020005@ecomtecnologia.com.br
Richard Huxton escreveu:
Reg Me Please wrote:
While I would not spend resources in fine tuning the count(*), I would
spend some to underastand why and how the other ones do it better.Just to be better.
The problem is well understood, and there is extensive discussion in
the mailing lists archives. The basic problem is that with PG's
implementation of MVCC the indexes don't have row visibility
information. The simple solution of adding it to every index entry
would increase index size substantially imposing costs on every index
access and update.There's a thread in -hackers called "Visibility map thoughts" that is
looking at the situation again and if/how to implement visibility
information in a compact form.
Remember that you can always use serial fields to count a table, like:
alter table foo add id serial;
select id from foo order by id desc limit 1;
This should return the same value than count(*), in a few msecs.
--
ACV
"SHARMILA JOTHIRAJAH" <sharmi_jo@yahoo.com> writes:
I understand that. But why is that when oracle is given a hint to do full
table scan instead of using index to get the count, it is still faster than
postgres when both has the same explain plan? Oracle takes 34 sec and
postgres takes 1 m10 sec . Is there anything that can be done in postgresql
for speeding this up?
How large are the actual respective data files?
What are the columns in these tables? Do you have many char() and NUMERIC
columns?
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning
Il Wednesday 07 November 2007 13:08:46 André Volpato ha scritto:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Reid Thompson escreveu:
Would it be possible to avoid the so-called "HTML email body"?
--
Reg me Please
Import Notes
Reply to msg id not found: 4731AACE.6070908@ecomtecnologia.com.br
On Tue, 2007-11-06 at 09:29 -0500, Bill Moran wrote:
In response to SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>:
Hi
We are in the process of testing for migration of our database from Oracle to Postgresql.
I hava a simple querySelect count(*) from foo
This is asked a lot. The quick answer is that PostgreSQL method of MVCC
makes it impossible to make this query fast. Perhaps, someday, some
brilliant developer will come up with an optimization, but that hasn't
happened yet.
What release level is being tested? It may already have happened.
8.3 is substantially faster at seq scans, so the tests should be re-run
on 8.3 beta.
Also, re-run the Postgres test. It should be faster the second time,
even if the database server is restarted between tests.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Reid Thompson escreveu:
<blockquote cite="mid:1194369123.15405.4.camel@raker.ateb.com"
type="cite">
<pre wrap="">On Tue, 2007-11-06 at 14:39 -0300, André Volpato wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Remember that you can always use serial fields to count a table, like:
alter table foo add id serial;
select id from foo order by id desc limit 1;
This should return the same value than count(*), in a few msecs.
--
ACV
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to <a class="moz-txt-link-abbreviated" href="mailto:majordomo@postgresql.org">majordomo@postgresql.org</a> so that your
message can get through to the mailing list cleanly
</pre>
</blockquote>
<pre wrap=""><!---->
not so...
test=# select version();
version
----------------------------------------------------------------------------------------------------------------
PostgreSQL 8.2.4 on i686-pc-linux-gnu, compiled by GCC
i686-pc-linux-gnu-gcc (GCC) 4.1.2 (Gentoo 4.1.2 p1.0.1)
(1 row)
test=# create table serialcount(aval integer);
CREATE TABLE
test=# \timing
Timing is on.
test=# insert into serialcount values ( generate_series(1,10000000));
INSERT 0 10000000
Time: 42297.468 ms
test=# select count(*) from serialcount;
count
----------
10000000
(1 row)
Time: 6158.188 ms
test=# select count(*) from serialcount;
count
----------
10000000
(1 row)
Time: 2366.596 ms
test=# select count(*) from serialcount;
count
----------
10000000
(1 row)
Time: 2090.416 ms
test=# select count(*) from serialcount;
count
----------
10000000
(1 row)
Time: 2125.377 ms
test=# select count(*) from serialcount;
count
----------
10000000
(1 row)
Time: 2122.584 ms
test=# alter table serialcount add id serial;
NOTICE: ALTER TABLE will create implicit sequence "serialcount_id_seq"
for serial column "serialcount.id"
ALTER TABLE
Time: 51733.139 ms
test=# select id from serialcount order by id desc limit 1;
id
----------
10000000
(1 row)
Time: 41088.062 ms
test=# select id from serialcount order by id desc limit 1;
id
----------
10000000
(1 row)
Time: 35638.317 ms
test=# vacuum analyze serialcount;
VACUUM
Time: 927.760 ms
test=# select id from serialcount order by id desc limit 1;
id
----------
10000000
(1 row)
Time: 34281.178 ms
</pre>
</blockquote>
<br>
I meant to select using an index. I´ve done the same tests here, and
realized that my server is two times slower than yours:<br>
<br>
testeprog=# select version();<br>
version<br>
---------------------------------------------------------------------------------------------------------<br>
PostgreSQL 8.1.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.3
20040412 (Red Hat Linux 3.3.3-7)<br>
(1 row)<br>
<br>
testeprog=# select count(*) from test;<br>
count<br>
----------<br>
10000000<br>
(1 row)<br>
<br>
Time: 4116.613 ms<br>
<br>
testeprog=# alter table test add id serial;<br>
NOTICE: ALTER TABLE will create implicit sequence "test_id_seq" for
serial column "test.id"<br>
ALTER TABLE<br>
Time: 90617.195 ms<br>
<br>
testeprog=# select id from test order by id desc limit 1;<br>
id<br>
----------<br>
10000000<br>
(1 row)<br>
<br>
Time: 64856.553 ms<br>
<br>
testeprog=# create unique index itest1 on test using btree (id);<br>
CREATE INDEX<br>
Time: 29026.891 ms<br>
<br>
<br>
testeprog=# explain analyze select id from test order by id desc limit
1;<br>
QUERY
PLAN<br>
----------------------------------------------------------------------------------------------------------------------------------------<br>
Limit (cost=0.00..0.02 rows=1 width=4) (actual time=0.017..0.019
rows=1 loops=1)<br>
-> Index Scan Backward using itest1 on test
(cost=0.00..185954.00 rows=10000000 width=4) (actual time=0.014..0.014
rows=1 loops=1)<br>
Total runtime: 0.059 ms<br>
(3 rows)<br>
<br>
<br>
<br>
@Bill:<br>
Bill Moran wrote
<blockquote cite="mid:20071106115927.2aea950e.wmoran@potentialtech.com"
type="cite">
<pre wrap="">
I don't think so. What kind of accuracy do you have when rows are
deleted? Also, sequences are not transactional, so rolled-back
transactions will increment the sequence without actually adding
rows.
</pre>
</blockquote>
<br>
You are right, the serial hack should not work in most oltp cases.<br>
<br>
--<br>
ACV<br>
<br>
</body>
</html>
Import Notes
Reply to msg id not found: 1194369123.15405.4.camel@raker.ateb.com
Hi,
The table has 43 columns. I have attached the columns-list.They have many char() and numeric columns.
For the table size, these are the corresponding entries from the pg_class
foo is the table and the others are some of its indexes.
relname reltuples relpages
foo 2.9384E7 825699
foo_idx_pat 2.9384E7 684995
foo_idx_service 2.9384E7 433549
foo_idx_serv 2.9384E7 433435
foo_pk 2.9384E7 109057
Thanks
Sharmila
----- Original Message ----
From: Gregory Stark <stark@enterprisedb.com>
To: SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>
Cc: Pavel Stehule <pavel.stehule@gmail.com>; pgsql-general@postgresql.org
Sent: Tuesday, November 6, 2007 8:03:48 PM
Subject: Re: [GENERAL] Postgresql simple query performance question
"SHARMILA JOTHIRAJAH" <sharmi_jo@yahoo.com> writes:
I understand that. But why is that when oracle is given a hint to do
full
table scan instead of using index to get the count, it is still
faster than
postgres when both has the same explain plan? Oracle takes 34 sec and
postgres takes 1 m10 sec . Is there anything that can be done in
postgresql
for speeding this up?
How large are the actual respective data files?
What are the columns in these tables? Do you have many char() and
NUMERIC
columns?
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning
---------------------------(end of
broadcast)---------------------------
TIP 6: explain analyze is your friend
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Attachments:
foo table.txttext/plain; name="=?utf-8?q?foo=20table.txt?="Download
Import Notes
Resolved by subject fallback
Hi
we are testing with version PostgreSQL 8.2.3. We already have a production system in Oracle and we wanted to migrate it to postgresql. If some tests are already done, are the results available for us to see?
Ill also check postgres 8.3 beta.
Thanks again
Sharmila
----- Original Message ----
From: Simon Riggs <simon@2ndquadrant.com>
To: Bill Moran <wmoran@potentialtech.com>
Cc: SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>; pgsql-general@postgresql.org
Sent: Wednesday, November 7, 2007 6:34:26 AM
Subject: Re: [GENERAL] Postgresql simple query performance question
On Tue, 2007-11-06 at 09:29 -0500, Bill Moran wrote:
In response to SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>:
Hi
We are in the process of testing for migration of our database from
Oracle to Postgresql.
I hava a simple query
Select count(*) from foo
This is asked a lot. The quick answer is that PostgreSQL method of
MVCC
makes it impossible to make this query fast. Perhaps, someday, some
brilliant developer will come up with an optimization, but that
hasn't
happened yet.
What release level is being tested? It may already have happened.
8.3 is substantially faster at seq scans, so the tests should be re-run
on 8.3 beta.
Also, re-run the Postgres test. It should be faster the second time,
even if the database server is restarted between tests.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
---------------------------(end of
broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Import Notes
Resolved by subject fallback
Il Wednesday 07 November 2007 13:47:26 SHARMILA JOTHIRAJAH ha scritto:
Hi
we are testing with version PostgreSQL 8.2.3.
Why not using at least the current 8.2.5?
Read here
http://www.postgresql.org/docs/current/static/release.html
for details.
--
Reg me Please