Postgresql simple query performance question

Started by SHARMILA JOTHIRAJAHover 18 years ago18 messagesgeneral

Jump to latest

SHARMILA JOTHIRAJAH

sharmi_jo@yahoo.com

over 18 years ago

Hi
We are in the process of testing for migration of our database from Oracle to Postgresql.
I hava a simple query

Select count(*) from foo
This table has 29384048 rows and is indexed on foo_id

The tables are vacuumed and the explain plan for postgresql is

QUERY PLAN

------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual time=68797.280..68797.280 rows=1 loops=1)
-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0) (actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms

The explain plan for oracle is

OPERATION OBJECT ACCESS_PREDICATES FILTER_PREDICATES
------------------- ------------------------ -------------------- --------------------
SELECT STATEMENT () (null) (null) (null)
SORT (AGGREGATE) (null) (null) (null)
INDEX (FULL SCAN) foo_IDX_ID (null) (null)

Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min 10sec

The same query in oracle without the index and full table scan(like in postgresql) has the

explain plan like this and it takes 34 sec.

select /*+ full(foo1) */ count(*) from foo1

OPERATION OBJECT ACCESS_PREDICATES FILTER_PREDICATES
----------------------- ------------------ -------------------- --------------------
SELECT STATEMENT () (null) (null) (null)
SORT (AGGREGATE) (null) (null) (null)
TABLE ACCESS (FULL) foo (null) (null)

In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 sec

How can I speed up this query in postgresql ? The other postgres settings are

postgresql

max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025

Are there any tuning that need to be done in the OS or database side? I had attached the iostat and vmstat results of postgresql

Thanks

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Pavel Stehule

pavel.stehule@gmail.com

over 18 years ago

In reply to: SHARMILA JOTHIRAJAH (#1)

Re: Postgresql simple query performance question

Hello

PostgreSQL doesn't use index for COUN(*)

http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7

Regards
Pavel Stehule

Show quoted text

On 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:

Hi
We are in the process of testing for migration of our database from Oracle
to Postgresql.
I hava a simple query

Select count(*) from foo
This table has 29384048 rows and is indexed on foo_id

The tables are vacuumed and the explain plan for postgresql is

QUERY PLAN

------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)

-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms

The explain plan for oracle is

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null) (null)

SORT (AGGREGATE) (null) (null) (null)

INDEX (FULL SCAN) foo_IDX_ID (null) (null)

Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min
10sec

The same query in oracle without the index and full table scan(like in
postgresql) has the

explain plan like this and it takes 34 sec.

select /*+ full(foo1) */ count(*) from foo1

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null) (null)

SORT (AGGREGATE) (null) (null) (null)
TABLE ACCESS (FULL) foo (null) (null)

In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 sec

How can I speed up this query in postgresql ? The other postgres settings
are

postgresql

max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025

Are there any tuning that need to be done in the OS or database side? I had
attached the iostat and vmstat results of postgresql

Thanks

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Reg Me Please

regmeplease@gmail.com

over 18 years ago

In reply to: Pavel Stehule (#2)

Re: Postgresql simple query performance question

I have no doubt you're right, Pavel.
But why not?
It could be a simple enhacement.

Il Tuesday 06 November 2007 15:11:02 Pavel Stehule ha scritto:

Hello

PostgreSQL doesn't use index for COUN(*)

http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7

Regards
Pavel Stehule

On 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:

Hi
We are in the process of testing for migration of our database from
Oracle to Postgresql.
I hava a simple query

Select count(*) from foo
This table has 29384048 rows and is indexed on foo_id

The tables are vacuumed and the explain plan for postgresql is

QUERY PLAN

------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)

-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms

The explain plan for oracle is

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)

SORT (AGGREGATE) (null) (null)
(null)

INDEX (FULL SCAN) foo_IDX_ID (null) (null)

Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min
10sec

The same query in oracle without the index and full table scan(like in
postgresql) has the

explain plan like this and it takes 34 sec.

select /*+ full(foo1) */ count(*) from foo1

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)

SORT (AGGREGATE) (null) (null)
(null) TABLE ACCESS (FULL) foo (null)
(null)

In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 sec

How can I speed up this query in postgresql ? The other postgres settings
are

postgresql

max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025

Are there any tuning that need to be done in the OS or database side? I
had attached the iostat and vmstat results of postgresql

Thanks

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--
Reg me Please

Bill Moran

wmoran@potentialtech.com

over 18 years ago

In reply to: SHARMILA JOTHIRAJAH (#1)

Re: Postgresql simple query performance question

In response to SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>:

Hi
We are in the process of testing for migration of our database from Oracle to Postgresql.
I hava a simple query

Select count(*) from foo

This is asked a lot. The quick answer is that PostgreSQL method of MVCC
makes it impossible to make this query fast. Perhaps, someday, some
brilliant developer will come up with an optimization, but that hasn't
happened yet.

There may be some tweaks you can make to your tuning, see inline below.
However, if you really need a fast, accurate count of rows in that
table, I recommend you create a trigger to track it.

This table has 29384048 rows and is indexed on foo_id

The tables are vacuumed and the explain plan for postgresql is

QUERY PLAN

------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual time=68797.280..68797.280 rows=1 loops=1)
-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0) (actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms

The explain plan for oracle is

OPERATION OBJECT ACCESS_PREDICATES FILTER_PREDICATES
------------------- ------------------------ -------------------- --------------------
SELECT STATEMENT () (null) (null) (null)
SORT (AGGREGATE) (null) (null) (null)
INDEX (FULL SCAN) foo_IDX_ID (null) (null)

Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min 10sec

The same query in oracle without the index and full table scan(like in postgresql) has the

explain plan like this and it takes 34 sec.

select /*+ full(foo1) */ count(*) from foo1

OPERATION OBJECT ACCESS_PREDICATES FILTER_PREDICATES
----------------------- ------------------ -------------------- --------------------
SELECT STATEMENT () (null) (null) (null)
SORT (AGGREGATE) (null) (null) (null)
TABLE ACCESS (FULL) foo (null) (null)

In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 sec

How can I speed up this query in postgresql ? The other postgres settings are

postgresql

max_connections = 100
shared_buffers = 50000

How much memory does this system have? What version of PostgreSQL are you
using? If you're using an 8.X version and have more 2G of RAM, this
value is likely too low. Start with 1/4 the available RAM and tune from
there.

temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025

Are there any tuning that need to be done in the OS or database side? I had attached the iostat and vmstat results of postgresql

--
Bill Moran
http://www.potentialtech.com

Bill Moran

wmoran@potentialtech.com

over 18 years ago

In reply to: Reg Me Please (#3)

Re: Postgresql simple query performance question

In response to Reg Me Please <regmeplease@gmail.com>:

I have no doubt you're right, Pavel.
But why not?
It could be a simple enhacement.

It's not simple. Do some searches on the mailing lists and you will
find discussion of why it's difficult to do.

Il Tuesday 06 November 2007 15:11:02 Pavel Stehule ha scritto:

Hello

PostgreSQL doesn't use index for COUN(*)

http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7

Regards
Pavel Stehule

On 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:

Hi
We are in the process of testing for migration of our database from
Oracle to Postgresql.
I hava a simple query

Select count(*) from foo
This table has 29384048 rows and is indexed on foo_id

The tables are vacuumed and the explain plan for postgresql is

QUERY PLAN

------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)

-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms

The explain plan for oracle is

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)

SORT (AGGREGATE) (null) (null)
(null)

INDEX (FULL SCAN) foo_IDX_ID (null) (null)

Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min
10sec

The same query in oracle without the index and full table scan(like in
postgresql) has the

explain plan like this and it takes 34 sec.

select /*+ full(foo1) */ count(*) from foo1

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)

SORT (AGGREGATE) (null) (null)
(null) TABLE ACCESS (FULL) foo (null)
(null)

In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 sec

How can I speed up this query in postgresql ? The other postgres settings
are

postgresql

max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025

Are there any tuning that need to be done in the OS or database side? I
had attached the iostat and vmstat results of postgresql

Thanks

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--
Reg me Please

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--
Bill Moran
http://www.potentialtech.com

Reg Me Please

regmeplease@gmail.com

over 18 years ago

In reply to: Bill Moran (#5)

Re: Postgresql simple query performance question

While I would not spend resources in fine tuning the count(*), I would
spend some to underastand why and how the other ones do it better.

Just to be better.

Il Tuesday 06 November 2007 15:29:34 Bill Moran ha scritto:

In response to Reg Me Please <regmeplease@gmail.com>:

I have no doubt you're right, Pavel.
But why not?
It could be a simple enhacement.

It's not simple. Do some searches on the mailing lists and you will
find discussion of why it's difficult to do.

Il Tuesday 06 November 2007 15:11:02 Pavel Stehule ha scritto:

Hello

PostgreSQL doesn't use index for COUN(*)

http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7

Regards
Pavel Stehule

On 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:

Hi
We are in the process of testing for migration of our database from
Oracle to Postgresql.
I hava a simple query

Select count(*) from foo
This table has 29384048 rows and is indexed on foo_id

The tables are vacuumed and the explain plan for postgresql is

QUERY PLAN

------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)

-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms

The explain plan for oracle is

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)

SORT (AGGREGATE) (null) (null)
(null)

INDEX (FULL SCAN) foo_IDX_ID (null) (null)

Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1
min 10sec

The same query in oracle without the index and full table scan(like
in postgresql) has the

explain plan like this and it takes 34 sec.

select /*+ full(foo1) */ count(*) from foo1

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)
(null)

SORT (AGGREGATE) (null) (null)
(null) TABLE ACCESS (FULL) foo (null)
(null)

In short the query "Select count(*) from foo" takes the following
time: Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 sec

How can I speed up this query in postgresql ? The other postgres
settings are

postgresql

max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025

Are there any tuning that need to be done in the OS or database
side? I had attached the iostat and vmstat results of postgresql

Thanks

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that
your message can get through to the mailing list cleanly

---------------------------(end of
broadcast)--------------------------- TIP 9: In versions below 8.0, the
planner will ignore your desire to choose an index scan if your joining
column's datatypes do not match

--
Reg me Please

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--
Reg me Please

SHARMILA JOTHIRAJAH

sharmi_jo@yahoo.com

over 18 years ago

In reply to: Reg Me Please (#6)

Re: Postgresql simple query performance question

I understand that. But why is that when oracle is given a hint to do full table scan instead of using index to get the count, it is still faster than postgres when both has the same explain plan? Oracle takes 34 sec and postgres takes 1 m10 sec . Is there anything that can be done in postgresql for speeding this up?

Oracle --select /*+ full(foo1) */ count(*) from foo1

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------

SELECT STATEMENT () (null) (null)

(null)

SORT (AGGREGATE) (null) (null)

(null)

TABLE ACCESS (FULL) foo (null)

(null)

postgresql --Select count(*) from foo

This table has 29384048 rows and is indexed on foo_id

The tables are vacuumed and the explain plan for postgresql is

QUERY PLAN

------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)

-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms

Thanks again
sharmila

----- Original Message ----
From: Pavel Stehule <pavel.stehule@gmail.com>
To: SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>
Cc: pgsql-general@postgresql.org
Sent: Tuesday, November 6, 2007 9:11:02 AM
Subject: Re: [GENERAL] Postgresql simple query performance question

Hello

PostgreSQL doesn't use index for COUN(*)

http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7

Regards
Pavel Stehule

On 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:

Hi
We are in the process of testing for migration of our database from

Oracle

to Postgresql.
I hava a simple query

Select count(*) from foo
This table has 29384048 rows and is indexed on foo_id

The tables are vacuumed and the explain plan for postgresql is

QUERY PLAN

------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)

-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms

The explain plan for oracle is

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)

(null)

SORT (AGGREGATE) (null) (null)

(null)

INDEX (FULL SCAN) foo_IDX_ID (null) (null)

Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1

min

10sec

The same query in oracle without the index and full table scan(like

postgresql) has the

explain plan like this and it takes 34 sec.

select /*+ full(foo1) */ count(*) from foo1

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)

(null)

SORT (AGGREGATE) (null) (null)

(null)

TABLE ACCESS (FULL) foo (null)

(null)

In short the query "Select count(*) from foo" takes the following

time:

Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 sec

How can I speed up this query in postgresql ? The other postgres

settings

are

postgresql

max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025

Are there any tuning that need to be done in the OS or database

side? I had

attached the iostat and vmstat results of postgresql

Thanks

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that

your

message can get through to the mailing list cleanly

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Import Notes

Resolved by subject fallback

Richard Huxton

dev@archonet.com

over 18 years ago

In reply to: Reg Me Please (#6)

Re: Postgresql simple query performance question

Reg Me Please wrote:

While I would not spend resources in fine tuning the count(*), I would
spend some to underastand why and how the other ones do it better.

Just to be better.

The problem is well understood, and there is extensive discussion in the
mailing lists archives. The basic problem is that with PG's
implementation of MVCC the indexes don't have row visibility
information. The simple solution of adding it to every index entry would
increase index size substantially imposing costs on every index access
and update.

There's a thread in -hackers called "Visibility map thoughts" that is
looking at the situation again and if/how to implement visibility
information in a compact form.

--
Richard Huxton
Archonet Ltd

Gokulakannan Somasundaram

gokul007@gmail.com

over 18 years ago

In reply to: SHARMILA JOTHIRAJAH (#7)

Re: Postgresql simple query performance question

Hi,
Oracle, eventhough is a timestamp based database, stores only one
version for each row in the table and the rest of the versions of the
same-row(which might have got created due to updates) in a separate
place called undo log. In postgres, all the versions are stored in the
table. So the table would be bigger than it is in Oracle. Try doing a
Vacuum Full and fire the query. You might save some seconds.
I think we will definitely address this problem in 8.4.

Thanks,
Gokul.

Show quoted text

On Nov 6, 2007 8:07 PM, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:

I understand that. But why is that when oracle is given a hint to do full
table scan instead of using index to get the count, it is still faster than
postgres when both has the same explain plan? Oracle takes 34 sec and
postgres takes 1 m10 sec . Is there anything that can be done in postgresql
for speeding this up?

Oracle --select /*+ full(foo1) */ count(*) from foo1

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------

SELECT STATEMENT () (null) (null) (null)

SORT (AGGREGATE) (null) (null) (null)
TABLE ACCESS (FULL) foo (null) (null)

postgresql --Select count(*) from foo

This table has 29384048 rows and is indexed on foo_id

The tables are vacuumed and the explain plan for postgresql is

QUERY PLAN

------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)

-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms

Thanks again
sharmila

----- Original Message ----
From: Pavel Stehule <pavel.stehule@gmail.com>
To: SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>
Cc: pgsql-general@postgresql.org
Sent: Tuesday, November 6, 2007 9:11:02 AM
Subject: Re: [GENERAL] Postgresql simple query performance question

Hello

PostgreSQL doesn't use index for COUN(*)

http://www.varlena.com/GeneralBits/18.php
http://sql-info.de/en/postgresql/postgres-gotchas.html#1_7

Regards
Pavel Stehule

On 06/11/2007, SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com> wrote:

Hi
We are in the process of testing for migration of our database from Oracle
to Postgresql.
I hava a simple query

Select count(*) from foo
This table has 29384048 rows and is indexed on foo_id

The tables are vacuumed and the explain plan for postgresql is

QUERY PLAN

------------------------------------------
Aggregate (cost=1194020.60..1194020.61 rows=1 width=0) (actual
time=68797.280..68797.280 rows=1 loops=1)

-> Seq Scan on foo (cost=0.00..1120560.48 rows=29384048 width=0)
(actual
time=0.232..60657.948 rows=29384048 loops=1)
Total runtime: 68797.358 ms

The explain plan for oracle is

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
------------------- ------------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null)

(null)

SORT (AGGREGATE) (null) (null)

(null)

INDEX (FULL SCAN) foo_IDX_ID (null) (null)

Oracle uses index for count(*) query in this case
This query in Oracle takes only 5 sec and in postgresql it takes 1 min
10sec

The same query in oracle without the index and full table scan(like in
postgresql) has the

explain plan like this and it takes 34 sec.

select /*+ full(foo1) */ count(*) from foo1

OPERATION OBJECT ACCESS_PREDICATES
FILTER_PREDICATES
----------------------- ------------------ --------------------
--------------------
SELECT STATEMENT () (null) (null) (null)

SORT (AGGREGATE) (null) (null) (null)
TABLE ACCESS (FULL) foo (null) (null)

In short the query "Select count(*) from foo" takes the following time:
Postgresql - 1m 10 sec
Oracle(index scan) - 5 sec
Oracle (full table scan) - 34 sec

How can I speed up this query in postgresql ? The other postgres settings
are

postgresql

max_connections = 100
shared_buffers = 50000
temp_buffers = 5000
work_mem = 16384
maintenance_work_mem = 262144
fsync = on
wal_sync_method = fsync
effective_cache_size = 300000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025

Are there any tuning that need to be done in the OS or database side? I

had

attached the iostat and vmstat results of postgresql

Thanks

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

#10

Bill Moran

wmoran@potentialtech.com

over 18 years ago

In reply to: SHARMILA JOTHIRAJAH (#1)

Re: Postgresql simple query performance question

In response to André Volpato <andre.volpato@ecomtecnologia.com.br>:

Richard Huxton escreveu:

Reg Me Please wrote:

While I would not spend resources in fine tuning the count(*), I would
spend some to underastand why and how the other ones do it better.

Just to be better.

The problem is well understood, and there is extensive discussion in
the mailing lists archives. The basic problem is that with PG's
implementation of MVCC the indexes don't have row visibility
information. The simple solution of adding it to every index entry
would increase index size substantially imposing costs on every index
access and update.

There's a thread in -hackers called "Visibility map thoughts" that is
looking at the situation again and if/how to implement visibility
information in a compact form.

Remember that you can always use serial fields to count a table, like:

alter table foo add id serial;
select id from foo order by id desc limit 1;

This should return the same value than count(*), in a few msecs.

I don't think so. What kind of accuracy do you have when rows are
deleted? Also, sequences are not transactional, so rolled-back
transactions will increment the sequence without actually adding
rows.

--
Bill Moran
http://www.potentialtech.com

Import Notes

Reply to msg id not found: 4730A6C5.3020005@ecomtecnologia.com.br

#11

André Volpato

andre.volpato@ecomtecnologia.com.br

over 18 years ago

In reply to: Richard Huxton (#8)

Re: Postgresql simple query performance question

Richard Huxton escreveu:

Reg Me Please wrote:

While I would not spend resources in fine tuning the count(*), I would
spend some to underastand why and how the other ones do it better.

Just to be better.

The problem is well understood, and there is extensive discussion in
the mailing lists archives. The basic problem is that with PG's
implementation of MVCC the indexes don't have row visibility
information. The simple solution of adding it to every index entry
would increase index size substantially imposing costs on every index
access and update.

There's a thread in -hackers called "Visibility map thoughts" that is
looking at the situation again and if/how to implement visibility
information in a compact form.

Remember that you can always use serial fields to count a table, like:

alter table foo add id serial;
select id from foo order by id desc limit 1;

This should return the same value than count(*), in a few msecs.

--
ACV

#12

Bruce Momjian

bruce@momjian.us

over 18 years ago

In reply to: SHARMILA JOTHIRAJAH (#7)

Re: Postgresql simple query performance question

"SHARMILA JOTHIRAJAH" <sharmi_jo@yahoo.com> writes:

I understand that. But why is that when oracle is given a hint to do full
table scan instead of using index to get the count, it is still faster than
postgres when both has the same explain plan? Oracle takes 34 sec and
postgres takes 1 m10 sec . Is there anything that can be done in postgresql
for speeding this up?

How large are the actual respective data files?

What are the columns in these tables? Do you have many char() and NUMERIC
columns?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning

#13

Reg Me Please

regmeplease@gmail.com

over 18 years ago

In reply to: SHARMILA JOTHIRAJAH (#1)

Re: Postgresql simple query performance question

Il Wednesday 07 November 2007 13:08:46 André Volpato ha scritto:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Reid Thompson escreveu:

Would it be possible to avoid the so-called "HTML email body"?

--
Reg me Please

Import Notes

Reply to msg id not found: 4731AACE.6070908@ecomtecnologia.com.br

#14

Simon Riggs

simon@2ndQuadrant.com

over 18 years ago

In reply to: Bill Moran (#4)

Re: Postgresql simple query performance question

On Tue, 2007-11-06 at 09:29 -0500, Bill Moran wrote:

In response to SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>:

Hi
We are in the process of testing for migration of our database from Oracle to Postgresql.
I hava a simple query

Select count(*) from foo

This is asked a lot. The quick answer is that PostgreSQL method of MVCC
makes it impossible to make this query fast. Perhaps, someday, some
brilliant developer will come up with an optimization, but that hasn't
happened yet.

What release level is being tested? It may already have happened.

8.3 is substantially faster at seq scans, so the tests should be re-run
on 8.3 beta.

Also, re-run the Postgres test. It should be faster the second time,
even if the database server is restarted between tests.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

#15

André Volpato

andre.volpato@ecomtecnologia.com.br

over 18 years ago

In reply to: SHARMILA JOTHIRAJAH (#1)

Re: Postgresql simple query performance question

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Reid Thompson escreveu:
<blockquote cite="mid:1194369123.15405.4.camel@raker.ateb.com"
type="cite">
<pre wrap="">On Tue, 2007-11-06 at 14:39 -0300, André Volpato wrote:

</pre>
<blockquote type="cite">
<pre wrap="">Remember that you can always use serial fields to count a table, like:

alter table foo add id serial;
select id from foo order by id desc limit 1;

This should return the same value than count(*), in a few msecs.

--
ACV

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to <a class="moz-txt-link-abbreviated" href="mailto:majordomo@postgresql.org">majordomo@postgresql.org</a> so that your
message can get through to the mailing list cleanly
</pre>
</blockquote>
<pre wrap="">

not so...

test=# select version();

version
----------------------------------------------------------------------------------------------------------------
PostgreSQL 8.2.4 on i686-pc-linux-gnu, compiled by GCC
i686-pc-linux-gnu-gcc (GCC) 4.1.2 (Gentoo 4.1.2 p1.0.1)
(1 row)

test=# create table serialcount(aval integer);
CREATE TABLE

test=# \timing
Timing is on.

test=# insert into serialcount values ( generate_series(1,10000000));
INSERT 0 10000000
Time: 42297.468 ms
test=# select count(*) from serialcount;
count
----------
10000000
(1 row)

Time: 6158.188 ms
test=# select count(*) from serialcount;
count
----------
10000000
(1 row)

Time: 2366.596 ms
test=# select count(*) from serialcount;
count
----------
10000000
(1 row)

Time: 2090.416 ms
test=# select count(*) from serialcount;
count
----------
10000000
(1 row)

Time: 2125.377 ms
test=# select count(*) from serialcount;
count
----------
10000000
(1 row)

Time: 2122.584 ms
test=# alter table serialcount add id serial;
NOTICE: ALTER TABLE will create implicit sequence "serialcount_id_seq"
for serial column "serialcount.id"
ALTER TABLE
Time: 51733.139 ms
test=# select id from serialcount order by id desc limit 1;
id
----------
10000000
(1 row)

Time: 41088.062 ms
test=# select id from serialcount order by id desc limit 1;
id
----------
10000000
(1 row)

Time: 35638.317 ms
test=# vacuum analyze serialcount;
VACUUM
Time: 927.760 ms
test=# select id from serialcount order by id desc limit 1;
id
----------
10000000
(1 row)

Time: 34281.178 ms
</pre>
</blockquote>
 
I meant to select using an index. I´ve done the same tests here, and
realized that my server is two times slower than yours: 
 
testeprog=# select version(); 
 version 
--------------------------------------------------------------------------------------------------------- 
PostgreSQL 8.1.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.3
20040412 (Red Hat Linux 3.3.3-7) 
(1 row) 
 
testeprog=# select count(*) from test; 
count 
---------- 
10000000 
(1 row) 
 
Time: 4116.613 ms 
 
testeprog=# alter table test add id serial; 
NOTICE: ALTER TABLE will create implicit sequence "test_id_seq" for
serial column "test.id" 
ALTER TABLE 
Time: 90617.195 ms 
 
testeprog=# select id from test order by id desc limit 1; 
 id 
---------- 
10000000 
(1 row) 
 
Time: 64856.553 ms 
 
testeprog=# create unique index itest1 on test using btree (id); 
CREATE INDEX 
Time: 29026.891 ms 
 
 
testeprog=# explain analyze select id from test order by id desc limit
1; 
 QUERY
PLAN 
---------------------------------------------------------------------------------------------------------------------------------------- 
Limit (cost=0.00..0.02 rows=1 width=4) (actual time=0.017..0.019
rows=1 loops=1) 
 -> Index Scan Backward using itest1 on test
(cost=0.00..185954.00 rows=10000000 width=4) (actual time=0.014..0.014
rows=1 loops=1) 
Total runtime: 0.059 ms 
(3 rows) 
 
 
 
@Bill: 
Bill Moran wrote
<blockquote cite="mid:20071106115927.2aea950e.wmoran@potentialtech.com"
type="cite">
<pre wrap="">
I don't think so. What kind of accuracy do you have when rows are
deleted? Also, sequences are not transactional, so rolled-back
transactions will increment the sequence without actually adding
rows.
</pre>
</blockquote>
 
You are right, the serial hack should not work in most oltp cases. 
 
-- 
ACV 
 
</body>
</html>

Import Notes

Reply to msg id not found: 1194369123.15405.4.camel@raker.ateb.com

#16

SHARMILA JOTHIRAJAH

sharmi_jo@yahoo.com

over 18 years ago

In reply to: André Volpato (#15)

Re: Postgresql simple query performance question

Hi,
The table has 43 columns. I have attached the columns-list.They have many char() and numeric columns.
For the table size, these are the corresponding entries from the pg_class
foo is the table and the others are some of its indexes.

relname reltuples relpages
foo 2.9384E7 825699
foo_idx_pat 2.9384E7 684995
foo_idx_service 2.9384E7 433549
foo_idx_serv 2.9384E7 433435
foo_pk 2.9384E7 109057

Thanks
Sharmila

----- Original Message ----
From: Gregory Stark <stark@enterprisedb.com>
To: SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>
Cc: Pavel Stehule <pavel.stehule@gmail.com>; pgsql-general@postgresql.org
Sent: Tuesday, November 6, 2007 8:03:48 PM
Subject: Re: [GENERAL] Postgresql simple query performance question

"SHARMILA JOTHIRAJAH" <sharmi_jo@yahoo.com> writes:

I understand that. But why is that when oracle is given a hint to do

full

table scan instead of using index to get the count, it is still

faster than

postgres when both has the same explain plan? Oracle takes 34 sec and
postgres takes 1 m10 sec . Is there anything that can be done in

postgresql

for speeding this up?

How large are the actual respective data files?

What are the columns in these tables? Do you have many char() and
NUMERIC
columns?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning

---------------------------(end of
broadcast)---------------------------
TIP 6: explain analyze is your friend

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Import Notes

Resolved by subject fallback

#17

SHARMILA JOTHIRAJAH

sharmi_jo@yahoo.com

over 18 years ago

In reply to: SHARMILA JOTHIRAJAH (#16)

Re: Postgresql simple query performance question

Hi
we are testing with version PostgreSQL 8.2.3. We already have a production system in Oracle and we wanted to migrate it to postgresql. If some tests are already done, are the results available for us to see?
Ill also check postgres 8.3 beta.
Thanks again
Sharmila

----- Original Message ----
From: Simon Riggs <simon@2ndquadrant.com>
To: Bill Moran <wmoran@potentialtech.com>
Cc: SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>; pgsql-general@postgresql.org
Sent: Wednesday, November 7, 2007 6:34:26 AM
Subject: Re: [GENERAL] Postgresql simple query performance question

On Tue, 2007-11-06 at 09:29 -0500, Bill Moran wrote:

In response to SHARMILA JOTHIRAJAH <sharmi_jo@yahoo.com>:

Hi
We are in the process of testing for migration of our database from

Oracle to Postgresql.

I hava a simple query

Select count(*) from foo

This is asked a lot. The quick answer is that PostgreSQL method of

MVCC

makes it impossible to make this query fast. Perhaps, someday, some
brilliant developer will come up with an optimization, but that

hasn't

happened yet.

What release level is being tested? It may already have happened.

8.3 is substantially faster at seq scans, so the tests should be re-run
on 8.3 beta.

Also, re-run the Postgres test. It should be faster the second time,
even if the database server is restarted between tests.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

---------------------------(end of
broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Import Notes

Resolved by subject fallback

#18

Reg Me Please

regmeplease@gmail.com

over 18 years ago

In reply to: SHARMILA JOTHIRAJAH (#17)

Re: Postgresql simple query performance question

Il Wednesday 07 November 2007 13:47:26 SHARMILA JOTHIRAJAH ha scritto:

Hi
we are testing with version PostgreSQL 8.2.3.

Why not using at least the current 8.2.5?

Read here
http://www.postgresql.org/docs/current/static/release.html

for details.

--
Reg me Please

Postgresql simple query performance question

Attachments:

Attachments: