Very slow queries on 8.1

Started by David Rysdamover 20 years ago12 messagesgeneral
Jump to latest
#1David Rysdam
drysdam@ll.mit.edu

I'm porting an application from Sybase and I've noticed that similar
application functions take 2 to 3 times longer on postgres than they
used to on the same machine running under Sybase. I've tried changing
various "performance tuning" parameters, such as shared_buffers,
effective_cache_size, etc but there's little or no effect. I'm
beginning to think there's a deeper root cause to the slowness.

Right now, I'm working on a test case that involves a table with ~360k
rows called "nb.sigs". My sample query is:

select * from nb.sigs where signum > 250000

With no index, explain says this query costs 11341. After CREATE INDEX
on the signum field, along with an ANALYZE for nb.sigs, the query costs
3456 and takes around 4 seconds to return the first row. This seems
extremely slow to me, but I can't figure out what I might be doing
wrong. Any ideas?

(If necessary, I can write an entire script that creates and populates a
table and then give my performance on that sample for someone else to
check against.)

#2codeWarrior
gpatnude@hotmail.com
In reply to: David Rysdam (#1)
Re: Very slow queries on 8.1

What is the data type for "signum" ???

"David Rysdam" <drysdam@ll.mit.edu> wrote in message
news:437CA40A.8020507@ll.mit.edu...

Show quoted text

I'm porting an application from Sybase and I've noticed that similar
application functions take 2 to 3 times longer on postgres than they used
to on the same machine running under Sybase. I've tried changing various
"performance tuning" parameters, such as shared_buffers,
effective_cache_size, etc but there's little or no effect. I'm beginning
to think there's a deeper root cause to the slowness.

Right now, I'm working on a test case that involves a table with ~360k
rows called "nb.sigs". My sample query is:

select * from nb.sigs where signum > 250000

With no index, explain says this query costs 11341. After CREATE INDEX on
the signum field, along with an ANALYZE for nb.sigs, the query costs 3456
and takes around 4 seconds to return the first row. This seems extremely
slow to me, but I can't figure out what I might be doing wrong. Any
ideas?

(If necessary, I can write an entire script that creates and populates a
table and then give my performance on that sample for someone else to
check against.)

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rysdam (#1)
Re: Very slow queries on 8.1

David Rysdam <drysdam@ll.mit.edu> writes:

Right now, I'm working on a test case that involves a table with ~360k
rows called "nb.sigs". My sample query is:
select * from nb.sigs where signum > 250000
With no index, explain says this query costs 11341. After CREATE INDEX
on the signum field, along with an ANALYZE for nb.sigs, the query costs
3456 and takes around 4 seconds to return the first row. This seems
extremely slow to me, but I can't figure out what I might be doing
wrong. Any ideas?

How many rows does that actually return, and what client interface are
you fetching it with? libpq, at least, likes to fetch the entire query
result before it gives it to you --- so you're talking about 4 sec to
get all the rows, not only the first one. That might be reasonable if
you're fetching 100k rows via an indexscan...

regards, tom lane

#4David Rysdam
drysdam@ll.mit.edu
In reply to: codeWarrior (#2)
Re: Very slow queries on 8.1

int4, not null and the index is unique. I even tried clustering on it
to no avail.

codeWarrior wrote:

Show quoted text

What is the data type for "signum" ???

"David Rysdam" <drysdam@ll.mit.edu> wrote in message
news:437CA40A.8020507@ll.mit.edu...

I'm porting an application from Sybase and I've noticed that similar
application functions take 2 to 3 times longer on postgres than they used
to on the same machine running under Sybase. I've tried changing various
"performance tuning" parameters, such as shared_buffers,
effective_cache_size, etc but there's little or no effect. I'm beginning
to think there's a deeper root cause to the slowness.

Right now, I'm working on a test case that involves a table with ~360k
rows called "nb.sigs". My sample query is:

select * from nb.sigs where signum > 250000

With no index, explain says this query costs 11341. After CREATE INDEX on
the signum field, along with an ANALYZE for nb.sigs, the query costs 3456
and takes around 4 seconds to return the first row. This seems extremely
slow to me, but I can't figure out what I might be doing wrong. Any
ideas?

(If necessary, I can write an entire script that creates and populates a
table and then give my performance on that sample for someone else to
check against.)

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#5David Rysdam
drysdam@ll.mit.edu
In reply to: Tom Lane (#3)
Re: Very slow queries on 8.1

Tom Lane wrote:

David Rysdam <drysdam@ll.mit.edu> writes:

Right now, I'm working on a test case that involves a table with ~360k
rows called "nb.sigs". My sample query is:
select * from nb.sigs where signum > 250000
With no index, explain says this query costs 11341. After CREATE INDEX
on the signum field, along with an ANALYZE for nb.sigs, the query costs
3456 and takes around 4 seconds to return the first row. This seems
extremely slow to me, but I can't figure out what I might be doing
wrong. Any ideas?

How many rows does that actually return, and what client interface are
you fetching it with? libpq, at least, likes to fetch the entire query
result before it gives it to you --- so you're talking about 4 sec to
get all the rows, not only the first one. That might be reasonable if
you're fetching 100k rows via an indexscan...

regards, tom lane

Right, it's about 100k rows and it is through libpq (pgadmin in this
case, but my app uses libpq from pgtcl). Is there a way to tell libpq
to not do what it "likes" and do what I need instead? I didn't see
anything in the docs, but I didn't look very hard.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rysdam (#5)
Re: Very slow queries on 8.1

David Rysdam <drysdam@ll.mit.edu> writes:

Right, it's about 100k rows and it is through libpq (pgadmin in this
case, but my app uses libpq from pgtcl). Is there a way to tell libpq
to not do what it "likes" and do what I need instead?

The only way ATM is to declare a cursor on the query and then FETCH
whatever amount seems reasonable at a time.

There is support in the on-the-wire protocol for partial fetches from
ordinary queries (ie, without the notational overhead of creating a
cursor). libpq doesn't expose that at the moment. There's a thread
right now in pgsql-interfaces about adding such a feature to libpq ...

regards, tom lane

#7Martijn van Oosterhout
kleptog@svana.org
In reply to: David Rysdam (#5)
Re: Very slow queries on 8.1

On Thu, Nov 17, 2005 at 11:31:27AM -0500, David Rysdam wrote:

Right, it's about 100k rows and it is through libpq (pgadmin in this
case, but my app uses libpq from pgtcl). Is there a way to tell libpq
to not do what it "likes" and do what I need instead? I didn't see
anything in the docs, but I didn't look very hard.

Use the async interface. There you submit the query and retrieve rows
as they come in. It's a bit trickier to program but it can be done.

psql doesn't do this though, it's not clear how it could anyway, given
the way it formats.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#8Bruno Wolff III
bruno@wolff.to
In reply to: David Rysdam (#5)
Re: Very slow queries on 8.1

On Thu, Nov 17, 2005 at 11:31:27 -0500,
David Rysdam <drysdam@ll.mit.edu> wrote:

Right, it's about 100k rows and it is through libpq (pgadmin in this
case, but my app uses libpq from pgtcl). Is there a way to tell libpq
to not do what it "likes" and do what I need instead? I didn't see
anything in the docs, but I didn't look very hard.

You could use a cursor. That will bias the plan toward fast start plans which
might give you lower throughput if you are normally planning to fetch all of
the rows, but will give you quicker access to the first row.

#9David Rysdam
drysdam@ll.mit.edu
In reply to: Martijn van Oosterhout (#7)
Re: Very slow queries on 8.1

Martijn van Oosterhout wrote:

On Thu, Nov 17, 2005 at 11:31:27AM -0500, David Rysdam wrote:

Right, it's about 100k rows and it is through libpq (pgadmin in this
case, but my app uses libpq from pgtcl). Is there a way to tell libpq
to not do what it "likes" and do what I need instead? I didn't see
anything in the docs, but I didn't look very hard.

Use the async interface. There you submit the query and retrieve rows
as they come in. It's a bit trickier to program but it can be done.

psql doesn't do this though, it's not clear how it could anyway, given
the way it formats.

Have a nice day,

I'm experimenting with the async interface right now. Hopefully it will
fit in well.

It's OK if psql/pgadmin don't do it, as long as the app does.

#10David Rysdam
drysdam@ll.mit.edu
In reply to: Bruno Wolff III (#8)
Re: Very slow queries on 8.1

Bruno Wolff III wrote:

On Thu, Nov 17, 2005 at 11:31:27 -0500,
David Rysdam <drysdam@ll.mit.edu> wrote:

Right, it's about 100k rows and it is through libpq (pgadmin in this
case, but my app uses libpq from pgtcl). Is there a way to tell libpq
to not do what it "likes" and do what I need instead? I didn't see
anything in the docs, but I didn't look very hard.

You could use a cursor. That will bias the plan toward fast start plans which
might give you lower throughput if you are normally planning to fetch all of
the rows, but will give you quicker access to the first row.

That is exactly what I want, but is it possible to use a cursor from pgtcl?

#11Michael Fuhr
mike@fuhr.org
In reply to: David Rysdam (#1)
Re: Very slow queries on 8.1

On Thu, Nov 17, 2005 at 10:38:50AM -0500, David Rysdam wrote:

I'm porting an application from Sybase and I've noticed that similar
application functions take 2 to 3 times longer on postgres than they
used to on the same machine running under Sybase. I've tried changing
various "performance tuning" parameters, such as shared_buffers,
effective_cache_size, etc but there's little or no effect.

What's your OS and hardware environment and what are your non-default
PostgreSQL settings?

Right now, I'm working on a test case that involves a table with ~360k
rows called "nb.sigs". My sample query is:

select * from nb.sigs where signum > 250000

With no index, explain says this query costs 11341. After CREATE INDEX
on the signum field, along with an ANALYZE for nb.sigs, the query costs
3456 and takes around 4 seconds to return the first row.

Please post the EXPLAIN ANALYZE output for the query -- that'll
show us the query plan, the accuracy of the planner's row count
estimate, and how long the query takes to execute on the server.
It might also be useful to see the table definition and the output
of the following query:

SELECT null_frac, n_distinct, correlation
FROM pg_stats
WHERE schemaname = 'nb' AND tablename = 'sigs' AND attname = 'signum';

What client interface are you using? If the query returns a lot
of rows then you might benefit from using a cursor to fetch the
result set in chunks; otherwise the client library is probably
waiting for the entire result set to arrive before returning any
rows to you. If the result set is large then you can experience
performance problems due to a shortage of real memory.

How volatile is the data and how common are queries based on signum?
You might benefit from clustering on the signum index.

(If necessary, I can write an entire script that creates and populates a
table and then give my performance on that sample for someone else to
check against.)

If it's a short script that populates the table with canned data
then go ahead and post it.

--
Michael Fuhr

#12David Rysdam
drysdam@ll.mit.edu
In reply to: Michael Fuhr (#11)
Re: Very slow queries on 8.1

Michael Fuhr wrote:

On Thu, Nov 17, 2005 at 10:38:50AM -0500, David Rysdam wrote:

I'm porting an application from Sybase and I've noticed that similar
application functions take 2 to 3 times longer on postgres than they
used to on the same machine running under Sybase. I've tried changing
various "performance tuning" parameters, such as shared_buffers,
effective_cache_size, etc but there's little or no effect.

What's your OS and hardware environment and what are your non-default
PostgreSQL settings?

Fedora Core 2, dual 2.8 GHz, 2 GB ram.

shared_buffers = 10000
effective_cache_size = 100000

Right now, I'm working on a test case that involves a table with ~360k
rows called "nb.sigs". My sample query is:

select * from nb.sigs where signum > 250000

With no index, explain says this query costs 11341. After CREATE INDEX
on the signum field, along with an ANALYZE for nb.sigs, the query costs
3456 and takes around 4 seconds to return the first row.

Please post the EXPLAIN ANALYZE output for the query -- that'll
show us the query plan, the accuracy of the planner's row count
estimate, and how long the query takes to execute on the server.
It might also be useful to see the table definition and the output
of the following query:

SELECT null_frac, n_distinct, correlation
FROM pg_stats
WHERE schemaname = 'nb' AND tablename = 'sigs' AND attname = 'signum';

The first things would be problematic to supply, since they are actually
on a computer that doesn't have access to the Internet or to the machine
I'm writing this on. As for the query:

Row null_frac n_distinct correlation
1 0 -1 1

What client interface are you using? If the query returns a lot
of rows then you might benefit from using a cursor to fetch the
result set in chunks; otherwise the client library is probably
waiting for the entire result set to arrive before returning any
rows to you.

That does seem to be the problem. I've never worked with cursors, so
I'll have to see if I can fit our DB module into that mold.