Slow Postgresql server

Started by Jason Lustigalmost 19 years ago57 messages
#1Jason Lustig
lustig@brandeis.edu

Hello all,

My website has been having issues with our new Linux/PostgreSQL
server being somewhat slow. I have done tests using Apache Benchmark
and for pages that do not connect to Postgres, the speeds are much
faster (334 requests/second v. 1-2 requests/second), so it seems that
Postgres is what's causing the problem and not Apache. I did some
reserach, and it seems that the bottleneck is in fact the hard
drives! Here's an excerpt from vmstat:

procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us
sy id wa st
1 1 140 24780 166636 575144 0 0 0 3900 1462 3299 1
4 49 48 0
0 1 140 24780 166636 575144 0 0 0 3828 1455 3391 0
4 48 48 0
1 1 140 24780 166636 575144 0 0 0 2440 960 2033 0
3 48 48 0
0 1 140 24780 166636 575144 0 0 0 2552 1001 2131 0
2 50 49 0
0 1 140 24780 166636 575144 0 0 0 3188 1233 2755 0
3 49 48 0
0 1 140 24780 166636 575144 0 0 0 2048 868 1812 0
2 49 49 0
0 1 140 24780 166636 575144 0 0 0 2720 1094 2386 0
3 49 49 0

As you can see, almost 50% of the CPU is waiting on I/O. This doesn't
seem like it should be happening, however, since we are using a RAID
1 setup (160+160). We have 1GB ram, and have upped shared_buffers to
13000 and work_mem to 8096. What would cause the computer to only use
such a small percentage of the CPU, with more than half of it waiting
on I/O requests?

Thanks a lot
Jason

#2Dennis Bjorklund
db@zigo.dhs.org
In reply to: Jason Lustig (#1)
Re: Slow Postgresql server

Jason Lustig skrev:

and work_mem to 8096. What would cause the computer to only use such a
small percentage of the CPU, with more than half of it waiting on I/O
requests?

Do your webpages write things to the database on each connect?

Maybe it do a bunch of writes each individually commited? For every
commit pg will wait for the data to be written down to the disk platter
before it move on. So if you do several writes you want to do them in
one transaction so you only need one commit.

/Dennis

#3Jeff Frost
jeff@frostconsultingllc.com
In reply to: Jason Lustig (#1)
Re: Slow Postgresql server

On Wed, 11 Apr 2007, Jason Lustig wrote:

Hello all,

My website has been having issues with our new Linux/PostgreSQL server being
somewhat slow. I have done tests using Apache Benchmark and for pages that do
not connect to Postgres, the speeds are much faster (334 requests/second v.
1-2 requests/second), so it seems that Postgres is what's causing the problem
and not Apache. I did some reserach, and it seems that the bottleneck is in
fact the hard drives! Here's an excerpt from vmstat:

procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa
st
1 1 140 24780 166636 575144 0 0 0 3900 1462 3299 1 4 49 48
0
0 1 140 24780 166636 575144 0 0 0 3828 1455 3391 0 4 48 48
0
1 1 140 24780 166636 575144 0 0 0 2440 960 2033 0 3 48 48
0
0 1 140 24780 166636 575144 0 0 0 2552 1001 2131 0 2 50 49
0
0 1 140 24780 166636 575144 0 0 0 3188 1233 2755 0 3 49 48
0
0 1 140 24780 166636 575144 0 0 0 2048 868 1812 0 2 49 49
0
0 1 140 24780 166636 575144 0 0 0 2720 1094 2386 0 3 49 49
0

As you can see, almost 50% of the CPU is waiting on I/O. This doesn't seem
like it should be happening, however, since we are using a RAID 1 setup
(160+160). We have 1GB ram, and have upped shared_buffers to 13000 and
work_mem to 8096. What would cause the computer to only use such a small
percentage of the CPU, with more than half of it waiting on I/O requests?

Well, the simple answer is a slow disk subsystem. Is it hardware or software
RAID1? If hardware, what's the RAID controller? Based on your vmstat output,
I'd guess that this query activity is all writes since I see only blocks out.
Can you identify what the slow queries are? What version of postgres? How
large is the database? Can you post the non-default values in your
postgresql.conf?

I'd suggest you test your disk subsystem to see if it's as performant as you
think with bonnie++. Here's some output from my RAID1 test server:

Version 1.03 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
pgtest 4G 47090 92 52348 11 30954 6 41838 65 73396 8 255.9 1
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 894 2 +++++ +++ 854 1 817 2 +++++ +++ 969 2

So, that's 52MB/sec block writes and 73MB/sec block reads. That's typical of
a RAID1 on 2 semi-fast SATA drives.

If you're doing writes to the DB on every web page, you might consider playing
with the commit_delay and commit_siblings parameters in the postgresql.conf.
Also, if you're doing multiple inserts as separate transactions, you should
consider batching them up in one transaction.

--
Jeff Frost, Owner <jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

#4Ron
rjpeace@earthlink.net
In reply to: Jason Lustig (#1)
Re: Slow Postgresql server

1= RAID 1improves data =intregrity=, not IO performance.
Your HD IO performance is essentially that of 1 160GB HD of whatever
performance one of those HDs have.
(what kind of HDs are they anyway? For instance 7200rpm 160GB HDs
are not particularly "high performance")
BEST case is streaming IO involving no seeks => ~50 MBps.
You can't get even that as the back end of a website.

2= 1GB of RAM is -small- for a DB server.

You need to buy RAM and HD.

Boost the RAM to 4GB, change pg config parameters appropriately and
see how much it helps.
Non ECC RAM is currently running ~$60-$75 per GB for 1 or 2 GB sticks
ECC RAM prices will be ~ 1.5x - 2x that, $120 - $150 per GB for 1 or
2 GB sticks.
(do !not! buy 4GB sticks unless you have a large budget. Their price
pr GB is still too high)

If adding RAM helps as much as I suspect it will, find out how big
the "hot" section of your DB is and see if you can buy enough RAM to
make it RAM resident.
If you can do this, it will result in the lowest term DB maintenance.

If you can't do that for whatever reason, the next step is to improve
your HD subsystem.
Cheap RAID cards with enough BB cache to allow writes to be coalesced
into larger streams (reducing seeks) will help, but you fundamentally
you will need more HDs.

RAID 5 is an reasonable option for most website DBs workloads.
To hit the 300MBps speeds attainable by the cheap RAID cards, you are
going to at least 7 HDs (6 HDs * 50MBps ASTR = 300MBps ASTR + the
equivalent of 1 HD gets used for the "R" in RAID). A minimum of 8
HDs are need for this performance if you want to use RAID 6.
Most tower case (not mini-tower, tower) cases can hold this internally.
Price per MBps of HD is all over the map. The simplest (but not
necessarily best) option is to buy more of the 160GB HDs you already have.
Optimizing the money spent when buying HDs for a RAID set is a bit
more complicated than doing so for RAM. Lot's of context dependent
things affect the final decision.

I see you are mailing from Brandeis. I'm local. Drop me some
private email at the address I'm posting from if you want and I'll
send you further contact info so we can talk in more detail.

Cheers,
Ron Peacetree

At 06:02 PM 4/11/2007, Jason Lustig wrote:

Show quoted text

Hello all,

My website has been having issues with our new Linux/PostgreSQL
server being somewhat slow. I have done tests using Apache Benchmark
and for pages that do not connect to Postgres, the speeds are much
faster (334 requests/second v. 1-2 requests/second), so it seems that
Postgres is what's causing the problem and not Apache. I did some
reserach, and it seems that the bottleneck is in fact the hard
drives! Here's an excerpt from vmstat:

procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us
sy id wa st
1 1 140 24780 166636 575144 0 0 0 3900 1462 3299 1
4 49 48 0
0 1 140 24780 166636 575144 0 0 0 3828 1455 3391 0
4 48 48 0
1 1 140 24780 166636 575144 0 0 0 2440 960 2033 0
3 48 48 0
0 1 140 24780 166636 575144 0 0 0 2552 1001 2131 0
2 50 49 0
0 1 140 24780 166636 575144 0 0 0 3188 1233 2755 0
3 49 48 0
0 1 140 24780 166636 575144 0 0 0 2048 868 1812 0
2 49 49 0
0 1 140 24780 166636 575144 0 0 0 2720 1094 2386 0
3 49 49 0

As you can see, almost 50% of the CPU is waiting on I/O. This doesn't
seem like it should be happening, however, since we are using a RAID
1 setup (160+160). We have 1GB ram, and have upped shared_buffers to
13000 and work_mem to 8096. What would cause the computer to only use
such a small percentage of the CPU, with more than half of it waiting
on I/O requests?

Thanks a lot
Jason

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#5Guido Neitzer
lists@event-s.net
In reply to: Ron (#4)
Re: Slow Postgresql server

On 12.04.2007, at 07:26, Ron wrote:

You need to buy RAM and HD.

Before he does that, wouldn't it be more useful, to find out WHY he
has so much IO?

Have I missed that or has nobody suggested finding the slow queries
(when you have much IO on them, they might be slow at least with a
high shared memory setting).

So, my first idea is, to turn on query logging for queries longer
than a xy milliseconds, "explain analyse" these queries and see
wether there are a lot of seq scans involved, which would explain the
high IO.

Just an idea, perhaps I missed that step in that discussion
somewhere ...

But yes, it might also be, that the server is swapping, that's
another thing to find out.

cug

#6Ron
rjpeace@earthlink.net
In reply to: Guido Neitzer (#5)
Re: Slow Postgresql server

At 10:08 AM 4/12/2007, Guido Neitzer wrote:

On 12.04.2007, at 07:26, Ron wrote:

You need to buy RAM and HD.

Before he does that, wouldn't it be more useful, to find out WHY he
has so much IO?

1= Unless I missed something, the OP described pg being used as a
backend DB for a webserver.

I know the typical IO demands of that scenario better than I sometimes want to.
:-(

2= 1GB of RAM + effectively 1 160GB HD = p*ss poor DB IO support.
~ 1/2 that RAM is going to be used for OS stuff, leaving only ~512MB
of RAM to be used supporting pg.
That RAID 1 set is effectively 1 HD head that all IO requests are
going to contend for.
Even if the HD in question is a 15Krpm screamer, that level of HW
contention has very adverse implications.

Completely agree that at some point the queries need to be examined
(ditto the table schema, etc), but this system is starting off in a
Bad Place for its stated purpose IME.
Some minimum stuff is obvious even w/o spending time looking at
anything beyond the HW config.

Cheers,
Ron Peacetree

Show quoted text

Have I missed that or has nobody suggested finding the slow queries
(when you have much IO on them, they might be slow at least with a
high shared memory setting).

So, my first idea is, to turn on query logging for queries longer
than a xy milliseconds, "explain analyse" these queries and see
wether there are a lot of seq scans involved, which would explain the
high IO.

Just an idea, perhaps I missed that step in that discussion
somewhere ...

But yes, it might also be, that the server is swapping, that's
another thing to find out.

cug

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

#7Guido Neitzer
lists@event-s.net
In reply to: Ron (#6)
Re: Slow Postgresql server

On 12.04.2007, at 08:59, Ron wrote:

1= Unless I missed something, the OP described pg being used as a
backend DB for a webserver.

Yep.

I know the typical IO demands of that scenario better than I
sometimes want to.
:-(

Yep. Same here. ;-)

2= 1GB of RAM + effectively 1 160GB HD = p*ss poor DB IO support.

Absolutely right. Depending a little bit on the DB and WebSite layout
and on the actual requirements, but yes - it's not really a kick-ass
machine ...

Completely agree that at some point the queries need to be examined
(ditto the table schema, etc), but this system is starting off in a
Bad Place for its stated purpose IME.
Some minimum stuff is obvious even w/o spending time looking at
anything beyond the HW config.

Depends. As I said - if the whole DB fits into the remaining space,
and a lot of website backend DBs do, it might just work out. But this
seems not to be the case - either the site is chewing on seq scans
all the time which will cause I/O or it is bound by the lack of
memory and swaps the whole time ... He has to find out.

cug

#8Jeff Frost
jeff@frostconsultingllc.com
In reply to: Jason Lustig (#1)
Re: Slow Postgresql server

On Thu, 12 Apr 2007, Jason Lustig wrote:

0 <-- BM starts here
10 0 180 700436 16420 91740 0 0 0 176 278 2923 59 41 0
0 0
11 0 180 696736 16420 91740 0 0 0 0 254 2904 57 43 0
0 0
12 0 180 691272 16420 91740 0 0 0 0 255 3043 60 39 1
0 0
9 0 180 690396 16420 91740 0 0 0 0 254 3078 63 36 2 0
0

Obviously, I've turned off logging now but I'd like to get it running again
(without bogging down the server) so that I can profile the system and find
out which queries I need to optimize. My logging settings (with unnecessary
comments taken out) were:

So what did you get in the logs when you had logging turned on? If you have
the statement logging, perhaps it's worth running through pgfouine to generate
a report.

log_destination = 'syslog' # Valid values are combinations of
redirect_stderr = off # Enable capturing of stderr into log
log_min_duration_statement = 0 # -1 is disabled, 0 logs all
statements
silent_mode = on # DO NOT USE without syslog or
log_duration = off
log_line_prefix = 'user=%u,db=%d' # Special values:
log_statement = 'none' # none, ddl, mod, all

Perhaps you just want to log slow queries > 100ms? But since you don't seem
to know what queries you're running on each web page, I'd suggest you just
turn on the following and run your benchmark against it, then turn it back
off:

log_duration = on
log_statement = 'all'

Then go grab pgfouine and run the report against the logs to see what queries
are chewing up all your time.

So you know, we're using Postgres 8.2.3. The database currently is pretty
small (we're just running a testing database right now with a few megabytes
of data). No doubt some of our queries are slow, but I was concerned because
no matter how slow the queries were (at most the worst were taking a couple
of msecs anyway), I was getting ridiculously slow responses from the server.
Outside of logging, our only other non-default postgresql.conf items are:

shared_buffers = 13000 # min 128kB or max_connections*16kB
work_mem = 8096 # min 64kB

In terms of the server itself, I think that it uses software raid. How can I
tell? Our hosting company set it up with the server so I guess I could ask
them, but is there a program I can run which will tell me the information? I
also ran bonnie++ and got this output:

Version  1.03       ------Sequential Output------ --Sequential Input- 
--Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
pgtest 2000M 29277  67 33819  15 15446   4 35144  62 48887   5 152.7   0
------Sequential Create------ --------Random 
Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- 
-Delete--
files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec 
%CP
16 17886  77 +++++ +++ +++++ +++ 23258  99 +++++ +++ +++++ 
+++

So I'm getting 33MB and 48MB write/read respectively. Is this slow? Is there
anything I should be doing to optimize our RAID configuration?

It's not fast, but at least it's about the same speed as an average IDE drive
from this era. More disks would help, but since you indicate the DB fits in
RAM with plenty of room to spare, how about you update your
effective_cache_size to something reasonable. You can use the output of the
'free' command and take the cache number and divide by 8 to get a reasonable
value on linux. Then turn on logging and run your benchmark. After that, run
a pgfouine report against the log and post us the explain analyze from your
slow queries.

And if Ron is indeed local, it might be worthwhile to contact him. Someone
onsite would likely get this taken care of much faster than we can on the
mailing list.

--
Jeff Frost, Owner <jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

#9Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Guido Neitzer (#7)
Re: Slow Postgresql server

On Thu, 2007-04-12 at 10:19, Guido Neitzer wrote:

On 12.04.2007, at 08:59, Ron wrote:

Depends. As I said - if the whole DB fits into the remaining space,
and a lot of website backend DBs do, it might just work out. But this
seems not to be the case - either the site is chewing on seq scans
all the time which will cause I/O or it is bound by the lack of
memory and swaps the whole time ... He has to find out.

It could also be something as simple as a very bloated data store.

I'd ask the user what vacuum verbose says at the end

#10Jeff Frost
jeff@frostconsultingllc.com
In reply to: Scott Marlowe (#9)
Re: Slow Postgresql server

On Thu, 12 Apr 2007, Scott Marlowe wrote:

On Thu, 2007-04-12 at 10:19, Guido Neitzer wrote:

On 12.04.2007, at 08:59, Ron wrote:

Depends. As I said - if the whole DB fits into the remaining space,
and a lot of website backend DBs do, it might just work out. But this
seems not to be the case - either the site is chewing on seq scans
all the time which will cause I/O or it is bound by the lack of
memory and swaps the whole time ... He has to find out.

It could also be something as simple as a very bloated data store.

I'd ask the user what vacuum verbose says at the end

You know, I should answer emails at night...we didn't ask when the last time
the data was vacuumed or analyzed and I believe he indicated that the only
non-default values were memory related, so no autovacuum running.

Jason,

Before you go any further, run 'vacuum analyze;' on your DB if you're not
doing this with regularity and strongly consider enabling autovacuum.

--
Jeff Frost, Owner <jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

#11Carlos Moreno
moreno_pg@mochima.com
In reply to: Jeff Frost (#10)
Re: Slow Postgresql server

Jeff Frost wrote:

You know, I should answer emails at night...

Indeed you shouldN'T ;-)

Carlos
--

#12Steve
cheetah@tanabi.org
In reply to: Jeff Frost (#10)
Strangely Variable Query Performance

Hey there;

On a Postgres 8.2.3 server, I've got a query that is running very slow in
some cases. With some work, I've determined the 'slow part' of the query.
:) This is a query on a table with like 10 million rows or something like
that. encounter_id is an integer and receipt is of type 'date'.

This query runs really slow [minutes] (query and explain below):

select extract(epoch from ds.receipt) from detail_summary ds where
ds.receipt >= '1998-12-30 0:0:0' and
ds.encounter_id in
(8813186,8813187,8813188,8813189,8813190,8813191,8813192,
8813193,8813194,8813195,8813196,8813197,8813198,8813199,
8813200,8813201,8813202,8813203,8813204,8813205,8813206,
8813207,8813208,8813209,8813210,8813211,8813212,8813213,
8813214,8813215,8813216,8813217,8813218,8813219,8813220,
8813221,8813222,8813223,8813224,8813225,8813226,8813227,
8813228,8813229,8813230,8813231,8813232,8813233,8813234,
8813235,8813236,8813237,8813238,8813239,8813240,8813241,
8813242,8813243,8813244,8813245,8813246,8813247,8813248,
8813249,8813250,8813251,8813252,8813253,8813254,8813255,
8813256,8813257,8813258,8813259,8813260,8813261,8813262,
8813263,8813264,8813265,8813266,8813267,8813268,8813269,
8813270,8813271,8813272,8813273,8813274,8813275,8813276,
8813277,8813278,8813279,8813280,8813281,8813282,8813283,
8813284,8815534)

Results in the 'explain' :

Seq Scan on detail_summary ds (cost=0.00..1902749.83 rows=9962 width=4)
Filter: ((receipt >= '1998-12-30'::date) AND (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[])))
(2 rows)

Turning enable_seqscan to off results in a slightly more interesting
explain, but an equally slow query.

HOWEVER! The simple removal of the receipt date paramater results in a
fast query, as such:

select extract(epoch from ds.receipt) from detail_summary ds where
ds.encounter_id in
(8813186,8813187,8813188,8813189,8813190,8813191,8813192,
8813193,8813194,8813195,8813196,8813197,8813198,8813199,
8813200,8813201,8813202,8813203,8813204,8813205,8813206,
8813207,8813208,8813209,8813210,8813211,8813212,8813213,
8813214,8813215,8813216,8813217,8813218,8813219,8813220,
8813221,8813222,8813223,8813224,8813225,8813226,8813227,
8813228,8813229,8813230,8813231,8813232,8813233,8813234,
8813235,8813236,8813237,8813238,8813239,8813240,8813241,
8813242,8813243,8813244,8813245,8813246,8813247,8813248,
8813249,8813250,8813251,8813252,8813253,8813254,8813255,
8813256,8813257,8813258,8813259,8813260,8813261,8813262,
8813263,8813264,8813265,8813266,8813267,8813268,8813269,
8813270,8813271,8813272,8813273,8813274,8813275,8813276,
8813277,8813278,8813279,8813280,8813281,8813282,8813283,
8813284 ,8815534)

This query returns instantly and explains as:

Bitmap Heap Scan on detail_summary ds (cost=161.00..14819.81 rows=9963
width=4)
Recheck Cond: (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[]))
-> Bitmap Index Scan on detail_summary_encounter_id_idx
(cost=0.00..160.75 rows=9963 width=0)
Index Cond: (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[]))

Any thoughts? Both encounter_id and receipt date are indexed columns.
I've vacuumed and analyzed the table. I tried making a combined index of
encounter_id and receipt and it hasn't worked out any better.

Thanks!

Steve

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#12)
Re: Strangely Variable Query Performance

Steve <cheetah@tanabi.org> writes:

On a Postgres 8.2.3 server, I've got a query that is running very slow in
some cases.

Could we see the exact definition of that table and its indexes?
It looks like the planner is missing the bitmap scan for some reason,
but I've not seen a case like that before.

Also, I assume the restriction on receipt date is very nonselective?
It doesn't seem to have changed the estimated rowcount much.

regards, tom lane

#14Steve
cheetah@tanabi.org
In reply to: Tom Lane (#13)
Re: Strangely Variable Query Performance

Could we see the exact definition of that table and its indexes?
It looks like the planner is missing the bitmap scan for some reason,
but I've not seen a case like that before.

Also, I assume the restriction on receipt date is very nonselective?
It doesn't seem to have changed the estimated rowcount much.

This is true -- This particular receipt date is actually quite
meaningless. It's equivalent to saying 'all receipt dates'. I don't
think there's even any data that goes back before 2005.

Here's the table and it's indexes. Before looking, a note; there's
several 'revop' indexes, this is for sorting. The customer insisted on,
frankly, meaninglessly complicated sorts. I don't think any of that
matters for our purposes here though :)

Column | Type |
Modifiers
-----------------------+------------------------+--------------------------------------------------------------------
detailsummary_id | integer | not null default
nextval(('detailsummary_id_seq'::text)::regclass)
detailgroup_id | integer |
receipt | date |
batchnum | integer |
encounternum | integer |
procedureseq | integer |
procedurecode | character varying(5) |
wrong_procedurecode | character varying(5) |
batch_id | integer |
encounter_id | integer |
procedure_id | integer |
carrier_id | integer |
product_line | integer |
provider_id | integer |
member_num | character varying(20) |
wrong_member_num | character varying(20) |
member_name | character varying(40) |
patient_control | character varying(20) |
rendering_prov_id | character varying(15) |
rendering_prov_name | character varying(30) |
referring_prov_id | character varying(15) |
referring_prov_name | character varying(30) |
servicedate | date |
wrong_servicedate | date |
diagnosis_codes | character varying(5)[] |
wrong_diagnosis_codes | character varying(5)[] |
ffs_charge | double precision |
export_date | date |
hedis_date | date |
raps_date | date |
diagnosis_pointers | character(1)[] |
modifiers | character(2)[] |
units | double precision |
pos | character(2) |
isduplicate | boolean |
duplicate_id | integer |
encounter_corrected | boolean |
procedure_corrected | boolean |
numerrors | integer |
encerrors_codes | integer[] |
procerror_code | integer |
error_servicedate | text |
e_duplicate_id | integer |
ecode_counts | integer[] |
p_record_status | integer |
e_record_status | integer |
e_delete_date | date |
p_delete_date | date |
b_record_status | integer |
b_confirmation | character varying(20) |
b_carrier_cobol_id | character varying(16) |
b_provider_cobol_id | character varying(20) |
b_provider_tax_id | character varying(16) |
b_carrier_name | character varying(50) |
b_provider_name | character varying(50) |
b_submitter_file_id | character varying(40) |
e_hist_carrier_id | integer |
p_hist_carrier_id | integer |
e_duplicate_id_orig | character varying(25) |
p_duplicate_id_orig | character varying(25) |
num_procerrors | integer |
num_encerrors | integer |
export_id | integer |
raps_id | integer |
hedis_id | integer |
Indexes:
"detail_summary_b_record_status_idx" btree (b_record_status)
"detail_summary_batch_id_idx" btree (batch_id)
"detail_summary_batchnum_idx" btree (batchnum)
"detail_summary_carrier_id_idx" btree (carrier_id)
"detail_summary_duplicate_id_idx" btree (duplicate_id)
"detail_summary_e_record_status_idx" btree (e_record_status)
"detail_summary_encounter_id_idx" btree (encounter_id)
"detail_summary_encounternum_idx" btree (encounternum)
"detail_summary_export_date_idx" btree (export_date)
"detail_summary_hedis_date_idx" btree (hedis_date)
"detail_summary_member_name_idx" btree (member_name)
"detail_summary_member_num_idx" btree (member_num)
"detail_summary_p_record_status_idx" btree (p_record_status)
"detail_summary_patient_control_idx" btree (patient_control)
"detail_summary_procedurecode_idx" btree (procedurecode)
"detail_summary_product_line_idx" btree (product_line)
"detail_summary_provider_id_idx" btree (provider_id)
"detail_summary_raps_date_idx" btree (raps_date)
"detail_summary_receipt_encounter_idx" btree (receipt, encounter_id)
"detail_summary_receipt_id_idx" btree (receipt)
"detail_summary_referrering_prov_id_idx" btree (referring_prov_id)
"detail_summary_rendering_prov_id_idx" btree (rendering_prov_id)
"detail_summary_rendering_prov_name_idx" btree (rendering_prov_name)
"detail_summary_servicedate_idx" btree (servicedate)
"ds_sort_1" btree (receipt date_revop, carrier_id, batchnum,
encounternum, procedurecode, encounter_id)
"ds_sort_10" btree (receipt date_revop, carrier_id, batchnum,
encounternum, procedurecode, encounter_id, procedure_id)
"ed_cbee_norev" btree (export_date, carrier_id, batchnum,
encounternum, encounter_id)
"ed_cbee_norev_p" btree (export_date, carrier_id, batchnum,
encounternum, encounter_id, procedure_id)
"ed_cbee_rev" btree (export_date date_revop, carrier_id, batchnum,
encounternum, encounter_id)
"ed_cbee_rev_p" btree (export_date date_revop, carrier_id, batchnum,
encounternum, encounter_id, procedure_id)
"mcbe" btree (member_name, carrier_id, batchnum, encounternum,
encounter_id)
"mcbe_p" btree (member_name, carrier_id, batchnum, encounternum,
encounter_id, procedure_id)
"mcbe_rev" btree (member_name text_revop, carrier_id, batchnum,
encounternum, encounter_id)
"mcbe_rev_p" btree (member_name text_revop, carrier_id, batchnum,
encounternum, encounter_id, procedure_id)
"mcbee_norev" btree (member_num, carrier_id, batchnum, encounternum,
encounter_id)
"mcbee_norev_p" btree (member_num, carrier_id, batchnum, encounternum,
encounter_id, procedure_id)
"mcbee_rev" btree (member_num text_revop, carrier_id, batchnum,
encounternum, encounter_id)
"mcbee_rev_p" btree (member_num text_revop, carrier_id, batchnum,
encounternum, encounter_id, procedure_id)
"pcbee_norev" btree (patient_control, carrier_id, batchnum,
encounternum, encounter_id)
"pcbee_norev_p" btree (patient_control, carrier_id, batchnum,
encounternum, encounter_id, procedure_id)
"pcbee_rev" btree (patient_control text_revop, carrier_id, batchnum,
encounternum, encounter_id)
"pcbee_rev_p" btree (patient_control text_revop, carrier_id, batchnum,
encounternum, encounter_id, procedure_id)
"rcbee_norev" btree (receipt, carrier_id, batchnum, encounternum,
encounter_id)
"rcbee_norev_p" btree (receipt, carrier_id, batchnum, encounternum,
encounter_id, procedure_id)
"rp_cbee_norev" btree (rendering_prov_name, carrier_id, batchnum,
encounternum, encounter_id)
"rp_cbee_norev_p" btree (rendering_prov_name, carrier_id, batchnum,
encounternum, encounter_id, procedure_id)
"rp_cbee_rev" btree (rendering_prov_name text_revop, carrier_id,
batchnum, encounternum, encounter_id)
"rp_cbee_rev_p" btree (rendering_prov_name text_revop, carrier_id,
batchnum, encounternum, encounter_id, procedure_id)
"sd_cbee_norev" btree (servicedate, carrier_id, batchnum,
encounternum, encounter_id)
"sd_cbee_norev_p" btree (servicedate, carrier_id, batchnum,
encounternum, encounter_id, procedure_id)
"sd_cbee_rev" btree (servicedate date_revop, carrier_id, batchnum,
encounternum, encounter_id)
"sd_cbee_rev_p" btree (servicedate date_revop, carrier_id, batchnum,
encounternum, encounter_id, procedure_id)
"testrev" btree (receipt date_revop, carrier_id, batchnum,
encounternum, encounter_id)
"testrev_p" btree (receipt date_revop, carrier_id, batchnum,
encounternum, encounter_id, procedure_id)

#15Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Steve (#12)
Re: Strangely Variable Query Performance

On Thu, 2007-04-12 at 16:03, Steve wrote:

Hey there;

On a Postgres 8.2.3 server, I've got a query that is running very slow in
some cases. With some work, I've determined the 'slow part' of the query.
:) This is a query on a table with like 10 million rows or something like
that. encounter_id is an integer and receipt is of type 'date'.

SNIP

Seq Scan on detail_summary ds (cost=0.00..1902749.83 rows=9962 width=4)
Filter: ((receipt >= '1998-12-30'::date) AND (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[])))
(2 rows)

How accurate is the row estimate made by the planner? (explain analyze
to be sure)

#16Jason Lustig
lustig@brandeis.edu
In reply to: Jeff Frost (#8)
Re: Slow Postgresql server

Hi all,

Wow! That's a lot to respond to. Let me go through some of the
ideas... First, I just turned on autovacuum, I forgot to do that. I'm
not seeing a major impact though. Also, I know that it's not optimal
for a dedicated server. It's not just for postgres, it's also got our
apache server on it. We're just getting started and didn't want to
make the major investment right now in getting the most expensive
server we can get. Within the next year, as our traffic grows, we
will most likely upgrade, but for now when we're in the beginning
phases of our project, we're going to work with this server.

In terms of RAID not helping speed-wise (only making an impact in
data integrity) - I was under the impression that even a mirrored
disk set improves speed, because read requests can be sent to either
of the disk controllers. Is this incorrect?

I turned on logging again, only logging queries > 5ms. and it caused
the same problems. I think it might be an issue within the OS's
logging facilities, since it's going through stderr.

Some of the queries are definitely making an impact on the speed. We
are constantly trying to improve performance, and part of that is
reassessing our indexes and denormalizing data where it would help.
We're also doing work with memcached to cache the results of some of
the more expensive operations.

Thanks for all your help guys - it's really fantastic to see the
community here! I've got a lot of database experience (mostly with ms
sql and mysql) but this is my first time doing serious work with
postgres and it's really a great system with great people too.

Jason

On Apr 12, 2007, at 11:35 AM, Jeff Frost wrote:

Show quoted text

On Thu, 12 Apr 2007, Jason Lustig wrote:

0 <-- BM starts here
10 0 180 700436 16420 91740 0 0 0 176 278 2923
59 41 0 0 0
11 0 180 696736 16420 91740 0 0 0 0 254 2904
57 43 0 0 0
12 0 180 691272 16420 91740 0 0 0 0 255 3043
60 39 1 0 0
9 0 180 690396 16420 91740 0 0 0 0 254 3078
63 36 2 0 0

Obviously, I've turned off logging now but I'd like to get it
running again (without bogging down the server) so that I can
profile the system and find out which queries I need to optimize.
My logging settings (with unnecessary comments taken out) were:

So what did you get in the logs when you had logging turned on? If
you have the statement logging, perhaps it's worth running through
pgfouine to generate a report.

log_destination = 'syslog' # Valid values are
combinations of
redirect_stderr = off # Enable capturing of
stderr into log
log_min_duration_statement = 0 # -1 is disabled, 0
logs all statements
silent_mode = on # DO NOT USE without
syslog or
log_duration = off
log_line_prefix = 'user=%u,db=%d' # Special
values:
log_statement = 'none' # none, ddl, mod, all

Perhaps you just want to log slow queries > 100ms? But since you
don't seem to know what queries you're running on each web page,
I'd suggest you just turn on the following and run your benchmark
against it, then turn it back off:

log_duration = on
log_statement = 'all'

Then go grab pgfouine and run the report against the logs to see
what queries are chewing up all your time.

So you know, we're using Postgres 8.2.3. The database currently is
pretty small (we're just running a testing database right now with
a few megabytes of data). No doubt some of our queries are slow,
but I was concerned because no matter how slow the queries were
(at most the worst were taking a couple of msecs anyway), I was
getting ridiculously slow responses from the server. Outside of
logging, our only other non-default postgresql.conf items are:

shared_buffers = 13000 # min 128kB or
max_connections*16kB
work_mem = 8096 # min 64kB

In terms of the server itself, I think that it uses software raid.
How can I tell? Our hosting company set it up with the server so I
guess I could ask them, but is there a program I can run which
will tell me the information? I also ran bonnie++ and got this
output:

Version  1.03       ------Sequential Output------ --Sequential  
Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- -- 
Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
CP  /sec %CP
pgtest 2000M 29277  67 33819  15 15446   4 35144  62 48887   5  
152.7   0
------Sequential Create------ --------Random  
Create--------
-Create-- --Read--- -Delete-- -Create-- -- 
Read--- -Delete--
files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec % 
CP  /sec %CP
16 17886  77 +++++ +++ +++++ +++ 23258  99 +++++ ++ 
+ +++++ +++

So I'm getting 33MB and 48MB write/read respectively. Is this
slow? Is there anything I should be doing to optimize our RAID
configuration?

It's not fast, but at least it's about the same speed as an average
IDE drive from this era. More disks would help, but since you
indicate the DB fits in RAM with plenty of room to spare, how about
you update your effective_cache_size to something reasonable. You
can use the output of the 'free' command and take the cache number
and divide by 8 to get a reasonable value on linux. Then turn on
logging and run your benchmark. After that, run a pgfouine report
against the log and post us the explain analyze from your slow
queries.

And if Ron is indeed local, it might be worthwhile to contact him.
Someone onsite would likely get this taken care of much faster than
we can on the mailing list.

--
Jeff Frost, Owner <jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#14)
Re: Strangely Variable Query Performance

Steve <cheetah@tanabi.org> writes:

Here's the table and it's indexes. Before looking, a note; there's
several 'revop' indexes, this is for sorting. The customer insisted on,
frankly, meaninglessly complicated sorts. I don't think any of that
matters for our purposes here though :)

Oy vey ... I hope this is a read-mostly table, because having that many
indexes has got to be killing your insert/update performance.

I see that some of the revop indexes might be considered relevant to
this query, so how exactly have you got those opclasses defined?
There's built-in support for reverse sort as of CVS HEAD, but in
existing releases you must have cobbled something together, and I wonder
if that could be a contributing factor ...

regards, tom lane

#18Steve
cheetah@tanabi.org
In reply to: Tom Lane (#17)
Re: Strangely Variable Query Performance

Oy vey ... I hope this is a read-mostly table, because having that many
indexes has got to be killing your insert/update performance.

Hahaha yeah these are read-only tables. Nightly inserts/updates.
Takes a few hours, depending on how many records (between 4 and 10
usually). But during the day, while querying, read only.

I see that some of the revop indexes might be considered relevant to
this query, so how exactly have you got those opclasses defined?
There's built-in support for reverse sort as of CVS HEAD, but in
existing releases you must have cobbled something together, and I wonder
if that could be a contributing factor ...

Here's the revops (the c functions are at the bottom):

CREATE FUNCTION ddd_date_revcmp(date, date) RETURNS integer
AS '/usr/local/pgsql/contrib/cmplib.so', 'ddd_date_revcmp'
LANGUAGE c STRICT;

CREATE FUNCTION ddd_int_revcmp(integer, integer) RETURNS integer
AS '/usr/local/pgsql/contrib/cmplib.so', 'ddd_int_revcmp'
LANGUAGE c STRICT;

CREATE FUNCTION ddd_text_revcmp(text, text) RETURNS integer
AS '/usr/local/pgsql/contrib/cmplib.so', 'ddd_text_revcmp'
LANGUAGE c STRICT;

CREATE OPERATOR CLASS date_revop
FOR TYPE date USING btree AS
OPERATOR 1 >(date,date) ,
OPERATOR 2 >=(date,date) ,
OPERATOR 3 =(date,date) ,
OPERATOR 4 <=(date,date) ,
OPERATOR 5 <(date,date) ,
FUNCTION 1 ddd_date_revcmp(date,date);

CREATE OPERATOR CLASS int4_revop
FOR TYPE integer USING btree AS
OPERATOR 1 >(integer,integer) ,
OPERATOR 2 >=(integer,integer) ,
OPERATOR 3 =(integer,integer) ,
OPERATOR 4 <=(integer,integer) ,
OPERATOR 5 <(integer,integer) ,
FUNCTION 1 ddd_int_revcmp(integer,integer);

CREATE OPERATOR CLASS text_revop
FOR TYPE text USING btree AS
OPERATOR 1 >(text,text) ,
OPERATOR 2 >=(text,text) ,
OPERATOR 3 =(text,text) ,
OPERATOR 4 <=(text,text) ,
OPERATOR 5 <(text,text) ,
FUNCTION 1 ddd_text_revcmp(text,text);

Datum ddd_date_revcmp(PG_FUNCTION_ARGS){
DateADT arg1=PG_GETARG_DATEADT(0);
DateADT arg2=PG_GETARG_DATEADT(1);

PG_RETURN_INT32(arg2 - arg1);
}

Datum ddd_int_revcmp(PG_FUNCTION_ARGS){
int32 arg1=PG_GETARG_INT32(0);
int32 arg2=PG_GETARG_INT32(1);

PG_RETURN_INT32(arg2 - arg1);
}

Datum ddd_text_revcmp(PG_FUNCTION_ARGS){
char* arg1=(char*)VARDATA(PG_GETARG_TEXT_P(0));
char* arg2=(char*)VARDATA(PG_GETARG_TEXT_P(1));

if((*arg1) != (*arg2)){
PG_RETURN_INT32(*arg2 - *arg1);
}else{
PG_RETURN_INT32(strcmp(arg2,arg1));
}
}

#19Steve
cheetah@tanabi.org
In reply to: Scott Marlowe (#15)
Re: Strangely Variable Query Performance

Seq Scan on detail_summary ds (cost=0.00..1902749.83 rows=9962 width=4)
Filter: ((receipt >= '1998-12-30'::date) AND (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[])))
(2 rows)

How accurate is the row estimate made by the planner? (explain analyze
to be sure)

Results:

Seq Scan on detail_summary ds (cost=0.00..1902749.83 rows=9962 width=4)
(actual time=62871.386..257258.249 rows=112 loops=1)
Filter: ((receipt >= '1998-12-30'::date) AND (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[])))
Total runtime: 257258.652 ms

#20Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Steve (#19)
Re: Strangely Variable Query Performance

On Thu, 2007-04-12 at 17:04, Steve wrote:

Seq Scan on detail_summary ds (cost=0.00..1902749.83 rows=9962 width=4)
Filter: ((receipt >= '1998-12-30'::date) AND (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[])))
(2 rows)

How accurate is the row estimate made by the planner? (explain analyze
to be sure)

Results:

Seq Scan on detail_summary ds (cost=0.00..1902749.83 rows=9962 width=4)
(actual time=62871.386..257258.249 rows=112 loops=1)
Filter: ((receipt >= '1998-12-30'::date) AND (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[])))
Total runtime: 257258.652 ms

So there's a misjudgment of the number of rows returned by a factor of
about 88. That's pretty big. Since you had the same number without the
receipt date (I think...) then it's the encounter_id that's not being
counted right.

Try upping the stats target on that column and running analyze again and
see if you get closer to 112 in your analyze or not.

#21Guido Neitzer
lists@event-s.net
In reply to: Jason Lustig (#16)
Re: Slow Postgresql server

On 12.04.2007, at 15:58, Jason Lustig wrote:

Wow! That's a lot to respond to. Let me go through some of the
ideas... First, I just turned on autovacuum, I forgot to do that.
I'm not seeing a major impact though. Also, I know that it's not
optimal for a dedicated server.

Hmm, why not? Have you recently vacuumed your db manually so it gets
cleaned up? Even a vacuum full might be useful if the db is really
bloated.

It's not just for postgres, it's also got our apache server on it.
We're just getting started and didn't want to make the major
investment right now in getting the most expensive server we can get

Hmmm, but more RAM would definitely make sense, especially in that
szenaria. It really sounds like you machine is swapping to dead.

What does the system say about memory usage?

Some of the queries are definitely making an impact on the speed.
We are constantly trying to improve performance, and part of that
is reassessing our indexes and denormalizing data where it would
help. We're also doing work with memcached to cache the results of
some of the more expensive operations.

Hmmm, that kills you even more, as it uses RAM. I really don't think
at the moment that it has something to do with PG itself, but with
not enough memory for what you want to achieve.

What perhaps helps might be connection pooling, so that not so many
processes are created for the requests. It depends on your "middle-
ware" what you can do about that. pg_pool might be an option.

cug

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#18)
Re: Strangely Variable Query Performance

Steve <cheetah@tanabi.org> writes:

Datum ddd_text_revcmp(PG_FUNCTION_ARGS){
char* arg1=(char*)VARDATA(PG_GETARG_TEXT_P(0));
char* arg2=(char*)VARDATA(PG_GETARG_TEXT_P(1));

if((*arg1) != (*arg2)){
PG_RETURN_INT32(*arg2 - *arg1);
}else{
PG_RETURN_INT32(strcmp(arg2,arg1));
}
}

[ itch... ] That code is just completely wrong, because the contents
of a TEXT datum aren't guaranteed null-terminated. It'd be better to
invoke bttextcmp and negate its result.

That's not relevant to your immediate problem, but if you've noticed
any strange behavior with your text_revop indexes, that's the reason...

regards, tom lane

#23Steve
cheetah@tanabi.org
In reply to: Tom Lane (#22)
Re: Strangely Variable Query Performance

[ itch... ] That code is just completely wrong, because the contents
of a TEXT datum aren't guaranteed null-terminated. It'd be better to
invoke bttextcmp and negate its result.

That's not relevant to your immediate problem, but if you've noticed
any strange behavior with your text_revop indexes, that's the reason...

The indexes have all worked, though I'll make the change anyway.
Documentation on how to code these things is pretty sketchy and I believe
I followed an example on the site if I remember right. :/

Thanks for the info though :)

Steve

#24Steve
cheetah@tanabi.org
In reply to: Scott Marlowe (#20)
Re: Strangely Variable Query Performance

So there's a misjudgment of the number of rows returned by a factor of
about 88. That's pretty big. Since you had the same number without the
receipt date (I think...) then it's the encounter_id that's not being
counted right.

Try upping the stats target on that column and running analyze again and
see if you get closer to 112 in your analyze or not.

If I max the statistics targets at 1000, I get:

Seq Scan on detail_summary ds (cost=0.00..1903030.26 rows=1099 width=4)
Filter: ((receipt >= '1998-12-30'::date) AND (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[])))

Setting it ot 500 makes the estimated rows twice as much. It
seems to have no effect on anything though, either way. :)

Steve

#25Steve
cheetah@tanabi.org
In reply to: Jason Lustig (#1)
Re: Strangely Variable Query Performance

Here's my planner parameters:

seq_page_cost = 1.0 # measured on an arbitrary scale
random_page_cost = 1.5 # same scale as above
cpu_tuple_cost = 0.001 # same scale as above
cpu_index_tuple_cost = 0.0005 # same scale as above
cpu_operator_cost = 0.00025 # same scale as above
effective_cache_size = 8192MB

default_statistics_target = 100 # range 1-1000

On a machine with 16 gigs of RAM. I tried to make it skew towards
indexes. However, even if I force it to use the indexes
(enable_seqscan=off) it doesn't make it any faster really :/

Steve

On Thu, 12 Apr 2007, Tom Lane wrote:

Show quoted text

Scott Marlowe <smarlowe@g2switchworks.com> writes:

So there's a misjudgment of the number of rows returned by a factor of
about 88. That's pretty big. Since you had the same number without the
receipt date (I think...) then it's the encounter_id that's not being
counted right.

I don't think that's Steve's problem, though. It's certainly
misestimating, but nonetheless the cost estimate for the seqscan is
1902749.83 versus 14819.81 for the bitmap scan; it should've picked
the bitmap scan anyway.

I tried to duplicate the problem here, without any success; I get

Bitmap Heap Scan on detail_summary ds (cost=422.01..801.27 rows=100 width=4)
Recheck Cond: (encounter_id = ANY ('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[]))
Filter: (receipt >= '1998-12-30'::date)
-> Bitmap Index Scan on detail_summary_encounter_id_idx (cost=0.00..421.98 rows=100 width=0)
Index Cond: (encounter_id = ANY ('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[]))

so either this has been fixed by a post-8.2.3 bug fix (which I doubt,
it doesn't seem familiar at all) or there's some additional contributing
factor. Steve, are you using any nondefault planner parameters?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Scott Marlowe (#20)
Re: Strangely Variable Query Performance

Scott Marlowe <smarlowe@g2switchworks.com> writes:

So there's a misjudgment of the number of rows returned by a factor of
about 88. That's pretty big. Since you had the same number without the
receipt date (I think...) then it's the encounter_id that's not being
counted right.

I don't think that's Steve's problem, though. It's certainly
misestimating, but nonetheless the cost estimate for the seqscan is
1902749.83 versus 14819.81 for the bitmap scan; it should've picked
the bitmap scan anyway.

I tried to duplicate the problem here, without any success; I get

Bitmap Heap Scan on detail_summary ds (cost=422.01..801.27 rows=100 width=4)
Recheck Cond: (encounter_id = ANY ('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[]))
Filter: (receipt >= '1998-12-30'::date)
-> Bitmap Index Scan on detail_summary_encounter_id_idx (cost=0.00..421.98 rows=100 width=0)
Index Cond: (encounter_id = ANY ('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[]))

so either this has been fixed by a post-8.2.3 bug fix (which I doubt,
it doesn't seem familiar at all) or there's some additional contributing
factor. Steve, are you using any nondefault planner parameters?

regards, tom lane

#27Steve
cheetah@tanabi.org
In reply to: Jason Lustig (#1)
Re: Strangely Variable Query Performance

It's a redhat enterprise machine running AMD x64 processors.

Linux ers3.dddcorp.com 2.6.9-42.0.10.ELsmp #1 SMP Fri Feb 16 17:13:42 EST
2007 x86_64 x86_64 x86_64 GNU/Linux

It was compiled by me, straight up, nothing weird at all, no odd compiler
options or wahtever :)

So yeah :/ I'm quite baffled as well,

Talk to you later,

Steve

On Thu, 12 Apr 2007, Tom Lane wrote:

Show quoted text

Steve <cheetah@tanabi.org> writes:

Here's my planner parameters:

I copied all these, and my 8.2.x still likes the bitmap scan a lot
better than the seqscan. Furthermore, I double-checked the CVS history
and there definitely haven't been any changes in that area in REL8_2
branch since 8.2.3. So I'm a bit baffled. Maybe the misbehavior is
platform-specific ... what are you on exactly? Is there anything
nonstandard about your Postgres installation?

regards, tom lane

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#25)
Re: Strangely Variable Query Performance

Steve <cheetah@tanabi.org> writes:

Here's my planner parameters:

I copied all these, and my 8.2.x still likes the bitmap scan a lot
better than the seqscan. Furthermore, I double-checked the CVS history
and there definitely haven't been any changes in that area in REL8_2
branch since 8.2.3. So I'm a bit baffled. Maybe the misbehavior is
platform-specific ... what are you on exactly? Is there anything
nonstandard about your Postgres installation?

regards, tom lane

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#25)
Re: Strangely Variable Query Performance

Steve <cheetah@tanabi.org> writes:

... even if I force it to use the indexes
(enable_seqscan=off) it doesn't make it any faster really :/

Does that change the plan, or do you still get a seqscan?

BTW, how big is this table really (how many rows)?

regards, tom lane

#30Steve
cheetah@tanabi.org
In reply to: Tom Lane (#29)
Re: Strangely Variable Query Performance

Table size: 16,037,728 rows

With enable_seqscan=off I get:

Bitmap Heap Scan on detail_summary ds (cost=4211395.20..4213045.32
rows=1099 width=4)
Recheck Cond: ((receipt >= '1998-12-30'::date) AND (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[])))
-> Bitmap Index Scan on detail_summary_receipt_encounter_idx
(cost=0.00..4211395.17 rows=1099 width=0)
Index Cond: ((receipt >= '1998-12-30'::date) AND (encounter_id =
ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[])))

The explain analyze is pending, running it now (however it doens't really
appear to be any faster using this plan).

Steve

On Thu, 12 Apr 2007, Tom Lane wrote:

Show quoted text

Steve <cheetah@tanabi.org> writes:

... even if I force it to use the indexes
(enable_seqscan=off) it doesn't make it any faster really :/

Does that change the plan, or do you still get a seqscan?

BTW, how big is this table really (how many rows)?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

#31Steve
cheetah@tanabi.org
In reply to: Tom Lane (#29)
Re: Strangely Variable Query Performance

Here's the explain analyze with seqscan = off:

Bitmap Heap Scan on detail_summary ds (cost=4211395.20..4213045.32
rows=1099 width=4) (actual time=121288.825..121305.908 rows=112 loops=1)
Recheck Cond: ((receipt >= '1998-12-30'::date) AND (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[])))
-> Bitmap Index Scan on detail_summary_receipt_encounter_idx
(cost=0.00..4211395.17 rows=1099 width=0) (actual
time=121256.681..121256.681 rows=112 loops=1)
Index Cond: ((receipt >= '1998-12-30'::date) AND (encounter_id =
ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[])))
Total runtime: 121306.233 ms

Your other question is answered in the other mail along with the
non-analyze'd query plan :D

Steve

On Thu, 12 Apr 2007, Tom Lane wrote:

Show quoted text

Steve <cheetah@tanabi.org> writes:

... even if I force it to use the indexes
(enable_seqscan=off) it doesn't make it any faster really :/

Does that change the plan, or do you still get a seqscan?

BTW, how big is this table really (how many rows)?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#30)
Re: Strangely Variable Query Performance

Steve <cheetah@tanabi.org> writes:

With enable_seqscan=off I get:

-> Bitmap Index Scan on detail_summary_receipt_encounter_idx
(cost=0.00..4211395.17 rows=1099 width=0)
Index Cond: ((receipt >= '1998-12-30'::date) AND (encounter_id =
ANY ...

The explain analyze is pending, running it now (however it doens't really
appear to be any faster using this plan).

Yeah, that index is nearly useless for this query --- since the receipt
condition isn't really eliminating anything, it'll have to look at every
index entry :-( ... in fact, do so again for each of the IN arms :-( :-(
So it's definitely right not to want to use that plan. Question is, why
is it seemingly failing to consider the "right" index?

I'm busy setting up my test case on an x86_64 machine right now, but
I rather fear it'll still work just fine for me. Have you got any
nondefault parameter settings besides the ones you already mentioned?

Another thing that might be interesting, if you haven't got a problem
with exclusive-locking the table for a little bit, is

BEGIN;
DROP INDEX each index except detail_summary_encounter_id_idx
EXPLAIN the problem query
ROLLBACK;

just to see if it does the right thing when it's not distracted by
all the "wrong" indexes (yeah, I'm grasping at straws here). If you
set up the above as a SQL script it should only take a second to run.
Please try this with both settings of enable_seqscan --- you don't need
to do "explain analyze" though, we just want to know which plan it picks
and what the cost estimate is.

regards, tom lane

#33Steve
cheetah@tanabi.org
In reply to: Tom Lane (#32)
Re: Strangely Variable Query Performance

If the other indexes are removed, with enable_seqscan=on:

Bitmap Heap Scan on detail_summary ds (cost=154.10..1804.22 rows=1099
width=4)
Recheck Cond: (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[]))
Filter: (receipt >= '1998-12-30'::date)
-> Bitmap Index Scan on detail_summary_encounter_id_idx
(cost=0.00..154.07 rows=1099 width=0)
Index Cond: (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[]))

With it off:

Bitmap Heap Scan on detail_summary ds (cost=154.10..1804.22 rows=1099
width=4)
Recheck Cond: (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[]))
Filter: (receipt >= '1998-12-30'::date)
-> Bitmap Index Scan on detail_summary_encounter_id_idx
(cost=0.00..154.07 rows=1099 width=0)
Index Cond: (encounter_id = ANY
('{8813186,8813187,8813188,8813189,8813190,8813191,8813192,8813193,8813194,8813195,8813196,8813197,8813198,8813199,8813200,8813201,8813202,8813203,8813204,8813205,8813206,8813207,8813208,8813209,8813210,8813211,8813212,8813213,8813214,8813215,8813216,8813217,8813218,8813219,8813220,8813221,8813222,8813223,8813224,8813225,8813226,8813227,8813228,8813229,8813230,8813231,8813232,8813233,8813234,8813235,8813236,8813237,8813238,8813239,8813240,8813241,8813242,8813243,8813244,8813245,8813246,8813247,8813248,8813249,8813250,8813251,8813252,8813253,8813254,8813255,8813256,8813257,8813258,8813259,8813260,8813261,8813262,8813263,8813264,8813265,8813266,8813267,8813268,8813269,8813270,8813271,8813272,8813273,8813274,8813275,8813276,8813277,8813278,8813279,8813280,8813281,8813282,8813283,8813284,8815534}'::integer[]))

Either way, it runs perfectly fast. So it looks like the indexes are
confusing this query like you suspected. Any advise? This isn't the only
query we run on this table, much as I'd absolutely love to kill off some
indexes to imrpove our nightly load times I can't foul up the other
queries :)

Thank you very much for all your help on this issue, too!

Steve

On Thu, 12 Apr 2007, Tom Lane wrote:

Show quoted text

Steve <cheetah@tanabi.org> writes:

With enable_seqscan=off I get:

-> Bitmap Index Scan on detail_summary_receipt_encounter_idx
(cost=0.00..4211395.17 rows=1099 width=0)
Index Cond: ((receipt >= '1998-12-30'::date) AND (encounter_id =
ANY ...

The explain analyze is pending, running it now (however it doens't really
appear to be any faster using this plan).

Yeah, that index is nearly useless for this query --- since the receipt
condition isn't really eliminating anything, it'll have to look at every
index entry :-( ... in fact, do so again for each of the IN arms :-( :-(
So it's definitely right not to want to use that plan. Question is, why
is it seemingly failing to consider the "right" index?

I'm busy setting up my test case on an x86_64 machine right now, but
I rather fear it'll still work just fine for me. Have you got any
nondefault parameter settings besides the ones you already mentioned?

Another thing that might be interesting, if you haven't got a problem
with exclusive-locking the table for a little bit, is

BEGIN;
DROP INDEX each index except detail_summary_encounter_id_idx
EXPLAIN the problem query
ROLLBACK;

just to see if it does the right thing when it's not distracted by
all the "wrong" indexes (yeah, I'm grasping at straws here). If you
set up the above as a SQL script it should only take a second to run.
Please try this with both settings of enable_seqscan --- you don't need
to do "explain analyze" though, we just want to know which plan it picks
and what the cost estimate is.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#33)
Re: Strangely Variable Query Performance

Steve <cheetah@tanabi.org> writes:

Either way, it runs perfectly fast. So it looks like the indexes are
confusing this query like you suspected. Any advise?

Wow --- sometimes grasping at straws pays off. I was testing here with
just a subset of the indexes to save build time, but I bet that one of
the "irrelevant" ones is affecting this somehow. Time to re-test.

If you have some time to kill, it might be interesting to vary that
begin/rollback test script to leave one or two other indexes in place,
and see if you can identify exactly which other index(es) get it
confused.

I'm about to go out to dinner with the wife, but will have a closer
look when I get back, or tomorrow morning. We'll figure this out.

regards, tom lane

#35Steve
cheetah@tanabi.org
In reply to: Tom Lane (#34)
Re: Strangely Variable Query Performance

Okay -- I started leaving indexes on one by one.

The explain broke when the detail_summary_receipt_encounter_idx index was
left on (receipt, encounter_id).

Just dropping that index had no effect, but there's a LOT of indexes that
refer to receipt. So on a hunch I tried dropping all indexes that refer
to receipt date and that worked -- so it's the indexes that contain
receipt date that are teh problem.

For more fun, I tried leaving the index that's just receipt date alone
(index detail_summary_receipt_id_idx) and THAT produced the correct query;
it's all these multi-column queries that are fouling things up, it would
seem!

.... So does this mean I should experiment with dropping those indexes?
I'm not sure if that will result in 'bad things' as there are other
complicated actions like sorts that may go real slow if I drop those
indexes. BUT I think it'll be easy to convince the customer to drop their
absurdly complicated sorts if I can come back with serious results like
what we've worked out here.

And thanks again -- have a good dinner! :)

Steve

On Thu, 12 Apr 2007, Tom Lane wrote:

Show quoted text

Steve <cheetah@tanabi.org> writes:

Either way, it runs perfectly fast. So it looks like the indexes are
confusing this query like you suspected. Any advise?

Wow --- sometimes grasping at straws pays off. I was testing here with
just a subset of the indexes to save build time, but I bet that one of
the "irrelevant" ones is affecting this somehow. Time to re-test.

If you have some time to kill, it might be interesting to vary that
begin/rollback test script to leave one or two other indexes in place,
and see if you can identify exactly which other index(es) get it
confused.

I'm about to go out to dinner with the wife, but will have a closer
look when I get back, or tomorrow morning. We'll figure this out.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#35)
Re: Strangely Variable Query Performance

Steve <cheetah@tanabi.org> writes:

Okay -- I started leaving indexes on one by one.
...
.... So does this mean I should experiment with dropping those indexes?

No, I think this means there's a planner bug to fix. I haven't quite
scoped out what it is yet, though.

regards, tom lane

#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#35)
Re: Strangely Variable Query Performance

Steve <cheetah@tanabi.org> writes:

Just dropping that index had no effect, but there's a LOT of indexes that
refer to receipt. So on a hunch I tried dropping all indexes that refer
to receipt date and that worked -- so it's the indexes that contain
receipt date that are teh problem.

I'm still not having any luck reproducing the failure here. Grasping at
straws again, I wonder if it's got something to do with the order in
which the planner examines the indexes --- which is OID order. Could
you send me the results of

select indexrelid::regclass from pg_index where indrelid = 'detail_summary'::regclass order by indexrelid;

regards, tom lane

#38Steve
cheetah@tanabi.org
In reply to: Tom Lane (#37)
Re: Strangely Variable Query Performance

Here you go:

detail_summary_b_record_status_idx
detail_summary_batch_id_idx
detail_summary_batchnum_idx
detail_summary_carrier_id_idx
detail_summary_duplicate_id_idx
detail_summary_e_record_status_idx
detail_summary_encounter_id_idx
detail_summary_encounternum_idx
detail_summary_export_date_idx
detail_summary_hedis_date_idx
detail_summary_member_name_idx
detail_summary_member_num_idx
detail_summary_p_record_status_idx
detail_summary_patient_control_idx
detail_summary_procedurecode_idx
detail_summary_product_line_idx
detail_summary_provider_id_idx
detail_summary_raps_date_idx
detail_summary_receipt_id_idx
detail_summary_referrering_prov_id_idx
detail_summary_rendering_prov_id_idx
detail_summary_rendering_prov_name_idx
detail_summary_servicedate_idx
ds_sort_1
ds_sort_10
ed_cbee_norev
ed_cbee_norev_p
ed_cbee_rev
ed_cbee_rev_p
mcbe
mcbe_p
mcbe_rev
mcbe_rev_p
mcbee_norev
mcbee_norev_p
mcbee_rev
mcbee_rev_p
pcbee_norev
pcbee_norev_p
pcbee_rev
pcbee_rev_p
rcbee_norev
rcbee_norev_p
rp_cbee_norev
rp_cbee_norev_p
rp_cbee_rev
rp_cbee_rev_p
sd_cbee_norev
sd_cbee_norev_p
sd_cbee_rev
sd_cbee_rev_p
testrev
testrev_p
detail_summary_receipt_encounter_idx

On Thu, 12 Apr 2007, Tom Lane wrote:

Show quoted text

Steve <cheetah@tanabi.org> writes:

Just dropping that index had no effect, but there's a LOT of indexes that
refer to receipt. So on a hunch I tried dropping all indexes that refer
to receipt date and that worked -- so it's the indexes that contain
receipt date that are teh problem.

I'm still not having any luck reproducing the failure here. Grasping at
straws again, I wonder if it's got something to do with the order in
which the planner examines the indexes --- which is OID order. Could
you send me the results of

select indexrelid::regclass from pg_index where indrelid = 'detail_summary'::regclass order by indexrelid;

regards, tom lane

#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#38)
Re: Strangely Variable Query Performance

Steve <cheetah@tanabi.org> writes:

On Thu, 12 Apr 2007, Tom Lane wrote:

I'm still not having any luck reproducing the failure here. Grasping at
straws again, I wonder if it's got something to do with the order in
which the planner examines the indexes --- which is OID order. Could
you send me the results of

Here you go:

Nope, still doesn't fail for me. [ baffled and annoyed... ] There must
be something about your situation that's relevant but we aren't
recognizing it. Are you in a position to let me ssh into your machine
and try to debug it? Or other options like sending me a dump of your
database? I'm about out of steam for tonight, but let's talk about it
off-list tomorrow.

regards, tom lane

#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#38)
choose_bitmap_and again (was Re: Strangely Variable Query Performance)

Steve <cheetah@tanabi.org> writes:

[ strange planner misbehavior in 8.2.3 ]

After some off-list investigation (thanks, Steve, for letting me poke
at your machine), the short answer is that the heuristics used by
choose_bitmap_and() suck. The problem query is like

select ... from ds where
ds.receipt >= '1998-12-30 0:0:0' and
ds.encounter_id in ( ... 100 distinct values ... );

and the table has a truly remarkable variety of indexes on encounter_id,
receipt, and combinations of them with other columns. The receipt
condition is actually in effect a no-op, because all receipt dates are
later than that, but because ineq_histogram_selectivity doesn't trust
histogram data unreservedly we compute a selectivity of about 0.99997
for it. That means that the indexes that cover both receipt and
encounter_id are given a selectivity score just fractionally better than
those involving encounter_id alone, and therefore they sort first in
choose_bitmap_and's sort step, and the way that that routine is coded,
only combinations of the very first index with other ones will be
considered for a bitmap heap scan. So the possibility of using just the
index on encounter_id alone is never considered, even though that
alternative is vastly cheaper than the alternatives that are considered.
(It happens that encounter_id is a low-order column in all the indexes
that include receipt, and so these scans end up covering the whole index
... multiple times even. The cost estimation is fine --- the thing
knows these are expensive --- what's falling down is the heuristic for
which combinations of indexes to consider using in a bitmap scan.)

The original coding of choose_bitmap_and involved a "fuzzy" comparison
of selectivities, which would have avoided this problem, but we got rid
of that later because it had its own problems. In fact,
choose_bitmap_and has caused us enough problems that I'm thinking we
need a fundamental rethink of how it works, rather than just marginal
tweaks. If you haven't looked at this code before, the comments explain
the idea well enough:

/*
* choose_bitmap_and
* Given a nonempty list of bitmap paths, AND them into one path.
*
* This is a nontrivial decision since we can legally use any subset of the
* given path set. We want to choose a good tradeoff between selectivity
* and cost of computing the bitmap.
*
* The result is either a single one of the inputs, or a BitmapAndPath
* combining multiple inputs.
*/
...
/*
* In theory we should consider every nonempty subset of the given paths.
* In practice that seems like overkill, given the crude nature of the
* estimates, not to mention the possible effects of higher-level AND and
* OR clauses. As a compromise, we sort the paths by selectivity. We
* always take the first, and sequentially add on paths that result in a
* lower estimated cost.
*
* We also make some effort to detect directly redundant input paths, as
* can happen if there are multiple possibly usable indexes. (Another way
* it can happen is that best_inner_indexscan will find the same OR join
* clauses that create_or_index_quals has pulled OR restriction clauses
* out of, and then both versions show up as duplicate paths.) We
* consider an index redundant if any of its index conditions were already
* used by earlier indexes. (We could use predicate_implied_by to have a
* more intelligent, but much more expensive, check --- but in most cases
* simple pointer equality should suffice, since after all the index
* conditions are all coming from the same RestrictInfo lists.)
*
* You might think the condition for redundancy should be "all index
* conditions already used", not "any", but this turns out to be wrong.
* For example, if we use an index on A, and then come to an index with
* conditions on A and B, the only way that the second index can be later
* in the selectivity-order sort is if the condition on B is completely
* non-selective. In any case, we'd surely be drastically misestimating
* the selectivity if we count the same condition twice.
*
* We include index predicate conditions in the redundancy test. Because
* the test is just for pointer equality and not equal(), the effect is
* that use of the same partial index in two different AND elements is
* considered redundant. (XXX is this too strong?)
*
* Note: outputting the selected sub-paths in selectivity order is a good
* thing even if we weren't using that as part of the selection method,
* because it makes the short-circuit case in MultiExecBitmapAnd() more
* likely to apply.
*/

One idea I thought about was to sort by index scan cost, using
selectivity only as a tiebreaker for cost, rather than the other way
around as is currently done. This seems fairly plausible because
indexscans that are cheaper than other indexscans likely return fewer
rows too, and so selectivity is already accounted for to some extent ---
at least you can't have an enormously worse selectivity at lower cost,
whereas Steve's example proves it doesn't work the other way. But I'm
worried about breaking the reasoning about redundant indexes that's
mentioned in the comments.

Another alternative that would respond to the immediate problem is to
maintain the current sort order, but as we come to each index, consider
using that one alone, and throw away whatever AND we might have built up
if that one alone beats the AND-so-far. This seems more conservative,
as it's unlikely to break any cases that work well now, but on the other
hand it feels like plastering another wart atop a structure that's
already rather rickety.

Has anyone got any thoughts about the best way to do this?

regards, tom lane

#41Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#40)
Re: [HACKERS] choose_bitmap_and again (was Re: Strangely Variable Query Performance)

Tom Lane wrote:

One idea I thought about was to sort by index scan cost, using
selectivity only as a tiebreaker for cost, rather than the other way
around as is currently done. This seems fairly plausible because
indexscans that are cheaper than other indexscans likely return fewer
rows too, and so selectivity is already accounted for to some extent ---
at least you can't have an enormously worse selectivity at lower cost,
whereas Steve's example proves it doesn't work the other way. But I'm
worried about breaking the reasoning about redundant indexes that's
mentioned in the comments.

Another alternative that would respond to the immediate problem is to
maintain the current sort order, but as we come to each index, consider
using that one alone, and throw away whatever AND we might have built up
if that one alone beats the AND-so-far. This seems more conservative,
as it's unlikely to break any cases that work well now, but on the other
hand it feels like plastering another wart atop a structure that's
already rather rickety.

Has anyone got any thoughts about the best way to do this?

How about doing both: sort the index by index scan cost; then pick the
first index on the list and start adding indexes when they lower the
cost. When adding each index, consider it by itself against the
already stacked indexes. If the cost is lower, put this index at the
top of the list, and restart the algorithm (after the sorting step of
course).

I think the concern about condition redundancy should be attacked
separately. How about just comparing whether they have common prefixes
of conditions? I admit I don't understand what would happen with
indexes defined like (lower(A), B, C) versus (A, B) for example.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#42Simon Riggs
simon@2ndquadrant.com
In reply to: Tom Lane (#40)
Re: choose_bitmap_and again (was Re: [PERFORM] StrangelyVariable Query Performance)

On Fri, 2007-04-13 at 18:48 -0400, Tom Lane wrote:

Has anyone got any thoughts about the best way to do this?

I don't think we know enough to pick one variant that works in all
cases. This requires more detailed analysis of various cases.

Lets put in a parameter to allow the options to be varied. The purpose
would be to look for some more information that allows us to see what
the pre-conditions would be for each heuristic.

Initially, we say this is a beta-only feature and may be withdrawn in
the production version.

Why did it not pick my index? is a tiring game, but one that must be
played. I hope that the ORDER LIMIT optimization should reduce the
number of indexes chosen to maintain sort order in the output.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#43Dann Corbit
DCorbit@connx.com
In reply to: Alvaro Herrera (#41)
Re: choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
owner@postgresql.org] On Behalf Of Alvaro Herrera
Sent: Friday, April 13, 2007 4:24 PM
To: Tom Lane
Cc: pgsql-hackers@postgreSQL.org; PostgreSQL Performance; Steve
Subject: Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM]
Strangely Variable Query Performance)

Tom Lane wrote:

One idea I thought about was to sort by index scan cost, using
selectivity only as a tiebreaker for cost, rather than the other way
around as is currently done. This seems fairly plausible because
indexscans that are cheaper than other indexscans likely return

fewer

rows too, and so selectivity is already accounted for to some extent

---

at least you can't have an enormously worse selectivity at lower

cost,

whereas Steve's example proves it doesn't work the other way. But

I'm

worried about breaking the reasoning about redundant indexes that's
mentioned in the comments.

Another alternative that would respond to the immediate problem is

to

maintain the current sort order, but as we come to each index,

consider

using that one alone, and throw away whatever AND we might have

built up

if that one alone beats the AND-so-far. This seems more

conservative,

as it's unlikely to break any cases that work well now, but on the

other

hand it feels like plastering another wart atop a structure that's
already rather rickety.

Has anyone got any thoughts about the best way to do this?

How about doing both: sort the index by index scan cost; then pick the
first index on the list and start adding indexes when they lower the
cost. When adding each index, consider it by itself against the
already stacked indexes. If the cost is lower, put this index at the
top of the list, and restart the algorithm (after the sorting step of
course).

I think the concern about condition redundancy should be attacked
separately. How about just comparing whether they have common

prefixes

of conditions? I admit I don't understand what would happen with
indexes defined like (lower(A), B, C) versus (A, B) for example.

Instead of sorting, I suggest the quickselect() algorithm, which is
O(n).
Probably, if the list is small, it won't matter much, but it might offer
some tangible benefit.

Here is an example of the algorithm:

#include <stdlib.h>
typedef double Etype; /* Season to taste. */

extern Etype RandomSelect(Etype * A, size_t p, size_t r, size_t i);
extern size_t RandRange(size_t a, size_t b);
extern size_t RandomPartition(Etype * A, size_t p, size_t r);
extern size_t Partition(Etype * A, size_t p, size_t r);

/*
**
** In the following code, every reference to CLR means:
**
** "Introduction to Algorithms"
** By Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
** ISBN 0-07-013143-0
*/

/*
** CLR, page 187
*/
Etype RandomSelect(Etype A[], size_t p, size_t r, size_t i)
{
size_t q,
k;
if (p == r)
return A[p];
q = RandomPartition(A, p, r);
k = q - p + 1;

if (i <= k)
return RandomSelect(A, p, q, i);
else
return RandomSelect(A, q + 1, r, i - k);
}

size_t RandRange(size_t a, size_t b)
{
size_t c = (size_t) ((double) rand() / ((double) RAND_MAX + 1) * (b
- a));
return c + a;
}

/*
** CLR, page 162
*/
size_t RandomPartition(Etype A[], size_t p, size_t r)
{
size_t i = RandRange(p, r);
Etype Temp;
Temp = A[p];
A[p] = A[i];
A[i] = Temp;
return Partition(A, p, r);
}

/*
** CLR, page 154
*/
size_t Partition(Etype A[], size_t p, size_t r)
{
Etype x,
temp;
size_t i,
j;

x = A[p];
i = p - 1;
j = r + 1;

for (;;) {
do {
j--;
} while (!(A[j] <= x));
do {
i++;
} while (!(A[i] >= x));
if (i < j) {
temp = A[i];
A[i] = A[j];
A[j] = temp;
} else
return j;
}
}

#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dann Corbit (#43)
Re: [HACKERS] choose_bitmap_and again (was Re: Strangely Variable Query Performance)

"Dann Corbit" <DCorbit@connx.com> writes:

Instead of sorting, I suggest the quickselect() algorithm, which is
O(n).

What for? Common cases have less than half a dozen entries. That is
not the place we need to be spending engineering effort --- what we
need to worry about is what's the choice algorithm, not implementation
details.

regards, tom lane

#45Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#41)
Re: [HACKERS] choose_bitmap_and again (was Re: Strangely Variable Query Performance)

Alvaro Herrera <alvherre@commandprompt.com> writes:

I think the concern about condition redundancy should be attacked
separately. How about just comparing whether they have common prefixes
of conditions? I admit I don't understand what would happen with
indexes defined like (lower(A), B, C) versus (A, B) for example.

I understand that issue a bit better than I did when I wrote the comment
(so I suppose I better rewrite it). The $64 reason for rejecting
AND-combinations of indexes that are using some of the same
WHERE-conditions is that if we don't, we effectively double-count the
selectivity of those conditions, causing us to prefer useless
AND-combinations. An example is "WHERE A > 42 AND B < 100" where we
have an index on A and one on (A,B). The selectivity calculation
will blindly assume that the selectivities of the two indexes are
independent and thus prefer to BitmapAnd them, when obviously there
is no point in using both. Ideally we should improve the selectivity
calculation to not get fooled like this, but that seems hard and
probably slow. So for the moment we have the heuristic that no
WHERE-clause should be used twice in any AND-combination.

Given that we are using that heuristic, it becomes important that
we visit the indexes in the "right order" --- as the code stands,
in the (A) vs (A,B) case it will consider only the first index it
arrives at in the selectivity sort order, because the second will
be rejected on the basis of re-use of the WHERE A > 42 condition.
Sorting by selectivity tends to ensure that we pick the index that
can make use of as many WHERE-clauses as possible.

The idea of considering each index alone fixes the order dependency
for cases where a single index is the best answer, but it doesn't
help much for cases where you really do want a BitmapAnd, only not
one using the index with the individually best selectivity.

We really need a heuristic here --- exhaustive search will be
impractical in cases where there are many indexes, because you'd
be looking at 2^N combinations. (In Steve's example there are
actually 38 relevant indexes, which is bad database design if
you ask me, but in any case we cannot afford to search through
2^38 possibilities.) But the one we're using now is too fragile.

Maybe we should use a cutoff similar to the GEQO one: do exhaustive
search if there are less than N relevant indexes, for some N.
But that's not going to help Steve; we still need a smarter heuristic
for what to look for above the cutoff.

regards, tom lane

#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#41)
Re: [HACKERS] choose_bitmap_and again (was Re: Strangely Variable Query Performance)

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom Lane wrote:

Has anyone got any thoughts about the best way to do this?

How about doing both: sort the index by index scan cost; then pick the
first index on the list and start adding indexes when they lower the
cost. When adding each index, consider it by itself against the
already stacked indexes. If the cost is lower, put this index at the
top of the list, and restart the algorithm (after the sorting step of
course).

The "restart" part of that bothers me --- it's not entirely clear that
you couldn't get into an infinite loop. (Imagine that A+B is better
than A alone, so we adopt it, but worse than C alone, so we take C as
the new leader and start over. Then perhaps C+B is better than C alone
but worse than A alone, so we take A as the leader and start over.
Maybe this is impossible but I'm unsure.)

I looked back at the gdb results I'd gotten from Steve's example and
noticed that for his 38 indexes there were only three distinct index
selectivity values, and the sort step grouped indexes by cost within
those groups. In hindsight of course this is obvious: the selectivity
depends on the set of WHERE-clauses used, so with two WHERE-clauses
there are three possible selectivities (to within roundoff error anyway)
depending on whether the index uses one or both clauses. So the
existing algorithm gets one thing right: for any two indexes that make
use of just the same set of WHERE-clauses, it will always take the one
with cheaper scan cost.

Thinking more about this leads me to the following proposal:

1. Explicitly group the indexes according to the subset of
WHERE-conditions (and partial index conditions, if any) they use.
Within each such group, discard all but the cheapest-scan-cost one.

2. Sort the remaining indexes according to scan cost.

3. For each index in order, consider it as a standalone scan, and also
consider adding it on to the AND-group led by each preceding index,
using the same logic as now: reject using any WHERE-condition twice
in a group, and then add on only if the total cost of the AND-group
scan is reduced.

This would be approximately O(N^2) in the number of relevant indexes,
whereas the current scheme is roughly linear (handwaving a bit here
because the number of WHERE-clauses is a factor too). But that seems
unlikely to be a problem for sane numbers of indexes, unlike the O(2^N)
behavior of an exhaustive search. It would get rid of (most of) the
order-sensitivity problem, since we would definitely consider the
AND-combination of every pair of combinable indexes. I can imagine
cases where it wouldn't notice useful three-or-more-way combinations
because the preceding two-way combination didn't win, but they seem
pretty remote possibilities.

Comments?

regards, tom lane

#47Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#46)
Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

I wrote:

Thinking more about this leads me to the following proposal:

1. Explicitly group the indexes according to the subset of
WHERE-conditions (and partial index conditions, if any) they use.
Within each such group, discard all but the cheapest-scan-cost one.

2. Sort the remaining indexes according to scan cost.

3. For each index in order, consider it as a standalone scan, and also
consider adding it on to the AND-group led by each preceding index,
using the same logic as now: reject using any WHERE-condition twice
in a group, and then add on only if the total cost of the AND-group
scan is reduced.

Here is a patch along these lines, in fact two patches (HEAD and 8.2
versions). The 8.2 version picks up some additional partial-index
intelligence that I added to HEAD on Mar 21 but did not at that time
risk back-patching --- since this is a fairly large rewrite of the
routine, keeping the branches in sync seems best.

Steve, can you try this out on your queries and see if it makes better
or worse decisions? It seems to fix your initial complaint but I do
not have a large stock of test cases to try.

regards, tom lane

#48Steve
cheetah@tanabi.org
In reply to: Tom Lane (#47)
Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

Steve, can you try this out on your queries and see if it makes better
or worse decisions? It seems to fix your initial complaint but I do
not have a large stock of test cases to try.

Wow, this is a remarkable difference. Queries that were taking
minutes to complete are coming up in seconds. Good work, I think this'll
solve my customer's needs for their demo on the 19th :)

Thank you so much!

Steve

#49Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve (#48)
Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

Steve <cheetah@tanabi.org> writes:

Steve, can you try this out on your queries and see if it makes better
or worse decisions? It seems to fix your initial complaint but I do
not have a large stock of test cases to try.

Wow, this is a remarkable difference. Queries that were taking
minutes to complete are coming up in seconds. Good work, I think this'll
solve my customer's needs for their demo on the 19th :)

Can you find any cases where it makes a worse choice than before?
Another thing to pay attention to is whether the planning time gets
noticeably worse. If we can't find any cases where it loses badly
on those measures, I'll feel comfortable in applying it...

regards, tom lane

#50Steve
cheetah@tanabi.org
In reply to: Tom Lane (#49)
Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

Can you find any cases where it makes a worse choice than before?
Another thing to pay attention to is whether the planning time gets
noticeably worse. If we can't find any cases where it loses badly
on those measures, I'll feel comfortable in applying it...

I'll see what I can find -- I'll let you know on Monday if I can
find any queries that perform worse. My tests so far have shown
equivalent or better performance so far but I've only done sort of a
survey so far ... I've got plenty of special cases to test that should
put this through the paces.

Steve

#51Steve
cheetah@tanabi.org
In reply to: Tom Lane (#49)
Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

Can you find any cases where it makes a worse choice than before?
Another thing to pay attention to is whether the planning time gets
noticeably worse. If we can't find any cases where it loses badly
on those measures, I'll feel comfortable in applying it...

Okay, here's the vedict; all the "extremely slow" queries (i.e.
queries that took more than 30 seconds and upwards of several minutes to
complete) are now running in the realm of reason. In fact, most queries
that took between 1 and 4 minutes are now down to taking about 9 seconds
which is obviously a tremendous improvement.

A few of the queries that were taking 9 seconds or less are
"slightly slower" -- meaning a second or two slower. However most of them
are running at the same speed they were before, or better.

So I'd say as far as I can tell with my application and my
dataset, this change is solid and an obvious improvement.

Talk to you later,

Steve

#52Alvaro Herrera
alvherre@commandprompt.com
In reply to: Steve (#51)
Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

Steve wrote:

Can you find any cases where it makes a worse choice than before?
Another thing to pay attention to is whether the planning time gets
noticeably worse. If we can't find any cases where it loses badly
on those measures, I'll feel comfortable in applying it...

Okay, here's the vedict; all the "extremely slow" queries (i.e.
queries that took more than 30 seconds and upwards of several minutes to
complete) are now running in the realm of reason. In fact, most queries
that took between 1 and 4 minutes are now down to taking about 9 seconds
which is obviously a tremendous improvement.

A few of the queries that were taking 9 seconds or less are
"slightly slower" -- meaning a second or two slower. However most of them
are running at the same speed they were before, or better.

So I'd say as far as I can tell with my application and my
dataset, this change is solid and an obvious improvement.

Maybe it would be interesting to see in detail those cases that got a
bit slower, to further tweak the heuristic if necessary. Is the extra
time, time spent in planning or in execution?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#53Robins Tharakan
robins@pobox.com
In reply to: Tom Lane (#39)
Fwd: Strangely Variable Query Performance

Hi Tom / Steve,

Am one of the silent readers of performance issues that come up on this list
(and are discussed in detail) ... just like this one.

If and when you do come up with a solution, please do post some details
about them here... (i say that coz it seems that for obvious reasons, things
must have gone off air after tom's last email, and one can understand that).
But an analysis, or atleast a pointer may be of help to someone (like me)
reading this list.

Thanks
Robins

---------- Forwarded message ----------
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Apr 13, 2007 10:08 AM
Subject: Re: [PERFORM] Strangely Variable Query Performance
To: Steve <cheetah@tanabi.org>
Cc: Scott Marlowe <smarlowe@g2switchworks.com>, PostgreSQL Performance <
pgsql-performance@postgresql.org>

Steve <cheetah@tanabi.org> writes:

On Thu, 12 Apr 2007, Tom Lane wrote:

I'm still not having any luck reproducing the failure here. Grasping at
straws again, I wonder if it's got something to do with the order in
which the planner examines the indexes --- which is OID order. Could
you send me the results of

Here you go:

Nope, still doesn't fail for me. [ baffled and annoyed... ] There must
be something about your situation that's relevant but we aren't
recognizing it. Are you in a position to let me ssh into your machine
and try to debug it? Or other options like sending me a dump of your
database? I'm about out of steam for tonight, but let's talk about it
off-list tomorrow.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--
Robins

#54Robins
tharakan@gmail.com
In reply to: Robins Tharakan (#53)
Fwd: Strangely Variable Query Performance

Hi Tom / Steve,

Am one of the silent readers of performance issues that come up on this list
(and are discussed in detail) ... just like this one.

If and when you do come up with a solution, please do post some details
about them here... (i say that coz it seems that for obvious reasons, things
must have gone off air after tom's last email, and one can understand that).
But an analysis, or atleast a pointer may be of help to someone (like me)
reading this list.

Thanks
Robins

---------- Forwarded message ----------
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Apr 13, 2007 10:08 AM
Subject: Re: [PERFORM] Strangely Variable Query Performance
To: Steve <cheetah@tanabi.org>
Cc: Scott Marlowe <smarlowe@g2switchworks.com >, PostgreSQL Performance <
pgsql-performance@postgresql.org>

Steve <cheetah@tanabi.org> writes:

On Thu, 12 Apr 2007, Tom Lane wrote:

I'm still not having any luck reproducing the failure here. Grasping at
straws again, I wonder if it's got something to do with the order in
which the planner examines the indexes --- which is OID order. Could
you send me the results of

Here you go:

Nope, still doesn't fail for me. [ baffled and annoyed... ] There must
be something about your situation that's relevant but we aren't
recognizing it. Are you in a position to let me ssh into your machine
and try to debug it? Or other options like sending me a dump of your
database? I'm about out of steam for tonight, but let's talk about it
off-list tomorrow.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

#55Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robins (#54)
Re: Fwd: Strangely Variable Query Performance

Robins <tharakan@gmail.com> writes:

If and when you do come up with a solution, please do post some details
about them here... (i say that coz it seems that for obvious reasons, things
must have gone off air after tom's last email, and one can understand that).
But an analysis, or atleast a pointer may be of help to someone (like me)
reading this list.

Oh, sorry, the subsequent discussion moved over to pgsql-hackers:
http://archives.postgresql.org/pgsql-hackers/2007-04/msg00621.php
and -patches:
http://archives.postgresql.org/pgsql-patches/2007-04/msg00374.php

Those are good places to look if a discussion on -bugs or other lists
seems to tail off...

regards, tom lane

PS: the reason I couldn't reproduce the behavior was just that the dummy
data I was using didn't have the right statistics.

#56Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#52)
Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

Alvaro Herrera <alvherre@commandprompt.com> writes:

Steve wrote:

So I'd say as far as I can tell with my application and my
dataset, this change is solid and an obvious improvement.

Maybe it would be interesting to see in detail those cases that got a
bit slower, to further tweak the heuristic if necessary. Is the extra
time, time spent in planning or in execution?

Since there doesn't seem to be vast interest out there in testing this
further, I'm going to go ahead and apply the patch to get it out of my
working directory. We can always tweak it more later if new info
surfaces.

regards, tom lane

#57Steve
cheetah@tanabi.org
In reply to: Tom Lane (#56)
Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

Maybe it would be interesting to see in detail those cases that got a
bit slower, to further tweak the heuristic if necessary. Is the extra
time, time spent in planning or in execution?

Since there doesn't seem to be vast interest out there in testing this
further, I'm going to go ahead and apply the patch to get it out of my
working directory. We can always tweak it more later if new info
surfaces.

Doing my routine patching seems to have exploded my mail server,
sorry for not replying sooner!

I don't actually have planning vs. execution time statistics from
the older version for the queries in question -- there were not 'problem
queries' and therefore were never really analyzed. My customer's already
dragging me off to another issue, so I've got to shift gears.

Appreciate all your work -- thanks again!!! :)

Steve