Multiple Indexing, performance impact

Started by Daniel Åkerudover 24 years ago19 messages
#1Daniel Åkerud
zilch@home.se

I just made i simple little test application inserting 50'000 'pwgen 8' data into a table with only a primary key id and a text column.

In every run, it is all deleted and the tables are vacuumed.

Having one separate index on name it took 36 seconds
Having an additional index, also on name, it took 69 seconds.
Furthermore:
3 indexes: 97 seconds
4 indexes: 131 seconds
5 indexes: 163 seconds
6 indexes: 210 seconds
7 indexes: 319 seconds
8 indexes: 572 seconds
9 indexes: 831 seconds
10 indexes: 1219 seconds

Anyone know what causes the signifacant performance decrease after 7 indexes?

Daniel Åkerud

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Åkerud (#1)
Re: Multiple Indexing, performance impact

=?iso-8859-1?Q?Daniel_=C5kerud?= <zilch@home.se> writes:

Anyone know what causes the signifacant performance decrease after 7 indexe=
s?

I'd bet that somewhere around there, you are starting to see thrashing
of the buffer pool due to needing to touch too many different pages to
insert each tuple. What is your -B setting? If you increase it,
does the performance improve?

regards, tom lane

#3Daniel Åkerud
zilch@home.se
In reply to: Daniel Åkerud (#1)
Re: Multiple Indexing, performance impact

I did a ps ax | postmaster but found no -B, and concluded that it uses the
value specified in /etc/postgrelsql/postgresql.conf on shared_buffers (I
saw -B was shared buffer doing a man postmaster). I'll change this to 256
and rerun the test!

Will post the results here later. Please tell if this was a too puny
increase!

Daniel �kerud

=?iso-8859-1?Q?Daniel_=C5kerud?= <zilch@home.se> writes:

Anyone know what causes the signifacant performance decrease after 7

indexe=

Show quoted text

s?

I'd bet that somewhere around there, you are starting to see thrashing
of the buffer pool due to needing to touch too many different pages to
insert each tuple. What is your -B setting? If you increase it,
does the performance improve?

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Åkerud (#3)
Re: Multiple Indexing, performance impact

=?iso-8859-1?Q?Daniel_=C5kerud?= <zilch@home.se> writes:

I did a ps ax | postmaster but found no -B, and concluded that it uses the
value specified in /etc/postgrelsql/postgresql.conf on shared_buffers (I
saw -B was shared buffer doing a man postmaster). I'll change this to 256
and rerun the test!

Will post the results here later. Please tell if this was a too puny
increase!

That should be enough to see if there's a performance change, but for
future reference, yes you should go higher. On modern machines with
many megs of RAM, you should probably be using -B on the order of a few
thousand, at least for production installations. The reason the default
is so low is that we hope the system will still be able to fire up on
machines where the kernel enforces a SHMMAX limit of only a meg or so.
This hope is possibly in vain anymore anyway, since the system's
non-buffer shared-memory usage keeps creeping up; I think 7.1 is well
past 1MB shmem even with 64 buffers...

regards, tom lane

#5Daniel Åkerud
zilch@home.se
In reply to: Daniel Åkerud (#1)
Re: Multiple Indexing, performance impact

Holy ultra-violet-active macaronies :)

First I changed it to 256, then I changed it to 1024.

-B 128 is A
-B 256 is B
-B 1024 is C

New multiple-index performance data):

1. A: 36 B: 32 C: 35
2. A: 69 B: 53 C: 38
3. A: 97 B: 79 C: 40
4. A: 131 B: 98 C: 48
5. A: 163 B: 124 C: 52
6. A: 210 B: 146 C: 66
7. A: 319 B: 233 C: 149
8. A: 572 B: 438 C: 268
9. A: 831 B: 655 C:
10. A: 1219 B: 896 C:

The last test hasn't finished yet, but THANKS! I know the reson now, at
least... i'll try
2048 also.

-B equals --brutal-performance ? ;)

Thanks,
Daniel �kerud

Show quoted text

That should be enough to see if there's a performance change, but for
future reference, yes you should go higher. On modern machines with
many megs of RAM, you should probably be using -B on the order of a few
thousand, at least for production installations. The reason the default
is so low is that we hope the system will still be able to fire up on
machines where the kernel enforces a SHMMAX limit of only a meg or so.
This hope is possibly in vain anymore anyway, since the system's
non-buffer shared-memory usage keeps creeping up; I think 7.1 is well
past 1MB shmem even with 64 buffers...

regards, tom lane

#6Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Daniel Åkerud (#5)
Re: Multiple Indexing, performance impact

Holy ultra-violet-active macaronies :)

First I changed it to 256, then I changed it to 1024.

-B 128 is A
-B 256 is B
-B 1024 is C

New multiple-index performance data):

1. A: 36 B: 32 C: 35
2. A: 69 B: 53 C: 38
3. A: 97 B: 79 C: 40
4. A: 131 B: 98 C: 48
5. A: 163 B: 124 C: 52
6. A: 210 B: 146 C: 66
7. A: 319 B: 233 C: 149
8. A: 572 B: 438 C: 268
9. A: 831 B: 655 C:
10. A: 1219 B: 896 C:

The last test hasn't finished yet, but THANKS! I know the reson now, at
least... i'll try
2048 also.

Strange that even at 1024 performance still drops off at 7. Seems it
may be more than buffer thrashing.

-B equals --brutal-performance ? ;)

See my performance article on techdocs.postgresql.org.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#7Daniel Åkerud
zilch@home.se
In reply to: Bruce Momjian (#6)
Re: Multiple Indexing, performance impact

Tried with 2048 also, it complete took away the strange steep after 7:

D is now 2048

1. A: 36 B: 32 C: 35 D: 31
2. A: 69 B: 53 C: 38 D: 38
3. A: 97 B: 79 C: 40 D: 40
4. A: 131 B: 98 C: 48 D: 43
5. A: 163 B: 124 C: 52 D: 49
6. A: 210 B: 146 C: 66 D: 50
7. A: 319 B: 233 C: 149 D: 58
8. A: 572 B: 438 C: 268 D: 65
9. A: 831 B: 655 C: 437 D: 76
10. A: 1219 B: 896 C: 583 D: 79

What is the program called that flushes the buffers every 30 seconds on a
linux 2.2.x system?

Daniel �kerud

Show quoted text

Holy ultra-violet-active macaronies :)

First I changed it to 256, then I changed it to 1024.

-B 128 is A
-B 256 is B
-B 1024 is C

New multiple-index performance data):

1. A: 36 B: 32 C: 35
2. A: 69 B: 53 C: 38
3. A: 97 B: 79 C: 40
4. A: 131 B: 98 C: 48
5. A: 163 B: 124 C: 52
6. A: 210 B: 146 C: 66
7. A: 319 B: 233 C: 149
8. A: 572 B: 438 C: 268
9. A: 831 B: 655 C:
10. A: 1219 B: 896 C:

The last test hasn't finished yet, but THANKS! I know the reson now, at
least... i'll try
2048 also.

Strange that even at 1024 performance still drops off at 7. Seems it
may be more than buffer thrashing.

-B equals --brutal-performance ? ;)

See my performance article on techdocs.postgresql.org.

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#6)
Re: Multiple Indexing, performance impact

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Strange that even at 1024 performance still drops off at 7. Seems it
may be more than buffer thrashing.

Yeah, if anything the knee in the curve seems to be worse at 1024
buffers. Curious. Deserves more investigation, perhaps.

This does remind me that I'd been thinking of suggesting that we
raise the default -B to something more reasonable, maybe 1000 or so
(yielding an 8-meg-plus shared memory area). This wouldn't prevent
people from setting it small if they have a small SHMMAX, but it's
probably time to stop letting that case drive our default setting.
Since 64 is already too much to let 7.1 fit in SHMMAX = 1MB, I think
the original rationale for using 64 is looking pretty broken anyway.
Comments?

regards, tom lane

#9Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#8)
Re: Multiple Indexing, performance impact

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Strange that even at 1024 performance still drops off at 7. Seems it
may be more than buffer thrashing.

Yeah, if anything the knee in the curve seems to be worse at 1024
buffers. Curious. Deserves more investigation, perhaps.

This does remind me that I'd been thinking of suggesting that we
raise the default -B to something more reasonable, maybe 1000 or so
(yielding an 8-meg-plus shared memory area). This wouldn't prevent
people from setting it small if they have a small SHMMAX, but it's
probably time to stop letting that case drive our default setting.
Since 64 is already too much to let 7.1 fit in SHMMAX = 1MB, I think
the original rationale for using 64 is looking pretty broken anyway.
Comments?

BSD/OS has a 4MB max but we document how to increase it by recompiling
the kernel. Maybe if we fail the startup we can tell them how to
decrease the buffers in postgresql.conf file. Seems quite clear.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#9)
Re: Multiple Indexing, performance impact

Bruce Momjian <pgman@candle.pha.pa.us> writes:

This does remind me that I'd been thinking of suggesting that we
raise the default -B to something more reasonable, maybe 1000 or so
(yielding an 8-meg-plus shared memory area).

BSD/OS has a 4MB max but we document how to increase it by recompiling
the kernel.

Hmm. Anyone like the idea of a platform-specific default established
by configure? We could set it in the template file on platforms where
the default SHMMAX is too small to allow 1000 buffers.

regards, tom lane

#11Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#10)
Re: Multiple Indexing, performance impact

Bruce Momjian <pgman@candle.pha.pa.us> writes:

This does remind me that I'd been thinking of suggesting that we
raise the default -B to something more reasonable, maybe 1000 or so
(yielding an 8-meg-plus shared memory area).

BSD/OS has a 4MB max but we document how to increase it by recompiling
the kernel.

Hmm. Anyone like the idea of a platform-specific default established
by configure? We could set it in the template file on platforms where
the default SHMMAX is too small to allow 1000 buffers.

Template file seems like a good idea for platforms that can't handle the
default. I don't think configure should be doing such tests because the
target could be a different kernel.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#11)
Re: Multiple Indexing, performance impact

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Hmm. Anyone like the idea of a platform-specific default established
by configure? We could set it in the template file on platforms where
the default SHMMAX is too small to allow 1000 buffers.

Template file seems like a good idea for platforms that can't handle the
default. I don't think configure should be doing such tests because the
target could be a different kernel.

Right, I wasn't thinking of an actual run-time test in configure, just
that we could use it to let the OS-specific template file override the
normal default.

We could offer a --with switch to manually choose the default, too.

regards, tom lane

#13Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#12)
Re: Multiple Indexing, performance impact

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Hmm. Anyone like the idea of a platform-specific default established
by configure? We could set it in the template file on platforms where
the default SHMMAX is too small to allow 1000 buffers.

Template file seems like a good idea for platforms that can't handle the
default. I don't think configure should be doing such tests because the
target could be a different kernel.

Right, I wasn't thinking of an actual run-time test in configure, just
that we could use it to let the OS-specific template file override the
normal default.

We could offer a --with switch to manually choose the default, too.

Good idea, yes. Not sure if we need a --with switch because they can
just edit the postgresql.conf or postgresql.conf.sample file.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#13)
Re: Multiple Indexing, performance impact

Bruce Momjian <pgman@candle.pha.pa.us> writes:

We could offer a --with switch to manually choose the default, too.

Good idea, yes. Not sure if we need a --with switch because they can
just edit the postgresql.conf or postgresql.conf.sample file.

Well, we have a --with switch for DEF_MAXBACKENDS, so one for the
default number of buffers doesn't seem too unreasonable. I wouldn't
bother with it if configure didn't have to touch the value anyway...
but it's just another line or two in configure.in...

regards, tom lane

#15Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#8)
Re: Multiple Indexing, performance impact

Tom Lane writes:

This does remind me that I'd been thinking of suggesting that we
raise the default -B to something more reasonable, maybe 1000 or so
(yielding an 8-meg-plus shared memory area).

On Modern(tm) systems, 8 MB is just as arbitrary and undersized as 1 MB.
So while for real use, manual tuning will still be necessary, on test
systems we'd use significant amounts of memory for nothing, or not start
up at all.

Maybe we could look around what the default limit is these days, but
raising it to arbitrary values will just paint over the fact that user
intervention is still required and that there is almost no documentation
for this.

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#15)
Re: Multiple Indexing, performance impact

Peter Eisentraut <peter_e@gmx.net> writes:

Tom Lane writes:

This does remind me that I'd been thinking of suggesting that we
raise the default -B to something more reasonable, maybe 1000 or so
(yielding an 8-meg-plus shared memory area).

On Modern(tm) systems, 8 MB is just as arbitrary and undersized as 1 MB.

A fair complaint, but at least it's within an order of magnitude of
being reasonable; you don't *have* to tune it before you get something
approaching reasonable performance. 64 is two or more orders of
magnitude off.

So while for real use, manual tuning will still be necessary, on test
systems we'd use significant amounts of memory for nothing, or not start
up at all.

The thought of test postmasters was what kept me from proposing
something even higher than 1000. 8Mb is small enough that you can
still expect to run several postmasters without problems, on most
machines where you might contemplate the idea of multiple postmasters
at all.

Would you suggest that we have no default at all, and make users pick
something?

Maybe we could look around what the default limit is these days, but
raising it to arbitrary values will just paint over the fact that user
intervention is still required and that there is almost no documentation
for this.

We do need to have a section in the administrator's guide about tuning.

regards, tom lane

#17Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#14)
Re: [HACKERS] Re: Multiple Indexing, performance impact

Bruce Momjian <pgman@candle.pha.pa.us> writes:

We could offer a --with switch to manually choose the default, too.

Good idea, yes. Not sure if we need a --with switch because they can
just edit the postgresql.conf or postgresql.conf.sample file.

Well, we have a --with switch for DEF_MAXBACKENDS, so one for the
default number of buffers doesn't seem too unreasonable. I wouldn't
bother with it if configure didn't have to touch the value anyway...
but it's just another line or two in configure.in...

Yes, we could add that too, but now that we have postgresql.conf should
we even be mentioning stuff like that in configure. In the old days we
had a compiled-in limit but that is not true anymore, right?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#18Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#16)
Re: [GENERAL] Multiple Indexing, performance impact

Tom Lane writes:

Would you suggest that we have no default at all, and make users pick
something?

No. I'm concerned that PostgreSQL should work out of the box for
everyone. And I would prefer that PostgreSQL works the same on every
platform out of the box. Obviously we've already lost this on systems
where the default shmmax is 512kB (SCO OpenServer, Unixware) or 1 MB
(Solaris), and reducing the parameters is clearly not an option. But if a
plurality of systems have the default set at 4 MB or 8 MB then we should
stop there so we don't upset a large fraction of users.

Btw., do we have any data on how appropriate wal_buffers = 8 is?

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#18)
Re: [GENERAL] Multiple Indexing, performance impact

Peter Eisentraut <peter_e@gmx.net> writes:

No. I'm concerned that PostgreSQL should work out of the box for
everyone.

Agreed.

And I would prefer that PostgreSQL works the same on every
platform out of the box.

Well, I'm not sure that we need to take that as far as saying that
default NBuffers can't vary across platforms. It's not like we're
adding or subtracting functionality. All I want is to have the default
setup tuned a little better than it is now.

Obviously we've already lost this on systems
where the default shmmax is 512kB (SCO OpenServer, Unixware) or 1 MB
(Solaris), and reducing the parameters is clearly not an option. But if a
plurality of systems have the default set at 4 MB or 8 MB then we should
stop there so we don't upset a large fraction of users.

Making sure that default NBuffers stays under the platform's default
SHMMAX would accomplish that goal at least as well, probably better
than trying to have a one-size-fits-all default; especially if we've
already failed to do the latter.

Btw., do we have any data on how appropriate wal_buffers = 8 is?

Not that I've seen. It looks like a rather ad-hoc choice to me...

regards, tom lane