Index tuple killing code committed

Started by Tom Laneover 23 years ago4 messages

tgl@sss.pgh.pa.us

over 23 years ago

Per previous discussion, I have committed changes that cause the btree
and hash index methods to mark index tuples "killed" the first time they
are fetched after becoming globally dead. Subsequently the killed
entries are not returned out of indexscans, saving useless heap fetches.
(I haven't changed rtree and gist yet; they will need some internal
restructuring to do this efficiently. Perhaps Oleg or Teodor would like
to take that on.)

This seems to make a useful improvement in pgbench results. Yesterday's
CVS tip gave me these results:

(Running postmaster with "-i -F -B 1024", other parameters at defaults,
and pgbench initialized with "pgbench -i -s 10 bench")

$ time pgbench -c 5 -t 1000 -n bench
tps = 26.428787(including connections establishing)
tps = 26.443410(excluding connections establishing)
real 3:09.74
$ time pgbench -c 5 -t 1000 -n bench
tps = 18.838304(including connections establishing)
tps = 18.846281(excluding connections establishing)
real 4:26.41
$ time pgbench -c 5 -t 1000 -n bench
tps = 13.541641(including connections establishing)
tps = 13.545646(excluding connections establishing)
real 6:10.19

Note the "-n" switches here to prevent vacuums between runs; the point
is to observe the degradation as more and more dead tuples accumulate.

With the just-committed changes I get (starting from a fresh start):

$ time pgbench -c 5 -t 1000 -n bench
tps = 28.393271(including connections establishing)
tps = 28.410117(excluding connections establishing)
real 2:56.53
$ time pgbench -c 5 -t 1000 -n bench
tps = 23.498645(including connections establishing)
tps = 23.510134(excluding connections establishing)
real 3:33.89
$ time pgbench -c 5 -t 1000 -n bench
tps = 18.773239(including connections establishing)
tps = 18.780936(excluding connections establishing)
real 4:26.84

The remaining degradation is actually in seqscan performance, not
indexscan --- unless one uses a much larger -s setting, the planner will
think it ought to use seqscans for updating the "branches" and "tellers"
tables, since those nominally have just a few rows; and there's no way
to avoid scanning lots of dead tuples in a seqscan. Forcing indexscans
helps some in the former CVS tip:

$ PGOPTIONS="-fs" time pgbench -c 5 -t 1000 -n bench
tps = 28.840678(including connections establishing)
tps = 28.857442(excluding connections establishing)
real 2:53.9
$ PGOPTIONS="-fs" time pgbench -c 5 -t 1000 -n bench
tps = 25.670674(including connections establishing)
tps = 25.684493(excluding connections establishing)
real 3:15.7
$ PGOPTIONS="-fs" time pgbench -c 5 -t 1000 -n bench
tps = 22.593429(including connections establishing)
tps = 22.603928(excluding connections establishing)
real 3:42.7

and with the changes I get:

$ PGOPTIONS=-fs time pgbench -c 5 -t 1000 -n bench
tps = 29.445004(including connections establishing)
tps = 29.463948(excluding connections establishing)
real 2:50.3
$ PGOPTIONS=-fs time pgbench -c 5 -t 1000 -n bench
tps = 30.277968(including connections establishing)
tps = 30.301363(excluding connections establishing)
real 2:45.6
$ PGOPTIONS=-fs time pgbench -c 5 -t 1000 -n bench
tps = 30.209377(including connections establishing)
tps = 30.230646(excluding connections establishing)
real 2:46.0

This is the first time I have ever seen repeated pgbench runs without
substantial performance degradation. Not a bad result for a Friday
afternoon...

regards, tom lane

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 23 years ago

In reply to: Tom Lane (#1)

Re: Index tuple killing code committed

This is the first time I have ever seen repeated pgbench runs without
substantial performance degradation. Not a bad result for a Friday
afternoon...

Congatulations :-) This sounds great !!!

Andreas

Import Notes

Resolved by subject fallback

Joe Conway

mail@joeconway.com

over 23 years ago

In reply to: Tom Lane (#1)

Re: Index tuple killing code committed

Tom Lane wrote:

The remaining degradation is actually in seqscan performance, not
indexscan --- unless one uses a much larger -s setting, the planner will
think it ought to use seqscans for updating the "branches" and "tellers"
tables, since those nominally have just a few rows; and there's no way
to avoid scanning lots of dead tuples in a seqscan. Forcing indexscans
helps some in the former CVS tip:

This may qualify as a "way out there" idea, or more trouble than it's
worth, but what about a table option which provides a bitmap index of
tuple status -- i.e. tuple dead t/f. If available, a seqscan in between
vacuums could maybe gain some of the same efficiency.

This is the first time I have ever seen repeated pgbench runs without
substantial performance degradation. Not a bad result for a Friday
afternoon...

Nice work!

Joe

Tom Lane

tgl@sss.pgh.pa.us

over 23 years ago

In reply to: Joe Conway (#3)

Re: Index tuple killing code committed

Joe Conway <mail@joeconway.com> writes:

This may qualify as a "way out there" idea, or more trouble than it's
worth, but what about a table option which provides a bitmap index of
tuple status -- i.e. tuple dead t/f. If available, a seqscan in between
vacuums could maybe gain some of the same efficiency.

Hmm. I'm inclined to think that a separate bitmap index wouldn't be
worth the trouble. Under most scenarios it'd just require extra I/O
and not buy much.

However ... we could potentially take over the LP_DELETED flag bit of
heap tuples for the same use as for index tuples: set it when the tuple
is known dead for all transactions. This would save calling
HeapTupleSatisfiesSnapshot in the inner loop of heapgettup, while not
adding much expense for the normal case where the tuple's not dead.

regards, tom lane