Reuse the dead item on unique index.

Started by Atsushi Ogawaover 20 years ago3 messageshackers
Jump to latest
#1Atsushi Ogawa
atsushi.ogawa@gmail.com

When _bt_check_unique finds a dead item that has same data as new
item, LP_DEAD is set to the item. Can we reuse this dead item instead
of inserting new item? If it is possible, the growth of btree index
can be reduced.

I think it is effective in the table like branches table of pgbench to
which the same item is updated many times.

An attached patch is test implementation of this idea. This patch is
not complete, but it might be useful to show this idea.

regards,

--- Atsushi Ogawa

Attachments:

uniq_index.patchtext/x-patch; charset=us-ascii; name=uniq_index.patchDownload+66-10
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Atsushi Ogawa (#1)
Re: Reuse the dead item on unique index.

Atsushi Ogawa <atsushi.ogawa@gmail.com> writes:

When _bt_check_unique finds a dead item that has same data as new
item, LP_DEAD is set to the item. Can we reuse this dead item instead
of inserting new item?

This strikes me as a pretty bad idea for the same reason pointed out
recently in other threads: the notion of equality embodied in a btree
opclass' equals function may have little or nothing to do with true
identity. So your assumption that it's the "same" data is faulty.

Also, I'm dubious about the assumption that "can be marked LP_DELETED"
is the same as "can be physically removed right now". The side-effects
on indexscans happening concurrently with yours could be bad. At the
very least you'd need to obtain super-exclusive lock (cf btbulkdelete)
before doing the replacement.

regards, tom lane

#3Atsushi Ogawa
atsushi.ogawa@gmail.com
In reply to: Tom Lane (#2)
Re: Reuse the dead item on unique index.

Tom Lane <tgl@sss.pgh.pa.us>:

Atsushi Ogawa <atsushi.ogawa@gmail.com> writes:

When _bt_check_unique finds a dead item that has same data as new
item, LP_DEAD is set to the item. Can we reuse this dead item instead
of inserting new item?

This strikes me as a pretty bad idea for the same reason pointed out
recently in other threads: the notion of equality embodied in a btree
opclass' equals function may have little or nothing to do with true
identity. So your assumption that it's the "same" data is faulty.

Thanks, I understand the problem.
When the size of new item and dead item is the equal, the new item can
be overwrited at the position of the dead item.

Also, I'm dubious about the assumption that "can be marked LP_DELETED"
is the same as "can be physically removed right now". The side-effects
on indexscans happening concurrently with yours could be bad. At the
very least you'd need to obtain super-exclusive lock (cf btbulkdelete)
before doing the replacement.

I agree. I will add code that checks the refcount of buffer. If refcount
is 1, current process has super-exclusive lock, and we can overwrite the
dead item. If refcount > 1, I use _bt_insertonpg.

regards,

--- Atsushi Ogawa