Adding index flag showing tuple status

Started by Bruce Momjianover 24 years ago6 messages
#1Bruce Momjian
pgman@candle.pha.pa.us

I am looking at adding an index tuple flag to indicate when a heap tuple
is expired so the index code can skip looking up the heap tuple.

The problem is that I can't figure out how be sure that the heap tuple
doesn't need to be looked at by _any_ backend. Right now, we update the
transaction commit flags in the heap tuple to prevent a pg_log lookup,
but that is not enough because some transactions may still see that heap
tuple as visible.

Vacuum has a complex test that looks a currently running transactions
and stuff. Do I have to duplicate that test in the new code? Seems I
do.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#2Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Bruce Momjian (#1)
AW: Adding index flag showing tuple status

I am looking at adding an index tuple flag to indicate when a
heap tuple is expired so the index code can skip looking up the heap tuple.

The problem is that I can't figure out how be sure that the heap tuple
doesn't need to be looked at by _any_ backend. Right now, we update the
transaction commit flags in the heap tuple to prevent a pg_log lookup,
but that is not enough because some transactions may still see that heap
tuple as visible.

If you are only marking those, that need not be visible anymore, can you not
simply delete that key (slot) from the index ? I know vacuum then shows a count
mismatch, but that could probably be accounted for.

Andreas

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zeugswetter Andreas SB (#2)
Re: AW: Adding index flag showing tuple status

I am looking at adding an index tuple flag to indicate when a
heap tuple is expired so the index code can skip looking up the heap tuple.

The problem is that I can't figure out how be sure that the heap tuple
doesn't need to be looked at by _any_ backend. Right now, we update the
transaction commit flags in the heap tuple to prevent a pg_log lookup,
but that is not enough because some transactions may still see that heap
tuple as visible.

If you are only marking those, that need not be visible anymore, can you not
simply delete that key (slot) from the index ? I know vacuum then shows a count
mismatch, but that could probably be accounted for.

I am not sure we need this, if we implement a lightweight VACUUM per my
proposal in a nearby thread. The conditions that you could mark or
delete an index tuple under are exactly the same that VACUUM would be
looking for (viz, tuple dead as far as all remaining transactions are
concerned).

regards, tom lane

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Zeugswetter Andreas SB (#2)
Re: AW: Adding index flag showing tuple status

[ Charset ISO-8859-1 unsupported, converting... ]

I am looking at adding an index tuple flag to indicate when a
heap tuple is expired so the index code can skip looking up the heap tuple.

The problem is that I can't figure out how be sure that the heap tuple
doesn't need to be looked at by _any_ backend. Right now, we update the
transaction commit flags in the heap tuple to prevent a pg_log lookup,
but that is not enough because some transactions may still see that heap
tuple as visible.

If you are only marking those, that need not be visible anymore, can you not
simply delete that key (slot) from the index ? I know vacuum then shows a count
mismatch, but that could probably be accounted for.

I wasn't going to delete it, just add a flag to index scans know they
don't need to look at the heap table.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#5Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Bruce Momjian (#4)
AW: AW: Adding index flag showing tuple status

I am looking at adding an index tuple flag to indicate when a
heap tuple is expired so the index code can skip looking up the heap tuple.

The problem is that I can't figure out how be sure that the heap tuple
doesn't need to be looked at by _any_ backend. Right now, we update the
transaction commit flags in the heap tuple to prevent a pg_log lookup,
but that is not enough because some transactions may still see that heap
tuple as visible.

If you are only marking those, that need not be visible anymore, can you not
simply delete that key (slot) from the index ? I know vacuum then shows a count
mismatch, but that could probably be accounted for.

I wasn't going to delete it, just add a flag to index scans know they
don't need to look at the heap table.

If it is only a flag, you would need to go to the same trouble that vacuum already
goes to (you cannot mark it if someone else is still interested in this snapshot).
Thus I do not see any benefit in adding a flag, versus deleting not needed keys.
To avoid the snapshot trouble you would need a xid (xmax or something), and that
is 4 bytes, and not a simple flag.

Andreas

#6Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Zeugswetter Andreas SB (#5)
Re: AW: AW: Adding index flag showing tuple status

I wasn't going to delete it, just add a flag to index scans know they
don't need to look at the heap table.

If it is only a flag, you would need to go to the same trouble
that vacuum already goes to (you cannot mark it if someone else
is still interested in this snapshot). Thus I do not see any
benefit in adding a flag, versus deleting not needed keys. To
avoid the snapshot trouble you would need a xid (xmax or
something), and that is 4 bytes, and not a simple flag.

Yes, I would need:

GetXmaxRecent(&XmaxRecent);

which find the minimum visible transaction for all active backends.
This is currently used in vacuum and btbuild. I would use that
visibility to set the bit. Because it is only a bit, I would need read
lock but not write lock. Multiple people can set the same bit.

Basically, I need to:

add a global XmaxRecent and set it once per transaction if needed
add a parameter to heap_fetch to pass back an index tid
if the heap is not visible to any backend, mark the index flag
update the index scan code to skip returning these flaged entries

Of course, Tom thinks is may not be needed with his new vacuum so I may
never implement it.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026