pgsql: Invent PageIndexTupleOverwrite, and teach BRIN and GiST to use i

Started by Tom Lanealmost 10 years ago4 messagescomitters
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

Invent PageIndexTupleOverwrite, and teach BRIN and GiST to use it.

PageIndexTupleOverwrite performs approximately the same function as
PageIndexTupleDelete (or PageIndexDeleteNoCompact) followed by PageAddItem
targeting the same item pointer offset. But in the case where the new
tuple is the same size as the old, it avoids shuffling other data around on
the page, because the new tuple is placed where the old one was rather than
being appended to the end of the page. This has been shown to provide a
substantial speedup for some GiST use-cases.

Also, this change allows some API simplifications: we can get rid of
the rather klugy and error-prone PAI_ALLOW_FAR_OFFSET flag for
PageAddItemExtended, since that was used only to cover a corner case
for BRIN that's better expressed by using PageIndexTupleOverwrite.

Note that this patch causes a rather subtle WAL incompatibility: the
physical page content change represented by certain WAL records is now
different than it was before, because while the tuples have the same
itempointer line numbers, the tuples themselves are in different places.
I have not bumped the WAL version number because I think it doesn't matter
unless you are trying to do bitwise comparisons of original and replayed
pages, and in any case we're early in a devel cycle and there will probably
be more WAL changes before v10 gets out the door.

There is probably room to make use of PageIndexTupleOverwrite in SP-GiST
and GIN too, but that is left for a future patch.

Andrey Borodin, reviewed by Anastasia Lubennikova, whacked around a bit
by me

Discussion: <CAJEAwVGQjGGOj6mMSgMwGvtFd5Kwe6VFAxY=uEPZWMDjzbn4VQ@mail.gmail.com>

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/b1328d78f88cdf4f7504004159e530b776f0de16

Modified Files
--------------
src/backend/access/brin/brin_pageops.c | 6 +-
src/backend/access/brin/brin_xlog.c | 9 +--
src/backend/access/gist/gist.c | 30 +++++--
src/backend/access/gist/gistxlog.c | 29 ++++++-
src/backend/storage/page/bufpage.c | 142 ++++++++++++++++++++++++++++-----
src/include/storage/bufpage.h | 3 +-
6 files changed, 179 insertions(+), 40 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#2Andrey Borodin
amborodin@acm.org
In reply to: Tom Lane (#1)
Re: pgsql: Invent PageIndexTupleOverwrite, and teach BRIN and GiST to use i

Hello!

2016-09-10 3:02 GMT+05:00 Tom Lane <tgl@sss.pgh.pa.us>:

Invent PageIndexTupleOverwrite, and teach BRIN and GiST to use it.

Discussion: <CAJEAwVGQjGGOj6mMSgMwGvtFd5Kwe6VFAxY=uEPZWMDjzbn4VQ@mail.gmail.com>

Details
-------
http://git.postgresql.org/pg/commitdiff/b1328d78f88cdf4f7504004159e530b776f0de16

Modified Files
--------------
src/backend/access/brin/brin_pageops.c | 6 +-
src/backend/access/brin/brin_xlog.c | 9 +--
src/backend/access/gist/gist.c | 30 +++++--
src/backend/access/gist/gistxlog.c | 29 ++++++-
src/backend/storage/page/bufpage.c | 142 ++++++++++++++++++++++++++++-----
src/include/storage/bufpage.h | 3 +-
6 files changed, 179 insertions(+), 40 deletions(-)

This commit changes shape of WAL records of GiST and BRIN, but does
not bump XLOG_PAGE_MAGIC. Is it for a reason?

Best regards, Andrey Borodin.

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrey Borodin (#2)
Re: pgsql: Invent PageIndexTupleOverwrite, and teach BRIN and GiST to use i

Andrew Borodin <borodin@octonica.com> writes:

2016-09-10 3:02 GMT+05:00 Tom Lane <tgl@sss.pgh.pa.us>:

Invent PageIndexTupleOverwrite, and teach BRIN and GiST to use it.

This commit changes shape of WAL records of GiST and BRIN, but does
not bump XLOG_PAGE_MAGIC. Is it for a reason?

Yeah, I explained it in part of the commit message you didn't quote:

Note that this patch causes a rather subtle WAL incompatibility: the
physical page content change represented by certain WAL records is now
different than it was before, because while the tuples have the same
itempointer line numbers, the tuples themselves are in different places.
I have not bumped the WAL version number because I think it doesn't matter
unless you are trying to do bitwise comparisons of original and replayed
pages, and in any case we're early in a devel cycle and there will probably
be more WAL changes before v10 gets out the door.

regards, tom lane

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#4Andrey Borodin
amborodin@acm.org
In reply to: Tom Lane (#3)
Re: pgsql: Invent PageIndexTupleOverwrite, and teach BRIN and GiST to use i

2016-09-22 23:11 GMT+05:00 Tom Lane <tgl@sss.pgh.pa.us>:

Yeah, I explained it in part of the commit message you didn't quote:

Oh.. sorry, my bad. Focused on code and missed whole thing.

Note that this patch causes a rather subtle WAL incompatibility: the
physical page content change represented by certain WAL records is now
different than it was before, because while the tuples have the same
itempointer line numbers, the tuples themselves are in different places.

As for GiST, that was my initial suggestion, but tests of Anastasia
Lubennikova showed that in some cases index size is reduced by 3. It's
not possible to get 3 times less pages by compatible WAL replay.

I have not bumped the WAL version number because I think it doesn't matter
unless you are trying to do bitwise comparisons of original and replayed
pages, and in any case we're early in a devel cycle and there will probably
be more WAL changes before v10 gets out the door.

I think that's correct assumption. Sorry for time taken.

Best regards, Andrey Borodin.

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers