GIN improvements

Started by Teodor Sigaevover 17 years ago86 messages
#1Teodor Sigaev
teodor@sigaev.ru

Improvements of GIN indexes were presented on PGCon 2008. Presentation:
http://www.sigaev.ru/gin/fastinsert_and_multicolumn_GIN.pdf

1) multicolumn GIN
This patch ( http://www.sigaev.ru/misc/multicolumn_gin-0.2.gz ) adds multicolumn
support to GIN. The basic idea is: keys (entries in GIN terminology) extracted
from values are stored in separated tuples along with their column number. In
that case, multicolumn clause is just AND of column's clauses. Unlike other
indexes, the performance of search doesn't depends on what column of index
(first, last, any subset) is used in search clause. This property can be used in
gincostestimate, but I haven't looked on it yet.

2) fast insert into GIN
This patch ( http://www.sigaev.ru/misc/fast_insert_gin-0.4.gz ) implements an
idea of using bulk insert technique, which used at index creation time. Inserted
rows are stored in the linked list of pending pages and inserted to the regular
structure of GIN at vacuum time. The algorithm is shown in presentation, but
insert completion process (vacuum) was significantly reworkes to improve
concurrency. Now, the list of pending page is locked much lesser time - only
during deletion of pages from the list.

Open item:
what is a right time to call insert completion? Currently, it is called by
ginbulkdelete and ginvacuumcleanup, ginvacuumcleanup will call completion if
ginbulkdelete wasn't called. That's not good, but works. Completion process
should started before ginbulkdelete because ginbulkdelete doesn't look on
pending pages at all.

Since insert completion (of any index if that method will exists, I think) runs
fast if number of inserted tuples is a small because it doesn't go through the
whole index, so, IMHO, the existing statistic's fields should not be changed.
That idea, discussed at PGCon, is to have trigger in vacuum which will be fired
if number of inserted tuples becomes big. Now I don't think that the idea is
useful for two reason: for small number of tuples completion is a cheap and it
should be called before ginbulkdelete. IMHO, it's enough to add an optional
method to pg_am (aminsertcleanup, per Tom's suggestion). This method will be
called before ambulkdelete and amvacuumcleanup. Opinions, objections, suggestions?

On presentation some people were interested on how our changes affect the
search speed after rows insert. The tests are below: We use the same tables as
in presentation and measure search times ( after insertion of some rows ) before
and after vacuum. All times are in ms. Test tables contain 100000 rows, in the
first table the number of elements in array is 100 with cardinality = 500,
second - 100 and 500000, last - 1000 and 500.

Insert 10000 into table with 100000 rows (10%)
| v && '{v1}' |
-----------------+---------+--------+ found
| novac-d | vac-d | rows
-----------------+---------+--------+-------
n:100, c:500 | 118 | 35 | 19909
n:100, c:500000 | 95 | 0.7 | 25
n:1000, c:500 | 380 | 79 | 95211

Insert 1000 into table with 100000 rows (1%)
| v && '{v1}' |
-----------------+---------+--------+ found
| novac-d | vac-d | rows
-----------------+---------+--------+-------
n:100, c:500 | 40 | 31 | 18327
n:100, c:500000 | 13 | 0.5 | 26
n:1000, c:500 | 102 | 71 | 87499

Insert 100 into table with 100000 rows (0.1%)
| v && '{v1}' |
-----------------+---------+--------+ found
| novac-d | vac-d | rows
-----------------+---------+--------+-------
n:100, c:500 | 32 | 31 | 18171
n:100, c:500000 | 1.7 | 0.5 | 20
n:1000, c:500 | 74 | 71 | 87499

Looking at result it's easy to conclude that:
- time of search pending list is O(number of inserted rows), i.e., search time
is equal to (time of search in GIN) + K1 * (number of inserted rows after the
last vacuum).
- search time is O(average length of indexed columns). Observations made above
is also applicable here.
- significant performance gap starts around 5-10% of inserts or near 500-1000
inserts. This is very depends on specific dataset.

Notice, that insert performance to GIN was increased up to 10 times. See
exact results in presentation.

Do we need to add option to control this (fast insertion) feature?
If so, what is a default value? It's not clear to me.

Note: These patches are mutually exclusive because they touch the same pieces
of code and I'm too lazy to manage several depending patches. I don't see any
problem to join patches to one, but IMHO it will be difficult to review.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#2Teodor Sigaev
teodor@sigaev.ru
In reply to: Teodor Sigaev (#1)
Re: GIN improvements

2) fast insert into GIN

New version:
http://www.sigaev.ru/misc/fast_insert_gin-0.6.gz

Changes:
- added option FASTUPDATE=(1|t|true|on|enable|0|f|false|disable) for
CREATE/ALTER command for GIN indexes
- Since there wasn't any comments on first email, pg_am.aminsertcleanup optional
method was introduced.
- added documentation

Suppose, patch is ready to review/commit...

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#3Heikki Linnakangas
heikki@enterprisedb.com
In reply to: Teodor Sigaev (#2)
Re: GIN improvements

Teodor Sigaev wrote:

2) fast insert into GIN

New version:
http://www.sigaev.ru/misc/fast_insert_gin-0.6.gz

Changes:
- added option FASTUPDATE=(1|t|true|on|enable|0|f|false|disable) for
CREATE/ALTER command for GIN indexes

I think we should try to make it automatic. I'm not sure how.

How about having a constant sized "fastupdate" buffer, of say 100 rows
or a fixed number of pages, and when that becomes full, the next
inserter will have to pay the price of updating the index and flushing
the buffer. To keep that overhead out of the main codepath, we could
make autovacuum to flush the buffers periodically.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4Teodor Sigaev
teodor@sigaev.ru
In reply to: Heikki Linnakangas (#3)
Re: GIN improvements

How about having a constant sized "fastupdate" buffer, of say 100 rows
or a fixed number of pages, and when that becomes full, the next
inserter will have to pay the price of updating the index and flushing

I'm not sure that is acceptable because flushing pending list may take several
seconds in unpredictable moment.

the buffer. To keep that overhead out of the main codepath, we could
make autovacuum to flush the buffers periodically.

Do you mean that GIN sends a "smoke signal" to the autovacuum launcher process
to ask about vacuum?

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#5Alvaro Herrera
alvherre@commandprompt.com
In reply to: Teodor Sigaev (#4)
Re: GIN improvements

Teodor Sigaev wrote:

How about having a constant sized "fastupdate" buffer, of say 100 rows
or a fixed number of pages, and when that becomes full, the next
inserter will have to pay the price of updating the index and flushing

I'm not sure that is acceptable because flushing pending list may take
several seconds in unpredictable moment.

Perhaps we could make the fixed-size buffer be of size X, and trigger
autovac on X/3 or some such, to give it enough slack so that it would be
very unlikely to be processed by user processes.

the buffer. To keep that overhead out of the main codepath, we could
make autovacuum to flush the buffers periodically.

Do you mean that GIN sends a "smoke signal" to the autovacuum launcher
process to ask about vacuum?

Something like that, yes.

Currently, autovac only uses pgstats as trigger for actions. Maybe we
could use something else (say, a flag in shared memory?), or just stash
the info that the index needs to be processed in pgstats and have
autovac examine it.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#6Alvaro Herrera
alvherre@commandprompt.com
In reply to: Teodor Sigaev (#2)
Re: GIN improvements

Teodor Sigaev wrote:

2) fast insert into GIN

New version:
http://www.sigaev.ru/misc/fast_insert_gin-0.6.gz

Changes:
- added option FASTUPDATE=(1|t|true|on|enable|0|f|false|disable) for
CREATE/ALTER command for GIN indexes
- Since there wasn't any comments on first email, pg_am.aminsertcleanup optional
method was introduced.

Hmm, this alters the semantics of amvacuumcleanup a bit. Currently in
btvacuumcleanup we invoke btvacuumscan only if btbulkdelete was not
called. This is noticed by observing whether the "stats" pointer is
NULL. However, the patch changes this a bit because
index_vacuum_cleanup is called with the results of index_insert_cleanup,
instead of a plain NULL.

Right now this is not a problem because there is no insert_cleanup
function for btree, but I wonder if we should clean it up.

FWIW there's a typo in catalogs.sgml (finction -> function)

What's the use of the FASTUPDATE parameter? Is there a case when a user
is interested in turning it off?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#7Teodor Sigaev
teodor@sigaev.ru
In reply to: Alvaro Herrera (#5)
Re: GIN improvements

Perhaps we could make the fixed-size buffer be of size X, and trigger
autovac on X/3 or some such, to give it enough slack so that it would be
very unlikely to be processed by user processes.

Do you mean that GIN sends a "smoke signal" to the autovacuum launcher
process to ask about vacuum?

Currently, autovac only uses pgstats as trigger for actions. Maybe we
could use something else (say, a flag in shared memory?), or just stash
the info that the index needs to be processed in pgstats and have
autovac examine it.

Flag in pgstats or shared memory is most reasonable solution. Using size of
buffers is not very good because other indexes might use another technics for
delayed insert and use another trigger criteria.

Suppose, the best technic for GIN will be a setting flag by search procedure -
if time of scanning pending pages is eqial to time of search in regular
structure then it's time to call insert cleanup.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#8Teodor Sigaev
teodor@sigaev.ru
In reply to: Alvaro Herrera (#6)
Re: GIN improvements

Right now this is not a problem because there is no insert_cleanup
function for btree, but I wonder if we should clean it up.

Look at gistbulkdelete and gistvacuumcleanup, first function wants to send a
bool flag to second one and they use GiSTBulkDelete structure instead of usual
IndexBulkDeleteResult. When it will be needed btree may use the same method.

FWIW there's a typo in catalogs.sgml (finction -> function)

Thank you, will fix.

What's the use of the FASTUPDATE parameter? Is there a case when a user
is interested in turning it off?

Yeah - when time of search is much-much more important (or crucial) than
insertion time. Or table stores read-only values.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#9Teodor Sigaev
teodor@sigaev.ru
In reply to: Teodor Sigaev (#1)
Re: GIN improvements

1) multicolumn GIN
Unlike other indexes, the performance of search
doesn't depends on what column of index (first, last, any subset) is
used in search clause. This property can be used in gincostestimate, but
I haven't looked on it yet.

After some playing I didn't find any mentions in *costestimate function about
difference of cost estimation between first and any other columns in clauses,
so, IMHO, issue above isn't an issue. :)

So, I didn't see any comments/objections and I intend to commit this patch for
next two days and synchronize 'fast insert into GIN' patch with CVS.

Objections?

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#9)
Re: GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

So, I didn't see any comments/objections and I intend to commit this patch for
next two days and synchronize 'fast insert into GIN' patch with CVS.
Objections?

I think it hasn't really gotten reviewed at all (certainly not by me).
If you want other people to look it over you should wait for next
month's commit fest.

regards, tom lane

#11Teodor Sigaev
teodor@sigaev.ru
In reply to: Teodor Sigaev (#1)
Re: [PATCHES] GIN improvements

Sync with current CVS HEAD and post in hackers- too because patches- close to
the closing.

http://www.sigaev.ru/misc/fast_insert_gin-0.7.gz
http://www.sigaev.ru/misc/multicolumn_gin-0.3.gz

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#12Heikki Linnakangas
heikki@enterprisedb.com
In reply to: Teodor Sigaev (#11)
Multi-column GIN

Dumb question:

What's the benefit of a multi-column GIN index over multiple
single-column GIN indexes?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#13Teodor Sigaev
teodor@sigaev.ru
In reply to: Heikki Linnakangas (#12)
Re: [PATCHES] Multi-column GIN

What's the benefit of a multi-column GIN index over multiple
single-column GIN indexes?

Page 12 from presentation on PgCon
(http://www.sigaev.ru/gin/fastinsert_and_multicolumn_GIN.pdf):

Multicolumn index vs. 2 single column indexes

Size:    539 Mb        538 Mb
Speed:   *1.885* ms    4.994 ms
Index:   ~340 s        ~200 s
Insert:  72 s/10000    66 s/10000

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#14Oleg Bartunov
oleg@sai.msu.su
In reply to: Teodor Sigaev (#13)
Re: [PATCHES] Multi-column GIN

On Fri, 4 Jul 2008, Teodor Sigaev wrote:

What's the benefit of a multi-column GIN index over multiple
single-column GIN indexes?

Page 12 from presentation on PgCon
(http://www.sigaev.ru/gin/fastinsert_and_multicolumn_GIN.pdf):

Multicolumn index vs. 2 single column indexes

Size:    539 Mb        538 Mb
Speed:   *1.885* ms    4.994 ms
Index:   ~340 s        ~200 s
Insert:  72 s/10000    66 s/10000

Well, another reason is a index feature-completeness

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

#15Alvaro Herrera
alvherre@commandprompt.com
In reply to: Teodor Sigaev (#11)
Re: [PATCHES] GIN improvements

Teodor Sigaev wrote:

Sync with current CVS HEAD and post in hackers- too because patches-
close to the closing.

http://www.sigaev.ru/misc/multicolumn_gin-0.3.gz

I looked this over and it looks good in general. I was only wondering
about for single-column indexes -- we're storing attribute numbers too,
right? Would it be too difficult to strip them out in that case?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#16Teodor Sigaev
teodor@sigaev.ru
In reply to: Alvaro Herrera (#15)
Re: [PATCHES] GIN improvements

I looked this over and it looks good in general. I was only wondering
about for single-column indexes -- we're storing attribute numbers too,
right?

No, GinState->oneCol field signals to GinFormTuple and
gin_index_getattr/gintuple_get_attrnum about actual storage.

Single column index is binary compatible with current index :)

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#17Alvaro Herrera
alvherre@commandprompt.com
In reply to: Teodor Sigaev (#16)
Re: [PATCHES] GIN improvements

Teodor Sigaev wrote:

I looked this over and it looks good in general. I was only wondering
about for single-column indexes -- we're storing attribute numbers too,
right?

No, GinState->oneCol field signals to GinFormTuple and
gin_index_getattr/gintuple_get_attrnum about actual storage.

Single column index is binary compatible with current index :)

Ah, neat!

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#18Teodor Sigaev
teodor@sigaev.ru
In reply to: Alvaro Herrera (#17)
Re: [PATCHES] GIN improvements

I looked this over and it looks good in general.

May I think that patch passed review and commit it?

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#18)
Re: [PATCHES] GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

I looked this over and it looks good in general.

May I think that patch passed review and commit it?

I'd still like to take a look.

regards, tom lane

#20Neil Conway
neilc@samurai.com
In reply to: Tom Lane (#19)
Re: [PATCHES] GIN improvements

On Tue, 2008-07-08 at 14:51 -0400, Tom Lane wrote:

I'd still like to take a look.

I was tasked with reviewing this for the current commit fest, although
so far I've just been working on grokking the rest of the GIN code. But
if you'd like to review it instead, that's fine with me.

-Neil

#21Josh Berkus
josh@agliodbs.com
In reply to: Neil Conway (#20)
Re: [PATCHES] GIN improvements

Neil,

I was tasked with reviewing this for the current commit fest, although
so far I've just been working on grokking the rest of the GIN code. But
if you'd like to review it instead, that's fine with me.

I have plenty of other stuff I could assign you if you're not needed on
GIN.

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#11)
Re: [PATCHES] GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

http://www.sigaev.ru/misc/fast_insert_gin-0.7.gz
http://www.sigaev.ru/misc/multicolumn_gin-0.3.gz

I've committed the multicolumn one with minor revisions (fix some poor
English in docs and comments, add regression-test coverage). Do you
need more review of fast_insert yet? It looked like a number of people
commented on it already.

regards, tom lane

#23Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#22)
Re: [PATCHES] GIN improvements

I've committed the multicolumn one with minor revisions (fix some poor
English in docs and comments, add regression-test coverage). Do you

Thank you very much.

need more review of fast_insert yet? It looked like a number of people
commented on it already.

I should modify it to support/synchronize with multicolumn GIN - both patches
touch the same pieces of code, and I didn't make a single patch to simplify review.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#24Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#22)
Re: [PATCHES] GIN improvements

Updated: http://www.sigaev.ru/misc/fast_insert_gin-0.9.gz

need more review of fast_insert yet? It looked like a number of people
commented on it already.

I still havn't clearness of acceptability for suggested aminsertcleanup calling.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#24)
Re: [PATCHES] GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

Updated: http://www.sigaev.ru/misc/fast_insert_gin-0.9.gz
I still havn't clearness of acceptability for suggested aminsertcleanup calling.

I started to look at this. I don't understand why VACUUM does an insert
cleanup before starting to vacuum, but VACUUM FULL doesn't?

I don't particularly like the idea of adding aminsertcleanup calls
immediately before other AM operations such as ambulkdelete. It seems
to me that those operations ought to include the cleanup subroutine
themselves, if they need it; they shouldn't depend on callers to get
this right. Offhand it looks to me like the only new index AM call
needed is the one at vacuum startup, which tempts me to propose that
the new AM entry point should be called "amvacuumstartup", instead of
wiring in the assumption that what it's for is specifically cleanup
of insertions.

Comments? I can make the change if you think it's okay --- I'm busy
cleaning up docs and comments at the moment.

regards, tom lane

#26Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#25)
Re: [PATCHES] GIN improvements

I started to look at this. I don't understand why VACUUM does an insert
cleanup before starting to vacuum, but VACUUM FULL doesn't?

Hmm. May be I missed something, but I don't understand where and what... I tried
to track all places of ambultdelete call. aminsertcleanup should be called
before any ambulkdelete, because ambulkdelete doesn't scan pending list which
can store items to be deleted and hence index will store item pointers to absent
tuples.

needed is the one at vacuum startup, which tempts me to propose that
the new AM entry point should be called "amvacuumstartup", instead of
wiring in the assumption that what it's for is specifically cleanup
of insertions.

That's possible but inserts into index should be forbidden between
amvacuumstartup and last call of ambulkdelete.

Comments? I can make the change if you think it's okay --- I'm busy
cleaning up docs and comments at the moment.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#26)
Re: [PATCHES] GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

I started to look at this. I don't understand why VACUUM does an insert
cleanup before starting to vacuum, but VACUUM FULL doesn't?

Hmm. May be I missed something, but I don't understand where and what... I tried
to track all places of ambultdelete call. aminsertcleanup should be called
before any ambulkdelete, because ambulkdelete doesn't scan pending list which
can store items to be deleted and hence index will store item pointers to absent
tuples.

needed is the one at vacuum startup, which tempts me to propose that
the new AM entry point should be called "amvacuumstartup", instead of
wiring in the assumption that what it's for is specifically cleanup
of insertions.

That's possible but inserts into index should be forbidden between
amvacuumstartup and last call of ambulkdelete.

Well, if that is required to be true then this whole design is pretty
broken, because VACUUM doesn't hold any lock that would guarantee that
no insert happens between the two calls. If we fold the two AM calls
into one call then it'd be okay for the index AM to take such a lock
transiently during the single index-cleanup-plus-bulkdelete call.

For VACUUM FULL there's no such issue because the whole table is locked,
but I still don't see any real point in having two successive index AM
calls when the AM could perfectly well do all the work in one call.

Maybe it'd be better if ambulkdelete *did* scan the pending list?
You'd still need at least page-level locking but perhaps not anything
stronger.

regards, tom lane

#28Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#27)
Re: [PATCHES] GIN improvements

Well, if that is required to be true then this whole design is pretty
broken, because VACUUM doesn't hold any lock that would guarantee that
no insert happens between the two calls. If we fold the two AM calls
into one call then it'd be okay for the index AM to take such a lock
transiently during the single index-cleanup-plus-bulkdelete call.

Actually, lock doesn't needed. Just bulkdelete should not try to remove not yet
"insertcleanuped" items pointer. That's easy because VacPageList is prepared
before insertcleanup call.

Maybe it'd be better if ambulkdelete *did* scan the pending list?

I don't like that idea because it requires to add a lot of code (concurrent
deletion of pages in list), much simpler to call insertcleanup inside
ginbulkdelete/ginvacuumcleanup.

You'd still need at least page-level locking but perhaps not anything
stronger.

That's close to trivial to revert this piece to add cleanup call to
ginbulkdelete/ginvacuumcleanup. Early variants used this variant.
Reasons for new variant was:
- defining needing of call of insertcleanup, and stats argument was used for
it in both function. If it's a NULL then call cleanup.
- I thought about statistic-based trigger for separate call of insertcleanup.
Trigger should be fired on massive insert/update events very similar to
trigger on massive delete for ambulkdelete. I'm very sorry but I didn't do it
yet, and definitely I need some help here.

Do I revert that piece?

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#28)
Re: [PATCHES] GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

That's close to trivial to revert this piece to add cleanup call to
ginbulkdelete/ginvacuumcleanup. Early variants used this variant.

Yeah, I think we should do it that way.

On reflection I don't think you even need the amvacuumstartup call,
because it is *not* safe to assume that an index cleanup operation
there will guarantee that vacuum won't try to remove pending tuples.
Remember that a tuple inserted by a transaction that later aborted
is DEAD and can be reclaimed instantly by VACUUM. So while in the
case of VACUUM FULL it might be okay to call index_cleanup only
once, for regular VACUUM I think you really have to call it within
each bulkdelete operation. There's probably no point in optimizing
it away in VACUUM FULL either, since surely it'll be fast to call
index_cleanup when there's nothing in the pending list?

- I thought about statistic-based trigger for separate call of insertcleanup.
Trigger should be fired on massive insert/update events very similar to
trigger on massive delete for ambulkdelete. I'm very sorry but I didn't do it
yet, and definitely I need some help here.

Yeah, I was going to complain about that next :-). Autovacuum isn't
going to trigger as a result of INSERT operations; somehow we have
to teach it what to do for GIN indexes. I remember we discussed this
at PGCon but I don't think we decided exactly what to do...

Do I revert that piece?

I've already made a number of changes to the patch; let me keep working
on it and send it back to you later.

regards, tom lane

#30Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#29)
Re: [PATCHES] GIN improvements

once, for regular VACUUM I think you really have to call it within
each bulkdelete operation.

Exactly what I did in last patch.

There's probably no point in optimizing
it away in VACUUM FULL either, since surely it'll be fast to call
index_cleanup when there's nothing in the pending list?

Sure, with empty pending list insertcleanup will just lock/unlock metapage.

Yeah, I was going to complain about that next :-). Autovacuum isn't
going to trigger as a result of INSERT operations; somehow we have
to teach it what to do for GIN indexes. I remember we discussed this
at PGCon but I don't think we decided exactly what to do...

So, may be we just move insertcleanup call to ginbulkdelete/ginvacuumcleanup
but leave aminsertcleanup field in pg_proc for a future.

I've already made a number of changes to the patch; let me keep working
on it and send it back to you later.

ok

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#30)
Re: [PATCHES] GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

So, may be we just move insertcleanup call to ginbulkdelete/ginvacuumcleanup
but leave aminsertcleanup field in pg_proc for a future.

I'd be inclined not to add the extra AM call if we aren't going to use
it now. There's no very good reason to think that a definition we
settled on today would be exactly the right thing for whatever future
need might appear. Better to wait till we have a concrete example to
design around.

regards, tom lane

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#30)
Re: [PATCHES] GIN improvements

I wrote:

Yeah, I was going to complain about that next :-). Autovacuum isn't
going to trigger as a result of INSERT operations; somehow we have
to teach it what to do for GIN indexes. I remember we discussed this
at PGCon but I don't think we decided exactly what to do...

One simple idea is to call aminsertcleanup (probably renamed to
something else like amanalyzehook) during ANALYZE. This seems a bit
grotty, but it has the very attractive property that we don't need to
give the autovacuum control logic any special knowledge about GIN
indexes. Either inserts or updates will lead it to trigger either
auto-ANALYZE or auto-VACUUM, and either way GIN gets a cleanup
opportunity.

A possible argument against this is that if we later fix things so
that VACUUM and ANALYZE can happen concurrently on the same table,
amanalyzehook could get called concurrently with ambulkdelete or
other vacuum-support operations. So the AM author would have to
take care to interlock that safely. But this doesn't seem like
a big deal to me --- interlocks against regular inserts/updates
are probably a harder problem anyway.

Thoughts?

regards, tom lane

#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#24)
Re: [PATCHES] GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

Updated: http://www.sigaev.ru/misc/fast_insert_gin-0.9.gz

Here is the GIN fast-insert patch back again. Changes:

* Sync with CVS HEAD
* Clean up documentation and some of the code comments
* Fix up custom reloptions code
* Suppress some compiler warnings

I didn't get much further than that because I got discouraged after
looking at the locking issues around the pending-insertions list.
It's a mess:

* shiftList() holds an exclusive lock on metapage throughout its run,
which means that it's impossible for two of them to run concurrently.
So why bother with "concurrent deletion" detection?

* shiftList does LockBufferForCleanup, which means that it can be blocked
for an indefinitely long time by a concurrent scan, and since it's holding
exclusive lock on metapage no new scans or insertions can start meanwhile.
This is not only horrid from a performance standpoint but it very probably
can result in deadlocks --- which will be deadlocks on LWLocks and thus
not even detected by the system.

* GIN index scans release lock and pin on one pending-list page before
acquiring pin and lock on the next, which means there's a race condition:
shiftList could visit and delete the next page before we get to it,
because there's a window where we're holding no buffer lock at all.
I think this isn't fatal in itself, since presumably the data in the next
page has been moved into the main index and we can scan it later, but the
scan code isn't checking whether the page has been deleted out from under
it.

* It seems also possible that once a list page has been marked
GIN_DELETED, it could be re-used for some other purpose before a
scan-in-flight reaches it -- reused either as a regular index page or as a
new list page. Neither case is being defended against. It might be that
the new-list-page case isn't a problem, or it might not.

* There is a bigger race condition, which is that after a scan has
returned a tuple from a pending page, vacuum could move the index entry
into the main index structure, and then that same scan could return that
same index entry a second time. This is a no-no, and I don't see any easy
fix.

I haven't really finished reviewing this code, but I'm going to bounce it
back to you to see if you can solve the locking problems. Unless that can
be made safe there is no point doing any more work on this patch.

regards, tom lane

#34Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#33)
Re: [PATCHES] GIN improvements

Tom Lane wrote:

Teodor Sigaev <teodor@sigaev.ru> writes:

I didn't get much further than that because I got discouraged after
looking at the locking issues around the pending-insertions list.
It's a mess:

These are rather severe problems. Maybe there's a better solution, but
perhaps it would be good enough to lock out concurrent access to the
index while the bulkinsert procedure is working.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#34)
Re: [PATCHES] GIN improvements

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom Lane wrote:

It's a mess:

These are rather severe problems. Maybe there's a better solution, but
perhaps it would be good enough to lock out concurrent access to the
index while the bulkinsert procedure is working.

Ugh...

The idea I was toying with was to not allow GIN scans to "stop" on
pending-insertion pages; rather, they should suck out all the matching
tuple IDs into backend-local memory as fast as they can, and then return
the TIDs to the caller one at a time from that internal array. Then,
when the scan is later visiting the main part of the index, it could
check each matching TID against that array to see if it'd already
returned the TID. (So it might be an idea to sort the TID array after
gathering it, to make those subsequent checks fast via binary search.)

This would cost in backend-local memory, of course, but hopefully not
very much. The advantages are the elimination of the deadlock risk
from scan-blocks-insertcleanup-blocks-insert, and fixing the race
condition when a TID previously seen in the pending list is moved to
the main index. There were still a number of locking issues to fix
but I think they're all relatively easy to deal with.

regards, tom lane

#36Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#33)
Re: [PATCHES] GIN improvements

* shiftList() holds an exclusive lock on metapage throughout its run,
which means that it's impossible for two of them to run concurrently.
So why bother with "concurrent deletion" detection?

Because metapage is locked immediately before shiftList call, while metapage is
unlocked another process could produce locking metapage and execution of
shiftList. So, when shiftList starts it should check of already deleted page. If
shiftList sees already deleted page then it doesn't do anything and reports to
the caller.

* shiftList does LockBufferForCleanup, which means that it can be blocked
for an indefinitely long time by a concurrent scan, and since it's holding
exclusive lock on metapage no new scans or insertions can start meanwhile.
This is not only horrid from a performance standpoint but it very probably
can result in deadlocks --- which will be deadlocks on LWLocks and thus
not even detected by the system.

Ops, I see possible scenario: UPDATE tbl SET gin_indexed_field = ... where
gin_indexed_field .... with concurrent shiftList. Will fix. Thank you.

Nevertheless, shiftList should be fast in typical scenario: it doesn't do
complicated work but just marks as deleted pages which already was readed before.

* GIN index scans release lock and pin on one pending-list page before
acquiring pin and lock on the next, which means there's a race condition:
shiftList could visit and delete the next page before we get to it,
because there's a window where we're holding no buffer lock at all.

Agree, will fix.

* It seems also possible that once a list page has been marked
GIN_DELETED, it could be re-used for some other purpose before a
scan-in-flight reaches it -- reused either as a regular index page or as a

Impossible - because deletion is running from the head of list and scan too. But
deletion locks metapage and locks pages for cleanup. So, scan may start only
from not yet deleted page and will go through the list before deletion process.

* There is a bigger race condition, which is that after a scan has
returned a tuple from a pending page, vacuum could move the index entry
into the main index structure, and then that same scan could return that
same index entry a second time. This is a no-no, and I don't see any easy
fix.

Hmm, isn't it allowed for indexes? At least GiST has this behaviour from its
birth date.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#36)
Re: [PATCHES] GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

* There is a bigger race condition, which is that after a scan has
returned a tuple from a pending page, vacuum could move the index entry
into the main index structure, and then that same scan could return that
same index entry a second time. This is a no-no, and I don't see any easy
fix.

Hmm, isn't it allowed for indexes? At least GiST has this behaviour from its
birth date.

Really? Then GiST needs to be fixed too. Otherwise you risk having
queries return the same row twice. A bitmap indexscan plan would mask
such an index bug ... but a plain indexscan won't.

regards, tom lane

#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#37)
Re: [PATCHES] GIN improvements

I wrote:

Really? Then GiST needs to be fixed too. Otherwise you risk having
queries return the same row twice. A bitmap indexscan plan would mask
such an index bug ... but a plain indexscan won't.

BTW, there's another issue I forgot about yesterday, which is that
the planner assumes that all index AMs work correctly for backwards
scan. The place where the rubber meets the road here is that
if you DECLARE SCROLL CURSOR for a plan implemented as a plain
indexscan, then FETCH BACKWARDS is supposed to reliably generate
results consistent with previous FETCH FORWARDS, to wit the same
tuples in the reverse order.

We can assume that the query is using an MVCC snapshot, which means
that at the index level it's okay for the index to return newly-inserted
entries that weren't returned in the previous forward scan, or to not
return entries that were removed meanwhile by VACUUM. But re-ordering
live tuples is bad news.

The idea of copying the pending-tuples list into local scan state would
make this work as expected as far as the proposed patch goes, but I'm
wondering whether the behavior isn't completely broken anyway by
operations such as page splits. Do we need to change the planner to
assume that this only works nicely for btree?

regards, tom lane

#39Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#38)
Re: [PATCHES] GIN improvements

operations such as page splits. Do we need to change the planner to
assume that this only works nicely for btree?

It seems to that direction (backward or forward) has meaning only for indexes
with amcanorder = true. With amcanorder=false results will be occasionally for
any direction.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#39)
Re: [PATCHES] GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

operations such as page splits. Do we need to change the planner to
assume that this only works nicely for btree?

It seems to that direction (backward or forward) has meaning only for
indexes with amcanorder = true. With amcanorder=false results will be
occasionally for any direction.

Well, no; amcanorder specifies that the index can return results that
are sorted according to some externally meaningful ordering. The
question at hand is just whether the results of a single indexscan
are self-consistent. That's a property that can reasonably be expected
to hold regardless of amcanorder; it does hold for hash indexes for
instance. (In the case of hash we have to forbid splitting a bucket
that's actively being scanned in order to make it true.)

regards, tom lane

#41Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#37)
Re: [PATCHES] GIN improvements

queries return the same row twice. A bitmap indexscan plan would mask
such an index bug ... but a plain indexscan won't.

Fuh. :(. Well. Will fix.

GiST:
- GiST already supports both scan directions in theory, but page split may
change order between forward and backward scans (user-defined pageSplit doesn't
preserve order of tuples). Holding of split until end of scan will produce
unacceptable concurrency level.
- GiST can return one itempointer twice. It's fixable by storing content of
current page in memory instead of just keeping page pinned. Will do (backpatches
too).

GIN:
- GIN doesn't support backward scan direction and will not support in close
future.
- Right now GIN doesn't return twice the same itempointer, but with current
fast_insert patch it might return. So, suppose, to fix that it's enough just to
remember itempointers returned from pending list and use it as filter for
results from regular structure. Will do.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#42Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#41)
Re: [PATCHES] GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

- GiST already supports both scan directions in theory, but page split may
change order between forward and backward scans (user-defined pageSplit doesn't
preserve order of tuples). Holding of split until end of scan will produce
unacceptable concurrency level.

- GIN doesn't support backward scan direction and will not support in close
future.

Okay. I'll see about fixing the planner to not assume that GIST or GIN
indexscans are scrollable.

The cleanest way to do this is to introduce a new bool column in pg_am
rather than hard-wiring assumptions about which AMs can do it. However
(a) that's not back-patchable and (b) it'll create a merge conflict with
your patch, if you're still going to add a new AM function column.
I think that aminsertcleanup per se isn't needed, but if we want an
"amanalyze" there'd still be a conflict. Where are we on that?

regards, tom lane

#43Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#42)
Re: [PATCHES] GIN improvements

(a) that's not back-patchable and (b) it'll create a merge conflict with
your patch, if you're still going to add a new AM function column.
I think that aminsertcleanup per se isn't needed, but if we want an
"amanalyze" there'd still be a conflict. Where are we on that?

I'll revert aminsertcleanup framework but leave gininsertcleanup function as is,
because I'll not have enough time until end of summer - I'd like to finalize
patch and fixes first.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#44Teodor Sigaev
teodor@sigaev.ru
In reply to: Teodor Sigaev (#36)
1 attachment(s)
Re: [PATCHES] GIN improvements

Reworked version of fast insertion patch for GIN.

* shiftList does LockBufferForCleanup, which means that it can be blocked
for an indefinitely long time by a concurrent scan, and since it's
holding
exclusive lock on metapage no new scans or insertions can start
meanwhile.
This is not only horrid from a performance standpoint but it very
probably
can result in deadlocks --- which will be deadlocks on LWLocks and thus
not even detected by the system.

Ops, I see possible scenario: UPDATE tbl SET gin_indexed_field = ...
where gin_indexed_field .... with concurrent shiftList. Will fix. Thank
you.

Fixed, see below

* GIN index scans release lock and pin on one pending-list page before
acquiring pin and lock on the next, which means there's a race condition:
shiftList could visit and delete the next page before we get to it,
because there's a window where we're holding no buffer lock at all.

Agree, will fix.

Fixed

* There is a bigger race condition, which is that after a scan has
returned a tuple from a pending page, vacuum could move the index entry
into the main index structure, and then that same scan could return that
same index entry a second time. This is a no-no, and I don't see any
easy
fix.

Fixed. TIDBitmap is used for that and for preventing deadlock mentioned above.
TIDBitmap is used for collectiong matched tuples from pending pages and after
that it used as filter for results from regular GIN's scan.

Patch extends TIDBitmap interface by 2 functions:
bool tbm_check_tuple(TIDBitmap *tbm, const ItemPointer tid);
returns true if tid already exists in bitmap
bool tbm_has_lossy(TIDBitmap *tbm);
returns true if bitmap becomes lossy

Also, sequential scan on pending page is replaced to binary search for
performance. It's not a big win but it might improve performance.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

fast_insert_gin-0.15.gzapplication/x-tar; name=fast_insert_gin-0.15.gzDownload
#45Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Teodor Sigaev (#44)
Re: [PATCHES] GIN improvements

There's a pretty fundamental issue with this patch, which is that while
buffering the inserts in the "list pages" makes the inserts fast, all
subsequent queries become slower until the tuples have been properly
inserted into the index. I'm sure it's a good tradeoff in many cases,
but there has got to be a limit to it. Currently, if you create an empty
table, and load millions of tuples into it using INSERTs, the index
degenerates into just a pile of "fast" tuples that every query needs to
grovel through. The situation will only be rectified at the next vacuum,
but if there's no deletes or updates on the table, just inserts,
autovacuum won't happen until the next anti-wraparound vacuum.

To make things worse, a query will fail if all the matching
fast-inserted tuples don't fit in the non-lossy tid bitmap. That's
another reason to limit the number of list pages; queries will start
failing otherwise.

Yet another problem is that if so much work is offloaded to autovacuum,
it can tie up autovacuum workers for a very long time. And the work can
happen on an unfortunate time, when the system is busy, and affect other
queries. There's no vacuum_delay_point()s in gininsertcleanup, so
there's no way to throttle that work.

I think we need a hard limit on the number of list pages, before we can
consider accepting this patch. After the limit is full, the next
inserter can flush the list, inserting the tuples in the list into the
tree, or just fall back to regular, slow, inserts.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#46Gregory Stark
stark@enterprisedb.com
In reply to: Heikki Linnakangas (#45)
Re: [PATCHES] GIN improvements

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

I think we need a hard limit on the number of list pages, before we can
consider accepting this patch. After the limit is full, the next inserter can
flush the list, inserting the tuples in the list into the tree, or just fall
back to regular, slow, inserts.

I do like the idea of having the work fall to vacuum though. Perhaps we need
some way for autovacuum to ask an access method what shape an index is in and
whether it needs vacuuming? Or more likely a separate command from vacuum that
specifically cleans up an index.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning

#47Alvaro Herrera
alvherre@commandprompt.com
In reply to: Gregory Stark (#46)
Re: [PATCHES] GIN improvements

Gregory Stark wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

I think we need a hard limit on the number of list pages, before we can
consider accepting this patch. After the limit is full, the next inserter can
flush the list, inserting the tuples in the list into the tree, or just fall
back to regular, slow, inserts.

I do like the idea of having the work fall to vacuum though. Perhaps we need
some way for autovacuum to ask an access method what shape an index is in and
whether it needs vacuuming? Or more likely a separate command from vacuum that
specifically cleans up an index.

Yeah, this is what we agreed to on Ottawa. We need to collect some
different stats for the GIN indexes where this is active, and ensure
that autovacuum checks them.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#48Teodor Sigaev
teodor@sigaev.ru
In reply to: Heikki Linnakangas (#45)
Re: [PATCHES] GIN improvements

grovel through. The situation will only be rectified at the next vacuum,
but if there's no deletes or updates on the table, just inserts,
autovacuum won't happen until the next anti-wraparound vacuum.

There is not agreement here, see
http://archives.postgresql.org/message-id/2818.1216753264@sss.pgh.pa.us

Yet another problem is that if so much work is offloaded to autovacuum,
it can tie up autovacuum workers for a very long time. And the work can
happen on an unfortunate time, when the system is busy, and affect other
queries. There's no vacuum_delay_point()s in gininsertcleanup, so
there's no way to throttle that work.

Will insert.

I think we need a hard limit on the number of list pages, before we can
consider accepting this patch. After the limit is full, the next
inserter can flush the list, inserting the tuples in the list into the
tree, or just fall back to regular, slow, inserts.

Hard limit is not very good decision
- If it will make a flush when limit is reached then sometimes insert or update
will take unacceptable amount of time. Small limit is not very helpful, large
will takes a lot of time. Although if we calculate limit using work_mem setting
then, may be, it will be useful. Bulk insert will collect all pending pages in
memory at once.
- Falling back to regular insert will take long time for update of whole table -
and that was one of reasons of that patch. Users forget to drop GIN index before
a global update and query runs forever.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#49Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Teodor Sigaev (#48)
Re: [PATCHES] GIN improvements

Teodor Sigaev wrote:

- Falling back to regular insert will take long time for update of whole
table - and that was one of reasons of that patch. Users forget to drop
GIN index before a global update and query runs forever.

If *that* is a use case we're interested in, the incoming tuples could
be accumulated in backend-private memory, and inserted into the index at
commit. That would be a lot simpler, with no need to worry about
concurrent inserts or vacuums.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#50Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#49)
Re: [PATCHES] GIN improvements

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Teodor Sigaev wrote:

- Falling back to regular insert will take long time for update of whole
table - and that was one of reasons of that patch. Users forget to drop
GIN index before a global update and query runs forever.

If *that* is a use case we're interested in, the incoming tuples could
be accumulated in backend-private memory, and inserted into the index at
commit. That would be a lot simpler, with no need to worry about
concurrent inserts or vacuums.

Doesn't work --- the index would yield wrong answers for later queries
in the same transaction.

regards, tom lane

#51Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#50)
Re: [PATCHES] GIN improvements

Tom Lane wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

If *that* is a use case we're interested in, the incoming tuples could
be accumulated in backend-private memory, and inserted into the index at
commit. That would be a lot simpler, with no need to worry about
concurrent inserts or vacuums.

Doesn't work --- the index would yield wrong answers for later queries
in the same transaction.

Queries would still need to check the backend-private list.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#52Greg Stark
greg.stark@enterprisedb.com
In reply to: Heikki Linnakangas (#51)
Re: [PATCHES] GIN improvements

On 3 Dec 2008, at 06:57 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com

wrote:

Tom Lane wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

If *that* is a use case we're interested in, the incoming tuples
could be accumulated in backend-private memory, and inserted into
the index at commit. That would be a lot simpler, with no need to
worry about concurrent inserts or vacuums.

Doesn't work --- the index would yield wrong answers for later
queries
in the same transaction.

Queries would still need to check the backend-private list.

More to the point -- at least if I'm guessing right about tom's
thoughts --queries would still have to check the heap. That is the
backend private list would just be a proxy for buffered *index* tuples.

If we do this though it would be really nice to do it at a higher
level than the indexam. If we could do it for any indexam that
provides a kind of bulk insert method that would be great.

I'm just not sure how to support all the indexable operators for the
various indexams on the local buffered list.

Incidentally buffering btree index inserts was originally Heikki's idea.

Show quoted text

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#53Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Stark (#52)
Re: [PATCHES] GIN improvements

Greg Stark <greg.stark@enterprisedb.com> writes:

If we do this though it would be really nice to do it at a higher
level than the indexam. If we could do it for any indexam that
provides a kind of bulk insert method that would be great.

I'm just not sure how to support all the indexable operators for the
various indexams on the local buffered list.

In principle, just return all those TIDs marked "lossy, please recheck".
This is a bit brute-force but I'm not sure any useful optimization is
possible.

regards, tom lane

#54Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#53)
Re: [PATCHES] GIN improvements

Tom Lane wrote:

Greg Stark <greg.stark@enterprisedb.com> writes:

If we do this though it would be really nice to do it at a higher
level than the indexam. If we could do it for any indexam that
provides a kind of bulk insert method that would be great.

I'm just not sure how to support all the indexable operators for the
various indexams on the local buffered list.

In principle, just return all those TIDs marked "lossy, please recheck".
This is a bit brute-force but I'm not sure any useful optimization is
possible.

You could flush the local buffer to the index whenever the index is
queried. Not sure if it's better than returning them for recheck, though.

This wouldn't work for unique indexes, BTW, but that's not a problem for
GIN.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#55Teodor Sigaev
teodor@sigaev.ru
In reply to: Heikki Linnakangas (#45)
1 attachment(s)
Re: [PATCHES] GIN improvements

Changes:
- added vacuum_delay_point() in gininsertcleanup
- add trigger to run vacuum by number of inserted tuples from
last vacuum. Number of inserted tuples represents number
of really inserted tuples in index and it is calculated as
tuples_inserted + tuples_updated - tuples_hot_updated.
Trigger fires on instuples > vac_base_thresh because search time is linear
on number of pending pages (tuples)
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

fast_insert_gin-0.17.gzapplication/x-tar; name=fast_insert_gin-0.17.gzDownload
���AIfast_insert_gin-0.17�}{_G�����(�G!�xc������$�I�~���Kj��e�f|?�=�zvWK
�����R����9u^u�T?�JO�$b�T�I�'���U�����A����u4^}|�0�J���,--U���Y[{��������y��~��i��������deee���l=_�,����� ������B������B�	�{�E�V���q&�4���S8���Ip��~���/�O
]eI(K�^0�O����c��NG�9H��n��=j�������G�{�i?�^��I`�����
�^<����A�g�V�����A�f�_�2������GW�K����liG�)vg��Y�0�����Bw-�:�Q��k
|VK�l��w���v�"���Y8zG�,L�����_�	��$K��Sd�x:�{�u����`����tt&�]��^d%_#��k�8�]�}�F��6�b|e��������Yx4�����~|�t��|:[����������j�7�e����b J���>�"����pKd$����:hR����P[����e@��}"���a*��g�u��-��8��:�3�]a��g�Y�_`m�h_4��w�]��X_>W���[�k��
��Z�����P
3��@,'a0x�RJ��M���l�J���M��#��O��
�i������G��g�M�A1a���_����"&a����(���&����&�^��^��0��T,�Qo������&��l�Q�'�#�vL�
��4�1w
6���ON,��Z�jnm�5��a��Amu���%�����_sx�	���'��*�4E���@���`�PA�����)��v�������
���h�����u�f����?��%'H��%#�B��5k Y�V���O��?_[���
�����5K�7����p�����c�@ob�~	"�
?���`EQ$1���0(���P����:	�?B����(bf7q���z�EQ��A+`�"�����!t$�^�S�2�V(���6E���1��Y,��z:��,��@�	[����I��T��;@�M"�����
���&s�'��0�
��U��d�o`��y]m4�yx
�S�B��8��Qv�OA�����������������g��V���������8_E
nY����'����|�"�p���X�KJ��$����$_����U��|�(��?*v�D��
<kt]�
�tu�������E=���|�E�x�F��<�W���`)_<����	(:K�I�������D@�P� �_�E����l�w�KH�&��EM��7��D��[�b[���M ��.��vqyp~�=<?��^"5���
r�(���f�'����&<��*��G{����������c�^#�	�*��,x]��:��9��'Y�kp�}��� {���6�{����9���I�!��`��M��-��9� ��/��}�E(.�3�2���_�����5Y��p�>���N�f00N�Z~6����b4	~���h��k�����r�?j	���4
�@{������;b7�71�g����&
�H0�MA�&�
�b2z0a �I~��i������m�H��r���E������v��&I��0���IL`n��A"5�WQ����B�h����wa���	�
�����<����J�*���+��tq�V��|
���������M���\��	�����{I>�"~���-*��?��J�i�� ��\9���:�i����J��c��P�u�\���S�t�������+�ZG�M6��rx�`���(����=�\��tr���2���&wu��09 �|���"Q@�j�a�L��|l�����)��9�_D�NC��

�����]d�������
9�8V�������&�B�A����S�5�W����X�^�D�e4A7�_5N�~��i�����N_��v(i)��?;r�c���(�^L�P�+�
R���0p^���((Z����`��,�h]
���zF�J��F���Z�����G
U<�KMa��H����0�$=��g�ji�����m��ln���
��4��8f���{�`���V�[s5'�K�^��e(@�������x�j���5X���V�o�.[��/�+�"l"��ve=��z(�!�=$m��F,+����q��Z���2)BCo+ KA������hee��_g�i��nM8�M�\UMKS���f1D��"S?)#������g�y�H����\%��������������������?��/8��xyr�����������7�u����e1fW���j�pT�!��Ue��OAM�������4A;x'#03����������ino@��&��_��q��g[�J��iC A�;(�n�5<��#�3N`FO��'94(0�	
~%�r�&LUy>���z�c�#��K��=Fj������Q��
R�PZCN�sT;��*�z���t�R�z?�C�Q	��x�M�4��)���a��B�`e_K~�yV�.�����O�E���`�k�a�z��[j=C>v�k����.��E�xQ��F����5����P���E[�o�.h��<�����TTS0�M�2����d�T-y�3��"���03H���q+�z�Rq��AF>����-��}R�D�:�lJl4O\�6 ~�87�����93��~:<yq��#)��dlMDO]�1'�T%R���?�)���������F+�!9�`��7
�\{�I���`c�x���^(���L�X|������&7��$��T�U)WKP9s��^���}Z�c��r�����P�RzO�}g��g��c
&�@�4&O���T���0�����&��F��iC^���D�������_����Yivp�ea��"��?�����+���%`@���,�Q@~pC����{�LV�hZ�����x|���Kkz�0a_�=!���`.W�C�!(����R����^��2�*8�|8���Yn�Z%g��HIL#���U��^	�u���fD%ta>�#,����������aN.�z*��h��4����/ea��X�L��;3�\�����/�V$�����:�G(�v��E�x�w�|
�Y]1����M�]=��_��,�b��.���]�ZU�G����D+V�K����T�5g�i�������,u"��_��?����<s������o�T�;q��3�e����5�O�����p��T�o��e,t_,xi��
`sCv��'�O���}�&Vc
�*���B{#fc��SE���w�.������h+�]��65<�m��]��d��7��^��^T�K^�o��r��{l\�[^�B9���F9�I<"�T�/,����j�o��^�=X�������&Hf�� �
`d�`W�e�6R�T4�h' |��MS���w8���<N��X�Xc�4��,Hu�����#�_�O#���WT���d��r��wP�ll�~"Z��$�eG���}��[
D����q[�uW������
���YI�������[�'�����"[���}FqrG��.��g��l8G��b<[(2m�`U��������� ����n��nX7bp�!�b�����^lx$�#��T�^6#��v5w�K{�SqI��`������y�\��'$e�~�aVIpga���#r/�Cs5��V���eiQ
{a��&��8�$<+A���8US����%P��"T%�ps�{(T%��d����g����WG?uA|st�n1w�4i�/�F_����Z�����Z�p�h�}#6�M{}�1��s��H��Z
2�Nz������w@�PF��S��#��0P��$��Vd
����0%C��c-��p\D�)1��%�L�c�YZ��Mft����h@�eq\������B��r:���9���-�L�:�=�P��h5������Z��E7;�8L�U���|�7�����@xZ��:��Sj�~
������0��2�!�.�*����2�-���v��!���2r������
�d�htH7B�30�����O&)Un��#:Q�@raO�FG���2Y��\���]�K#I���~����P���?
v<�(M���Lg��f�DU�>�A�ue�xZ�)|��\E��)O���|��������[�Cc�����
�p��9�����G�/�6L(C~e��l�����\�[~�Z�2���1����d������a��cZ����6�Kln{���q�[p�����Z�X�kT&��fn�J�)�����,#_
x��_�����V�����_z)����0?��vf33H����>)O��ENM�1
r���$G��`��@Z������D���Q��<�lu&�z�FA/"���zb���Se��7m�es����bO?���,�B�e��2Y
��6��*��P��R�����6�9?��e���sNm���I<���p i�HB���������
�'��+@�Jq&w�ia��F��A�7MA�F'K�_���K���r����P�V<hDY>�h�9c���5vj+����1!�$2[�Zg�&����A�}=![��6KF`��O����4��#��@�}�@@f��|!~��]��a��+"@��Mh�9Z���q_>�,G��r$��^�YW��N�]�7F:�7g/xN�kh��D>EM��V���d�
e>��`O?�m
\J��)�Z=�auk�{B��Q���c(z�X����c�C��h����i��+��
��;���
W�����-�\���,$��8��<���d��������6A�dX��~�����J��vIC@���%D�����^����?*�t\Od�rUX'�@����3���8L��eP��i�''a/T�����gN���H�mb�ZB���@�2@<�4�
���gWe�
K�'�&�FO�B��t_�?%Y�=8sax#��ot��x<��l���^��MY���������O/����q6�{��*��zls����v���q>�W����:��^�;{<��M{_8��BC��[k�1p�O`f����x�������e�j��XM�|�{�(�[ pc����
�X0\�W����aL���N�h`�����a�c�����F�)t��Z�	�`��8�������y�X�{�Z1�������������SM�������%\5�\Kkz{���]� �����j�nr8zd�U#5��2��O���"�[
���S{%b
Z P@{C@�r��vA��}��r�&I�DI����$H�"T]t����^]�=�����e���)���t|z|9�������9�J����6�z7�q�f]<��TO���=��,���`E?��hjUX)%�d^x���f��J������P�������R��DsO�wJ��CrUF��S��
g����U��7�}�5�����~����<��S��(3�MIF#��3��0��R�+��
?K)`��`8��!7Y�{?J?����
���������� u�R�&bj���k�������vo��C
Sh�u6NDO9K�$��.�b���S����|T	���%�u��:���V�q�X
:S�P�B�6�{	���<@3�*�8�	��|>�������M��bH1�Ww2�R�n+
�`�U,�����9���[HR�nz���������z����d�d�c����8D�<�(���8���H�&.<�b<,����b�m�~���=q��^r�;��=�>��5��z�5`��t7�e� ��+[y5�,�s�~�\�Rr�}ex1���������gDMn�R�G`�H�#�{��	\��{��������q�)8����2nC:8���a�Hn#J�Bl�Q7�@B�L���\���BK�Y�kp�TL���1��
�K�)t/pD�9<�W�5P�������V�HBT������B��8�pG:M�v�`���1�?�7�q|+�������~$f�2dZ��8��qkr$���$ d���Y��������h<
�)H��j���\"�Y�
��dl9�����2�\���2�r�m�="���L�<J��	y�Db�$������UN)J���"���HjM�:&�_��E��",~���u"���dz�
���9a��H����f=�����%"�tAp����tX������j��e��K���'K����s�y��P!$'��1�_�'�/���Jwb�)yH���
��{w�s�/,+��C��=&���@NW'{�,J�A��5�����������8���Bi!���
������f;0J�^��=5X����| ����Y�g%���*����#�f��S9�^g�Y���nr���g�>R�/�i)��qT���'.�����$UT�Eo����D��v=)��v�x,���1T�@$�6v��?+%a�u1ID����@u5v�������M�a��������7(����v�x{����<�yX����M�i�t��!0W��p�"W�8\��pe6��9�����|
�C�!����"���
@�LA�����Zs}�����V�-��:�t���/�0�d�
�Wl'P���
�$�D�q��z���������B����i@'2;�6!��.Ui(#T&0%d�);���!��z����������bC1�j��Zj8���zc�R���E�E��o���*G����3���Tw�p�zvwv��F4l����Z4L���+n���Y�������}S���<X��gA�)*�(Iyw
����&T:�
��+b+l4s��IG�9'W3�V���{��i6�MXc����Cx�N �)6;�!������j������3~�>0>��R��{�����2h�z�i����/2�Ir=�Gxb�0
.,b��6Xd	l'@e`���BL�d��b�����w�:�C����x��-E&`vG>��\�d�
�4�2{��'?�q����9'���::it/�T���1��t���\��>Xn�|Ez{~����(���Y��)�F��%����L��I�N��HX+���=���P����ix�	�U�����"yi��r�jV8������v�,�l:Qj�?|H<����IYV��7^m�Z��6*��9�ud��6,G���lk��6ZU�S��>��7_��|��A�z�����dr��E��=R��z$�eF^�	����P1uM��f*|��-��E��7���_�gq�c|�\�8���C��l4��Z�^�j��XU�������=X���^ ��4�5;���Nxx�\~&��^�����z��T�Y��U�������P�!@�R��
2#�������|��������Z/�THb��TP�R���5Xt�?
�\��v�:�8��+�n<�l��s��`.�E������� �lS&�0C��KkON� ��,U���Q��:K�99A}g���oio5/���tu�B�KS�6�1���o��d ��q�|K��\���_������J��Y!�!���<b,>��0�*"r�$6��VjQ��.]���o\ic�L�?�5Yu<Q��\��z���9���	m�
 h��UD�!�A�&8le_^�#����g	�_���\��L��������cK5U8xH�4��SN:j����!����Ja�+e�<���)�7@0�M�b|����MH��|�F�,	S|�@'r�^�?�d+�nT"|����<��w8���PG�.S�e4!tZC���]���&mA��tv�Zef��L���,���&H�W���i�|�6!<Y.=�7X���{�f�!��9V)����OW�����a�������`06L�S���p���^�#�/*4��j;+/����C����BH�>^ace
��bj�����y�Q�g�����U���V?r���<���6�A���K�Y<9����9�X�{�5��GB�����*�F��\���}� S���K�&V������U��T,Aq7^���P�-��f�0
UY��z�+�qC��g��$�y��k��^��s6��_$�L���}�D��B[5F�����{a��g4L[��@M��M�1��������]*��o��X��gwd�.�.���F��)5F�(���_g/
��UA�"���K�m`YPY\p��1o+,��:n�y�v��~�LC�3wA��@��ye_6����a�(�st1�p6���/���
��k,7��KI���r0e��=���O�a�q��/��#U6�
�?�
"���a�\t�������\u�i>���R������:FO
yQ����������|��ti��"2�}��[�6�90�w�����@��Th��|���2S�C�~��'���:Wx�1	��rn��6J��!����'DM��(��ry��%`�����1DJ��j6�{N�:&U�$��R�����=R:��]����'�����K�J��c��^y�e��lG�{D����7�]��j�iz�=5�9P���v�(��E"P��rS�k�?�R}e��6�� ����-����WX�q.t1��Sx5����8���U������h��IB�# ��e���!o#���x��+x�+M�A�0�'�M�6-H�]	w{J�McA"���9
�Vv0F�"� 7F��������)����z��������qz&[���9��P����=J�a�ps�=����D���5@���yZ�|����Yj�%��S�%p?�'���%-�����e���>���gv�9�B�����y���*�
g����97�L�Y���{����X���rn�������<��gd)��[8")cg���<������x�KN
+�~�����+��+m�gC���R�MbG�t�q��a?��Y���1�AX���S���C||x���{���{��������B�Qz]_Du$�sr�.3B���$��Q��������o�0/�q���
Z<V��pk��E�_���^�������W#�����D�xb�E��� ��Jyw� ������������M��HK�!��s7��m4��H�nW������-�M���<,U�r$i���QM�^���*����!�<P��-N�����x��oA!�Ku��CoJ�����|��T��r��|�d�/ �34H�2kK��`��N
W�=�B���Yq����Uu��C�����V&5}�V4V��9�M���=��Ev9v�i�&���w�Z+��`�:�o!TD�:�E�5M��1���oc_[�h��V�����q����2����~2�&R#�6�����1���JOX��������K��x�P�����Zp�
s�$�ZPf�M��{U�g`���jwa���d��4����=�*@xB��7q�|�R��z!���4���4UkJ�y���AUR!����u#�jK�e�`D���^�_�q3�������Q�k������b�dx��}Whl��9�8>>��G��Q.��Y�g\q�y�5�e�O�d�u����F'���+R�E��`����-V���g. n$��I�Hl��[6��_���9�����e��i
v8��D�4C�UH�@�%,YV��H�X���@�9@�F��?/���\+�vt��L�A4�a���;���ma_�t�T �{�S?6J�,��#	,�`��fq�9���E_Wu��8���6Wh�P�UQ�*�U8`��9��0-N3���(#���W���`r7&�xcV�qov��b_�(C��rQ�SS����N���@a9��t;�$��C��Kf����/5�>3V��up�U��g'U�*�'U�G8AY�U�C���g����2x�Y�nG�i�8`AF��C�S����&��#������[�#2��y!r����.3���L�{@D�@2;���>�MF)e���g�D���U!�'�����z�����#2��Q)��/D:H?�^
CdNCu?��<�+�(��-�S<*�(��Q���
�U���3]2��s�VV���y�=��y�����jvv��s�nVA3��!�`���i�7�1Q;Ro��@+z�=
o����_���/�(�@�x����k�Sz�9�ZH�n�]��u�s�������A@��
�e`X�LCe=-�*0}����fp~���]���:3�f%�x��7����7���^���5����J��+9�y�)Z1/3�
V�*��?<?����� P�TF|��<��U:#�<L�.g���;��z%�6�)����4�;����������")GL����9+`y���+�l&&�+�W'��b�{������l��/dW���@�)��
�s?L{�8m��;*��t8�;P�_�n����,M�Tw�9���vC^�����d��$RP�����!���F�U��wg����.eR1P@R�S��������pw�[�E`�H���O�u���n
���/����C�}�c[�@-��T���z��9
�/j�*���[��HP�`(��/���G#�{����a������\��TG�m���f{����,�W�	�i����<P�I�p;�Np7��s�0?�!����hbW��� ;'��C(q���Y��XGy�Cl�[�7��hLdv6�!�a�5
����X�!��������\����pH�ZZWN��q����{T:�{*�����
�$O�),������ST�����&L�m'���cr�Ov�?
4�2��w�Q�����4��3�>W�����#�}G��|�s&��^���{�i?��t�*B=���u���@�����������2�Lg��F�O�m=�8i��p�8~J��nn��|��
��M,�Oz� ME/�W�0�8�u,d�)bjqc��N���`�e�W��[��go��u�����I����a��6�,�4���|����9�W�����I��5w6��U?�]x��!x�&��f!o��w����E����
:
���O�m&rn.P�I�	��@�Wf�:�|�t��
>������T�ZF�5�X��y��&���Ns��P|����&��B a����F����������h���k��}�W�����X��(�z��������^����P�����Tg�M^�^:Vs=��b�R����Uo��}����rzR8yvO7y�>\�;���k��N[/~���/n��[���/�9� �O=�	�[��7	��*��W��:�|���ZP� N|o���������jwY�	��f?�!w��K9��0�U=���A@�M�:��8�)��w2��FcL����fs}}#�����Q�8�X��}��_�OA\EMQ	7o������l��L��I����_���>8�<;�5�^��P�����kk���(�Z��k(�n>65,l����|i��L2I^DD-L>���<:��S�T���f\o�����������08DN���_ZzW��^Y���&��bp�������B�Z�FIr�\��n!d���o'��P�8nyt�����&JC�9��4t�]^��s���_�_-4����3w�:�'c���'�gU��8��B���BL@�W�9v�ny�
��0��p��%��y>z*7��k�,��r(.��������s1���'/����_u_\\�2��C����x���h�jp�1A#f�/'a/��������s�#�@ST�_K�J���L0E�g������I?���G����bSX�[�L}as������U�����.�a+b :C��#F:Mg�Wi���T�R�1���-�����Y��rlk��h��O����K�Ei�/4����S����W\�I�����vp�7���XK*[��#L�����
��U<�����{0�@�������6���.g�yT�*�U�\=e]Uv�l�l6����n�����7y��������F$�����p~7�rX��M�A��m:}I#����M)�pd�����@�$��R����f����:��W���>^���`2�#���)����m�C�l�<:$}������nk�>q�I^���(p���.��vS���r����g��<���?�dC	����
����n�J�� �L�*I!�Q�d������`��(�/j�`Q���
��8�c�
���k�OYU���{I�O��z��
U$�|�������n7w������~����1(�<������qS���\�����B��0*�F,��7P������9��.{(���|�*�'v�w�������H����<����U�*��.���;�����Jb�xs��J�7���U2;�gV����*�T��c�jM0.S�S |Fi���r"p���%+��b��<���F���BuZpB*������K�	�Z�,�����������J�gY�����zsc��;��X���|6� v_�jI���V&���*�V��	�J��[�B���l�I�b�EUyt���2��[���3����.N��1�b����wYxrT'����3}e{�J�3
G��]��zw���vdlS���R�-�V�(���X���S��m��Us�aH��3�*�J.�R��S�t3/m@�4�N�����OC�;�x
<���� ����n��^��h.�6�]h�p�kr�nv���.����O���o+w��&kn'�Z�/��y3��i�)����BMN��j�2���
L�ZM�yNn��i�eE���l��u�r��X����h�R
�vA3u8�����L��A�o%>��Y�LE��59�;!���}Xph��r�
5���r�A����O��;���p$������:9���p�
sV����`\�T��Yj���_&/%���]�"_��}L1���������W�z���������M�)o�F��z�(��s�B�x��Z�3q�(��QEf�z9�^������<�=L_p;(�nq5��VPL.u���4[����3Q;;�|u'�p�����Pm.������9Xj�����������B���,�bU�qo�|������<�u�b��"^����cI�
�������G��r�T�5�`Le��"�a��Z'�8M�����?��"�x���+��R\������=t*'i����?Z����{r[�������hT�5*+�:��M`�Wv
z"�;�
�8�Vg�������0;�>m�@���{�[*�!t� 43�JV�
�?����,;]�:�y[���gv||zqt~IB��oN+�O��_�t����=#Xd��$I�w��� �r��������'�b�_��x:p�s-����Fs��c_n����f�O��	&.��0��1t�	/��L��
���5����L`���m(�<��J`� �(��0�qQA�l��+�d��*����+D�%���!0�8�X��������U���7�<�����#=Q�K��b�2,�C[��y�.��G��z��@����B�lf�`��jz�	���6m��
����J�M�������������U?C���l�X^x~��y���u�]�����.9���-�
qp!�X��)������
��ys~��8+�,�\Ye��0���2u�e�#�3�2�>�������}�v_c�|0����U]�Q�����hWh��E}��!�{_(�Sy����{n/��z}m�����z��M�FnR��1r����<|�w����9V�]*�_aN�����D]�!����!��&��
�o�������=h�����T��������Dn3����!���|���m	�;	#D��i-��?|��U�A���4�`���t�W1�c��?�I2��D��W��R��C��omc(:~�o~�^�wrt�}wtN�������H�������
Q��D��������oh��)���N�����������-yaQ�;.I�J��L��]�exb�wR���$������}���?���:}����"�aSB����P��,�.xD�,{�U<a������m���YC�eK.��'��f�����|�u��aj������[a9�|��b��5'x:�/ �u�'���c[�{+}�".ph��fDI�W&�����*��w�d���V'q
�1���,����������F���h+�U�8���v����c��G��PX��&�4�V&n��
O�=��9�Q���#���R�����d����U7�59If���X7 @D]��bMvO����iO	C����q<��+�+�o|�iT���s� Q
�r}����9�d�a8���3�
���pq��D��(��Rc�P����V��H��
C��=��}���|aO���p|����p�f~86\-�����*}�g'q�"_�P���ap�A��/u����T�&t�2���n��v�c�q��L${�7�^��*��!�lwcu�\v#9�&�h�,TC��3;�^�~b�'��w��r��_���'A���m���m�^}l6�G
(5Ee�]�����i���
:N~�SM�mnc��l���ze������p���Y���	Q�w���p�Zye���*�EW�t�t����ML�����;�`�h�}Xx�r�R�7�f\����V��`��2�����������BsGJ�"�I����(��%��8���M�N'w����X=�#�_6?v�5;=f���&������!=W���m���A�:W�2�p[����^=SL����A�T"���A��)��o� �} �w��
R-�`7��+5�*�O�������w�E�V!ngu��S@�E��]n?��
y@�z�Ma(G��PG��WH*#<JLJ%$�2r]K6�����/�����3Nz�4����7���Qh(�B���
�SV������mk��K���D&��7xT��iX�+*��
XE}:0o��*�Sy��O� Q�v������p�F$��?�R�z��(,���>Sm�*�Q��67=��<N����5���&�*�k��_j��S����^�
�#�9��s���G�������K�T�V���[��t.��Dm��tb��td��z��>�/U[X�Y����?��8��Q���d���������3:)a���&������pk�e��������
�:�h{�
E.?
I���49��������TY6�����^�A�e�T#�-$�m�0�7����!����6�T/)m�5)�@=8�>���x��.d������U��X��#Of�J������Ps��z"����j$+����My>�
F�B����q�����g���m���o ��-l�#H`edS�*z��0�P	��
�[#g�X�I�Fl��Z-��r���x6 ��<t	�Q)s��NoYj��2G����A?���`:��9����_�)���l6�j6����\�iv�����c�;�^nm��J�`')Y��/Y�]"��gQ�F�K��d����5	v��FM'����kL���_�����Qd`o��5��-j��x��Pz���4��db+&S}m�~^��5����A��\V�ef��j�%�,S|Z��[7DI��n}5<����X�t����^Me�O�Ai�~
C���gH�
:C*��}C��4�:�-����p��:9��oJqI���*f�w��tC��\�����@���E5�b<�i���b�hjtzb�p�1N>�}@��E�=����,g��������O���L~�0@S��/��%S��+�Ezrr~�����\1
>�DJC���%����w%|�,��2���U��=��0}� �b�J�%�0`<���:oJ[B\��<�k�e?�{bE+�2�!o3�C��1�[8��#mUI�$�KeG����O���Ws�����������l���1m��D����!�~K�������_�. BM���$���|��i.hC������&+RB`�������c"���*�N��S�L�b����0p�2��"�/?��O-�(�y�R�}������v!c�Uw�z����]��s����E��9���Xs=��J�T�q�a�|��N������Gz���f�N���j7���SA���e|�@�%��
V��@������tpr������
���!%4�C�����1,���?9��I���fZN����\d����I���P����'��co�b:���9����Ev,�S��~�������1L��cn�{�N��]��9�)�{��r������4����N���(335"$���^����lO�sy�����r�g��#����������_�w��?���������FO����F���i�����N����N���Q9�tD!�\��d��4�\�}�Z�J�BX�f����j����#s��y�N�z��ek|i�����V)������X8����J(7�\y�OF�����Q��N�X/���p�����x8�oSpGy�d��������� e{&R�����������%I%�?���`���GX<[s���:�����#Pc������=�X��a���k�%7B)���S��@����������x~�~�eX��q�7������U��`�C_�z:(k@g���Q������C�@�Os_�~�M��hI���d�=��V�O���G������W�������|������6������^���f�������H��
�����vw����`j��������ln�[*��U3����t;>a�K�������L�3-�?�ar�z,��A����2�H\�m�8�A��-����0>����}�-[$V1�N��.aJF,~cO�(�
�79=.��/���F���o]I�r:�z��wj��VS�)+<o��z����>n9�}��t={������7�����u�r��T��$]�H�G>W{�h���^Zu���^o:�����������w���8�)H/�+&��
�G_�����dI|E���C��t:Jm_���]�������B���}���PX�J���UW0V`G���z��4��p�d
r�F�����~��?�����	�*���x����JV,�xr�\&L�su4p0�g0�����M���Br��B)����x���J+�R�������/�=�WG���E
pO���>��>�Z,�F>q��$y�����=�����������^�����Q�#�`�������)}����������Q]��9o���|�U�
])��O����<��O��EW�
3���|���'�����{�2���g���0�v�����
��"��s�1�v�=��13{�tfs�W�����
�~�U�� �5����kc��i���Wm�Op}���;^�uL!�'-Y�-?��I��F=���\/��g�����*P���m������My_��
����u)M=��*}]�5�NN�`�<��~`\�2���_��;�\t�O�*���s�b�Wn�,���9��*0h���g`�B�[�B>�-W�pD���0�V��:�����h+��&Sh��S���9u\Z�S�H:����s���'I������5}�'��������?������%��`��[ILq�x�����i��R���.��t�a�����d�5W+�N�����Th��e7�=Wi�)���:�*����
�%6�?�R����h6���[�� �V��/�����Z�������c�6�w��J�^}��_`N[0/[?0p�*�|��`�^L�������
|pD��?��qI�����*�����0��@6%�W�pAM\G 2��R�fX������up
��������r����97`g��`"�T"�^�����2Q��+Q7��)4\�
���Ew=E��0F��Su�p����$�(`��Z��6���z�i������N��?u�������a?W��`2o��iB%L�z������������)�@l�a�O-x,���*7��jdW�p��'G�/u�~���X��`�Rl'�Mt�w0�wQ?8??����%�=�����L���;h�(�4�d�h��a/{�i�[4��
��tT/!���_E��;:�S/�/�����Y��-R�w�J����ip"�����'__���D=����g^j�|���4�N��D�,�����3;O�ib��)�1���L��-GY�rY�g�4���X��Y��N"��D;��9�xs~���x�wf���V�#=��o�(���lK�N�"���N����	v����7OB��i6��V���t���Y8.v�;��UP��w����������e�i���:�"�/����ZF�.=rP�U�
��[����~�Y"����BR�z�heV�����C�3�F=g��n��Q�V���j7z8�����L����7���E�i�b��5��]��O�i���R���)����O�)��6R�����C���(��%�_�����m�����B����(�����f��a)�>]���g�c�(�x�=O�;L���a�.7�e(����7�?����* �R"Y�)����Y��Y�B}��%ex3
K���g���Y��Q����Y5<*^��Q�
���V(-(jN��f�U1���v��}�U�N^��u�(L��G��xl�/����-N�v���$S���$���F��#�k��zC�{��D3]�����2����*^����:(QaY�4�Z�2���MR1������u�:����x�/���e��1������y�d��o���
 �������Ak�;�KB�-r�I�o�kI�G���+�H���(�{p�P����s�V�3�,�)r]h%�V��S���P��d�Cd��v�zsP��k"*�������/�w�>�Y���>��Y��
Mr4�s{�*�Hx�bt(��K��X���w����=c���e|�+�/3�	�b�@>��'k �X�'3/{4w1w����'��PHzIv��F0R�
}��4��&�����\�&4@]�����-4�[h����o��B��������e7����~�)�S�-��[L����o1�F�b
�t�<�b
����x���;@�b��|�9�s�-���D�sH�o1	�b��$|�I���-&�[L������Ip�6�c�e�#�����6�ayI�=�������O�$�_���x��s�?�y����j��sV~@���7�M�����
�p��FN���v��UnQ����'�5��/V�[i�!���!~�k����(��2����"y_��j4���*��G��G�i�[��i���������|_���B�V"�;E����F�G	S��f�T��)+��a��aL�u���][��[Y��l]�\U�:b�����������7;x���T=�t����/"������L�W��?�I���X^��>F�i0R�����bDc��Z���bu�e��X��
m���<*}��������6������/�^��F�o�O_��/V�2�b�����H'a/�����#_�5��JQg�&�~����'��k��Z/����f.p��M,/nmY���0�18�j�7��c��S�b��/^5T�]���pU�����u��<��5�;[:/����.����]��N@;�@����|�	^���@O����7Dr��
�hu@��[�1�@Q��r{7����a�����\���d.F�����\D�tB!a/~����/���	\��M����(��;��H^q&���_�6��F9"\�1<�?�8	��m��i����i�@pl#���1K�kw�\�F��"/E	Z����7{7�f��+���(�=�{�b�%���m��0�U�S����;��{���'����K�#/y��x.��Sq�@'�������X�w�Yj.����!b���t	�^9�s7�$��0J��h��|��D]_�XT����?N�s��������1P��.&��Q�_L�=�)��(f������A���u���5��N������<W%��;��$T�[��R�t���F�#��.��]����*#9�h�sw�^6�`pg�����f�����hI_M��K
�����i�u1�=�q���xA��@"yv��V#��Gh>sXrm������c8��������~?b�N_D��I����-��Kh�����1g�?������H�����������J���w-�H�c���]�Q�K��L�[����
\��C�����������z�����>�+�0g�h����	3����n���)��RX�`���	>��g!
�8��!'?�x���S@�*��a:7����RFK�R^�N��!����`����K4��C ����#7�8bp�Sj���Tc4j}3�Q���������C��S����6\��w����@��B�
	��}z'�� *�>�P�����*��H���K���iT-�1�d�� +>��h`��XV�+�r���x���</���������h�������w}��M���I��D�p��J���*	Lmh\�M*��7m}�1���b\���L�^�U}�H�S%���"y`�)17�Mj ���W�=�=����+>m/�c �>X��f��5DI�����#�{S��i���x�E���\��p�)}�`&��z���j��cA��9���%�|�N
Ry���+B�z����*�U� �Q+��^�/�3Sa���.�{lVo��[�a�l���g��~�����m������������
�� aB��;�b%iF�� uSh#��
O���8Z!P�V+v���%u�����T����@~�o�3���T������N����E�XyH��Ckt����G~��pe#��K������OP��*�)s<\�@����N������\7�C�����P����fk��c���f�������-n�lI8������r^������������rB��Xi�K�+b��-����Bk�$�x��/����N�f�1�Q�Q4�:�_c%���T����~0���`Z~$-��f)��Yy�� 	U?�����s~�l%�HP�\�U/�p��0y��y�a��b
��{)�?�Y�1��'�G������OX�=�}�����\;�~A���U�������h��P��j9.���`8Mo��Q<M-������[���3��2����A*��
��q�c'e^L9���$����	���b��-��5#pV���a�C���|��	YP���~	+�g������N/O���gW�kk���M{�lX��t\���I^wUJ+�����
H;�Q����I@_c�<~�"��-��2(�QHg�H���^���
=����o�����^Qf�����������������^�y�TL��z?WD�Q��?�.�~L�4��Nk���uA���y^I�c�I��f�P��ZW�ftRYY�Zon=3�	_��-3�hI��q����_9j"u���Q�"���d��q���&��n"m���$/��5�����x
*����l�Y���[�u%����5x����#4)8���R��8��%��Jy�m?����Z�$��7z��z��V�����Y�}��9��F�Q����m����C��<��`F��y?WXB��Z#�9�T&��N��V{�J�<
�
J��@>0�1��+�M���H�T-��J�~�7������u��C��Gy�{g�>v�H��E����@w�sn��S��dRNB��Yyau�	P����?�0�nx�
#56Jeff Davis
pgsql@j-davis.com
In reply to: Teodor Sigaev (#55)
1 attachment(s)
Re: [PATCHES] GIN improvements

On Fri, 2008-12-12 at 20:36 +0300, Teodor Sigaev wrote:

Changes:
- added vacuum_delay_point() in gininsertcleanup
- add trigger to run vacuum by number of inserted tuples from
last vacuum. Number of inserted tuples represents number
of really inserted tuples in index and it is calculated as
tuples_inserted + tuples_updated - tuples_hot_updated.
Trigger fires on instuples > vac_base_thresh because search time is linear
on number of pending pages (tuples)

Hi,

Comments:

1.

You use something like the following in a few places:

START_CRIT_SECTION();
...
l = PageAddItem(...);
if (l == InvalidOffsetNumber)
elog(ERROR, "failed to add item to index page in \"%s\"",
RelationGetRelationName(index));

It's no use using ERROR, because it will turn into PANIC, which is
obviously unacceptable. It looks to me like those conditions can't
happen anyway, so it's probably better to add a comment explaining why
it can't happen, and Assert().

2. It appears to be properly triggering autovacuum when only inserts
happen, so I think that issue is solved.

3. Simple performance result with autovacuum off:

create table random(i int[]);
insert into random select ARRAY[(random() * 10)::int, (random() *
10)::int, (random() * 10)::int, (random() * 10)::int, (random() *
10)::int, (random() * 10)::int, (random() * 10)::int, (random() *
10)::int, (random() * 10)::int, (random() * 10)::int] from
generate_series(1, 1000000);

\timing on
drop table foo;
create table foo(i int[]);
create index foogin on foo using gin (i);
insert into foo select i from random;
vacuum foo;

Without patch:
INSERT: 71s
VACUUM: 2s
total: 73s

With patch:
INSERT: 33s
VACUUM: 12s
total: 45s

So, there is a performance advantage. This was just a quick test to make
sure the numbers matched my expectations.

4. Heikki mentioned:
http://archives.postgresql.org/pgsql-hackers/2008-11/msg01832.php

"To make things worse, a query will fail if all the matching
fast-inserted tuples don't fit in the non-lossy tid bitmap."

That issue still remains, correct? Is there a resolution to that?

5. I attached a newer version merged with HEAD.

6. You defined:

#define GinPageHasFullRow(page) ( GinPageGetOpaque(page)->flags &
GIN_LIST_FULLROW )

But in many places you still do the same check without using that macro.
The macro has only one call site, so I suggest either removing the macro
entirely, or using it every place you check that flag.

7. I don't understand this chunk of code:

ItemPointerData item = pos->item;

if ( scanGetCandidate(scan, pos) == false || !
ItemPointerEquals(&pos->item, &item) )
elog(ERROR,"Could not process tuple"); /* XXX should not be here !
*/

How can (!ItemPointerEquals(&pos->item, &item)) ever happen?

And how can (scanGetCandidate(scan, pos) == false) ever happen? Should
that be an Assert() instead?

If those can happen during normal operation, then we need a better error
message there.

Regards,
Jeff Davis

Attachments:

gin-fast-insert-20081221.patchtext/x-patch; charset=utf-8; name=gin-fast-insert-20081221.patchDownload
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index e34a2f9..6cce1f3 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -551,6 +551,13 @@
      </row>
 
      <row>
+      <entry><structfield>aminsertcleanup</structfield></entry>
+      <entry><type>regproc</type></entry>
+      <entry><literal><link linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.oid</literal></entry>
+      <entry>Post-<command>INSERT</command> cleanup finction (optional)</entry>
+     </row>
+
+     <row>
       <entry><structfield>amvacuumcleanup</structfield></entry>
       <entry><type>regproc</type></entry>
       <entry><literal><link linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.oid</literal></entry>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7493ca9..d7236f8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3525,6 +3525,11 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
         This setting can be overridden for individual tables by entries in
         <structname>pg_autovacuum</>.
        </para>
+       <para>
+        This parameter affects on vacuuming a table with <acronym>GIN</acronym>
+        index, it also specifies the minimum number of inserted or updated 
+        tuples needed to trigger a <command>VACUUM</>  on thos table.
+       </para>
       </listitem>
      </varlistentry>
 
diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml
index 1c5841a..adc77c4 100644
--- a/doc/src/sgml/gin.sgml
+++ b/doc/src/sgml/gin.sgml
@@ -188,9 +188,45 @@
   list of heap pointers (PL, posting list) if the list is small enough.
  </para>
 
+ <sect2 id="gin-fast-update">
+  <title>GIN fast update technique</title>
+
+  <para>
+   Updating a <acronym>GIN</acronym> index tends to be slow because of the
+   intrinsic nature of inverted indexes: inserting or updating one heap row
+   can cause many inserts into the index (one for each key extracted
+   from the indexed value). As of
+   <productname>PostgreSQL</productname> 8.4, this problem is alleviated
+   by postponing most of the work until the next <command>VACUUM</>.
+   Newly inserted index entries are temporarily stored in an unsorted list of
+   pending entries.  <command>VACUUM</> inserts all pending entries into the
+   main <acronym>GIN</acronym> index data structure,
+   using the same bulk insert techniques used during initial index creation.
+   This greatly improves <acronym>GIN</acronym> index update speed, even
+   counting the additional vacuum overhead.
+  </para>
+
+  <para>
+   The disadvantage of this approach is that searches must scan the list
+   of pending entries in addition to searching the regular index, and so
+   a large list of pending entries will slow searches significantly.
+   It's recommended to use properly-configured autovacuum with tables
+   having <acronym>GIN</acronym> indexes, to keep this overhead to
+   reasonable levels.
+  </para>
+
+  <para>
+   If consistently-fast search speed is more important than update speed,
+   use of pending entries can be disabled by turning off the
+   <literal>FASTUPDATE</literal> storage parameter for a
+   <acronym>GIN</acronym> index.  See <xref linkend="sql-createindex"
+   endterm="sql-createindex-title"> for details.
+  </para>
+ </sect2>
+
  <sect2 id="gin-partial-match">
   <title>Partial match algorithm</title>
-  
+
   <para>
    GIN can support <quote>partial match</> queries, in which the query
    does not determine an exact match for one or more keys, but the possible
@@ -225,11 +261,18 @@
    <term>Create vs insert</term>
    <listitem>
     <para>
-     In most cases, insertion into a <acronym>GIN</acronym> index is slow
+     Insertion into a <acronym>GIN</acronym> index can be slow
      due to the likelihood of many keys being inserted for each value.
      So, for bulk insertions into a table it is advisable to drop the GIN
      index and recreate it after finishing bulk insertion.
     </para>
+
+    <para>
+     As of <productname>PostgreSQL</productname> 8.4, this advice is less
+     necessary since delayed indexing is used (see <xref
+     linkend="gin-fast-update"> for details).  But for very large updates
+     it may still be best to drop and recreate the index.
+    </para>
    </listitem>
   </varlistentry>
 
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 8b502e6..b75ccc9 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -265,7 +265,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] <replaceable class="parameter">name</re
    <para>
     The <literal>WITH</> clause can specify <firstterm>storage parameters</>
     for indexes.  Each index method can have its own set of allowed storage
-    parameters.  The built-in index methods all accept a single parameter:
+    parameters.  All built-in index methods accept this parameter:
    </para>
 
    <variablelist>
@@ -292,6 +292,36 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] <replaceable class="parameter">name</re
 
    </variablelist>
 
+   <para>
+    <acronym>GIN</acronym> indexes accept an additional parameter:
+   </para>
+
+   <variablelist>
+
+   <varlistentry>
+    <term><literal>FASTUPDATE</></term>
+    <listitem>
+    <para>
+     This setting controls usage of the fast update technique described in
+     <xref linkend="gin-fast-update">.  It is a Boolean parameter:
+     <literal>ON</> enables fast update, <literal>OFF</> disables it.
+     (Alternative spellings of <literal>ON</> and <literal>OFF</> are
+     allowed as described in <xref linkend="config-setting">.)  The
+     default is <literal>ON</>.
+    </para>
+
+    <note>
+     <para>
+      Turning <literal>FASTUPDATE</> off via <command>ALTER INDEX</> prevents
+      future insertions from going into the list of pending index entries,
+      but does not in itself flush previous entries.  You might want to do a
+      <command>VACUUM</> afterward to ensure the pending list is emptied.
+     </para>
+    </note>
+    </listitem>
+   </varlistentry>
+
+   </variablelist>
   </refsect2>
 
   <refsect2 id="SQL-CREATEINDEX-CONCURRENTLY">
@@ -500,6 +530,13 @@ CREATE UNIQUE INDEX title_idx ON films (title) WITH (fillfactor = 70);
   </para>
 
   <para>
+   To create a <acronym>GIN</> index with fast update turned off:
+<programlisting>
+CREATE INDEX gin_idx ON documents_table (locations) WITH (fastupdate = off);
+</programlisting>
+  </para>
+
+  <para>
    To create an index on the column <literal>code</> in the table
    <literal>films</> and have the index reside in the tablespace
    <literal>indexspace</>:
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index bee0667..952481c 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -63,6 +63,13 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ <replaceable class="PARAMETER">
    blocks.  This form is much slower and requires an exclusive lock on each
    table while it is being processed.
   </para>
+
+  <para>
+   For tables with <acronym>GIN</> indexes, <command>VACUUM</command> (in
+   any form) also completes any delayed index insertions, by moving pending
+   index entries to the appropriate places in the main <acronym>GIN</> index
+   structure.  (See <xref linkend="gin-fast-update"> for more details.)
+  </para>
  </refsect1>
 
  <refsect1>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 1b1310c..8560c07 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3224,7 +3224,9 @@ SELECT plainto_tsquery('supernovae stars');
     </listitem>
     <listitem>
      <para>
-      GIN indexes are about ten times slower to update than GiST
+      GIN indexes are moderately slower to update than GiST indexes, but
+      about 10 times slower if fast update support was disabled
+      (see <xref linkend="gin-fast-update"> for details)
      </para>
     </listitem>
     <listitem>
diff --git a/src/backend/access/gin/Makefile b/src/backend/access/gin/Makefile
index 93442ae..99ded7a 100644
--- a/src/backend/access/gin/Makefile
+++ b/src/backend/access/gin/Makefile
@@ -14,6 +14,6 @@ include $(top_builddir)/src/Makefile.global
 
 OBJS = ginutil.o gininsert.o ginxlog.o ginentrypage.o gindatapage.o \
 	ginbtree.o ginscan.o ginget.o ginvacuum.o ginarrayproc.o \
-	ginbulk.o
+	ginbulk.o ginfast.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/access/gin/ginbulk.c b/src/backend/access/gin/ginbulk.c
index 5219e55..63b5be5 100644
--- a/src/backend/access/gin/ginbulk.c
+++ b/src/backend/access/gin/ginbulk.c
@@ -197,6 +197,8 @@ ginInsertRecordBA(BuildAccumulator *accum, ItemPointer heapptr, OffsetNumber att
 	if (nentry <= 0)
 		return;
 
+	Assert(ItemPointerIsValid(heapptr) && attnum >= FirstOffsetNumber);
+
 	i = nentry - 1;
 	for (; i > 0; i >>= 1)
 		nbit++;
diff --git a/src/backend/access/gin/gindatapage.c b/src/backend/access/gin/gindatapage.c
index bf0651d..3c188f3 100644
--- a/src/backend/access/gin/gindatapage.c
+++ b/src/backend/access/gin/gindatapage.c
@@ -43,8 +43,14 @@ MergeItemPointers(ItemPointerData *dst, ItemPointerData *a, uint32 na, ItemPoint
 
 	while (aptr - a < na && bptr - b < nb)
 	{
-		if (compareItemPointers(aptr, bptr) > 0)
+		int cmp = compareItemPointers(aptr, bptr); 
+		if (cmp > 0)
 			*dptr++ = *bptr++;
+		else if ( cmp == 0 )
+		{
+			*dptr++ = *bptr++;
+			aptr++;
+		}
 		else
 			*dptr++ = *aptr++;
 	}
@@ -630,11 +636,16 @@ insertItemPointer(GinPostingTreeScan *gdi, ItemPointerData *items, uint32 nitem)
 		gdi->stack = ginFindLeafPage(&gdi->btree, gdi->stack);
 
 		if (gdi->btree.findItem(&(gdi->btree), gdi->stack))
-			elog(ERROR, "item pointer (%u,%d) already exists",
-			ItemPointerGetBlockNumber(gdi->btree.items + gdi->btree.curitem),
-				 ItemPointerGetOffsetNumber(gdi->btree.items + gdi->btree.curitem));
-
-		ginInsertValue(&(gdi->btree), gdi->stack);
+		{
+			/*
+			 * gdi->btree.items[ gdi->btree.curitem ] already exists in index
+			 */
+			gdi->btree.curitem ++;
+			LockBuffer(gdi->stack->buffer, GIN_UNLOCK);
+			freeGinBtreeStack(gdi->stack);
+		}
+		else
+			ginInsertValue(&(gdi->btree), gdi->stack);
 
 		gdi->stack = NULL;
 	}
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
new file mode 100644
index 0000000..3ca335d
--- /dev/null
+++ b/src/backend/access/gin/ginfast.c
@@ -0,0 +1,761 @@
+/*-------------------------------------------------------------------------
+ *
+ * ginfast.c
+ *    Fast insert routines for the Postgres inverted index access method.
+ *    Pending entries are stored in linear list of pages and vacuum 
+ *    will transfer them into regular structure.
+ *
+ * Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *			$PostgreSQL$
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/gin.h"
+#include "access/tuptoaster.h"
+#include "catalog/index.h"
+#include "commands/vacuum.h"
+#include "miscadmin.h"
+#include "storage/bufmgr.h"
+#include "utils/memutils.h"
+
+
+static int32
+writeListPage(Relation index, Buffer buffer, IndexTuple *tuples, int32 ntuples, BlockNumber rightlink)
+{
+	Page			page = BufferGetPage(buffer);
+	int 			i, freesize, size=0;
+	OffsetNumber	l, off;
+
+	START_CRIT_SECTION();
+
+	GinInitBuffer(buffer, GIN_LIST);
+
+	off = FirstOffsetNumber;
+
+	for(i=0; i<ntuples; i++)
+	{	
+		size += IndexTupleSize(tuples[i]);
+		l = PageAddItem(page, (Item)tuples[i], IndexTupleSize(tuples[i]), off, false, false);
+
+		if (l == InvalidOffsetNumber)
+			elog(ERROR, "failed to add item to index page in \"%s\"",
+					 RelationGetRelationName(index));
+
+		off++;
+	}
+
+	GinPageGetOpaque(page)->rightlink = rightlink;
+	/*
+	 * tail page may contain only the whole row(s) or final
+	 * part of row placed on previous pages
+	 */
+	if ( rightlink == InvalidBlockNumber )
+		GinPageSetFullRow(page);
+
+	freesize = PageGetFreeSpace(page);
+
+	MarkBufferDirty(buffer);
+
+	if (!index->rd_istemp)
+	{
+		XLogRecData				rdata[2];
+		ginxlogInsertListPage	data;
+		XLogRecPtr  			recptr;
+		char	   			   *ptr;
+
+		rdata[0].buffer = buffer;
+		rdata[0].buffer_std = true;
+		rdata[0].data = (char*)&data;
+		rdata[0].len = sizeof(ginxlogInsertListPage);
+		rdata[0].next = rdata+1;
+
+		rdata[1].buffer = InvalidBuffer;
+		ptr = rdata[1].data = palloc( size );
+		rdata[1].len = size;
+		rdata[1].next = NULL;
+
+		for(i=0; i<ntuples; i++)
+		{
+			memcpy(ptr, tuples[i], IndexTupleSize(tuples[i]));
+			ptr += IndexTupleSize(tuples[i]);
+		}
+
+		data.blkno = BufferGetBlockNumber(buffer);
+		data.rightlink = rightlink;
+		data.ntuples = ntuples;
+
+		recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_INSERT_LISTPAGE, rdata);
+		PageSetLSN(page, recptr);
+		PageSetTLI(page, ThisTimeLineID);
+	}
+
+	UnlockReleaseBuffer(buffer);
+
+	END_CRIT_SECTION();
+
+	return freesize;
+}
+
+static void
+makeSublist(Relation index, IndexTuple *tuples, int32 ntuples, GinMetaPageData *res)
+{
+	Buffer			curBuffer = InvalidBuffer;
+	Buffer			prevBuffer = InvalidBuffer;
+	int 			i, size = 0, tupsize;
+	int 			startTuple = 0;
+
+	Assert(ntuples > 0);
+
+	/*
+	 * Split tuples for pages
+	 */
+	for(i=0;i<ntuples;i++)
+	{
+		if ( curBuffer == InvalidBuffer )
+		{
+			curBuffer = GinNewBuffer(index);
+
+			if ( prevBuffer != InvalidBuffer )
+			{
+				writeListPage(index, prevBuffer, tuples+startTuple, i-startTuple, BufferGetBlockNumber(curBuffer));
+			}
+			else
+			{
+				res->head = BufferGetBlockNumber(curBuffer);
+			}
+
+			prevBuffer = curBuffer;;
+			startTuple = i;
+			size = 0;
+		}
+
+		tupsize = IndexTupleSize(tuples[i]) + sizeof(ItemIdData);
+
+		if ( size + tupsize >= GinListPageSize )
+		{
+			i--;
+			curBuffer = InvalidBuffer;
+		}
+		else
+		{
+			size += tupsize;
+		}
+	}
+
+	/*
+	 * Write last page
+	 */
+	res->tail = BufferGetBlockNumber(curBuffer);
+	res->tailFreeSize = writeListPage(index, curBuffer, tuples+startTuple, ntuples-startTuple, InvalidBlockNumber);
+}
+
+#define GIN_PAGE_FREESIZE \
+	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) ) 
+/*
+ * Inserts collected values during normal insertion. Function guarantees
+ * that all values of heap will be stored sequentually with
+ * preserving order
+ */
+void
+ginHeapTupleFastInsert(Relation index, GinTupleCollector *collector)
+{
+	Buffer				metabuffer;
+	Page				metapage;
+	GinMetaPageData    *metadata = NULL;
+	XLogRecData			rdata[2];
+	Buffer				buffer = InvalidBuffer;
+	Page				page = NULL;
+	ginxlogUpdateMeta	data;
+	bool				separateList = false;
+
+	if ( collector->ntuples == 0 )
+		return;
+
+	data.node = index->rd_node;
+	data.ntuples = 0;
+	data.newRightlink = data.prevTail = InvalidBlockNumber;
+
+	rdata[0].buffer = InvalidBuffer;
+	rdata[0].data = (char *) &data;
+	rdata[0].len = sizeof(ginxlogUpdateMeta);
+	rdata[0].next = NULL;
+
+	metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
+	metapage = BufferGetPage(metabuffer);
+
+	if ( collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GIN_PAGE_FREESIZE )
+	{
+		/*
+		 * Total size is greater than one page => make sublist
+		 */
+		separateList = true;
+	}
+	else
+	{
+		LockBuffer(metabuffer, GIN_EXCLUSIVE);
+		metadata = GinPageGetMeta(metapage);
+
+		if ( metadata->head == InvalidBlockNumber || 
+			collector->sumsize + collector->ntuples * sizeof(ItemIdData) > metadata->tailFreeSize )
+		{
+			/*
+			 * Pending list is empty or total size is greater than freespace 
+			 * on tail page => make sublist
+			 * We unlock metabuffer to keep high concurrency
+			 */
+			separateList = true;
+			LockBuffer(metabuffer, GIN_UNLOCK);
+		}
+	}
+
+	if ( separateList )
+	{
+		GinMetaPageData		sublist;
+
+		/*
+		 * We should make sublist separately and append it to the tail
+		 */
+		memset( &sublist, 0, sizeof(GinMetaPageData) );
+
+		makeSublist(index, collector->tuples, collector->ntuples, &sublist); 
+
+		/*
+		 * metapage was unlocked, see above
+		 */
+		LockBuffer(metabuffer, GIN_EXCLUSIVE);
+		metadata = GinPageGetMeta(metapage);
+
+		if ( metadata->head == InvalidBlockNumber )
+		{
+			/*
+			 * Sublist becomes main list
+			 */
+			START_CRIT_SECTION();
+			memcpy(metadata, &sublist, sizeof(GinMetaPageData) );
+			memcpy(&data.metadata, &sublist, sizeof(GinMetaPageData) );
+		}
+		else
+		{
+			/*
+			 * merge lists
+			 */
+
+			data.prevTail = metadata->tail;
+			buffer = ReadBuffer(index, metadata->tail);
+			LockBuffer(buffer, GIN_EXCLUSIVE);
+			page = BufferGetPage(buffer);
+			Assert(GinPageGetOpaque(page)->rightlink == InvalidBlockNumber);
+
+			START_CRIT_SECTION();
+
+			GinPageGetOpaque(page)->rightlink = sublist.head;
+			metadata->tail = sublist.tail;
+			metadata->tailFreeSize = sublist.tailFreeSize;
+
+			memcpy(&data.metadata, metadata, sizeof(GinMetaPageData) );
+			data.newRightlink = sublist.head;
+
+			MarkBufferDirty(buffer);
+		}
+	}
+	else
+	{
+		/*
+		 * Insert into tail page, metapage is already locked
+		 */
+
+		OffsetNumber	l, off;
+		int				i, tupsize;
+		char			*ptr;
+
+		buffer = ReadBuffer(index, metadata->tail);
+		LockBuffer(buffer, GIN_EXCLUSIVE);
+		page = BufferGetPage(buffer);
+		off = (PageIsEmpty(page)) ? FirstOffsetNumber :
+				OffsetNumberNext(PageGetMaxOffsetNumber(page));
+
+		rdata[0].next = rdata + 1;
+
+		rdata[1].buffer = buffer;
+		rdata[1].buffer_std = true;
+		ptr = rdata[1].data = (char *) palloc( collector->sumsize );
+		rdata[1].len = collector->sumsize;
+		rdata[1].next = NULL;
+
+		data.ntuples = collector->ntuples;
+
+		START_CRIT_SECTION();
+
+		for(i=0; i<collector->ntuples; i++)
+		{
+			tupsize = IndexTupleSize(collector->tuples[i]);
+			l = PageAddItem(page, (Item)collector->tuples[i], tupsize, off, false, false);
+
+			if (l == InvalidOffsetNumber)
+				elog(ERROR, "failed to add item to index page in \"%s\"",
+						 RelationGetRelationName(index));
+
+			memcpy(ptr, collector->tuples[i], tupsize);
+			ptr+=tupsize;
+
+			off++;
+		}
+
+		metadata->tailFreeSize -= collector->sumsize + collector->ntuples * sizeof(ItemIdData);
+		memcpy(&data.metadata, metadata, sizeof(GinMetaPageData) );
+		MarkBufferDirty(buffer);
+	}
+
+	/*
+	 *  Make real write 
+	 */
+
+	MarkBufferDirty(metabuffer);
+	if ( !index->rd_istemp )
+	{
+		XLogRecPtr  recptr;
+
+		recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_UPDATE_META_PAGE, rdata);
+		PageSetLSN(metapage, recptr);
+		PageSetTLI(metapage, ThisTimeLineID);
+
+		if ( buffer != InvalidBuffer )
+		{
+			PageSetLSN(page, recptr);
+			PageSetTLI(page, ThisTimeLineID);
+		}
+	}
+
+	if (buffer != InvalidBuffer)
+		UnlockReleaseBuffer(buffer);
+	UnlockReleaseBuffer(metabuffer);
+
+	END_CRIT_SECTION();
+}
+
+/*
+ * Collect values from one tuples to be indexed. All values for
+ * one tuples shouold be written at once - to guarantee consistent state
+ */
+uint32
+ginHeapTupleFastCollect(Relation index, GinState *ginstate, GinTupleCollector *collector,
+		OffsetNumber attnum, Datum value, ItemPointer item)
+{
+	Datum	   *entries;
+	int32		i,
+				nentries;
+
+	entries = extractEntriesSU(ginstate, attnum, value, &nentries);
+
+	if (nentries == 0)
+		/* nothing to insert */
+		return 0;
+
+	/*
+	 * Allocate/reallocate memory for storing collected tuples
+	 */
+	if ( collector->tuples == NULL )
+	{
+		collector->lentuples = nentries * index->rd_att->natts;
+		collector->tuples = (IndexTuple*)palloc(sizeof(IndexTuple) * collector->lentuples);
+	}
+	
+	while ( collector->ntuples + nentries > collector->lentuples )
+	{
+		collector->lentuples *= 2;
+		collector->tuples = (IndexTuple*)repalloc( collector->tuples,
+													sizeof(IndexTuple) * collector->lentuples);
+	}
+
+	/*
+	 * Creates tuple's array
+	 */
+	for (i = 0; i < nentries; i++)
+	{
+		int32 tupsize;
+
+		collector->tuples[collector->ntuples + i] = GinFormTuple(ginstate, attnum, entries[i], NULL, 0);
+		collector->tuples[collector->ntuples + i]->t_tid = *item;
+		tupsize = IndexTupleSize(collector->tuples[collector->ntuples + i]);
+
+		if ( tupsize > TOAST_INDEX_TARGET || tupsize >= GinMaxItemSize)
+			elog(ERROR, "huge tuple");
+
+		collector->sumsize += tupsize;
+	}
+
+	collector->ntuples += nentries;
+
+	return nentries;
+}
+
+/*
+ * Deletes first pages in list before newHead page.
+ * If newHead == InvalidBlockNumber then function drops the whole list.
+ * returns true if concurrent completion process is running
+ */
+static bool
+shiftList(Relation index, Buffer metabuffer, BlockNumber newHead,
+		  IndexBulkDeleteResult *stats)
+{
+#define	NDELETE_AT_ONCE	(16)
+	Buffer					buffers[NDELETE_AT_ONCE];
+	ginxlogDeleteListPages	data;
+	XLogRecData				rdata[1];
+	Page					metapage;
+	GinMetaPageData 	   *metadata;
+	BlockNumber				blknoToDelete;
+
+	metapage = BufferGetPage(metabuffer);
+	metadata = GinPageGetMeta(metapage);
+	blknoToDelete = metadata->head;
+
+	data.node = index->rd_node;
+
+	rdata[0].buffer = InvalidBuffer;
+	rdata[0].data = (char *) &data;
+	rdata[0].len = sizeof(ginxlogDeleteListPages);
+	rdata[0].next = NULL;
+
+	do
+	{
+		Page		page;
+		int			i;
+
+		data.ndeleted = 0;
+		while( data.ndeleted < NDELETE_AT_ONCE && blknoToDelete != newHead )
+		{
+			data.toDelete[ data.ndeleted ] = blknoToDelete; 
+			buffers[ data.ndeleted ] = ReadBuffer(index, blknoToDelete);
+			LockBufferForCleanup( buffers[ data.ndeleted ] );
+			page = BufferGetPage( buffers[ data.ndeleted ] );
+
+			data.ndeleted++;
+			stats->pages_deleted++;
+
+			if ( GinPageIsDeleted(page) )
+			{
+				/* concurrent deletion process is detected */
+				for(i=0;i<data.ndeleted;i++)
+					UnlockReleaseBuffer( buffers[i] );
+
+				return true;
+			}
+
+			blknoToDelete = GinPageGetOpaque( page )->rightlink;
+		}
+
+		START_CRIT_SECTION();
+		
+		metadata->head = blknoToDelete;
+		if ( blknoToDelete == InvalidBlockNumber )
+		{
+			metadata->tail = InvalidBlockNumber;
+			metadata->tailFreeSize = 0;
+		}
+		memcpy( &data.metadata, metadata, sizeof(GinMetaPageData));
+		MarkBufferDirty( metabuffer );
+
+		for(i=0; i<data.ndeleted; i++)
+		{
+			page = BufferGetPage( buffers[ i ] );
+			GinPageGetOpaque( page )->flags = GIN_DELETED;
+			MarkBufferDirty( buffers[ i ] );
+		}
+
+		if ( !index->rd_istemp )
+		{
+			XLogRecPtr  recptr;
+
+			recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_DELETE_LISTPAGE, rdata);
+			PageSetLSN(metapage, recptr);
+			PageSetTLI(metapage, ThisTimeLineID);
+
+			for(i=0; i<data.ndeleted; i++)
+			{
+				page = BufferGetPage( buffers[ i ] );
+				PageSetLSN(page, recptr);
+				PageSetTLI(page, ThisTimeLineID);
+			}
+		}
+
+		for(i=0; i<data.ndeleted; i++)
+			UnlockReleaseBuffer( buffers[ i ] );
+
+		END_CRIT_SECTION();
+	} while( blknoToDelete != newHead );
+
+	return false;
+}
+
+typedef struct DatumArray
+{
+	Datum	*values;
+	int32	 nvalues;
+	int32	 maxvalues;
+} DatumArray;
+
+static
+void addDatum(DatumArray *datums, Datum datum)
+{
+	if ( datums->nvalues >= datums->maxvalues)
+	{
+		datums->maxvalues *= 2;
+		datums->values = (Datum*)repalloc( datums->values, sizeof(Datum)*datums->maxvalues);
+	}
+
+	datums->values[ datums->nvalues++ ] = datum; 
+}
+
+/*
+ * Go through all tuples on page and collect values in memory
+ */
+
+static void 
+processPendingPage(BuildAccumulator *accum, DatumArray *da, Page page, OffsetNumber startoff)
+{
+	ItemPointerData	heapptr;
+	OffsetNumber 	i,maxoff;
+	OffsetNumber	attrnum, curattnum;
+
+	maxoff = PageGetMaxOffsetNumber(page);
+	Assert( maxoff >= FirstOffsetNumber );
+	ItemPointerSetInvalid(&heapptr);
+	attrnum = 0;
+
+	for (i = startoff; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		IndexTuple  itup = (IndexTuple) PageGetItem(page, PageGetItemId(page, i));
+
+		curattnum = gintuple_get_attrnum(accum->ginstate, itup);
+
+		if ( !ItemPointerIsValid(&heapptr) )
+		{
+			heapptr = itup->t_tid;
+			attrnum = curattnum;
+		}
+		else if ( !(ItemPointerEquals(&heapptr, &itup->t_tid) && curattnum == attrnum) )
+		{
+			/*
+			 * We can insert several datums per call, but only for one heap tuple
+			 * and one column.
+			 */
+			ginInsertRecordBA(accum, &heapptr, attrnum, da->values, da->nvalues);
+			da->nvalues = 0;
+			heapptr = itup->t_tid;
+			attrnum = curattnum;
+		}
+		addDatum(da, gin_index_getattr(accum->ginstate, itup));
+	}
+
+	ginInsertRecordBA(accum, &heapptr, attrnum, da->values, da->nvalues);
+}
+
+/*
+ * Moves tuples from pending pages into regular GIN structure.
+ * Function doesn't require special locking and could be called
+ * in any time but only one at the same time. 
+ */ 
+
+Datum 
+gininsertcleanup(PG_FUNCTION_ARGS)
+{
+	IndexVacuumInfo	   *info = (IndexVacuumInfo *) PG_GETARG_POINTER(0);
+	IndexBulkDeleteResult *stats = (IndexBulkDeleteResult *) PG_GETARG_POINTER(1);
+	Relation 			index = info->index;
+	GinState  			ginstate;
+	Buffer				metabuffer, buffer;
+	Page				metapage, page;
+	GinMetaPageData    *metadata;
+	MemoryContext		opCtx, oldCtx;
+	BuildAccumulator	accum;
+	DatumArray			datums;
+	BlockNumber			blkno;
+
+	/* Set up all-zero stats if ginbulkdelete wasn't called */
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+
+	initGinState(&ginstate, index);
+
+	metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
+	LockBuffer(metabuffer, GIN_SHARE);
+	metapage = BufferGetPage(metabuffer);
+	metadata = GinPageGetMeta(metapage);
+
+	if ( metadata->head == InvalidBlockNumber )
+	{
+		UnlockReleaseBuffer(metabuffer);
+		PG_RETURN_POINTER(stats);
+	}
+
+	/*
+	 * Init
+	 */
+	datums.maxvalues=128;
+	datums.nvalues = 0;
+	datums.values = (Datum*)palloc(sizeof(Datum)*datums.maxvalues);
+
+	ginInitBA(&accum);
+	accum.ginstate = &ginstate;
+
+	opCtx = AllocSetContextCreate(CurrentMemoryContext,
+									"Gin refresh temporary context",
+									ALLOCSET_DEFAULT_MINSIZE,
+									ALLOCSET_DEFAULT_INITSIZE,
+									ALLOCSET_DEFAULT_MAXSIZE);
+
+	oldCtx = MemoryContextSwitchTo(opCtx);
+
+	/*
+	 * Read and lock head
+	 */
+	blkno = metadata->head;
+	buffer = ReadBuffer(index, blkno);
+	LockBuffer(buffer, GIN_SHARE);
+	page = BufferGetPage(buffer);
+
+	LockBuffer(metabuffer, GIN_UNLOCK);
+
+	for(;;)
+	{
+		/*
+		 * reset datum's collector and read page's datums into memory
+		 */
+		datums.nvalues = 0;
+
+		if ( GinPageIsDeleted(page) )
+		{
+			/* concurrent completion is running */
+			UnlockReleaseBuffer( buffer );
+			break;
+		}
+
+		processPendingPage(&accum, &datums, page, FirstOffsetNumber);
+
+		vacuum_delay_point();
+
+		/*
+		 * Is it time to flush memory to disk?
+		 */
+		if ( GinPageGetOpaque(page)->rightlink == InvalidBlockNumber || 
+				( GinPageHasFullRow(page) && accum.allocatedMemory > maintenance_work_mem * 1024L ) )
+		{
+			ItemPointerData    *list;
+			uint32      		nlist;
+			Datum       		entry;
+			OffsetNumber  		maxoff, attnum;
+
+			/*
+			 * Unlock current page to increase performance.
+			 * Changes of page will be checked later by comparing 
+			 * maxoff after completion of memory flush. 
+			 */
+			maxoff = PageGetMaxOffsetNumber(page);
+			LockBuffer(buffer, GIN_UNLOCK);
+
+			/*
+			 * Moving collected data into regular structure can take
+			 * significant amount of time - so, run it without locking pending
+			 * list.
+			 */
+			while ((list = ginGetEntry(&accum, &attnum, &entry, &nlist)) != NULL)
+			{
+				vacuum_delay_point();
+				ginEntryInsert(index, &ginstate, attnum, entry, list, nlist, FALSE);
+			}
+
+			/*
+			 * Lock the whole list to remove pages 
+			 */
+			LockBuffer(metabuffer, GIN_EXCLUSIVE);
+			LockBuffer(buffer, GIN_SHARE);
+
+			if ( GinPageIsDeleted(page) )
+			{
+				/* concurrent completion is running */
+				UnlockReleaseBuffer(buffer);
+				LockBuffer(metabuffer, GIN_UNLOCK);
+				break;
+			}
+
+			/*
+			 * While we keeped page unlocked it might be changed - 
+			 * add read the changes separately. On one page is rather
+			 * small - so, overused memory isn't very big, although 
+			 * we should reinit accumulator. We need to make a
+			 * check only once because now both page and metapage are
+			 * locked. Inserion algorithm gurantees that inserted row(s)
+			 * will not continue on next page.
+			 */
+			if ( PageGetMaxOffsetNumber(page) != maxoff )
+			{
+				ginInitBA(&accum);
+				datums.nvalues = 0;
+				processPendingPage(&accum, &datums, page, maxoff+1);
+
+				while ((list = ginGetEntry(&accum, &attnum, &entry, &nlist)) != NULL)
+					ginEntryInsert(index, &ginstate, attnum, entry, list, nlist, FALSE);
+			}
+
+			/*
+			 * Remember next page - it will become a new head
+			 */
+			blkno = GinPageGetOpaque(page)->rightlink;
+			UnlockReleaseBuffer(buffer); /* shiftList will do exclusive locking */
+
+			/*
+			 * remove readed pages from pending list, at this point all 
+			 * content of readed pages is in regular structure
+			 */
+			if ( shiftList(index, metabuffer, blkno, stats) )
+			{
+				/* concurrent completion is running */
+				LockBuffer(metabuffer, GIN_UNLOCK);
+				break;
+			}
+
+			Assert( blkno == metadata->head );
+			LockBuffer(metabuffer, GIN_UNLOCK);
+
+			/*
+			 * if we remove the whole list just exit 
+			 */
+			if ( blkno == InvalidBlockNumber )
+				break;
+
+			/*
+			 * reinit state
+			 */
+			MemoryContextReset(opCtx);
+			ginInitBA(&accum);
+		}
+		else
+		{
+			blkno = GinPageGetOpaque(page)->rightlink;
+			UnlockReleaseBuffer(buffer);
+		}
+
+
+		/*
+		 * Read next page in pending list
+		 */
+		CHECK_FOR_INTERRUPTS();
+		buffer = ReadBuffer(index, blkno);
+		LockBuffer(buffer, GIN_SHARE);
+		page = BufferGetPage(buffer);
+	}
+
+	ReleaseBuffer(metabuffer);
+	MemoryContextSwitchTo(oldCtx);
+	MemoryContextDelete(opCtx);
+
+	PG_RETURN_POINTER(stats);
+}
diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
index 23131e5..69c15fc 100644
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -268,6 +268,15 @@ startScanEntry(Relation index, GinState *ginstate, GinScanEntry entry)
 	Page			page;
 	bool			needUnlock = TRUE;
 
+	entry->buffer = InvalidBuffer;
+	entry->offset = InvalidOffsetNumber;
+	entry->list = NULL;
+	entry->nlist = 0;
+	entry->partialMatch = NULL;
+	entry->partialMatchResult = NULL;
+	entry->reduceResult = FALSE;
+	entry->predictNumberResult = 0;
+
 	if (entry->master != NULL)
 	{
 		entry->isFinished = entry->master->isFinished;
@@ -285,14 +294,6 @@ startScanEntry(Relation index, GinState *ginstate, GinScanEntry entry)
 	page = BufferGetPage(stackEntry->buffer);
 
 	entry->isFinished = TRUE;
-	entry->buffer = InvalidBuffer;
-	entry->offset = InvalidOffsetNumber;
-	entry->list = NULL;
-	entry->nlist = 0;
-	entry->partialMatch = NULL;
-	entry->partialMatchResult = NULL;
-	entry->reduceResult = FALSE;
-	entry->predictNumberResult = 0;
 
 	if ( entry->isPartialMatch )
 	{
@@ -350,9 +351,10 @@ startScanEntry(Relation index, GinState *ginstate, GinScanEntry entry)
 
 			entry->buffer = scanBeginPostingTree(gdi);
 			/*
-			 * We keep buffer pinned because we need to prevent deletition
+			 * We keep buffer pinned because we need to prevent deletion of
 			 * page during scan. See GIN's vacuum implementation. RefCount
-			 * is increased to keep buffer pinned after freeGinBtreeStack() call.
+			 * is increased to keep buffer pinned after freeGinBtreeStack()
+			 * call.
 			 */
 			IncrBufferRefCount(entry->buffer);
 
@@ -429,6 +431,15 @@ startScan(IndexScanDesc scan)
 	uint32		i;
 	GinScanOpaque so = (GinScanOpaque) scan->opaque;
 
+	/*
+	 * If isScanFastInsert is still true, set up to scan the pending-insert
+	 * list rather than the main index.
+	 */
+	if (so->isScanFastInsert)
+	{
+		return;
+	}
+
 	for (i = 0; i < so->nkeys; i++)
 		startScanKey(scan->indexRelation, &so->ginstate, so->keys + i);
 }
@@ -671,6 +682,336 @@ keyGetItem(Relation index, GinState *ginstate, MemoryContext tempCtx,
 	return FALSE;
 }
 
+typedef struct fastPosition {
+	Buffer			fastBuffer;
+	OffsetNumber	firstOffset;
+	OffsetNumber	lastOffset;
+	ItemPointerData	item;
+} fastPosition;
+
+
+/*
+ * Get ItemPointer of next heap row to be checked from fast insert storage.
+ * Returns false if there are no more.
+ *
+ * The fastBuffer is presumed pinned and share-locked on entry, and is 
+ * pinned and share-locked on success exit.  On failure exit it's released.
+ */
+static bool
+scanGetCandidate(IndexScanDesc scan, fastPosition *pos)
+{
+	OffsetNumber		maxoff;
+	Page				page;
+	IndexTuple  		itup;
+
+	ItemPointerSetInvalid( &pos->item );
+	for(;;)
+	{
+		page = BufferGetPage(pos->fastBuffer);
+
+		maxoff = PageGetMaxOffsetNumber(page);
+		if ( pos->firstOffset > maxoff )
+		{
+			BlockNumber blkno = GinPageGetOpaque(page)->rightlink;
+			if ( blkno == InvalidBlockNumber )
+			{
+				UnlockReleaseBuffer(pos->fastBuffer);
+				pos->fastBuffer=InvalidBuffer;
+
+				return false;
+			}
+			else
+			{
+				/*
+				 * Here we should prevent deletion of next page by 
+				 * insertcleanup process, which uses LockBufferForCleanup.
+				 * So, we pin next page before unpin current one
+				 */
+				Buffer	tmpbuf = ReadBuffer(scan->indexRelation, blkno);
+
+				UnlockReleaseBuffer( pos->fastBuffer);
+				pos->fastBuffer=tmpbuf;
+				LockBuffer( pos->fastBuffer, GIN_SHARE );
+
+				pos->firstOffset = FirstOffsetNumber;
+			}
+		}
+		else
+		{
+			itup = (IndexTuple) PageGetItem(page, PageGetItemId(page, pos->firstOffset));
+			pos->item = itup->t_tid;
+			if ( GinPageGetOpaque(page)->flags & GIN_LIST_FULLROW )
+			{
+				/*
+				 * find itempointer to the next row
+				 */
+				for(pos->lastOffset = pos->firstOffset+1; pos->lastOffset<=maxoff; pos->lastOffset++)
+				{
+					itup = (IndexTuple) PageGetItem(page, PageGetItemId(page, pos->lastOffset));
+					if (!ItemPointerEquals(&pos->item, &itup->t_tid))
+						break;
+				}
+			}
+			else
+			{
+				/*
+				 * All itempointers are the same on this page
+				 */
+				pos->lastOffset = maxoff + 1;
+			}
+			break;
+		}
+	}
+
+	return true;
+}
+
+static bool
+matchPartialInPendingList(GinState *ginstate, Page page, OffsetNumber off, 
+	OffsetNumber maxoff, Datum value, OffsetNumber attrnum,
+	Datum	*datum, bool *datumExtracted, StrategyNumber strategy)
+{
+	IndexTuple  		itup;
+	int 				res;
+
+	while( off < maxoff )
+	{
+		itup = (IndexTuple) PageGetItem(page, PageGetItemId(page, off));
+		if ( attrnum != gintuple_get_attrnum(ginstate, itup) )
+			return false;
+		
+		if (datumExtracted[ off-1 ] == false)
+		{
+			datum[ off-1 ] = gin_index_getattr(ginstate, itup);
+			datumExtracted[  off-1 ] = true;
+		}
+
+		res = DatumGetInt32(FunctionCall3(&ginstate->comparePartialFn[attrnum],
+						  value,
+						  datum[ off-1 ],
+						  UInt16GetDatum(strategy)));
+		if ( res == 0 )
+			return true;
+		else if (res>0)
+			return false;
+	}
+
+	return false;
+}
+/*
+ * Sets entryRes array for each key by looking on
+ * every entry per indexed value (row) in fast insert storage.
+ * returns true if at least one of datum was matched by key's entry
+ *
+ * The fastBuffer is presumed pinned and share-locked on entry.
+ */
+static bool
+collectDatumForItem(IndexScanDesc scan, fastPosition *pos)
+{
+	GinScanOpaque 		so = (GinScanOpaque) scan->opaque;
+	OffsetNumber		attrnum;
+	Page				page;
+	IndexTuple  		itup;
+	int					i, j;
+	bool				hasMatch = false;
+
+	/*
+	 * Resets entryRes
+	 */
+	for (i = 0; i < so->nkeys; i++)
+	{
+		GinScanKey	key = so->keys + i;
+		memset( key->entryRes, FALSE, key->nentries );
+	}
+
+	for(;;)
+	{
+		Datum				datum[ BLCKSZ/sizeof(IndexTupleData) ];
+		bool				datumExtracted[ BLCKSZ/sizeof(IndexTupleData) ];
+
+		Assert( pos->lastOffset > pos->firstOffset );
+		memset(datumExtracted + pos->firstOffset - 1, 0, sizeof(bool) * (pos->lastOffset - pos->firstOffset ));
+
+		page = BufferGetPage(pos->fastBuffer);
+
+		for(i = 0; i < so->nkeys; i++)
+		{
+			GinScanKey  key = so->keys + i;
+
+			for(j=0; j<key->nentries; j++)
+			{
+				OffsetNumber		StopLow = pos->firstOffset,
+									StopHigh = pos->lastOffset,
+									StopMiddle;
+				GinScanEntry		entry = key->scanEntry + j;
+
+				if ( key->entryRes[j] )
+					continue;
+
+				while (StopLow < StopHigh)
+				{
+					StopMiddle = StopLow + ((StopHigh - StopLow) >> 1);
+
+					itup = (IndexTuple) PageGetItem(page, PageGetItemId(page, StopMiddle));
+					attrnum = gintuple_get_attrnum(&so->ginstate, itup);
+
+					if (key->attnum < attrnum)
+						StopHigh = StopMiddle;
+					else if (key->attnum > attrnum)
+						StopLow = StopMiddle + 1;
+					else
+					{
+						int res;
+
+						if (datumExtracted[ StopMiddle-1 ] == false)
+						{
+							datum[ StopMiddle-1 ] = gin_index_getattr(&so->ginstate, itup);
+							datumExtracted[  StopMiddle-1 ] = true;
+						}
+						res =  compareEntries(&so->ginstate, 
+									entry->attnum, 
+									entry->entry, 
+									datum[ StopMiddle-1 ]);
+
+						if ( res == 0 )
+						{
+							if ( entry->isPartialMatch )
+								key->entryRes[j] = matchPartialInPendingList(&so->ginstate, page, StopMiddle, 
+																	pos->lastOffset, entry->entry, entry->attnum,
+																	datum, datumExtracted, entry->strategy);
+							else
+								key->entryRes[j] = true;
+							break;
+						}
+						else if ( res < 0  )
+							StopHigh = StopMiddle;
+						else
+							StopLow = StopMiddle + 1;
+					}
+				}
+
+				if ( StopLow>=StopHigh && entry->isPartialMatch )
+					key->entryRes[j] = matchPartialInPendingList(&so->ginstate, page, StopHigh, 
+																	pos->lastOffset, entry->entry, entry->attnum,
+																	datum, datumExtracted, entry->strategy);
+
+				hasMatch |= key->entryRes[j];
+			}
+		}
+	
+		pos->firstOffset = pos->lastOffset;
+
+		if ( GinPageGetOpaque(page)->flags & GIN_LIST_FULLROW )
+		{
+			/*
+			 * We scan all values from one tuple, go to next one
+			 */
+
+			return hasMatch;
+		}
+		else
+		{
+			ItemPointerData item = pos->item;
+
+			if ( scanGetCandidate(scan, pos) == false || !ItemPointerEquals(&pos->item, &item) )
+				elog(ERROR,"Could not process tuple");  /* XXX should not be here ! */
+		}
+	}
+
+	return hasMatch;
+}
+
+/*
+ * Collect all matched rows from pending list in bitmap
+ */
+static TIDBitmap*
+scanFastInsert(IndexScanDesc scan)
+{
+	GinScanOpaque so = (GinScanOpaque) scan->opaque;
+	MemoryContext	oldCtx;
+	bool			recheck, keyrecheck, match;
+	TIDBitmap	   *tbm = NULL;
+	int				i;
+	fastPosition	pos;
+	Buffer      	metabuffer = ReadBuffer(scan->indexRelation, GIN_METAPAGE_BLKNO);
+	BlockNumber 	blkno;
+
+	LockBuffer(metabuffer, GIN_SHARE);
+	blkno = GinPageGetMeta(BufferGetPage(metabuffer))->head;
+
+	/*
+	 * fetch head of list before unlocking metapage.
+	 * head page must be pinned to prevent deletion by vacuum process
+	 */
+	if ( blkno == InvalidBlockNumber )
+	{
+		/* No pending list, so proceed with normal scan */
+		UnlockReleaseBuffer( metabuffer );
+		return NULL;
+	}
+
+	pos.fastBuffer = ReadBuffer(scan->indexRelation, blkno);
+	LockBuffer(pos.fastBuffer, GIN_SHARE);
+	pos.firstOffset = FirstOffsetNumber;
+	UnlockReleaseBuffer( metabuffer );
+
+	/*
+	 * loop for each heap row
+	 */
+	while( scanGetCandidate(scan, &pos) )
+	{
+
+		/*
+		 * Check entries in rows and setup entryRes array
+		 */
+		if (!collectDatumForItem(scan, &pos))
+			continue;
+
+		/*
+		 * check for consistent
+		 */
+		oldCtx = MemoryContextSwitchTo(so->tempCtx);
+		recheck = false;
+		match = true; 
+
+		for (i = 0; match && i < so->nkeys; i++)
+		{
+			GinScanKey  	key = so->keys + i;
+
+			keyrecheck = true;
+
+			if ( DatumGetBool(FunctionCall4(&so->ginstate.consistentFn[ key->attnum-1 ],
+									 PointerGetDatum(key->entryRes),
+									 UInt16GetDatum(key->strategy),
+									 key->query,
+									 PointerGetDatum(&keyrecheck))) == false )
+			{
+				match = false;
+			}
+
+			recheck |= keyrecheck;
+		}
+
+		MemoryContextSwitchTo(oldCtx);
+		MemoryContextReset(so->tempCtx);
+
+		if ( match )
+		{
+			if ( tbm == NULL )
+				tbm = tbm_create( work_mem * 1024L );
+			tbm_add_tuples(tbm, &pos.item, 1, recheck);
+		}
+	}
+
+	if ( tbm && tbm_has_lossy(tbm) )	
+			ereport(ERROR,
+					(errcode(ERRCODE_OUT_OF_MEMORY),
+					errmsg("not enough memory to store result of pending list or VACUUME table" ),
+					errhint("Increase the \"work_mem\" parameter.")));
+
+	return tbm;
+}
+
 /*
  * Get heap item pointer from scan
  * returns true if found
@@ -693,44 +1034,112 @@ scanGetItem(IndexScanDesc scan, ItemPointerData *item, bool *recheck)
 	 */
 	*recheck = false;
 
-	ItemPointerSetMin(item);
-	for (i = 0; i < so->nkeys; i++)
+	/*
+	 * First of all we should check fast insert list of pages
+	 */
+	if ( so->isScanFastInsert )
 	{
-		GinScanKey	key = so->keys + i;
+		if ( so->scanFastTuples )
+		{
+			/*
+			 * Items from pending list is already collected in memory
+			 */
 
-		if (keyGetItem(scan->indexRelation, &so->ginstate, so->tempCtx,
-					   key, &keyrecheck))
-			return FALSE;		/* finished one of keys */
-		if (compareItemPointers(item, &key->curItem) < 0)
-			*item = key->curItem;
-		*recheck |= keyrecheck;
-	}
+			if ( so->scanFastResult == NULL || so->scanFastOffset >= so->scanFastResult->ntuples )
+			{
+				so->scanFastResult = tbm_iterate( so->scanFastTuples );
 
-	for (i = 1; i <= so->nkeys; i++)
-	{
-		GinScanKey	key = so->keys + i - 1;
+				if ( so->scanFastResult == NULL )
+				{
+					/* scan of pending pages is finished */
+					so->isScanFastInsert = false;
+					startScan(scan);
+					return scanGetItem(scan, item, recheck);
+				}
+				Assert( so->scanFastResult->ntuples >= 0 );
+				so->scanFastOffset = 0;
+			}
+	
+			ItemPointerSet(item,
+							so->scanFastResult->blockno,
+							so->scanFastResult->offsets[ so->scanFastOffset ]);
+			*recheck = true; /* be conserative due to concurrent 
+								removal from pending list */
 
-		for (;;)
+			so->scanFastOffset ++;
+
+			return true;
+		} 
+		else
 		{
-			int			cmp = compareItemPointers(item, &key->curItem);
+			/*
+			 * Collect ItemPointers in memory
+			 */
+			so->scanFastTuples = scanFastInsert(scan);
 
-			if (cmp == 0)
-				break;
-			else if (cmp > 0)
+			if ( so->scanFastTuples == NULL )
 			{
-				if (keyGetItem(scan->indexRelation, &so->ginstate, so->tempCtx,
-							   key, &keyrecheck))
-					return FALSE;		/* finished one of keys */
-				*recheck |= keyrecheck;
+				/* nothing found */
+				so->isScanFastInsert = false;
+				startScan(scan);
 			}
 			else
-			{					/* returns to begin */
+			{
+				tbm_begin_iterate(so->scanFastTuples);
+			}
+
+			return scanGetItem(scan, item, recheck);
+		}
+	}
+
+	/*
+	 * Regular scanning with filtering by already returned
+	 * ItemPointers from pending list
+	 */
+
+	do
+	{
+		ItemPointerSetMin(item);
+		*recheck = false;
+
+		for (i = 0; i < so->nkeys; i++)
+		{
+			GinScanKey	key = so->keys + i;
+
+			if (keyGetItem(scan->indexRelation, &so->ginstate, so->tempCtx,
+					   key, &keyrecheck))
+				return FALSE;		/* finished one of keys */
+			if (compareItemPointers(item, &key->curItem) < 0)
 				*item = key->curItem;
-				i = 0;
+			*recheck |= keyrecheck;
+		}
+
+		for (i = 1; i <= so->nkeys; i++)
+		{
+			GinScanKey	key = so->keys + i - 1;
+
+			for (;;)
+			{
+				int			cmp = compareItemPointers(item, &key->curItem);
+	
+				if (cmp == 0)
 				break;
+				else if (cmp > 0)
+				{
+					if (keyGetItem(scan->indexRelation, &so->ginstate, so->tempCtx,
+								   key, &keyrecheck))
+						return FALSE;		/* finished one of keys */
+					*recheck |= keyrecheck;
+				}
+				else
+				{					/* returns to begin */
+					*item = key->curItem;
+					i = 0;
+					break;
+				}
 			}
 		}
-	}
+	} while( so->scanFastTuples && tbm_check_tuple(so->scanFastTuples, item) );
 
 	return TRUE;
 }
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 4be89bc..062ddba 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -138,7 +138,7 @@ addItemPointersToTuple(Relation index, GinState *ginstate, GinBtreeStack *stack,
 /*
  * Inserts only one entry to the index, but it can add more than 1 ItemPointer.
  */
-static void
+void
 ginEntryInsert(Relation index, GinState *ginstate, OffsetNumber attnum, Datum value, 
 				ItemPointerData *items, uint32 nitem, bool isBuild)
 {
@@ -273,7 +273,7 @@ ginbuild(PG_FUNCTION_ARGS)
 	IndexBuildResult *result;
 	double		reltuples;
 	GinBuildState buildstate;
-	Buffer		buffer;
+	Buffer		RootBuffer, MetaBuffer;
 	ItemPointerData *list;
 	Datum		entry;
 	uint32		nlist;
@@ -286,11 +286,17 @@ ginbuild(PG_FUNCTION_ARGS)
 
 	initGinState(&buildstate.ginstate, index);
 
+	/* initialize the meta page */
+	MetaBuffer = GinNewBuffer(index);
+
 	/* initialize the root page */
-	buffer = GinNewBuffer(index);
+	RootBuffer = GinNewBuffer(index);
+
 	START_CRIT_SECTION();
-	GinInitBuffer(buffer, GIN_LEAF);
-	MarkBufferDirty(buffer);
+	GinInitMetabuffer(MetaBuffer);
+	MarkBufferDirty(MetaBuffer);
+	GinInitBuffer(RootBuffer, GIN_LEAF);
+	MarkBufferDirty(RootBuffer);
 
 	if (!index->rd_istemp)
 	{
@@ -303,16 +309,19 @@ ginbuild(PG_FUNCTION_ARGS)
 		rdata.len = sizeof(RelFileNode);
 		rdata.next = NULL;
 
-		page = BufferGetPage(buffer);
-
-
 		recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_CREATE_INDEX, &rdata);
+		
+		page = BufferGetPage(RootBuffer);
 		PageSetLSN(page, recptr);
 		PageSetTLI(page, ThisTimeLineID);
 
+		page = BufferGetPage(MetaBuffer);
+		PageSetLSN(page, recptr);
+		PageSetTLI(page, ThisTimeLineID);
 	}
 
-	UnlockReleaseBuffer(buffer);
+	UnlockReleaseBuffer(MetaBuffer);
+	UnlockReleaseBuffer(RootBuffer);
 	END_CRIT_SECTION();
 
 	/* build the index */
@@ -417,9 +426,26 @@ gininsert(PG_FUNCTION_ARGS)
 
 	initGinState(&ginstate, index);
 
-	for(i=0; i<ginstate.origTupdesc->natts;i++)
-		if ( !isnull[i] )
-			res += ginHeapTupleInsert(index, &ginstate, (OffsetNumber)(i+1), values[i], ht_ctid);
+	if ( GinGetUseFastUpdate(index) )
+	{
+		GinTupleCollector	collector;
+
+		memset(&collector, 0, sizeof(GinTupleCollector));
+		for(i=0; i<ginstate.origTupdesc->natts;i++)
+			if ( !isnull[i] )
+				res += ginHeapTupleFastCollect(index, &ginstate, &collector,
+												(OffsetNumber)(i+1), values[i], ht_ctid);
+
+		ginHeapTupleFastInsert(index, &collector);
+	}
+	else
+	{
+		for(i=0; i<ginstate.origTupdesc->natts;i++)
+			if ( !isnull[i] ) 
+				res += ginHeapTupleInsert(index, &ginstate, 
+												(OffsetNumber)(i+1), values[i], ht_ctid);
+
+	}
 
 	MemoryContextSwitchTo(oldCtx);
 	MemoryContextDelete(insertCtx);
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index bc51e94..0c0ce52 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -159,6 +159,9 @@ newScanKey(IndexScanDesc scan)
 				 errmsg("GIN indexes do not support whole-index scans")));
 
 	so->isVoidRes = false;
+	so->isScanFastInsert = true;
+	so->scanFastTuples = NULL;
+	so->scanFastResult = NULL;
 
 	for (i = 0; i < scan->numberOfKeys; i++)
 	{
@@ -233,8 +236,11 @@ ginrescan(PG_FUNCTION_ARGS)
 	else
 	{
 		freeScanKeys(so->keys, so->nkeys);
+		if ( so->scanFastTuples )
+			tbm_free( so->scanFastTuples );
 	}
 
+	so->scanFastTuples = NULL;
 	so->keys = NULL;
 
 	if (scankey && scan->numberOfKeys > 0)
@@ -256,6 +262,8 @@ ginendscan(PG_FUNCTION_ARGS)
 	if (so != NULL)
 	{
 		freeScanKeys(so->keys, so->nkeys);
+		if ( so->scanFastTuples )
+			tbm_free( so->scanFastTuples );
 
 		MemoryContextDelete(so->tempCtx);
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 5e71c85..6633dce 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -21,6 +21,7 @@
 #include "storage/freespace.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
+#include "utils/guc.h"
 
 void
 initGinState(GinState *state, Relation index)
@@ -57,7 +58,7 @@ initGinState(GinState *state, Relation index)
 						CurrentMemoryContext);
 
 		/*
-		 * Check opclass capability to do partial match. 
+		 * Check opclass capability to do partial match.
 		 */
 		if ( index_getprocid(index, i+1, GIN_COMPARE_PARTIAL_PROC) != InvalidOid )
 		{
@@ -88,7 +89,7 @@ gintuple_get_attrnum(GinState *ginstate, IndexTuple tuple)
 		bool    isnull;
 
 		/*
-		 * First attribute is always int16, so we can safely use any 
+		 * First attribute is always int16, so we can safely use any
 		 * tuple descriptor to obtain first attribute of tuple
 		 */
 		res = index_getattr(tuple, FirstOffsetNumber, ginstate->tupdesc[0],
@@ -213,6 +214,20 @@ GinInitBuffer(Buffer b, uint32 f)
 	GinInitPage(BufferGetPage(b), f, BufferGetPageSize(b));
 }
 
+void
+GinInitMetabuffer(Buffer b)
+{
+	GinMetaPageData	*metadata;
+	Page 			 page = BufferGetPage(b);
+
+	GinInitPage(page, GIN_META, BufferGetPageSize(b));
+
+	metadata = GinPageGetMeta(page);
+
+	metadata->head = metadata->tail = InvalidBlockNumber;
+	metadata->tailFreeSize = 0;
+}
+
 int
 compareEntries(GinState *ginstate, OffsetNumber attnum, Datum a, Datum b)
 {
@@ -310,12 +325,10 @@ extractEntriesSU(GinState *ginstate, OffsetNumber attnum, Datum value, int32 *ne
 	return entries;
 }
 
-Datum
-ginoptions(PG_FUNCTION_ARGS)
+static int
+parseFillfactor(char *value, bool validate)
 {
-	Datum		reloptions = PG_GETARG_DATUM(0);
-	bool		validate = PG_GETARG_BOOL(1);
-	bytea	   *result;
+	int	fillfactor;
 
 	/*
 	 * It's not clear that fillfactor is useful for GIN, but for the moment
@@ -324,10 +337,73 @@ ginoptions(PG_FUNCTION_ARGS)
 #define GIN_MIN_FILLFACTOR			10
 #define GIN_DEFAULT_FILLFACTOR		100
 
-	result = default_reloptions(reloptions, validate,
-								GIN_MIN_FILLFACTOR,
-								GIN_DEFAULT_FILLFACTOR);
-	if (result)
-		PG_RETURN_BYTEA_P(result);
-	PG_RETURN_NULL();
+	if (value == NULL)
+		return GIN_DEFAULT_FILLFACTOR;
+
+	if (!parse_int(value, &fillfactor, 0, NULL))
+	{
+		if (validate)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("fillfactor must be an integer: \"%s\"",
+							value)));
+		return GIN_DEFAULT_FILLFACTOR;
+	}
+
+	if (fillfactor < GIN_MIN_FILLFACTOR || fillfactor > 100)
+	{
+		if (validate)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("fillfactor=%d is out of range (should be between %d and 100)",
+						fillfactor, GIN_MIN_FILLFACTOR)));
+		return GIN_DEFAULT_FILLFACTOR;
+	}
+
+	return fillfactor;
+}
+
+static bool
+parseFastupdate(char *value, bool validate)
+{
+	bool	result;
+
+	if (value == NULL)
+		return GIN_DEFAULT_USE_FASTUPDATE;
+
+	if (!parse_bool(value, &result))
+	{
+		if (validate)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("fastupdate=\"%s\" is not recognized",
+							value)));
+		return GIN_DEFAULT_USE_FASTUPDATE;
+	}
+
+	return result;
+}
+
+Datum
+ginoptions(PG_FUNCTION_ARGS)
+{
+	Datum		reloptions = PG_GETARG_DATUM(0);
+	bool		validate = PG_GETARG_BOOL(1);
+	static const char *const gin_keywords[2] = {"fillfactor", "fastupdate"};
+	char	   *values[2];
+	GinOptions *options;
+
+	parseRelOptions(reloptions, 2, gin_keywords, values, validate);
+
+	/* If no options, just return NULL */
+	if (values[0] == NULL && values[1] == NULL)
+		PG_RETURN_NULL();
+
+	options = (GinOptions *) palloc(sizeof(GinOptions));
+	SET_VARSIZE(options, sizeof(GinOptions));
+
+	options->std.fillfactor = parseFillfactor(values[0], validate);
+	options->useFastUpdate = parseFastupdate(values[1], validate);
+
+	PG_RETURN_BYTEA_P(options);
 }
diff --git a/src/backend/access/gin/ginvacuum.c b/src/backend/access/gin/ginvacuum.c
index b180cd7..4146995 100644
--- a/src/backend/access/gin/ginvacuum.c
+++ b/src/backend/access/gin/ginvacuum.c
@@ -595,7 +595,14 @@ ginbulkdelete(PG_FUNCTION_ARGS)
 
 	/* first time through? */
 	if (stats == NULL)
-		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+	{
+		stats = (IndexBulkDeleteResult *)DatumGetPointer(
+			DirectFunctionCall2(gininsertcleanup,
+									PG_GETARG_DATUM(0),
+									PG_GETARG_DATUM(1)
+		));
+	}
+
 	/* we'll re-count the tuples each time */
 	stats->num_index_tuples = 0;
 
@@ -703,9 +710,18 @@ ginvacuumcleanup(PG_FUNCTION_ARGS)
 	BlockNumber lastBlock = GIN_ROOT_BLKNO,
 				lastFilledBlock = GIN_ROOT_BLKNO;
 
-	/* Set up all-zero stats if ginbulkdelete wasn't called */
+	/* 
+	 * Set up all-zero stats and finalyze fast insertion 
+	 * if ginbulkdelete wasn't called 
+	 */
 	if (stats == NULL)
-		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+	{
+		stats = (IndexBulkDeleteResult *)DatumGetPointer(
+			DirectFunctionCall2(gininsertcleanup,
+									PG_GETARG_DATUM(0),
+									PG_GETARG_DATUM(1)
+		));
+	}
 
 	/*
 	 * XXX we always report the heap tuple count as the number of index
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 0d40bfb..76db49c 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -71,20 +71,30 @@ static void
 ginRedoCreateIndex(XLogRecPtr lsn, XLogRecord *record)
 {
 	RelFileNode *node = (RelFileNode *) XLogRecGetData(record);
-	Buffer		buffer;
+	Buffer		RootBuffer, MetaBuffer;
 	Page		page;
 
-	buffer = XLogReadBuffer(*node, GIN_ROOT_BLKNO, true);
-	Assert(BufferIsValid(buffer));
-	page = (Page) BufferGetPage(buffer);
+	MetaBuffer = XLogReadBuffer(*node, GIN_METAPAGE_BLKNO, true);
+	Assert(BufferIsValid(MetaBuffer));
+	GinInitMetabuffer(MetaBuffer);
+
+	page = (Page) BufferGetPage(MetaBuffer);
+	PageSetLSN(page, lsn);
+	PageSetTLI(page, ThisTimeLineID);
 
-	GinInitBuffer(buffer, GIN_LEAF);
+	RootBuffer = XLogReadBuffer(*node, GIN_ROOT_BLKNO, true);
+	Assert(BufferIsValid(RootBuffer));
+	page = (Page) BufferGetPage(RootBuffer);
+
+	GinInitBuffer(RootBuffer, GIN_LEAF);
 
 	PageSetLSN(page, lsn);
 	PageSetTLI(page, ThisTimeLineID);
 
-	MarkBufferDirty(buffer);
-	UnlockReleaseBuffer(buffer);
+	MarkBufferDirty(MetaBuffer);
+	UnlockReleaseBuffer(MetaBuffer);
+	MarkBufferDirty(RootBuffer);
+	UnlockReleaseBuffer(RootBuffer);
 }
 
 static void
@@ -433,6 +443,161 @@ ginRedoDeletePage(XLogRecPtr lsn, XLogRecord *record)
 	}
 }
 
+static void
+ginRedoUpdateMetapage(XLogRecPtr lsn, XLogRecord *record)
+{
+	ginxlogUpdateMeta *data = (ginxlogUpdateMeta*) XLogRecGetData(record);
+	Buffer		metabuffer;
+	Page		metapage;
+
+	metabuffer = XLogReadBuffer(data->node, GIN_METAPAGE_BLKNO, false);
+	metapage = BufferGetPage(metabuffer);
+
+	if (!XLByteLE(lsn, PageGetLSN(metapage)))
+	{
+		memcpy( GinPageGetMeta(metapage), &data->metadata, sizeof(GinMetaPageData));
+		PageSetLSN(metapage, lsn);
+		PageSetTLI(metapage, ThisTimeLineID);
+		MarkBufferDirty(metabuffer);
+	}
+
+	if ( data->ntuples > 0 )
+	{
+		/*
+		 * insert into tail page
+		 */
+		if (!(record->xl_info & XLR_BKP_BLOCK_1))
+		{
+			Buffer 	buffer = XLogReadBuffer(data->node, data->metadata.tail, false);
+			Page 	page = BufferGetPage(buffer);
+
+			if ( !XLByteLE(lsn, PageGetLSN(page)))
+			{
+				OffsetNumber l, off = (PageIsEmpty(page)) ? FirstOffsetNumber :
+						OffsetNumberNext(PageGetMaxOffsetNumber(page));
+				int				i, tupsize;
+				IndexTuple		tuples = (IndexTuple) (XLogRecGetData(record) + sizeof(ginxlogUpdateMeta));
+
+				for(i=0; i<data->ntuples; i++)
+				{
+					tupsize = IndexTupleSize(tuples);
+
+					l = PageAddItem(page, (Item)tuples, tupsize, off, false, false);
+
+					if (l == InvalidOffsetNumber)
+						elog(ERROR, "failed to add item to index page");
+
+					tuples = (IndexTuple)( ((char*)tuples) + tupsize );
+				}
+
+				PageSetLSN(page, lsn);
+				PageSetTLI(page, ThisTimeLineID);
+				MarkBufferDirty(buffer);
+			}
+			UnlockReleaseBuffer(buffer);
+		}
+	}
+	else if ( data->prevTail != InvalidBlockNumber )
+	{
+		/*
+		 * New tail
+		 */
+
+		Buffer 	buffer = XLogReadBuffer(data->node, data->prevTail, false);
+		Page 	page = BufferGetPage(buffer);
+
+		if ( !XLByteLE(lsn, PageGetLSN(page)))
+		{
+			GinPageGetOpaque(page)->rightlink = data->newRightlink;
+
+			PageSetLSN(page, lsn);
+			PageSetTLI(page, ThisTimeLineID);
+			MarkBufferDirty(buffer);
+		}
+		UnlockReleaseBuffer(buffer);
+	}
+
+	UnlockReleaseBuffer(metabuffer);
+}
+
+static void
+ginRedoInsertListPage(XLogRecPtr lsn, XLogRecord *record)
+{
+	ginxlogInsertListPage *data = (ginxlogInsertListPage*) XLogRecGetData(record);
+	Buffer			buffer;
+	Page			page;
+	OffsetNumber 	l, off = FirstOffsetNumber;
+	int				i, tupsize;
+	IndexTuple      tuples = (IndexTuple) (XLogRecGetData(record) + sizeof(ginxlogInsertListPage));
+
+	if (record->xl_info & XLR_BKP_BLOCK_1)
+		return;
+
+	buffer = XLogReadBuffer(data->node, data->blkno, true);
+	page = BufferGetPage(buffer);
+
+	GinInitBuffer(buffer, GIN_LIST);
+	GinPageGetOpaque(page)->rightlink = data->rightlink;
+	if ( data->rightlink == InvalidBlockNumber )
+		GinPageSetFullRow(page);
+
+	for(i=0; i<data->ntuples; i++)
+	{
+		tupsize = IndexTupleSize(tuples);
+
+		l = PageAddItem(page, (Item)tuples, tupsize, off, false, false);
+
+		if (l == InvalidOffsetNumber)
+			elog(ERROR, "failed to add item to index page");
+
+		tuples = (IndexTuple)( ((char*)tuples) + tupsize );
+	}
+
+	PageSetLSN(page, lsn);
+	PageSetTLI(page, ThisTimeLineID);
+	MarkBufferDirty(buffer);
+
+	UnlockReleaseBuffer(buffer);
+}
+
+static void
+ginRedoDeleteListPages(XLogRecPtr lsn, XLogRecord *record)
+{
+	ginxlogDeleteListPages *data = (ginxlogDeleteListPages*) XLogRecGetData(record);
+	Buffer		metabuffer;
+	Page		metapage;
+	int 		i;
+
+	metabuffer = XLogReadBuffer(data->node, GIN_METAPAGE_BLKNO, false);
+	metapage = BufferGetPage(metabuffer);
+
+	if (!XLByteLE(lsn, PageGetLSN(metapage)))
+	{
+		memcpy( GinPageGetMeta(metapage), &data->metadata, sizeof(GinMetaPageData));
+		PageSetLSN(metapage, lsn);
+		PageSetTLI(metapage, ThisTimeLineID);
+		MarkBufferDirty(metabuffer);
+	}
+
+	for(i=0; i<data->ndeleted; i++)
+	{
+		Buffer 	buffer = XLogReadBuffer(data->node,data->toDelete[i],false);
+		Page	page = BufferGetPage(buffer);
+
+		if ( !XLByteLE(lsn, PageGetLSN(page)))
+		{
+			GinPageGetOpaque(page)->flags = GIN_DELETED;
+
+			PageSetLSN(page, lsn);
+			PageSetTLI(page, ThisTimeLineID);
+			MarkBufferDirty(buffer);
+		}
+
+		UnlockReleaseBuffer(buffer);
+	}
+	UnlockReleaseBuffer(metabuffer);
+}
+
 void
 gin_redo(XLogRecPtr lsn, XLogRecord *record)
 {
@@ -459,6 +624,15 @@ gin_redo(XLogRecPtr lsn, XLogRecord *record)
 		case XLOG_GIN_DELETE_PAGE:
 			ginRedoDeletePage(lsn, record);
 			break;
+		case XLOG_GIN_UPDATE_META_PAGE:
+			ginRedoUpdateMetapage(lsn, record);
+			break;
+		case XLOG_GIN_INSERT_LISTPAGE:
+			ginRedoInsertListPage(lsn, record);
+			break;
+		case XLOG_GIN_DELETE_LISTPAGE: 
+			ginRedoDeleteListPages(lsn, record);
+			break;
 		default:
 			elog(PANIC, "gin_redo: unknown op code %u", info);
 	}
@@ -514,6 +688,18 @@ gin_desc(StringInfo buf, uint8 xl_info, char *rec)
 			appendStringInfo(buf, "Delete page, ");
 			desc_node(buf, ((ginxlogDeletePage *) rec)->node, ((ginxlogDeletePage *) rec)->blkno);
 			break;
+		case XLOG_GIN_UPDATE_META_PAGE:
+			appendStringInfo(buf, "Update metapage, ");
+			desc_node(buf, ((ginxlogUpdateMeta *) rec)->node, ((ginxlogUpdateMeta *) rec)->metadata.tail); 
+			break;
+		case XLOG_GIN_INSERT_LISTPAGE:
+			appendStringInfo(buf, "insert new list page, ");
+			desc_node(buf, ((ginxlogInsertListPage *) rec)->node, ((ginxlogInsertListPage *) rec)->blkno); 
+			break;
+		case XLOG_GIN_DELETE_LISTPAGE:
+			appendStringInfo(buf, "Delete list page (%d), ", ((ginxlogDeleteListPages *) rec)->ndeleted);
+			desc_node(buf, ((ginxlogDeleteListPages *) rec)->node, ((ginxlogDeleteListPages *) rec)->metadata.head); 
+			break;
 		default:
 			elog(PANIC, "gin_desc: unknown op code %u", info);
 	}
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2fc6f05..4434ab4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -193,6 +193,7 @@ CREATE VIEW pg_stat_all_tables AS
             pg_stat_get_tuples_updated(C.oid) AS n_tup_upd, 
             pg_stat_get_tuples_deleted(C.oid) AS n_tup_del,
             pg_stat_get_tuples_hot_updated(C.oid) AS n_tup_hot_upd,
+            pg_stat_get_fresh_inserted_tuples(C.oid) AS n_fresh_tup, 
             pg_stat_get_live_tuples(C.oid) AS n_live_tup, 
             pg_stat_get_dead_tuples(C.oid) AS n_dead_tup,
             pg_stat_get_last_vacuum_time(C.oid) as last_vacuum,
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index ffc882f..0864a04 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -306,6 +306,47 @@ tbm_add_tuples(TIDBitmap *tbm, const ItemPointer tids, int ntids,
 }
 
 /*
+ * tbm_check_tuple - Check presence of tuple's ID in a TIDBitmap
+ */
+bool
+tbm_check_tuple(TIDBitmap *tbm, const ItemPointer tid) {
+	BlockNumber 	blk = ItemPointerGetBlockNumber(tid);
+	OffsetNumber 	off = ItemPointerGetOffsetNumber(tid);
+	PagetableEntry *page;
+	int				wordnum,
+					bitnum;
+
+	/* safety check to ensure we don't overrun bit array bounds */
+	if (off < 1 || off > MAX_TUPLES_PER_PAGE)
+		elog(ERROR, "tuple offset out of range: %u", off);
+
+	if (tbm_page_is_lossy(tbm, blk))
+		return true; /* whole page is already marked */
+
+	page = tbm_get_pageentry(tbm, blk);
+	if (page->ischunk)
+	{
+		wordnum = bitnum = 0;
+	}
+	else
+	{
+		wordnum = WORDNUM(off - 1);
+		bitnum = BITNUM(off - 1);
+	}
+
+	return ( page->words[wordnum] & ((bitmapword) 1 << bitnum) ) ? true : false; 
+}
+
+/*
+ * tbm_has_lossy - returns true if there is at least one lossy page
+ */
+bool
+tbm_has_lossy(TIDBitmap *tbm)
+{
+	return (tbm->nchunks>0);
+}
+
+/*
  * tbm_union - set union
  *
  * a is modified in-place, b is not changed
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 2c68779..324ae44 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2477,6 +2477,58 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map)
 }
 
 /*
+ * relation_has_pending_indexes
+ *
+ * Returns true if relation has indexes with delayed insertion.
+ * Currently, only GIN has that possiblity
+ */
+
+static bool
+relation_has_pending_indexes(Oid relid, Form_pg_class classForm)
+{
+	Relation	rel;
+	List       *indexoidlist;
+	ListCell   *indexoidscan;
+	bool		has = false;
+
+	/* only ordinary cataloged heap can contains such indexes */
+	if ( classForm->relkind != RELKIND_RELATION )
+		return false;
+
+	/* has not indexes at all */
+	if ( classForm->relhasindex == false )
+		return false;
+
+	rel = RelationIdGetRelation(relid);
+
+	indexoidlist = RelationGetIndexList(rel);
+
+	foreach(indexoidscan, indexoidlist)
+	{
+		Oid         indexoid = lfirst_oid(indexoidscan);
+		Relation	irel = RelationIdGetRelation(indexoid);
+
+		/*
+		 *  Currently, only GIN 
+		 */
+		if ( irel->rd_rel->relam == GIN_AM_OID )
+		{
+			RelationClose(irel);
+			has = true;
+			break;
+		}
+
+		RelationClose(irel);
+	}
+
+	list_free(indexoidlist);
+
+	RelationClose(rel);
+
+	return has;
+}
+
+/*
  * relation_needs_vacanalyze
  *
  * Check whether a relation needs to be vacuumed or analyzed; return each into
@@ -2533,7 +2585,8 @@ relation_needs_vacanalyze(Oid relid,
 
 	/* number of vacuum (resp. analyze) tuples at this time */
 	float4		vactuples,
-				anltuples;
+				anltuples,
+				instuples;
 
 	/* freeze parameters */
 	int			freeze_max_age;
@@ -2598,6 +2651,7 @@ relation_needs_vacanalyze(Oid relid,
 		vactuples = tabentry->n_dead_tuples;
 		anltuples = tabentry->n_live_tuples + tabentry->n_dead_tuples -
 			tabentry->last_anl_tuples;
+		instuples = tabentry->n_inserted_tuples;
 
 		vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
 		anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples;
@@ -2611,8 +2665,13 @@ relation_needs_vacanalyze(Oid relid,
 			 NameStr(classForm->relname),
 			 vactuples, vacthresh, anltuples, anlthresh);
 
-		/* Determine if this table needs vacuum or analyze. */
-		*dovacuum = force_vacuum || (vactuples > vacthresh);
+		/* 
+		 * Determine if this table needs vacuum or analyze. 
+		 * Use vac_base_thresh as a theshhold for instuples because
+		 * search time of GIN's pending pages is linear by its number. 
+		 */
+		*dovacuum = force_vacuum || (vactuples > vacthresh) || 
+				(relation_has_pending_indexes(relid, classForm) && instuples > vac_base_thresh);
 		*doanalyze = (anltuples > anlthresh);
 	}
 	else
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 5ae0ec1..24573ff 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3537,6 +3537,9 @@ pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len)
 			tabentry->tuples_updated = tabmsg[i].t_counts.t_tuples_updated;
 			tabentry->tuples_deleted = tabmsg[i].t_counts.t_tuples_deleted;
 			tabentry->tuples_hot_updated = tabmsg[i].t_counts.t_tuples_hot_updated;
+			tabentry->n_inserted_tuples = tabmsg[i].t_counts.t_tuples_inserted + 
+										  tabmsg[i].t_counts.t_tuples_updated -
+										  tabmsg[i].t_counts.t_tuples_hot_updated;
 			tabentry->n_live_tuples = tabmsg[i].t_counts.t_new_live_tuples;
 			tabentry->n_dead_tuples = tabmsg[i].t_counts.t_new_dead_tuples;
 			tabentry->blocks_fetched = tabmsg[i].t_counts.t_blocks_fetched;
@@ -3560,6 +3563,9 @@ pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len)
 			tabentry->tuples_updated += tabmsg[i].t_counts.t_tuples_updated;
 			tabentry->tuples_deleted += tabmsg[i].t_counts.t_tuples_deleted;
 			tabentry->tuples_hot_updated += tabmsg[i].t_counts.t_tuples_hot_updated;
+			tabentry->n_inserted_tuples += tabmsg[i].t_counts.t_tuples_inserted + 
+										  tabmsg[i].t_counts.t_tuples_updated -
+										  tabmsg[i].t_counts.t_tuples_hot_updated;
 			tabentry->n_live_tuples += tabmsg[i].t_counts.t_new_live_tuples;
 			tabentry->n_dead_tuples += tabmsg[i].t_counts.t_new_dead_tuples;
 			tabentry->blocks_fetched += tabmsg[i].t_counts.t_blocks_fetched;
@@ -3570,6 +3576,8 @@ pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len)
 		tabentry->n_live_tuples = Max(tabentry->n_live_tuples, 0);
 		/* Likewise for n_dead_tuples */
 		tabentry->n_dead_tuples = Max(tabentry->n_dead_tuples, 0);
+		/* Likewise for n_inserted_tuples */
+		tabentry->n_inserted_tuples = Max(tabentry->n_inserted_tuples, 0);
 
 		/*
 		 * Add per-table stats to the per-database entry, too.
@@ -3770,6 +3778,7 @@ pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len)
 		tabentry->n_live_tuples = msg->m_tuples;
 	/* Resetting dead_tuples to 0 is an approximation ... */
 	tabentry->n_dead_tuples = 0;
+	tabentry->n_inserted_tuples = 0;
 	if (msg->m_analyze)
 	{
 		if (msg->m_scanned_all)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 77c2baa..381de6f 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -31,6 +31,7 @@ extern Datum pg_stat_get_tuples_updated(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_tuples_deleted(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_tuples_hot_updated(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_live_tuples(PG_FUNCTION_ARGS);
+extern Datum pg_stat_get_fresh_inserted_tuples(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_dead_tuples(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_blocks_hit(PG_FUNCTION_ARGS);
@@ -209,6 +210,20 @@ pg_stat_get_live_tuples(PG_FUNCTION_ARGS)
 	PG_RETURN_INT64(result);
 }
 
+Datum
+pg_stat_get_fresh_inserted_tuples(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	int64		result;
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		result = 0;
+	else
+		result = (int64) (tabentry->n_inserted_tuples);
+
+	PG_RETURN_INT64(result);
+}
 
 Datum
 pg_stat_get_dead_tuples(PG_FUNCTION_ARGS)
diff --git a/src/include/access/gin.h b/src/include/access/gin.h
index 0fd2cbd..f514358 100644
--- a/src/include/access/gin.h
+++ b/src/include/access/gin.h
@@ -21,6 +21,7 @@
 #include "storage/buf.h"
 #include "storage/off.h"
 #include "storage/relfilenode.h"
+#include "utils/rel.h"
 
 
 /*
@@ -52,11 +53,34 @@ typedef struct GinPageOpaqueData
 
 typedef GinPageOpaqueData *GinPageOpaque;
 
-#define GIN_ROOT_BLKNO	(0)
+#define GIN_METAPAGE_BLKNO	(0)
+#define GIN_ROOT_BLKNO	(1)
 
 #define GIN_DATA		  (1 << 0)
 #define GIN_LEAF		  (1 << 1)
 #define GIN_DELETED		  (1 << 2)
+#define GIN_META		  (1 << 3)
+#define GIN_LIST		  (1 << 4)
+#define GIN_LIST_FULLROW  (1 << 5)   /* makes sense only on GIN_LIST page */
+
+typedef struct GinMetaPageData
+{
+	/*
+	 * Pointers to head and tail of list of GIN_LIST pages.  These store
+	 * fast-inserted entries that haven't yet been moved into the regular
+	 * GIN structure.
+	 */
+	BlockNumber			head;
+	BlockNumber         tail;
+
+	/*
+	 * Free space in bytes in the list's tail page.
+	 */
+	uint32				tailFreeSize;
+} GinMetaPageData;
+
+#define GinPageGetMeta(p) \
+	((GinMetaPageData *) PageGetContents(p))
 
 /*
  * Works on page
@@ -68,6 +92,8 @@ typedef GinPageOpaqueData *GinPageOpaque;
 #define GinPageSetNonLeaf(page)    ( GinPageGetOpaque(page)->flags &= ~GIN_LEAF )
 #define GinPageIsData(page)    ( GinPageGetOpaque(page)->flags & GIN_DATA )
 #define GinPageSetData(page)   ( GinPageGetOpaque(page)->flags |= GIN_DATA )
+#define GinPageHasFullRow(page)    ( GinPageGetOpaque(page)->flags & GIN_LIST_FULLROW )
+#define GinPageSetFullRow(page)   ( GinPageGetOpaque(page)->flags |= GIN_LIST_FULLROW )
 
 #define GinPageIsDeleted(page) ( GinPageGetOpaque(page)->flags & GIN_DELETED)
 #define GinPageSetDeleted(page)    ( GinPageGetOpaque(page)->flags |= GIN_DELETED)
@@ -135,6 +161,20 @@ typedef struct
 	 - GinPageGetOpaque(page)->maxoff * GinSizeOfItem(page) \
 	 - MAXALIGN(sizeof(GinPageOpaqueData)))
 
+/*
+ * storage type for GIN's options.  Must be upward compatible with
+ * StdRdOptions, since we might call RelationGetFillFactor().
+ */
+typedef struct GinOptions
+{
+	StdRdOptions std;			/* standard options */
+	bool		useFastUpdate;	/* use fast updates? */
+} GinOptions;
+
+#define GIN_DEFAULT_USE_FASTUPDATE  true
+#define GinGetUseFastUpdate(relation) \
+    ((relation)->rd_options ? \
+	 ((GinOptions *) (relation)->rd_options)->useFastUpdate : GIN_DEFAULT_USE_FASTUPDATE)
 
 #define GIN_UNLOCK	BUFFER_LOCK_UNLOCK
 #define GIN_SHARE	BUFFER_LOCK_SHARE
@@ -234,12 +274,49 @@ typedef struct ginxlogDeletePage
 	BlockNumber rightLink;
 } ginxlogDeletePage;
 
+
+#define XLOG_GIN_UPDATE_META_PAGE 0x60
+
+typedef struct ginxlogUpdateMeta
+{
+	RelFileNode     node;
+	GinMetaPageData metadata;
+	BlockNumber     prevTail;
+	BlockNumber     newRightlink;
+	int32           ntuples; /* if ntuples > 0 then metadata.tail was updated with
+								that tuples else new sub list was inserted */
+	/* follows array of inserted tuples */
+} ginxlogUpdateMeta;
+
+#define XLOG_GIN_INSERT_LISTPAGE  0x70
+
+typedef struct ginxlogInsertListPage
+{
+	RelFileNode     node;
+	BlockNumber     blkno;
+	BlockNumber     rightlink;
+	int32           ntuples;
+	/* follows array of inserted tuples */
+} ginxlogInsertListPage;
+
+#define XLOG_GIN_DELETE_LISTPAGE  0x80
+
+#define NDELETE_AT_ONCE (16)
+typedef struct ginxlogDeleteListPages
+{
+	RelFileNode     node;
+	GinMetaPageData metadata;
+	int32           ndeleted;
+	BlockNumber     toDelete[ NDELETE_AT_ONCE ];
+} ginxlogDeleteListPages;
+
 /* ginutil.c */
 extern Datum ginoptions(PG_FUNCTION_ARGS);
 extern void initGinState(GinState *state, Relation index);
 extern Buffer GinNewBuffer(Relation index);
 extern void GinInitBuffer(Buffer b, uint32 f);
 extern void GinInitPage(Page page, uint32 f, Size pageSize);
+extern void GinInitMetabuffer(Buffer b);
 extern int	compareEntries(GinState *ginstate, OffsetNumber attnum, Datum a, Datum b);
 extern int	compareAttEntries(GinState *ginstate, OffsetNumber attnum_a, Datum a, 
 												  OffsetNumber attnum_b, Datum b);
@@ -252,6 +329,8 @@ extern OffsetNumber gintuple_get_attrnum(GinState *ginstate, IndexTuple tuple);
 /* gininsert.c */
 extern Datum ginbuild(PG_FUNCTION_ARGS);
 extern Datum gininsert(PG_FUNCTION_ARGS);
+extern void ginEntryInsert(Relation index, GinState *ginstate, OffsetNumber attnum, Datum value,
+							ItemPointerData *items, uint32 nitem, bool isBuild);
 
 /* ginxlog.c */
 extern void gin_redo(XLogRecPtr lsn, XLogRecord *record);
@@ -425,6 +504,10 @@ typedef struct GinScanOpaqueData
 	uint32		nkeys;
 	bool		isVoidRes;		/* true if ginstate.extractQueryFn guarantees
 								 * that nothing will be found */
+	bool				isScanFastInsert; /* scan process in scanning fast update pages */
+	TIDBitmap			*scanFastTuples;	  
+	TBMIterateResult	*scanFastResult;
+	OffsetNumber		scanFastOffset;
 } GinScanOpaqueData;
 
 typedef GinScanOpaqueData *GinScanOpaque;
@@ -488,4 +571,23 @@ extern void ginInsertRecordBA(BuildAccumulator *accum,
 				  OffsetNumber attnum, Datum *entries, int32 nentry);
 extern ItemPointerData *ginGetEntry(BuildAccumulator *accum, OffsetNumber *attnum, Datum *entry, uint32 *n);
 
+/* ginfast.c */
+
+typedef struct GinTupleCollector {
+	IndexTuple	*tuples;
+	uint32		 ntuples;
+	uint32		 lentuples;
+	uint32		 sumsize;
+} GinTupleCollector; 
+
+extern void ginHeapTupleFastInsert(Relation index, GinTupleCollector *collector);  
+extern uint32 ginHeapTupleFastCollect(Relation index, GinState *ginstate,
+					GinTupleCollector *collector,
+					OffsetNumber attnum, Datum value, ItemPointer item);
+
+#define GinListPageSize   \
+    ( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) )
+
+extern Datum gininsertcleanup(PG_FUNCTION_ARGS);
+
 #endif
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index d405d82..9165f08 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2928,6 +2928,8 @@ DATA(insert OID = 1933 (  pg_stat_get_tuples_deleted	PGNSP PGUID 12 1 0 0 f f f
 DESCR("statistics: number of tuples deleted");
 DATA(insert OID = 1972 (  pg_stat_get_tuples_hot_updated PGNSP PGUID 12 1 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_tuples_hot_updated _null_ _null_ _null_ ));
 DESCR("statistics: number of tuples hot updated");
+DATA(insert OID = 2316 (  pg_stat_get_fresh_inserted_tuples	PGNSP PGUID 12 1 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_fresh_inserted_tuples _null_ _null_ _null_ ));
+DESCR("statistics: number of inserted tuples since last vacuum");
 DATA(insert OID = 2878 (  pg_stat_get_live_tuples	PGNSP PGUID 12 1 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_live_tuples _null_ _null_ _null_ ));
 DESCR("statistics: number of live tuples");
 DATA(insert OID = 2879 (  pg_stat_get_dead_tuples	PGNSP PGUID 12 1 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_dead_tuples _null_ _null_ _null_ ));
diff --git a/src/include/nodes/tidbitmap.h b/src/include/nodes/tidbitmap.h
index 56d6a0d..c8dbeea 100644
--- a/src/include/nodes/tidbitmap.h
+++ b/src/include/nodes/tidbitmap.h
@@ -49,6 +49,8 @@ extern void tbm_free(TIDBitmap *tbm);
 extern void tbm_add_tuples(TIDBitmap *tbm,
 						   const ItemPointer tids, int ntids,
 						   bool recheck);
+extern bool tbm_check_tuple(TIDBitmap *tbm, const ItemPointer tid);
+extern bool	tbm_has_lossy(TIDBitmap *tbm);
 
 extern void tbm_union(TIDBitmap *a, const TIDBitmap *b);
 extern void tbm_intersect(TIDBitmap *a, const TIDBitmap *b);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 4a1e274..79754dc 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -476,6 +476,8 @@ typedef struct PgStat_StatTabEntry
 	PgStat_Counter tuples_deleted;
 	PgStat_Counter tuples_hot_updated;
 
+	PgStat_Counter n_inserted_tuples; /* number of non-hot inserted tuples
+									   * since last vacuum */ 
 	PgStat_Counter n_live_tuples;
 	PgStat_Counter n_dead_tuples;
 	PgStat_Counter last_anl_tuples;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 977b17c..c037696 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1291,14 +1291,14 @@ SELECT viewname, definition FROM pg_views WHERE schemaname <> 'information_schem
  pg_shadow                | SELECT pg_authid.rolname AS usename, pg_authid.oid AS usesysid, pg_authid.rolcreatedb AS usecreatedb, pg_authid.rolsuper AS usesuper, pg_authid.rolcatupdate AS usecatupd, pg_authid.rolpassword AS passwd, (pg_authid.rolvaliduntil)::abstime AS valuntil, pg_authid.rolconfig AS useconfig FROM pg_authid WHERE pg_authid.rolcanlogin;
  pg_stat_activity         | SELECT s.datid, d.datname, s.procpid, s.usesysid, u.rolname AS usename, s.current_query, s.waiting, s.xact_start, s.query_start, s.backend_start, s.client_addr, s.client_port FROM pg_database d, pg_stat_get_activity(NULL::integer) s(datid, procpid, usesysid, current_query, waiting, xact_start, query_start, backend_start, client_addr, client_port), pg_authid u WHERE ((s.datid = d.oid) AND (s.usesysid = u.oid));
  pg_stat_all_indexes      | SELECT c.oid AS relid, i.oid AS indexrelid, n.nspname AS schemaname, c.relname, i.relname AS indexrelname, pg_stat_get_numscans(i.oid) AS idx_scan, pg_stat_get_tuples_returned(i.oid) AS idx_tup_read, pg_stat_get_tuples_fetched(i.oid) AS idx_tup_fetch FROM (((pg_class c JOIN pg_index x ON ((c.oid = x.indrelid))) JOIN pg_class i ON ((i.oid = x.indexrelid))) LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char"]));
- pg_stat_all_tables       | SELECT c.oid AS relid, n.nspname AS schemaname, c.relname, pg_stat_get_numscans(c.oid) AS seq_scan, pg_stat_get_tuples_returned(c.oid) AS seq_tup_read, (sum(pg_stat_get_numscans(i.indexrelid)))::bigint AS idx_scan, ((sum(pg_stat_get_tuples_fetched(i.indexrelid)))::bigint + pg_stat_get_tuples_fetched(c.oid)) AS idx_tup_fetch, pg_stat_get_tuples_inserted(c.oid) AS n_tup_ins, pg_stat_get_tuples_updated(c.oid) AS n_tup_upd, pg_stat_get_tuples_deleted(c.oid) AS n_tup_del, pg_stat_get_tuples_hot_updated(c.oid) AS n_tup_hot_upd, pg_stat_get_live_tuples(c.oid) AS n_live_tup, pg_stat_get_dead_tuples(c.oid) AS n_dead_tup, pg_stat_get_last_vacuum_time(c.oid) AS last_vacuum, pg_stat_get_last_autovacuum_time(c.oid) AS last_autovacuum, pg_stat_get_last_analyze_time(c.oid) AS last_analyze, pg_stat_get_last_autoanalyze_time(c.oid) AS last_autoanalyze FROM ((pg_class c LEFT JOIN pg_index i ON ((c.oid = i.indrelid))) LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char"])) GROUP BY c.oid, n.nspname, c.relname;
+ pg_stat_all_tables       | SELECT c.oid AS relid, n.nspname AS schemaname, c.relname, pg_stat_get_numscans(c.oid) AS seq_scan, pg_stat_get_tuples_returned(c.oid) AS seq_tup_read, (sum(pg_stat_get_numscans(i.indexrelid)))::bigint AS idx_scan, ((sum(pg_stat_get_tuples_fetched(i.indexrelid)))::bigint + pg_stat_get_tuples_fetched(c.oid)) AS idx_tup_fetch, pg_stat_get_tuples_inserted(c.oid) AS n_tup_ins, pg_stat_get_tuples_updated(c.oid) AS n_tup_upd, pg_stat_get_tuples_deleted(c.oid) AS n_tup_del, pg_stat_get_tuples_hot_updated(c.oid) AS n_tup_hot_upd, pg_stat_get_fresh_inserted_tuples(c.oid) AS n_fresh_tup, pg_stat_get_live_tuples(c.oid) AS n_live_tup, pg_stat_get_dead_tuples(c.oid) AS n_dead_tup, pg_stat_get_last_vacuum_time(c.oid) AS last_vacuum, pg_stat_get_last_autovacuum_time(c.oid) AS last_autovacuum, pg_stat_get_last_analyze_time(c.oid) AS last_analyze, pg_stat_get_last_autoanalyze_time(c.oid) AS last_autoanalyze FROM ((pg_class c LEFT JOIN pg_index i ON ((c.oid = i.indrelid))) LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char"])) GROUP BY c.oid, n.nspname, c.relname;
  pg_stat_bgwriter         | SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed, pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req, pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint, pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean, pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean, pg_stat_get_buf_written_backend() AS buffers_backend, pg_stat_get_buf_alloc() AS buffers_alloc;
  pg_stat_database         | SELECT d.oid AS datid, d.datname, pg_stat_get_db_numbackends(d.oid) AS numbackends, pg_stat_get_db_xact_commit(d.oid) AS xact_commit, pg_stat_get_db_xact_rollback(d.oid) AS xact_rollback, (pg_stat_get_db_blocks_fetched(d.oid) - pg_stat_get_db_blocks_hit(d.oid)) AS blks_read, pg_stat_get_db_blocks_hit(d.oid) AS blks_hit, pg_stat_get_db_tuples_returned(d.oid) AS tup_returned, pg_stat_get_db_tuples_fetched(d.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(d.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(d.oid) AS tup_updated, pg_stat_get_db_tuples_deleted(d.oid) AS tup_deleted FROM pg_database d;
  pg_stat_sys_indexes      | SELECT pg_stat_all_indexes.relid, pg_stat_all_indexes.indexrelid, pg_stat_all_indexes.schemaname, pg_stat_all_indexes.relname, pg_stat_all_indexes.indexrelname, pg_stat_all_indexes.idx_scan, pg_stat_all_indexes.idx_tup_read, pg_stat_all_indexes.idx_tup_fetch FROM pg_stat_all_indexes WHERE ((pg_stat_all_indexes.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_indexes.schemaname ~ '^pg_toast'::text));
- pg_stat_sys_tables       | SELECT pg_stat_all_tables.relid, pg_stat_all_tables.schemaname, pg_stat_all_tables.relname, pg_stat_all_tables.seq_scan, pg_stat_all_tables.seq_tup_read, pg_stat_all_tables.idx_scan, pg_stat_all_tables.idx_tup_fetch, pg_stat_all_tables.n_tup_ins, pg_stat_all_tables.n_tup_upd, pg_stat_all_tables.n_tup_del, pg_stat_all_tables.n_tup_hot_upd, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze FROM pg_stat_all_tables WHERE ((pg_stat_all_tables.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_tables.schemaname ~ '^pg_toast'::text));
+ pg_stat_sys_tables       | SELECT pg_stat_all_tables.relid, pg_stat_all_tables.schemaname, pg_stat_all_tables.relname, pg_stat_all_tables.seq_scan, pg_stat_all_tables.seq_tup_read, pg_stat_all_tables.idx_scan, pg_stat_all_tables.idx_tup_fetch, pg_stat_all_tables.n_tup_ins, pg_stat_all_tables.n_tup_upd, pg_stat_all_tables.n_tup_del, pg_stat_all_tables.n_tup_hot_upd, pg_stat_all_tables.n_fresh_tup, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze FROM pg_stat_all_tables WHERE ((pg_stat_all_tables.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_tables.schemaname ~ '^pg_toast'::text));
  pg_stat_user_functions   | SELECT p.oid AS funcid, n.nspname AS schemaname, p.proname AS funcname, pg_stat_get_function_calls(p.oid) AS calls, (pg_stat_get_function_time(p.oid) / 1000) AS total_time, (pg_stat_get_function_self_time(p.oid) / 1000) AS self_time FROM (pg_proc p LEFT JOIN pg_namespace n ON ((n.oid = p.pronamespace))) WHERE ((p.prolang <> (12)::oid) AND (pg_stat_get_function_calls(p.oid) IS NOT NULL));
  pg_stat_user_indexes     | SELECT pg_stat_all_indexes.relid, pg_stat_all_indexes.indexrelid, pg_stat_all_indexes.schemaname, pg_stat_all_indexes.relname, pg_stat_all_indexes.indexrelname, pg_stat_all_indexes.idx_scan, pg_stat_all_indexes.idx_tup_read, pg_stat_all_indexes.idx_tup_fetch FROM pg_stat_all_indexes WHERE ((pg_stat_all_indexes.schemaname <> ALL (ARRAY['pg_catalog'::name, 'information_schema'::name])) AND (pg_stat_all_indexes.schemaname !~ '^pg_toast'::text));
- pg_stat_user_tables      | SELECT pg_stat_all_tables.relid, pg_stat_all_tables.schemaname, pg_stat_all_tables.relname, pg_stat_all_tables.seq_scan, pg_stat_all_tables.seq_tup_read, pg_stat_all_tables.idx_scan, pg_stat_all_tables.idx_tup_fetch, pg_stat_all_tables.n_tup_ins, pg_stat_all_tables.n_tup_upd, pg_stat_all_tables.n_tup_del, pg_stat_all_tables.n_tup_hot_upd, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze FROM pg_stat_all_tables WHERE ((pg_stat_all_tables.schemaname <> ALL (ARRAY['pg_catalog'::name, 'information_schema'::name])) AND (pg_stat_all_tables.schemaname !~ '^pg_toast'::text));
+ pg_stat_user_tables      | SELECT pg_stat_all_tables.relid, pg_stat_all_tables.schemaname, pg_stat_all_tables.relname, pg_stat_all_tables.seq_scan, pg_stat_all_tables.seq_tup_read, pg_stat_all_tables.idx_scan, pg_stat_all_tables.idx_tup_fetch, pg_stat_all_tables.n_tup_ins, pg_stat_all_tables.n_tup_upd, pg_stat_all_tables.n_tup_del, pg_stat_all_tables.n_tup_hot_upd, pg_stat_all_tables.n_fresh_tup, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze FROM pg_stat_all_tables WHERE ((pg_stat_all_tables.schemaname <> ALL (ARRAY['pg_catalog'::name, 'information_schema'::name])) AND (pg_stat_all_tables.schemaname !~ '^pg_toast'::text));
  pg_statio_all_indexes    | SELECT c.oid AS relid, i.oid AS indexrelid, n.nspname AS schemaname, c.relname, i.relname AS indexrelname, (pg_stat_get_blocks_fetched(i.oid) - pg_stat_get_blocks_hit(i.oid)) AS idx_blks_read, pg_stat_get_blocks_hit(i.oid) AS idx_blks_hit FROM (((pg_class c JOIN pg_index x ON ((c.oid = x.indrelid))) JOIN pg_class i ON ((i.oid = x.indexrelid))) LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char"]));
  pg_statio_all_sequences  | SELECT c.oid AS relid, n.nspname AS schemaname, c.relname, (pg_stat_get_blocks_fetched(c.oid) - pg_stat_get_blocks_hit(c.oid)) AS blks_read, pg_stat_get_blocks_hit(c.oid) AS blks_hit FROM (pg_class c LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.relkind = 'S'::"char");
  pg_statio_all_tables     | SELECT c.oid AS relid, n.nspname AS schemaname, c.relname, (pg_stat_get_blocks_fetched(c.oid) - pg_stat_get_blocks_hit(c.oid)) AS heap_blks_read, pg_stat_get_blocks_hit(c.oid) AS heap_blks_hit, (sum((pg_stat_get_blocks_fetched(i.indexrelid) - pg_stat_get_blocks_hit(i.indexrelid))))::bigint AS idx_blks_read, (sum(pg_stat_get_blocks_hit(i.indexrelid)))::bigint AS idx_blks_hit, (pg_stat_get_blocks_fetched(t.oid) - pg_stat_get_blocks_hit(t.oid)) AS toast_blks_read, pg_stat_get_blocks_hit(t.oid) AS toast_blks_hit, (pg_stat_get_blocks_fetched(x.oid) - pg_stat_get_blocks_hit(x.oid)) AS tidx_blks_read, pg_stat_get_blocks_hit(x.oid) AS tidx_blks_hit FROM ((((pg_class c LEFT JOIN pg_index i ON ((c.oid = i.indrelid))) LEFT JOIN pg_class t ON ((c.reltoastrelid = t.oid))) LEFT JOIN pg_class x ON ((t.reltoastidxid = x.oid))) LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char"])) GROUP BY c.oid, n.nspname, c.relname, t.oid, x.oid;
#57Teodor Sigaev
teodor@sigaev.ru
In reply to: Jeff Davis (#56)
1 attachment(s)
Re: [PATCHES] GIN improvements

New version. Changes:
- synced with current CVS
- added all your changes
- autovacuum will run if fast update mode is turned on and
trigger of fresh tuple is fired
- gincostestimate now tries to calculate cost of scan of pending pages.
gincostestimate set disable_cost if it believe that tidbitmap will
become lossy. So, tidbitmap has new method - estimation of
maximum number of tuples with guaranteed non-lossy mode.

START_CRIT_SECTION();
...
l = PageAddItem(...);
if (l == InvalidOffsetNumber)
elog(ERROR, "failed to add item to index page in \"%s\"",
RelationGetRelationName(index));

It's no use using ERROR, because it will turn into PANIC, which is

I did that similar to other GIN/GiST places. BTW, BTree directly emits PANIC if
PageAddItem fails

4. Heikki mentioned:
http://archives.postgresql.org/pgsql-hackers/2008-11/msg01832.php

"To make things worse, a query will fail if all the matching
fast-inserted tuples don't fit in the non-lossy tid bitmap."

That issue still remains, correct? Is there a resolution to that?

Now gincostestimate can forbid index scan by disable_cost (see Changes). Of
course, it doesn't prevent failure in case of large update (for example), but it
prevents in most cases. BTW, because of sequential scan of pending list cost of
scan grows up fast and index scan becomes non-optimal.

5. I attached a newer version merged with HEAD.

Thank you

6. You defined:

#define GinPageHasFullRow(page) ( GinPageGetOpaque(page)->flags &
GIN_LIST_FULLROW )

Fixed

7. I don't understand this chunk of code:

How can (!ItemPointerEquals(&pos->item, &item)) ever happen?

And how can (scanGetCandidate(scan, pos) == false) ever happen? Should
that be an Assert() instead?

If those can happen during normal operation, then we need a better error
message there.

It should be assert, but assert enabled and disabled code will be different :(.
In both cases, scanGetCandidate() should be called, but in assert enabled code
we need to check return value and pos->item.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

fast_insert_gin-0.20.gzapplication/x-tar; name=fast_insert_gin-0.20.gzDownload
#58Jeff Davis
pgsql@j-davis.com
In reply to: Teodor Sigaev (#57)
Re: [PATCHES] GIN improvements

On Fri, 2009-01-16 at 15:39 +0300, Teodor Sigaev wrote:

START_CRIT_SECTION();
...
l = PageAddItem(...);
if (l == InvalidOffsetNumber)
elog(ERROR, "failed to add item to index page in \"%s\"",
RelationGetRelationName(index));

It's no use using ERROR, because it will turn into PANIC, which is

I did that similar to other GIN/GiST places. BTW, BTree directly emits PANIC if
PageAddItem fails

I'd still prefer PANIC over an ERROR that will always turn into a PANIC.
I'll leave it as you did, though.

4. Heikki mentioned:
http://archives.postgresql.org/pgsql-hackers/2008-11/msg01832.php

"To make things worse, a query will fail if all the matching
fast-inserted tuples don't fit in the non-lossy tid bitmap."

That issue still remains, correct? Is there a resolution to that?

Now gincostestimate can forbid index scan by disable_cost (see Changes). Of
course, it doesn't prevent failure in case of large update (for example), but it
prevents in most cases. BTW, because of sequential scan of pending list cost of
scan grows up fast and index scan becomes non-optimal.

Is this a 100% bulletproof solution, or is it still possible for a query
to fail due to the pending list? It relies on the stats collector, so
perhaps in rare cases it could still fail?

It might be surprising though, that after an UPDATE and before a VACUUM,
the gin index just stops working (if work_mem is too low). For many
use-cases, if GIN is not used, it's just as bad as the query failing,
because it would be so slow.

Can you explain why the tbm must not be lossy?

Also, can you clarify why a large update can cause a problem? In the
previous discussion, you suggested that it force normal index inserts
after a threshold based on work_mem:

http://archives.postgresql.org/pgsql-hackers/2008-12/msg00065.php

Regards,
Jeff Davis

#59Teodor Sigaev
teodor@sigaev.ru
In reply to: Jeff Davis (#58)
1 attachment(s)
Re: [PATCHES] GIN improvements

Changes:
Results of pernding list's scan now are placed directly in resulting
tidbitmap. This saves cycles for filtering results and reduce memory usage.
Also, it allows to not check losiness of tbm.

Is this a 100% bulletproof solution, or is it still possible for a query
to fail due to the pending list? It relies on the stats collector, so
perhaps in rare cases it could still fail?

Yes :(

Can you explain why the tbm must not be lossy?

The problem with lossy tbm has two aspects:
- amgettuple interface hasn't possibility to work with page-wide result instead
of exact ItemPointer. amgettuple can not return just a block number as
amgetbitmap can.
- Because of concurrent vacuum process: while we scan pending list, it's
content could be transferred into regular structure of index and then we will
find the same tuple twice. Again, amgettuple hasn't protection from that,
only amgetbitmap has it. So, we need to filter results from regular GIN
by results from pending list. ANd for filtering we can't use lossy tbm.

v0.21 prevents from that fail on call of gingetbitmap, because now all results
are collected in single resulting tidbitmap.

Also, can you clarify why a large update can cause a problem? In the

If query looks like
UPDATE tbl SET col=... WHERE col ... and planner choose GIN indexscan over col
then there is a probability of increasing of pending list over non-lossy limit.

previous discussion, you suggested that it force normal index inserts
after a threshold based on work_mem:

http://archives.postgresql.org/pgsql-hackers/2008-12/msg00065.php

I see only two guaranteed solution of the problem:
- after limit is reached, force normal index inserts. One of the motivation of
patch was frequent question from users: why update of whole table with GIN index
is so slow? So this way will not resolve this question.
- after limit is reached, force cleanup of pending list by calling
gininsertcleanup. Not very good, because users sometimes will see a huge
execution time of simple insert. Although users who runs a huge update should be
satisfied.

I have difficulties in a choice of way. Seems to me, the better will be second
way: if user gets very long time of insertion then (auto)vacuum of his
installation should tweaked.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

fast_insert_gin-0.21.gzapplication/x-tar; name=fast_insert_gin-0.21.gzDownload
���tIfast_insert_gin-0.21��{_G�0���)����$q7��a��`���=���
�f-�(3#c6����u��L�.6��<��i�2���u���~4���XM������Z�F��7A�C���^/���^2%�z�q%q����sT{����H��N����j���D{��F�y���R?b���j=Y]]���Nw�����7wJ�-�?4���������6����i���/��<�	~U�&I�]���q����8�J�>Y5�%��A�O�� ����I���~��_�5W���d ����(���T�����w��N�_u��~��<����
Q[_�:��t�A��by~����gg����?�?���	�Q��6�"�g
���koW��E��g��no6�	}���x������'y4\K�Sga���O���?�q�>�������������&OCY����?�����Ao2�� M��q����Sn=~XKxTQ�N#���'�.����u&��FZ<+���Ya)����o~lU��R,��L|�5�a�-��{;�N�c(?�T�]����(^�V���0���>�H�Q�k_?����$��F�f	�Q�~�q�6��g"��x2��u�f��`����dt��}��^h%_#��k 5�}�C��?�A�G�D���"���i�h*�����#�gEg����V��B��;������"pIn��
>�!�������8HCkQ3��I���
��><XY��[��O�n�p���
����"�u��-��5�Eo4��g�x_��A���_��X���
�W4����sB��������^�Z����������5 �Y����Q)Q��05��4oSgm�R�g�����A���'��������X���3tE�O����
�0�?������Z�7a�r��>�n����ebEX�z��6d75�vd��9{ <������4	��]��:
:�nol7���8��Z!� ��_�,�`���sa��m�p��Y�����Tv��r2���aAmz�Dy���������l0�n�D�K��
+�H<�=�T�~����B|��i<�+���f
8k�y�-Z�[-�� o�})_hY\���������X?��X�_������kTIXPi�cf�{~�wI���!��G��C@���.�����t@P(,D�
?(SIJ�@W�� e����P%�(���>E�q��A�<��v2�2P�z ��k���%)���8?�����R 9�m����6�d���L�&7�P�
?��7���70����6��2����!�>�2Y�?��c�vqX#;}ur~}������������O3��4�w��Q�[��d���\����b�B�0F\\.���P{�)L�z 9�Z'�)����K��P<�@����oE��Y�6�-���l}����*���= �F��#�<t$V�}���(�LB�"����z29h9�?Y�;��j�SAh��N�%����@�w`��n~	I�$;A��)�.g�?�3������6l�
�yv����������u������Peo�`G��6/8;�����7��T1�v=:@����5|\Y!��IW8T�r`��
����O��<��^�S?�3?E�4�+
]�Y�M@K��$�iQ:<eTG�a�����}$&��Bn
����z�������/)
�b�/����y0
���a����g�8u����/�>���C�3��Z�#u@!"����F��1<�E���%��ir_�h8�+�P61��.���0�����i�1J&�c��Nh�B�5"
L��r>Wa�z2^��iBi$��i�FAcCp+�
R)�������jDO	��~�m87������2��������S�g�7ia�E��V���}�7
{�
pQ�.Hk���e�������x�0C���+�fyj�
�P����oZn<3��5�a�����K��"?��G�}�]q��ZVk�����XUoa��4
���m�P ����42�49����C�T�y��}6�Q[�0`�f�!Nl�l+F6m���������l"g��Oh�n�~���h����j���7����H9��wGoN��rr��]�K������g����.���0�8<}�p(��'
�+���
z�����`)lo�\
v���$���Q�!�����Wb�spU�.o�<�y��� �h��c�e-�e5F�jH���3�W�!�FkY�Ks5TQ��7��h�PE�~_��Q.����"�U[���_5k�d�p�!6P����\c�Q-�W6O���~k��$���W;v�
�u����jz�z7V�:�����\�mM_VW�D�H����z��F��D����!�����6�6��Wj#��d�I!ca�t�����������C�xv�ZS
i�5WU����W\f1D
��BS?	#��_�'�� �"�n��!�'L*�%
��G?!($IIi�����������������W�-V�����N���qxd���\1��\O)��G����\W�����0�����%��ORT��$���7�5�z��<�N���h�}wA.���@�C3>��7Z��B
	j>@Y~������I
3z��CI�A����	�hP��@�a�T������=��L�����%I��#6�KQ�!�(`a)r(�� �9��y��F�\j_�_)J��#Y�	���d!H�c*4%U�I��������0�Ne��x�{ �(��M�Z�����;���k���M��tID-C�+����0��TI����V�+��6���X#}{r}D{����������l�kT�J6I�Y�ey�]�f��:�aS�Q&n�0��d�R��vpH"��X�M���K�D����Zf]3g����g��N�r"y�������SW`,p �@�T����I����5/s���$����HvC2(}hl(W��dRT0����Q�}KB/����y">��X���C�XT���j]������r�_b�,�}ZX ��r��^U��0��.����u��'���f��iB�������ea^�dM��
'����l�_�x�)J�/#OS��K����>����}O��&�:C����*����KF�aF��
BJ��2�5Y
�i����1M���}A~����(La�����.r0�*�!Nan�Fy�M[��V��R��0����R���eks���!�����o���WAh����Q^�O��'���5�X�,rX`��
����C��7
q���S�BXd�>V������GM"S`�����9Qt6�������	rE����s��/�3��)��xVWD.����sWeK�mW�?��X���+m�~��D���#z�-���s�%�w�+��S��e��tQ�tVJ���/�����iD�zp2������v�X��N�2��X{���3c�����>��_�Ub����it�d��!+��
��"�h;
~*�UP�X�)�����
�����Ns���{ut}B�]w�E[��j�����l��f�%Sn�����l�e������&wo�G�����(�Jj��h�&#RK%��F�	y���5qd�K@���6j	���,2�P��U�J��PG����N�'��4Q'�E�����t�=�etE�O��P�W��M�k2��5m�5�Y�zE���O��(��F�I�b�����r ���A/?�W��f�j r�T'��"�]�[/�;"N�;T�����#5y�� 4M8Bv	�]G��������N�\��$/��%#��l���<X���u����lY�@?�w�_l��
\nH���.i@��I��q�G�W��]�������9�4�H0R��lQ�|�\��?&KH�K�za��������yQD����Y�����K�\���gV�_'��&��	��������l��@y7�P�\�}���*�-�}2#�/�������N~������b��
H�H_��>g������K
T�����e���l6m�e�`?.��W�aK�)Hg<i}8B�{<�B��w:�O������@c��d�Y�)�rO<��T������|�ap��$�i�\:�c�Y���Cf4����h@�U~\������B��r2�������-�L�:�3�`��	H5G������Z��MnN$�r��T���}��oRa�9��:��v4�����>-1S�����u�C�M��������oql=�9`��a��p�QX�����L^����>��o���K��[�B��.\8�|z�w�#�RG���S�_$���K%I���~����P���?	6���$��w�1nF3k��H_����zH4��>�J�B�����g�G��a}�&L�W�,��1IP�Xk�p8���������#g���[
�����J�DV�m{bA
��-?q�d�lQ�J�rG2�^�`��0��1-w�s����
��>��v��-�uf`udm��U�[&Q�fj�J�)
���^�iJ��5n=_R,���z�b��<�����1�.0�a@"��pg:a0���~@������Y�D�� ��?�C�q�EV���a���l-�Td���O��g�39����3
Z�F���}��)��i��0�+�D,����~�����,UN�� ���
�V�t��5��/v�w��O���������b�<�M����I
Y�7��\�����?�n�X��U�3y�N��$��z��h2��~i
w]�d���.���\���@/�es!�V^C&�8-kQ`��b�#�SR�#��eVja\�����<������$��a�K�T��OS9"�T��
�f���Q:��~Z^�"�%v�	
5G��o=9��g�eH���;��6��0����iuW����-��O=��54\^"��$�H-]RE2g�e������~����~_S<�z�����|I����������e��0
�r7�1�@������W���5���~�+�uX;o2���s)����[�����.w���F�~`�
~�m
����8-�|
�5���������
+���g�5��{�|T&,i�K? e��.b���@�J�O�,�.���L"��8{,5�Y��i��-���tLLVK({9�u���AN&�,�,]�����O������'U�wo������]��2����wR�#��7�)CU�ys������������4�M�^������������D�z����C���l���N����3h�>.:�6�\��T�-�����'P���q�i2��_�v���3��C\��X��<B�����1���?�4W�U2��s�����d���2�������a���X�������r��Y�q��B�)~TW�]�,��������VL��8II�M�������F|6��M�x��?�YnMKz����UP���qI�t99r�#5��r����� ����C�����5h�@�
*�����c�q8��9�X�E":H��N����A�w���S�����������G����oO���vz�������������U��0/g�W�Q���N�4��E�z"����e���V�SN���V��R�Kf����o���9���^et��Y��.3�H4�TY��DJ>�W��`�?F�p�9MK_V~c��R�=X��70��9���<S�R#��T<�N�::���.V��{,���fQ>�
������������Z���x_�t��s��:Pg�}��t�q/��'��.��nu6�Da�
�q^�]�|�/��V���|T	���%�v��:���Q�1�X:c�P�B��{)����@3�"�8��[�����~L�.D�?1$���1N��
a0�*vB_���xM���:��Y��W��@v���g�b�J��n�E�q4��y.�Q2��8�����&n<�b�,�����b)m�~���=qy�^��;��=�><�=��z�6���t7�e� m���y5�,�si~�T�r�sex;1������������3.M��R�G��H����g����g���L����q�)8����f��tq"d����FU�6&n�> �n�����p1zrC��
k���\�s
�bj0~��1FnX�t�A�rG$���q���o�����u�4Da�	*�kh4�Cvw���nGF���]�x���{q��w�>��� 5��!�������6IaO���D��K��Q�8��8R�8�I ��I�FA:�T��6��M�W�?%a+��_��U�dz��)�+W����#�����\��i�V !�H���^�����i-�;g���2����������~����):o����!��B�{F�B`�&{��(�3�qz��"]b\>�4.�;��! ����e�Z����r��Z��0������P-H�=�c��O�_>��U��ZS���?��i��(Z��_XZV5���zL��U���N������6*;b���'�?t__\v� p������B�S����f��P6��Q���N���EA)�fY0@	�����	�G!rk�����o��|���fg�
Nv;:���_��R�K�($H��@\_�?QQy���X�^�%Y#!�,�1�z�)���q,����*
�o@O�X�����?����8��	T�zr��d�bYsD������Zo\=���@��;r�r��e��u��jv�t�o�f�`�'��4T�J�xWg����k��_�U��N_C1��)�N�VX��K�:{��^j����\l�P{��R��|K����������k�%�(�/C��T6�u����Q��0�e�D���Q�}-�����/�/
��$�xP�tqR���2�2B�K��2����A��2����;(�D���6�haM
g�?�Bo���]����s�����R2��a`�;�r���~o477v�d����]�A&�r	��n`�,�
����h}Sl�PX��qA��
RHU�x��!�'#k�]�D�U�-eWD}X���
X�n�s,O�f��H��j� m���`#�����C��." ���;�!|������X��'�	5~�>���V�"���Ns{�������l�6g��Q��	J������ta�y(���"��;~,c{.b�%���7��9|v^���^�)�l�sJ,�����_�2J���
,�����K��O�b�� ^�
C�������s�H�{TS`#T�>����A�Ui�I�������U�&��5�V!���VPR��(G����Q��6(�:��#�a����p���G�Mw����	?��[��ZR���4�v����O�J��{�g�@v�P�r|���+�P+z;��Vj�G<�3�$�7[�\T��WKUj�O�����>?(	��^��6%�����:~��o���fi�7��t\B��]}6A2��	�����*i��ag���+;�+�G���2wH������\E�K_�Z;e������X��=X
��_B��h�5���lu�r���L�Q�_}�X�]�tnW2_����tEU�w��yh������
��Pq�+�}Q���@9(
��5����[C�!������Zp��&]�.�v�\�o<[������#F�u3�-o�2�%u�!1�
����S����N��i���T�Nci?'���ft�+.����PG������i���������M�t�>��}�*�����_]/�TAU1F��������a;��_�~��ls9����-����vj�k�.]���o\m�O�X��5Yu<>��\��z����'�m�)��"@P��+��cPw6�������#��u�����V��.�w&v�{x_{����+�uH����;O�+j�������{
J.!+cQ�2�����7������z�$t::?�	���s`�o����`�1%�
4)W�����A.��FY��G��h�������N�qd�*)X����A{h1A��[Ao�����g�g���*Nj��g���wA�ldN�N�,���J����^�bE���i
���Q>mU�<x�z��%��TG!p��%q�)�!�v�-�����
Ez1Gcj�������4��/���c�+� ����U���D��D7������fm��/.����w����YMxP�����U���@
/g��#V�#O�T�m����f@ni��>h��zs��%\)vP�����
k���|6���B���DF3@���"�u=�U��!�������5��/�/����0N&y����n���Z���nf�r���q���0m��5IT�4��\��S���v��N��Gb�����-����n:1����RP��)����F�\:<E��d�x'�(��$����z�a�V�lSd��L���NC���pg���������l��9k�m����]aW��W��Ch�`�����
�����[�g/�lxx�_�����}�e����U6���J,�lpa�\������<"7�{���0���n����M�`B�m�}����&7�Q��-M=�<a`��KE�2��(4kB�N��M����DZ:&�#���p*��@���QY'��
^]LC�T�0|����JG������M���qQ��>}��
���9*����X�oF�f{S,���R=�S<��c��1)/�!����?�4����l���"Dyz(�3����n���V]������4B7��D�}Gc��[e�=������p��h�j"� �7Pm��R�����`�1"�&a~4��J���n,��%�[��|�
��a��'��,�N�����t��;�k/F��E{���8�f)������t�)_0��9��sM�Y�a����D'ZK �w��@�
r!�]�@?j���(��h������\�y�3eX��t�����#7N���t^3��2s4)HI@lK��q�$!��h���h�v�hh�%�YK�_
t������f�:�IX*�k�CVkRD����n����5U%��U��9;S_�������nA1-E�������V���v��y�q��B#K���Z#`�nN��gX���UNQ��%'���~������Tq_�����@7-����-�Z]�_;�r��,���}@O�8����Z}~��N�����9^S��;�C`�1�7���{`�L���Cp)��If���������Zf����-l;+���X��Qd1�f��Va�p��(���B�z��������W'����������^\�� 6Te��%���n������I�.���������w�pz��[TK����o�}I����01h?L�����,�7#-�X�D(�"�N�y<�
W*���$���q���;���O�]-�(��������(���L���p*���3���&e�Qf�y=���n�S�
�^��M����3"�d�|N��
J&~��H����.����	�����%�RY��9��ru{�{���A���l�0}:L����J}�y[��A�T���o�9�2��@�Z���6%`%����}�����S�z���^�J��q�_����C/��c����k��z�g��;N��������f��m��fld�!�`-����������u��$���+V�=.������R=�U���-�V�W�5iE��\g��22����W9KK���9���������8����=�����Cw�5��@3���X�1�x����%/���k���!+�������:-+_vUG9���6m%K=�R��Q��4�_u&6mu�0��Y���d)u�]���P��nPy����"B���~V�\.(hC��`�V�4���|�n�yo�����\�/�\�3�#:����3��M��R6��m����U���!�Q������������;��)V�~�}�^�g+�����{p������U��Dg���1<uI����	���%YS���_o4�����-��,a�!�����c�X�aAO-Id��h��"�SK~[�qQ9bq�q��0En4x4��8pYvT�����L|��� @:R���%������o,��7�����lQ�X��K�OMI�w�$�:�"��}����Y*U�Zi�8
���t�!����x��:w@Z���4w@@tE�ugK��t�|0�����|��=�����)q���"P��Gj5�M�����iv��;4�J^xl��;�[6<v��v������'���y��wO�8�q���x��R�A �rt�!H�x�#�Mt*�
1�&����U���cY�Q�lA��Co���,$�A2�vT&�I�g�h�%M�CDg_h��E�A[J[��^e���r����V����)V�S���YW�����lK���w�����M����~���#\��v�����|sg�{����f{sS���K%t��@��J���8�{D��0�
^���sm[�Y��?[���Y!>�<	�f�������D��x��^1sH���2
�� "H�@BDR��"`P�����������2D'�QQ<����'��a�<|�RJ��Oj����2t�Ss�P�M�;�*<�P~�*��u�T�i�;�r����V�L}:�'�/c���nv���;�n T3�5�,*&*7X��Yz�`n	��4I��*���>���������J�)A���#�����:�)
/�o���6Z�N�b X����5�oB�bV��������y`�V�T����-���b#���15�Z�n�6��]M28������k����~(���� ��bV0�U��=�������uR�3WT��:5�U:%1L�-L��#�i.�~%�6�1������M����4�z��w�
��(��|��V��le��M]�����)`��m���Ns��U�����T����"+��}��T�Jm��iu�x2R�&e>��|�V���xsu'm=Zi7d�PN�x�w{x�QAa��A����1W�(����V��]TAH9jJ��(w��19�p5����ZM�Bm���E������R
i�[t�����Pv:��jY�L��3�:��_Z�������j����Q
�J|��i~44^HW��2�4U�0���5AK�t������lo�Y�|��a�^�&�S	f��K���D�E6��
�\ez�C�����
����J,K���=��G-��{\�--�	�2�����
^�4���e��lv��-m���Nl0��A�@���m�i,��mV���q�������>��}���u{����%��l��"P�h��g�<k�fr/��������x������B��p�3���#i� T\��M�h8��q�y��|���h��A����jY�?��7��C�$�W�#��q�������
�QU�pt�b��U�3��Lz��I��##�d������[;��
����/���y��]��qod����&F9'H������k�X9���[z���o������wG�']�u}zt�}wyqL�U���/�����ns{��6��T~���iw���i�������g:�����<� �����i����+@�`�|!��Rrn|sG���Fc�;��&�hk�B�x�@��l���H���Ur��L_2�C���O������{�=�����&��B�N{��C�������������$�A�U�)��M�aE�#��"�v,i��o8���jn,�������O���:��U��d���irr4�2���wzz_�y���t�w�#�@}��6�o��ot����|���I�y��a_O��:
��*[���������S����A��5�z//.���=�a@���A���TI��p���yg�|$=�^������u����Wh���e8L F����~������;UJ�M)
d�������67����Xf2c������L�q�]�@��+���|��I�P�Ex{���[�^����sp���e����A4T��4����-������j����f�5>7UW EN��R�
�Y���6�4��B!��/
��E�O�50�J�NA�c<������w)�t�n�sM;�C��R�i1`�*��%���bn��uH���g|Xk�u�z�����4�?m�1���o�a��w�|S��>��`�W���6�f���}������%��
��<��������E�����v�m[{�%
�N��	��,eO*F\��K���m0Q[�Q�%4ZK�`�7I#���Iq��4r���t�f���m���I�:�L�M@yB[gL}E�6���N];[�x��U�L���y��������Nk������\��H/e��./.��RvS�<X�u��>���U�/����,��	]��� �KQ����M�c��d�REdQ�Pv��j�X�p�`������5��`�����J�z�l8c	�������Wz!.�i��N�A\a�5��x:Z�����{-E<�����pJU��:_��w9 }7����}���/��2��Ci��o�&�aC5��A]�P�'�a�Tv���;���}��^$s>�.o�pu��]��R�����-�<��1����i�`c�*�~>��f����*�{S�f�������gUF���	�$3�:���=g�}*p������=
�[���������
*<�~��0�s�,�����q��l+Y-�������>K����������af�����&y���"���^*��AW�5�T���*�m?���l����O[����B�0����g/����Np��=�3���a�������^���3:�`�U���m+���h!��D!w���0��kp��)��5i!�nk�(~*,�F�:��d������3r�W?
A�$�,�e���`�.���n�^�Jfl/��5���5i��O���Q���r����!Y��;�NFc�5�.���s}m�~|~���2Q���X[MF�T�	Z�i5�	Z�oM��P���������.V����������;�&�$�q�G����NW�ry�I����������w��V�+�G��
'�0K
�Bo|��Ey��G��������{wA�,��0U�+gR��^N���*e9��6gzH�e�K��b��k��Og�
S�<�'�`����Wow7�{{��m"G�
�����/��-��T����:uMyIg��B3_�ps,��{0a@�w_&/��d�x>��VL@t���4Y�y�Q;�:�|u'�p�������$	�����XJ�+�m�):���uC���s����,�bU���H�v��T�u6+�-=7z,4��R����G��j�4i�B���������j�d_@i
=�HM���t��%������Jy����_�����B�":j���[���L�_���\���}��e��
a]d�����6
z
"[{��m+Z�vg���a��z�O�Fb vq�=�#���r� 41�JV�����^��&����������xej���W'���t����E����
��"���"�l$ ���;:?=������ ���"��(�4Y��3�Dyb{���������t����|�j��\����C���d��%�~����=���.O`���m(�<��
=�E�Q1p�a����x��mS\��}U�Ic_|"VLBZ�@1�Ws���$T����\�Y3)����H<���?��-.�0�8��%+�5��|(����}��c������33{0�y={����Qx��e�����J�C���_t�9Wg�1��h��,+<?�����m�]J���y�]����5`�
qt%b,���aFs�\����9��]�W@�ah��^i��uy��:b������sFC_k�|j�>`���z>
�@�e��.����LXE�Y8�;4[��>G].���J�X^��A��L����bw����;�>0����F��0�s4x�����j��ha�@U���W��91�>��� o
�	����J�q��J!s��@'q�Y�����9S0�������r:o/��B�]E
����.��D�1L�e�{r�������7��m��������=9�n4�x8�D��>c>D���R�^x���^�Wj0�m��Udk���L��$'��������I,B�K�B��k8����S|���E���;�;]���
&��0�o)�;,�(��*�������I/BO���
�-��Ue�gA��p[)��E��87RQ2�59�F�	�[=�u�_��������y
�/�`��(��V�e�I�lo�q�6�6S��B���P@&=�[~��d���Q�����nlwG9�X�K#�'�n����.h�j��f$#��c�`
B�.#�b�#���c��p�����m�'��������4;-���g�-� ���ab� ���{��m������YNx��������at�y��.������N��eg����;^�T������cly��W��B��V�n����i���i����U_������uM����T"L[��������K����nG���{PJ����Z����I�&US���
��F����i77�����s+���p��^|����q��S�.���3��h6$3�c[3^��`E�B�`�T��VS��
u��P:7�C#��s���h/�m�0�-�'i��6t^&��'�9jb�d�4D�'�m��(�P�0����&�8�hj�|�m��qov���;;���;�$�3Q�����/��`�@��
���I�m3.
N�Yi?(
S�>�2a�)�+�Vl�Q�~`n���-��;�������]�a�W����j�x%�������}u�zq����[��L��M�����B^���a��O����3�������`[�x!G��B�?����I)���\�^L�S�:���/�kKW��v2�Z
���������bL)��O�4�Y�*^m����<�����"��W����Z�s�h���cC�a�����<��������**{���L�k��t������W��~���^�A���,Y!�*K?��x��x~)�I?D#����tX	bGUy�LD:�{ 1n����G�60�H����2�!�����\���������!���_({Jg�]����C��(�'���#���c�$o����wO����������B�o��I
D��+#�o|�7�����-�"�QPZ9��?��9"�,<�C��pP!_��Y�r����)�2�&��B�P6�Lq@Id�n%�T�z���0�P��.O���$�=�[��a�Q���\��N�Y���G���f����8��)N�����:��a}T�&�K(W-�����U��&{�8.��QU�%C���MT��dQ	�i�P
������"4�@^2�h
�����R
���u�{

�Go� �={���Pr����g������J�lY���dUsU��h�+����54y_=�O��8��Ge�@�}��Bd��5��67^&����P����Kf�w����O,�6w�2n�w�BfE3�xMu�P��o37�.	�I�o�=��tu�����
q�nXP��i�(�R/��B�0���F�)���MG�`����w�1��c{{��������K:���|V�G]�h����������������~�`��;
}��}�.�
\��M����h����]�����8�RG8:�#|R��:��
;z�8�%�����Pi���� ���l
�9B�X_A�y��2�b)�#B����Z����k*��r_���m����T��E�`��24�}!����#H�aD����g���
���,�XE	�����@��������z��ga��K�@K�}���o
adA�Iv����G���
�$��O���g�:�c��tX���������%�j�J����T������,�Imlm�47�T�L2]�G���!��,S�Qv�S��Z�%#l�V<����D���D��Utb������)��)T}F_��0��G��?��qF-����]���������f�S:)1a�J"�u)m{�B���<jck����lc���<�i���+���3zY'g��R������UC^-���^Vu3b� b�8���
�v-��zs�aKEiS�TO(�E��X<�p
2`]��-��*�w�x�QD3��N'��wj�9�i=�����}<�[eQ�^�T/������<I������������.<���g5VG6����+�� �0��c��i�	O�P?][[S���+��������h0��R�>���V!eC���p��%�p���KIa0�{YA~����a**{����r�<}-������-�z�1�&������	M���s���9��������"��(��L(���Wu�e�5�Q{w�A���:���c
i�s%��0�A��WJ�\���a����`������:
�!�mkm�����PA����� ���p8�M�Yxlj~%Wt��u����A`��]:�x��n����\:��2�a���,T2�hXyr!�2���7�N�0��E@�����G7���d��]�f��b��Z�����_��L)j(�����*��U]:���wU���[��v�Z�v��I"�GV���e`P�E|F��%aZ�XmW����R_,�b��(��JP�R�*K&���o�m�6�����*��in�6M��c�nM��\'�F�)��&��b��*��GtX��������h�a')��8NRm��n�����C�cm����4��,@?|@%�x�{}3�D?�����[(9�6��4�!6*g�:f�j��v���&���d��)<�$��R1�j|Fs�I'c�l��^�i�'kZ�Q��&�$��h	OE�1�BM;�2��>�1�������z:k���������f�����������U���-z�e��o*H�qv<DA�H�����%�����������R�<
��������>!�T$1��Z^'���RhD����v�u�m��}2�����.�q����A�\[�g���]t{G����7C�QF.=H�-�7v��RZ��X3��O�w5�7h�-�]w4���_��~q�����^dA��^a/���AU��k�8qM��!c���l�V���7x|�`���Q8��:�g�b������0��Q�b���4��f{���,r$���x���(0c��(�-��)z�qc��[����,e���*��YI�%:g�2Bs��5wr&cv_f3��C���%�s�y��"���cs�k%�.�\����"�
���	fY�y��A�>�>�Ro�4�K��@L����8�4�y�z	��U��[��;�D�v�#4��8�\�/�O�O�K�^���L���i��N����p&��Ow���RS[�
����[L�\;YS-{,���b�jPF������4�<S�DUEW#WU�
5P����"C����U�|5���F�����=o����������$��"�?�h*����*��4�&���AT�X~��������|���:��y�@]<�t�"�����]��WG�Ghv��(���1��)n���S��x��-�+lL2�N���l4w�-�������W=���2���M�F�����M_q���������J[
��%�-
"�J���Xy�����1S���gZ,���d�;& 8R=`@$����yS�&Pn���6%�����OF��F�[�$<��~�1��?K�t�N�A����s9�IZR��E��X0���c����/	���D��'�|�>�p��e&p��Z����&�	������kZ�������%��rE���T���|Dd{���Z���������<���`�Q=fv�_��g���M"�>O3�X3zG�z��pTw�z����]i
�2'\��t6���b ���Z��%V��f5�='L�NU@��p��N�����f{�e����&G�X�|HTx�a��P�Y�
77.^�*1��saArh"�t���72�������������	��5T������7�_"�4R�oS�y��c��&�z��b-�c�����;0��~�o��@�q`*S#v�v2��cuL�eY`�?s��������������������k"���E��aH����"�g����g&{�n%nA���_����SF�a���1N]�����O.����]}ty�T�'~Irc���tg����x&�u�3�/R�(�l��<���YD��v���K�T�o|�P2%6Y�9Nf�"U!.����5b�Vl�N���a-d��p��`
�|#��S6�dk2�`�lr�r�=]�����M���>��(���b��.Cp�bQ
�f,���EqC��\�"x)r��$��_w�U�(��AX��������u�������F%�
Qd��K��CX��j���o��0B	��:�+�&��@��e��i���Me
/�P�R���!-��{���6������6K�bP��zQx#����7���������������������=����V���F�l�����2�-LZ�c_^�knlh��q�wiV��a*N�������HP���dy��b�� �e���+����D3�B1~m��4e/qv����!����P��;P�96;�����u`��jn��;���(U5�]������-~u�O�g�"#�����������3��Px[��U�[�{ J���:������sS�[�# ��f�����d��I�.��o�B��������oU&m��z�����L�K�cc#G��*�S����&V���!��C�Q�z��57�,���
<��{w�B�ei���W�M����V	�oI.��S'�=��&�����/���[�=�|��aY��2��b2�WXvh����t3���m�	:;r�~8����(������[������*�������{�cc_X�J���
��W�M{��4��9!�|5�V%.{FA/����o�gY�%�D�%>����
���@�e2?�G
O�L��������TJ�ov'����'�{Y�oq�����mX&2�H�W'W���%�{�=/��R%��D{S]F��{���������N�{s~�~��F��h�n��/��3z�i������bN������x�p�y&])��#f�'��ho'�=���&s���z�+��]T������M��e�����`�B�6s�}��t��9�)s�+���0�m�f�wW�m+y&:��T��c�j:�7��M�e�$������^~�~��O���v.�+�����E,�����$��h�h�6A��,JR�����0Ve
67+<��.���1v�������cS>0C�c��3PT��:���R�~�����"���z���7�s7G:���\��U.��]�'+���;�����nMwI^$��A��N�D$l�X\�9U�P�U�P�����N����X*v:�`�����Q����g�����+�f���+����pK%�]Q�������G���\]]w�G=�b7���z����~��
���^��a��g�q�������Q�u�k�����v����3�wA?���������k�L�����&tK�O�?6�R$U\�=d���4�Qod��P+�`XG�~.�8���
}/TY�!��}�
u�y��v����������CmT��i��I<�n�+�����������'�'����ED��3�u��K`��`^�>~`�fk����i�f;��D���p6�_�������Gx�CHs�F5�W�����0������WJ��&���1���U3�������q�b���.���d&T��=hg��;����RL�"����c*�f��ua(�PA�]��P�0(�]O!�����w�.�kq6V����2
x�zk2�6���z�iF+WV�t������1�<�����B}(�b4Bo�B�iB%��z��������b��O��2�q0�Ok�X�N���4���]U�
+�����-t5s�X��`�Rl'��
��V�o�~tyy����K�{�|	�&-5�w���3�bu�������-.�<+�]���1���^F��Y�z6�+����7�-����K�K���ee�`����b��9(��S��+�P�6�X��JPl�T���yC�W����b�O��o��Q��mLlFo;S�k�W��
����S[�rEJ,J��x&'�KN"����B���x�N��k�[���7��)F��
Xv����z-����;�Wi���!��,��T���f2���<��]��jf������QESP�=-�O���I�}�|XnD>��u����Z�)-��~�2���o��o6�yZjABP/���ja=��q�}���,���P��j�wY�N��6��y�	_}]��3�"�3M�����f���V�iU#�s�V�qU3��V�iU#�~�F*6MY�v���
�#��I��Wd���r[v�����/�:5J�j��,��jX��OFW���H>2!>�r�S��@���
e2��K1�
�W|��������OyA������e)�����j�L�����Y(�/���_Q��#�Y5<"^���J���V*-	jN��d��E1��+v�}�U�NQ��u�L����%�xl�/���-J�v���������b�T�j��U����V���~�Y}��.}/p@]��UYw]�[�j��F�U
1NHUk]&�]y�,�s��z�E��N%� �/E����5fC��J�_\s0�"����yv�ttv����Y��w<��qhqm����|�8��yt�/��
�
=����w��:���jE>U���"��V�k�8�����6��Rgh�{��_e���t�K���"�*���������wW�>�Y���>��1��dh ���]h�����`z��n
�dO�����{�����
��0���X&L���4c ����G�>�qxi������\��@�8q�Q���H��\7��Jj�k��1L7yVM'g�0�*}3.�7��o��\��|s
����5����k����n�5����|
��|�)��S�����OA��7��?�!�7����'<A��	�7��o>�|��|�9������O�7��o>	�|��$|�I�����7l}��
�DFY����h��t��oX^QDqf4��f����~�'$C9��7��V���e%
��:�	��XO�{������b���(}�(�������]�b�[�z�x�ix�z��u�VYw�6�`���q�@%�\�?�eI
�P��j4���:�=�e�������K�|�������_���B�V"�;E����F�G	S�lg�TL�)��aZ�aL�u���[����Y��j_�_��*:XdOmlmaB�-'�7=2q����]�IN894�8�	E�1L���c
��.��'�PH���A��b�������������x����w>�x�a/�\�6��p��A/M������������V	�*f���a/p��.S>7OT:��
�g�&���a���G�`���H���9:~��-�\����$��Y���0�1�:��<4��c��S�cf�/�p*��zZ��������z�m��6�{�:.�������9�wg �
T)X����o��d��dr{G(g`L�0�X���%�*�Q0\-�w��@0�q��"@�[�W~���h�av��ZC����(�\yi.^�2Ir�v_���%����^�s G.��� z%���a�S�6��F9"��	<�?�$
�Ln��I�)��,������lt���-��R.�'����[Q��dGM���]��5p%W���
�}���2Ug�'��%��e����0	2������W�E=�����G����\|����N�~�����;`�?�f�����cC\�0����2nu��n8�����+6��Q�i	"i�W$���4���3X���]�y�hh��(�p~��: ������X�d�T|?����I_c���I:v����*����#�AA5J_Ey"d�8K��DG�Oe`�]�	EU Ft,4�`��F�l*����I��f���@��3��o&���[3�����$���U�8wJ^� xI ?��g���G��9mXro�����cKd�@wjpA��|���!��D]�b�iK���c���� �1�->����(�dbY��Qfb��{(�"
���7�,��E�B�ZfS<��e	w(�
5�*�Ka�i[�Qe�m�����>��b�~4��E������t���b���H),J��������
�8��!�?���;�e�`�PV���ks:�����a�D,y���6D�#�[9L�*v�Far���d��>R�!7�(��M
 #Q��OFTC��S���*��Oi80r~��p������E�K���,����'��Z�HL���$>*����}�V������`����c����R��L���RL�������H/�x^�8)F<����J��O(
7���0�K�>�01W�( .S\�s4S%�����IE���o9&�"l�
�F�7��H��a��/)R�T�� Ub@(�?#���I
����"�G��8���;u���c�YI �W��6���(�]�\8�e86���������������fR��g�.���<��j����(^���G��$#���T_��d���v���D-�*}uQ4������]l����*
����i���ZD��l��i�����Q�L�����#��
I� fB���b�YN���u3h#��rO���8�!P�v+v��a��c�� ��k�=���[���@+0&�u:e93H�a�!����=R��0&���hs(����z�3Au���T���p�"/v�u���'�Fs���V�=	��I��M�r�����a�`�GQ3�[�b��p^��P���"���m����;�!RW���~�����F7D��Y�QK4x
�
�T�e��e�K��8G�
c��Z�hZu^��JR4��k�a�h���A��HR��u�\�}��b�A�~
�3��dYKZ����6h�^�� �i����,���b
��{-�?�Z�1��Gg�'�������X�41���o0!��%_���6aEK2���0��3��X��:G�B1N�;zu�L2�h��d"����v,6Q����4Hd���P9�3;	�r`���y�5N����6�m6��	��������p �'e�TOH�)h�����a����������������j��jn�����
x�i�&4�1Y$B
'E�U	���;{��6 ��^DQ��&%��ny��E��.��2�QHw�H���^��r�Q�����(?I��WT���0'�)�����d���K�!���iXZ��*�h8���'�e��i�E��iMG��.�2@?����xR!�Y�S7����lU�,"�mo4�w
v���-Kag�5�b�=�,1�� EM���$J�e���������>������Hk;�>�,���=T���A��gr�c0�BU�7����X�p8���a"O�B�Q���X$����QBV2I�����KZK��qQ��a�cC!�����r���������_5"���5��h���>2f!>:3�E�����
�/��2����Bg��&��hYz��<�db(���%F�~p��D������6A{���h�|]]�O[��:�����'��0��[L����c6�����r�
�����S[Yau�0�������]c��
#60Alvaro Herrera
alvherre@commandprompt.com
In reply to: Teodor Sigaev (#57)
Re: [PATCHES] GIN improvements

Teodor Sigaev wrote:

New version. Changes:
- synced with current CVS

I notice you added a fillfactor reloption in ginoptions, but does it
really make sense? I recall removing it because the original code
contained a comment that says "this is here because default_reloptions
wants it, but it has no effect".

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#61Teodor Sigaev
teodor@sigaev.ru
In reply to: Alvaro Herrera (#60)
Re: [PATCHES] GIN improvements

I notice you added a fillfactor reloption in ginoptions, but does it
really make sense? I recall removing it because the original code
contained a comment that says "this is here because default_reloptions
wants it, but it has no effect".

I didn't change a recognition of fillfactor value, although GIN doesn't use it
for now.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#62Alvaro Herrera
alvherre@commandprompt.com
In reply to: Teodor Sigaev (#61)
Re: [PATCHES] GIN improvements

Teodor Sigaev wrote:

I notice you added a fillfactor reloption in ginoptions, but does it
really make sense? I recall removing it because the original code
contained a comment that says "this is here because default_reloptions
wants it, but it has no effect".

I didn't change a recognition of fillfactor value, although GIN doesn't
use it for now.

I suggest you take StdRdOptions out of the GinOptions struct, and leave
fillfactor out of ginoptions. I don't think there's much point in
supporting options that don't actually do anything. If the user tries
to set fillfactor for a gin index, he will get an error. Which is a
good thing IMHO.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#63Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#62)
Re: [PATCHES] GIN improvements

Alvaro Herrera <alvherre@commandprompt.com> writes:

Teodor Sigaev wrote:

I didn't change a recognition of fillfactor value, although GIN doesn't
use it for now.

I suggest you take StdRdOptions out of the GinOptions struct, and leave
fillfactor out of ginoptions. I don't think there's much point in
supporting options that don't actually do anything. If the user tries
to set fillfactor for a gin index, he will get an error. Which is a
good thing IMHO.

+1 ... appearing to accept an option that doesn't really do anything is
likely to confuse users. We didn't have much choice in the previous
incarnation of reloptions, but I think now we should throw errors when
we can.

regards, tom lane

#64Teodor Sigaev
teodor@sigaev.ru
In reply to: Alvaro Herrera (#62)
1 attachment(s)
Re: [PATCHES] GIN improvements

I suggest you take StdRdOptions out of the GinOptions struct, and leave
fillfactor out of ginoptions. I don't think there's much point in
supporting options that don't actually do anything. If the user tries
to set fillfactor for a gin index, he will get an error. Which is a
good thing IMHO.

Oh, I see. Fixed.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

fast_insert_gin-0.22.gzapplication/x-tar; name=fast_insert_gin-0.22.gzDownload
���tIfast_insert_gin-0.22��{_G�0���)�YG����0�O0����l^�i��f���1�������u�G�ds�c��4}�����uuU?�JO��b�Ldio5I������1��kA�f�Z/��x-
��8��8[�a�9�=[ZZzH��N����j��wE����y����R?b���j=[YY�����6[/�7J�-�?4���������6����i���/��<�	~U�:I�]���q������J�>[5�%�A�O�� ���q\��~��_�5W���d ����(���U������W�N�^w��~��<����
Q[[�:��t�A��bi
~�����������=���	7Q��.�"�g
���koW��E�������hB�J�{���\�}}O�h����(��4�����

�<�7!�e���#���u���4�1�	eO���d2��A���4�Q�n=~\MxTQ�N#���'�.����5&�VG	Z<+���Ya)��c�7?���y)�Sa&���_��U=cw���V�P(~����
D��Q���>��a>I�=7�����~����I�8���~�6��<m���E���d$���(���� ����:L{�
��J�F���@j�{"�E��6�"������P����4�Z��O����=���77[���"����@�q�Kr����5��W�X���E�AZ��QMj����U�-����2,��5}�u���,,UL�/�a�[my�X�\�Fc�|�������:�e9~���{�}��ypE�Z`}�2'$|����jnm�u��n�����L|������=`K�9�5m
S��'�J�6uV�!Un=m8M/}rS?��8�h����1CW��4i����0
���?����lA��ux���I�#�v{�_&����7I�iCvSnG6���'�/>@����I8e�l�Q��pwk}�������
�>���%Q�O����l��C���4E���SX�W��@-
j��&���g��G?4d�t�6�_aw�X��@b�����3��'[��	(���0��5��-$���h��h���y����|�eq����C�����S�@ob�~	"�
?oP%aAQ�	��q�	���m(�'Y~������B�gd1
������z�AQ��A+ ��L%)�]%��U�{��B��/X����h8y� ,�c�����z�@
����Z3{�����d|�~�Ki�����k��T����S�v�\C�:��:��+�������h����g8���8�dQ~�O�@���Q`���������������3z;�?�p��T~�5_C	nY�����r=Wo�E
��8qq�,�+J@��@�0-V����j���T? @�&U�B�(��?*�q
�g
H����J���Q8�z��O*��L��z�����X��.;�xS0	�l���������dM�H����N�
`�GOH8����#����u���%$q�����@��E�����[Tl����	X7����.�/��G'W���#D�zC��E���/������JW���GP����hE��r��qy�D�_k$]�P����KxR��?E?�$kCx
N��������4t�fu/4qQ,I��� �=D���UPy��BG� ������
�)|fRG���������l4������g�(�S���SJ�_�Z��Qn���w4V4�l�gj�����F�C� \�� I<�'jtw�~��]=k�� �C��p��P*���d���(�dL����!
���40m,g`��\����pxo�	i��h,W���1���.H���:J�{{W�-tT�.��p4n3�����E�{
(E������o��B���5��g7~��o�@���m��=��K�9�;Z?��Pa��a�W���>�@^��BI���xnF�k�* �A�;�F�E~����r�0��5b����Q��M���8���i�y]�`�@��%>9�idDir@�{��:���9 �l���"a����c��D�V�l�����)��9��D�NC��

�x������=y����o����1�r$���7y-��^;�<�T�{uK�NOd��m�]E#`tqx���P�1N�Wd����;>{]�R����>����OI��o��cx9�F���d���@]��y����-�@~�L���Z��j�����M�gx�$X-B>������<j���!-nj�����4���\�3I�E�������j�&��f�ZCl���;����4Zr�l���~k��$���W;v�
�u����jz�z7Q�:�����\�mM_VW�D�H����z��F��D����!�����6�6'��j#��d�I!ca�t/����������C�xv�ZS
i�5WU����W\f1D
��BS?	#��_�'�� �"�n��!�'L*�%
��G?!($IIi������������������-V���OO���qx�d���\1��\O)��G����\S�����0����	�%��ORT��$���7�U�f��<�L���h�}�A.���@�C3>���Z��B
	j�CY~�������I
3z��CI�A����	�hP��@�a�T������=��L�����%I��#6�IQ�!�(`a)r(�� �9��y��F�\j_�_)J}�#Y�	���d!H�c*4%U�I��������0�Ne��x��/�(��L�Z�����;���+���M��tID-C�+����0��TI����V�+��6���X#}w|uH{���g�������l�kT�J6I�Y�%y�]�f��*�aS�Q&n�0��d�R���@"��X�M���K�D����Zf]3g���G�.O�r,y�������SW`,p �@�T����I����5/s���$����HvC2(}hl(W��dRT0����Q�}KB/����y">��X���C�XT��{�jM������r�_b�,�}ZX ��r��^U��0��6����u��'���f��iB�������ea^�eM��
'����l�_�x�)J�/#OS��K����.����}O��:�:C����*����KF�aF��
BJ��2�5Y
�i����1M���>��xiMo��B�WfO�?9�K��0
�~����-�l+lM�ss�|kn)W�������rEV�������X�� �n]����/��Y��X�S��S,~9,0BCXN��!M���8[P��L!,�@+����PcC��&�)�Vsl���9Qt6�����'�1rE����s��/^0��)��xVWD.����sWeK�mW�?��X���+m�~��D���#z�-���s�%�w�+��S��e��tQ�tVJ���/�����iD�zp2������v�X��N�2�.�[{���3c����+>��_�Ub����it�d��!+��
��"�i;
~*�UP�X�)�����
�����Ns�?�}xuL�]w�E[��j�����l��z�%Sn�����l�e������&wo�G�����(�Jj��h�&#RK%��F�y���Uqh�K@���6j	���,2�P���J��PG����N�g��4Q'�E�����t�=�%tE�O��P�W��M�k2��5m�5�Y�zE���O��(��z�I�b������/���A/?���f�j r�U'��"�]�[/�;"N�[T�����#5y�� 4M8Dv	�]C��������N�\��$/��%#��l���<X���u����dY�@?�w�Wl��
\jH���.i@��I��q�G����]�������9�4�H0R��lQ�<b����%$�%������ZYr����("��U�,
GMtW��e.��m�3��o�tD����^b��TMyb���@y7�P�\�}�� PUt[�%�dF\�^^uO�^��������������}�&��IS>�j��=�Q����l�����~\�
��� �PS��x��$p$��wxB����d���M�-���:����8�<SH��x�����
k9����"rOI��4�t�@�43���h��o���|!���l��=`9���1��d���sf�!�[|�>u�gJ5���c�j���gG��z{���HH�0��P�g������2s�uhi��hL?N���)|Zb�N�C�����`������k-s���zls�������,���4��!��*z9�!*(�}zE��'�����( 
]�p ���w�#�RG���S�_$���K%I���~����P���?	6���$��w�1nF3k��H_����r@4��>�J�B�����g�G��am�&L�W�,��1IP�Xk�p8���������#g���[
�����J�DV�m{bA
��-?s�d�lQ�J�rG2�^�`��0��1-w�s�+�:�}�����[2��������67(L�����lS��g����|5�)j��z��X~��9���)������X�Yv�9
1�<w�3H������O��EAL�>
r���8G^�`��Pj�����KEq��(�l~�:��`9�;���j�M=�����������	��H��~q����[�M��R�T�,��j�pnMW�^c��bG~w�T��21\zJ�� �����4����;���%!z�Qn�5��L���3�F����_%;�'��1^M�a�����&C�,���p��IFhy���]�E+�BY6�h�5d���5v�(�9��1%�82k�Zf����{�A�s=![��>KB`�6�dO����4��#��@�~�@@j��|!~��S��a���+"Xb���Ps������|Y�t
/�Ih��	���Vw��(�������L�����%y�Gj��*�9�h(������/�R�}M������[s��%��������������4��(��=�_ �Nr�^����Pr�#���p�a���(^U��<��JoM������If*����*�9�)�&C����5���w2�K"6�@
��=��
t�]�I����z,�������!T
��+e?	���*�2��B�8�E���g�?���l���11Y-��^���
�:9�4�
�t��.�I?�R��M��T������pF��{x����F�_��I<HHm������UY���1�����O���/��p6�z��*��zls����z���r>HV���:��^�;![<��M�\���s��R�w�����A�N�G9(�����]V#���q��b��	P��.����#+��D0\�V��,��O"��eN�����K�W����c/�3�0.W�Qnm\����>�S��.�?�8~���C|��j��~��4���v/��>\�i�gS����W(�����V�������YEZ-�DK����#W]1RS�(J��Z
"�qU�8t����X�6�����9|\P?b������EX$ ��4�nj�I�|�.:uOO��.��@|s��������h�W:9;��]����X�\Y�
�r�~y�����N�._�A�'�O���P����`e;�4ojU�)�d�{���f��J������PF������2s�DsO�uJ��C|U+F��c��
�����Ua�7�}-5M����~
�/��<��s%�(5�ME�#������0��R`���r?��a��`8��!Y�{?�>����
��������� s.�R@'"����{����� ����$����������(�SAS#������O���j�U���*��Q�����XG��:
4K@g
]h��x/E�@}�h`FZ�G�A|��c�=X�������'��#~}/�!���R!X��N�K����U�W=+��
���H��Q=�L@Ri����M��&�<�E0J&1�'�]Y����X���r�T,�m��:��'.��C�x~�G��{���Y�s�t{ �������tU;���%u.����ZB�}��b'����9<����_|p��)U
��T�u����,��=�lb:9��v4}6�5���a�W����.N��c��>��������
�d�����8.FOnhs_aU�[�kp�TL
���0��
+�N2�^n��sxx/����!b���y���.��(L3Ae�a�q���t�"����(��Koqr'�������|�f�2dV���0�$)��������|	�5JGy0GJ'9	dQ<	�(H������\B�i�
��$l�����*�L�s3e~�r�>�zB2����0��
$����k"@;�9�EQb�La�ZFR{Zu�>��~"��=`a�3E���4$��](wo��� C�@"�d�6��pf6N/Y�K�������r��6�U�5����_K��=Y�^Qk0C�w�@q�)��L�W���?=���XkJ ��8��E��K�����BO���*P��I�3�p�FeGL?���������..>���TR�|������
��0*�^��=5X�((��l ��A������y��Q�'����g��|���fg�
Nv::���_��R�K�($H�e_\]|8VQy���X�^�%Y#!�,�1�z�)���q,����*
��@O�X�����?����8��	T�zr��d�bYsD������Zo\9��7@��[r�r��e{�u��lvvu�o�f�`����4T�J�xWf����k��_���L_C1��)�N�VX��K�2{��^j����\l�P��R��|K����\�����k�%�(��B��T6�U����Q��0�e�D���Q�}-�����/�/
��$�xP�tqR���2�2B�K��"����A��2����;(�D���6�haU
g�?�@o���]����s�����R2��a`�;y<l�a����;V��������� u����g70y�@��C�Y�����)�b(�m�� T`)�*^<�������6T��
���+�>�[sP�I7�9�'W3�[x$Xx�e�6aPX��{�C�1�W��B�����y��vu,������a��O�R�{{���cE����n�[���K�(I�%�xVr��`��A�<��`���?���=�1��UV��Q��8/��R���u��9%����@��/e%�n`�AV�q��t�'W1TJ/��!@|	�	�{u��N��=��)0���r�}P����4
����EQ�g���g�B�
��@�^FE+(�DQ�#e�Y�(�\�@A�����UD��a������K����s��fy-�sm}�m���f�'c%}����sx ;^�e9�t��j����cn+5G������e����T>T��WKUj�O�����>�/	��^��6%�����:~��o���fi�����t\B��]}6A2��	�����2i��ag���+;�+�G���2wH������\E�K_�Z;e������X��=X
��_B��h�5���lux�W]q&����>G,��_:�+���bc�����;��<��@�K��d��b��8����(�z�������������Z�8�i�\��I�����;�&��l9`�v���Qd�n���bIeH3��)����u��S%{��$U��X���$��]��K'�Ev%��s1�xu�������&k}��'�����3F_������^{�W����UPU��6�t`Gx�t������m.��#*�b��j�������O���6�t�����Z��U������U
��X]�{B�F��q-!����<ug������=���?IX�l����b=pgb�|�������i���Y��
�X�������A�j
{�����b�25/By9��{��/����IB�#��S���9���	Sr�@�r��\����'�h�|���<���8���P�F�����o-���&�z+�m�5���,��!����������Y2j�m�)���r�lDx�\y?���.[Q��jZC��w�OF[5���w���&?�Q\�I�g
l��]+E�@22��G�^�������"�;(�
g�� ������!�#�@%�e��*#>L	�XSWVQf��B���a���;]�x��&<(�@s��2O������3���}��'���P��.���J�Q3 �4�A4�L=������;���?~�g�5�YP>�Wsy)�pK"� �B�_�����z��q�
�o�k���Z���lJ�U	'��o�R_7�Kb-[yi7�{9���8b��H��|k��$*K��c.R��GDq�TT���#�TA���]J]Z7��]Sb�J)(c�_go
i�U.�"ir�K�m�`YY\pM=�0o+m��)2a�E�v��~�DC�3wA��@��Ea_6�����6R��������+n��!�_�m�M�c�a��J�����Q6<���{�|��>�������*�k��_�%Q6�0o.�j
���a�����GGV\|7�YS�$h0!c�6��>ORQP����R����m�0�b���e�Eh�5�E���&�P�bt"	-��]�T8
UH����?����X��.��XP6�1��CE,9��Q@�����&t��`\T7�N^���3s������/���e���K ���T�����ueL��iH�"$���#�!*:,��-�Q�J��=���[y��U��l��{1���`"��d�����V�~O��*oq5��=Z��H'��
T[;��#�(/Xy���I�M��Rf}����#a����6���i�%gI��1K�Sx5���$Dh���Z���B4m�4j���Z
�CL��B���/a�v��&�,�0I��b���%���

��H�^���.p��R��UiY4B�\��le��,�L�{%-i2f����3a,����o��M��R���ql� IH�+�����4�
;W4����,������:��
WDY5�y�$,�5�!�5)����H
�n�P����D��T�����{n��hX|������"GT@c�G~+XXg�a��f��bD��%O�j��A7'��3,���*�(_����JA���{�ub��/VY�hS���}le��Z�.��P�|S��H�>�'A�	v�d���>?��S^'xyI���)]���!0���Q�=0I&�g�!��A��3�z|u�DBI�K-�|W
T�����tA,��(�{�]�sk�0�i8N@�a�L!f=L�^�������������`���/�f�����"
kaL�R�eB��$�O��ly��_�>|xw��	P8���-��uo�����P����4����
cYS���Q�|&�o	~'�<W�+m��d���8����
������J�SP�x�c�]�IH�L�s8qj��[~���(3��HrS7�����B/P��G\���H2P>��l�%��x����BV�������^�e��%������}���-��f h��e �"L��)|5\�O4o"p4H��{#4��;G�!C���Y+�}�������O���z�R?t��WG0�jb\�Wz��������� �@y�Z��^�E����S��_8}kc��no'��}H1X����vi�j�,f��9���1��Un�������T�+�y�`��������&����k�l-(#����{���d���S��X�=mL�_�S
}����<t�]��
4s�
��+��[�H_��x�q��]����=l��R��eWu�)mk�V"��S,U�)���a���3�i�s@��=��hu�%K�[��26&x����w��[������|h���RA9@�5�G��������[w��{ktE
4��~!�����p�9��'_��h��nRe��Y�o3�^��Q���n��Uod���!O�"���c��<[���D��c�F��D<@����&:�0����a�%���'���dMDC~���D���;�tV��M����~b��ib��-X����b
3D�K~[���r�����������<��\�e�QX��x���ioA�t� e{/J���J]�k�X2�ko�A�]}��j����������>IRu�E
���A�$T����*5p0�-<��RC��{%���un�����a����������_�>b�`B;"��U���{��-�'S��=�"P��Gj5�M�����Iv��;4�J^xl���[�6<v��v������'���y��wO�8�q���x��R�A �rt�!Hwx�#�Mt*�
1�&��{��U��*�cY�I�lA��}o���,$�A2�vT&�I�g�p�%M�CDg_h��E�A[J[��^e���r����V����)V�S��e�YW�����lK����Aw�Y%�z�[U!����'��]�������;����2H)��J��#��W�t�Wq&x��"1a$������d�J="�b1��B|�yb�Nw%�Yq������b����e<�AD�h�����)D����!Kg{���6��e�Nh��x2�!�O&��y�P���k���'t�e��s�P�M�;�*<�P~�*��u�T�i�;�r����V�H}:�'���b���jvv��;�n T3�U�,*&*7X��Yz�`n	��4I��*���>���������Jc��EQ�AN��p��������7������}��a X����5�oB�b��������<0}�O��fp~���U��u����s��a�[�����&�`��G��
H�gI?��?WrR��~1+��
V��L�qtq��:)��+*�vQ��*������F��4�z�JF��E�
Z�����&o_�'j=���Hdz�V����S_�<����dq������m���vs��Y�����T����"+��}��T�Jm��iu�x2R�&e>��|7V���xsu'm=Zn7d�PN�x�w{x�QAa��A����1W�(����f��]TAH9jJ}�(w��19�p5����ZM�Bm���E������R
i�{�2U��o��t����j���g�u�S/���!�E�=e_�V�_�
��������dh� ]�"�L�T��<z*�|-���t���f{k�2����u��j5��J0C_Z�0$�/���U8��
�SR�^ �Llp�(dtVbY�����=j�����Joi�PO����1,&4<�Pp��*
��v6,sWg���lj��>tb�	�����l�Mc��o���,��\����������c��,�N_.�dSv���@#4�?�Y6�{���nv��,�o�����>�������4tIs}�����$��S�W���q�'�k��67���b."bXx_�,�Q�N��X������8������
D����|8�I�l�*��t&=n��I����J�F�B�������������y���<������7�L��qp
���
$B��c��Un�}�-=�o�����f���������.��:9<���8?��*|a��aWfs���as��h*��L|������0�Df@�a��3v_����5�Z���]pO�4�[��
�Y0�x�Rr)97��#PlK�1�Dr�c��A�s<g��c6X�j�{S^�*��S�/�!gY����MZ�����������H�S��f��c���fg�HW�zDzu��������~����"�IYe;�4�]�7��LX57
E}AY�o��L��Y�WW�*�j�E��499�J���M�;=���<tzV����#�@}��6�����w����|���I�y��a_�r��CY�������W������fF��dM�����S:�j�yP28e�g%*Us?��k�Y7I�k:>=�����5���{N��~6q�_������^�RmS�Y�a�>�;;�u#8?f2c������L�q�]�@��+���|��I�P�Ex{���[�^����sp���e���E�W��QdQ��o��il����j�&�-��^j�
P�� e�?d���Y�@��kY>�#;�@|~�����������x �� �T��.ht�(�nL:r�@$��=�D��5�:��]j~�m"�������r5�0�o�[��)�R��I�����Yc��g�x�fY~�t�������<���2~EW�[/6w7�[��m�%?&d��Q���Y2zO�D\������m0%Z�Q�$�4K�^��H#���I���/r�5�t�e���me}�$�o����<V�3��&�6_��=$dJ��Y&�S���
c�|rH��g����nk�qnGcx�W2�>�8��W|��)��&����z�~��S~�>(��~F7-��?��R���hw��X�b��T�Y~&��n������oX����/:d
�8��3���r-�X�&����z��~�<Ln�� �0��O����}�����"��E�O8*�AOz/����>��B~O�>����%XA�Y�I�X��7`���������.{�r���w*���y��h��fq/�
�{�WU��J���7P
�4��S��
�L/����2@��g�L?���
3�a@n����Z�����?��Oq�*#��wS�n����`���S3c����0?.8~BO������-�x',x�A���o������U��<^�m�#��d� ��c���ZV[���w26�\�R��$O0Pxb�Y�KES6����&��T����^U��mZ���c�i��|WH���,�x��>O��i�C<���V��Q8�����i�9
#V��l��6=��b�Nr�Lq	�)(F7����X�����
��b�<�\�Ua�@�G���#}�����A"��^t_������n��f:�e�d��r��\�Q8�]�V�YI�� �Kn-�?����I����x4Xs;����\��w���g���>-}�����DaQI�����V����u���
%K����8�8�b���nl���mZ
�����i�>�'��[!G�t�,���dM�Nn��}Qlh��r|h4��XobS�p4,t���W��6p$������z�w�Kr�S�rV��
��Na`\�R��Avjs�t�_&"$//���M�0;��"g��&����p�vws?`o?`k�pO�Be�xE(��;��!<U�:��N]S^�Y+���W�������nGE����������'7�JB��b���&���l~"jB��������1�?�R��^�u��#3�Ji}y�
3E=��j���s�7!�EU��S���q�����fE���fBO���`?��=�d<�T��H��TEl�$4lT�${�)�P"5��'�Qdr�������+���V�~q�?@��>���(t��2�`�d�����������+h�-k�-h��i7��?�i�k��mnlY!�:��u������|�
�����8������U��+��}x��$�N��.[��-n��S;>9�<��"�[�� �=�_	��p:.R���,�9F�8�����#�hjq_�IR�],��� ��O�E��2H���������l���Cw��-�q��1V�e�qN�;�Hx1�@�x_T���h]��\����L8����S��x������3�C��o��*���M{�Q�X1	i��S�9���P1��jr9f����Sg"�H�B�����X�K�����h���;�����^Jt,`���c��8�������o�v?E�]���2tN�*�5�j?�`s��p1hw�������@����7]�w)�
�]v]���V�6������S���%r��������I^9Y��X�z�av��m�0�vG\
��e}
}������������40�E�����"�:2a5g�<��l-��*�pz_*�cy����{V/����lo�����qW�7�����}�
�DTCVEE�"�b��S�p�9��4yS�O�.����Uf��V
���:�sL`.�F��}��:�h�2�{��� y�$r�2�Q(����Av+�OaJ~��C��w�?���=SnC�^~,����U�!��I&��R?&�����R��q[n����Q���d�'�8�TjC����Mbb2��^�a�-�������/�dl�_�BP0�
�I|C��a�F�g�V��M%�]�MNz�|�U<k�3�*>�O��r	���(�6����1���7�O���q`�B��'gW]x��kpD7�-���E�<�2-���e{K7�#���@h�$W��1Q����FU�������T:���a;����
C_X�>av#mG�NA�U��� yf��h��vy������C���~&��8��(�H>�4d�V��iI77?mQ9-�����W�����_��r�c�4M�}F��Z�1����v<�+�p�/;���������}����=���^|�=/tPg���]x�M�0��uG��V}�g�Z��5�+7��D8����}Q�5Dq��q����N�E���:��!��xU�/�pM"������0���u{{��n�oi�UC��5�31$a��
"��
���\�]&N^#g��lH��-�,��b��
q�������	���"f_1�8�tn��Fn3�0^�s�^,�fa<[�K���"�����O6s��<$�&i�8�O���Q:�|_2��5F:V�G���I���?���^}xz|�}|A�f����_�(�
�tu�`
3y�&f\�^7�ruP���}Neb[S�V����� ���2Y[T�w���+Q7��>;�2������2�J C/^Uh���������w��
U-�xO�J��dnu�#�>��d�?����3?��#��x�R��!���u�����R&8'#
����#�<sS;�5���8��d2��u%��9Y�	��|��i&1����$�3qy�C%�AE$����A/�������?K����=]&y���SQ��oUT.j]�/Z��Z�|}=�az����_=����6z����d�8�,����m|��]l�L'��d@���j�a%�U�q02a|�l������m��]��!�
���p��2rmWayf��_��j��W�l���}(�'��w!�&o�A��8����<�a�Q�����z�=+��6�:^��
Q�)�$)�����������*��$��FAi�X��@��0���(�B�A�|CgQJ�.���0���\B��~@���%�MP��0������P?B
��8>�K�����
�qG�cCr��87f�[�>�������b_��8��W_����YP��`r,����vJ��Wu�2�����\���GU���[7Q�d7�Ei$���B54���?���y�DD�)X���S�K)#���Q�44�����l���?~B����{�IV�&s�K+��e����U�U6G�q���6���$k��?���0�gh0���Y���
�UC�$��p�w�ac�C���'�=��C[?����A��Q��i�
����U�}C������$8&A���x{��E��
b+.��a	@�gh�)�K�\JJ��D��Zv�tW�7�z���n
�E��[N���mxd���C|�e��6b�Y�u��e*G�B���b�fC���� 
�(�u�-�Q�dx7p���A���ezjvF��]R�\�#��>�����-����G���^�i��(�4�EJa�F�I6���G��� ��FTE���!Iw�}-Q���U�!d�/ED��I��A*mu�"f0�x��rG
�t��0"��v������`Q�j�XE	����nA�������LU��ga��K�@K�}��sf
adA��q����G����E����
zR�3b%��S:(B��(0�f5�AI�Z�R�35*U�k���~����7�����*�%���#���?�X�)�(��)�y5��6[-��Ut"ggt�\�*:��hgtd��r��>�/U[�[�#������8��Q���b�qxgW�tb��)�����	e~���k�z!�Z~������3���ux���#��%g��������rFgD�����rZ/���n�C�mD�m1�����%8ToN<l�(m
��	����cx�G�BAF�����wX���e�;�h&U������B
5'9�g2��a���r+,�p�����2uu"�'��5�q5�������G;�������F�54~ea��d���"r,��8M>��������|�rE��>���
rTJ���f�*���<�d�s))&q/+�/u�2LEe�S�Z�c��k~Y���mv��G=���G���jk������9X��^\�r��x�kv��S&��������23�q�������]�u\��1�4����q���D���
�U�aB��iOzl0N�{�y�����6�TS{u��{U�he���Y8�����,<65��+;��w;�vk� 0}�*�l��s�z{�`.Md�}��,T2�hXyr!�2���7�N~?��E@�����G����d��]�f��b��F/����_���j(�����*��U]:���wU���[;�v�Z�v����GV��Oe`P�E|�z�%aZ3mW����R_,�b�l&��JP�R�*K&���o�n�47w7��*��in�6L��#�nM��\%�F�)��f��d��2���'tX��������h�ag��8JRm��n�����C�cm����4��,@?|@%x�{u=�D?�����[(9�&��4�!6*g�:��j�,t���&���d��)<�$��R1�j|Fs��'c�l��^�i�'kZ�Q��&i$��h	OE�1�BM;�2_�>�1�ba�	I�F=�5a���r�A�uN3�h��qA�m]��h������2��7$R
;� K$yx]���gEh�������A)���_tJ����X�z*���/�ErD)��pIV@;c:�6���.����#�]��A�����$��@��-��3H��6����dc��!�(#�����[;�l)��h����������w��^����;��L���T?���G�L/� �p�����r�*l��M���Y���@PP�[-x����=�o�����(	dZ��SQ��JM���m����G�AU���~����I9���I���i�1�~����=��1I���HnS��o_D�[�,�$����Y	
!��l��;9�1�/S	��!O����9���Ta�W��9��2e�s�K�}y�I��W�S�^��� 
h�`q�7	�A��%|` ������s����^�k��*^�-c���x"���	�\��I��g�W'G���B�/�T�w/�4JY'Z��o8�z���;p�b��-v��MA�-�d�����=��C{�j5(��RQl@[�O�)
q�����+��x�(��5[�!]�W��}�uo�R����)S��k�xPN����*[L2�,�3���RK��d���'�nb^�D���7��]?�X��9g��H�w���N7(b��jH��*}xu�f�:��R
�C��v�X^�75:�g�����������I��zsg��������)h~��o�.�_.
������)��w�|8=�8�����\�2 � "��������)��3���~��R�MN�W�c�#E�Dr)�]�7e���L^iS*��+�dD] oD�5�J�=H��a�z�����H���J�?9�c���%u�^���c�+<�~k8���IF�{�h�7K����]f��U����m2���\�J����)
�w��^XB�.W��Iq�A���GD�v���U�\�-�2����4�afG^�E�x�/���$�Q��$��5���w���KGu7��������� s�5�?@gSzz.�����]b�a*oVs�s���TT�Y�������lo��[�xA�$�+�o�
/4,�
;�|��\������]%����b.,HM������F���n�!9��]��9�P���wr��/�:_K��!��Gx#�������
����k���%&�"���a�����c@�?)<
�6���D�y�^�}R���
I����I,��Gn���2�d��_��������W����7��W=D���Jv#m�"�Gxf����J2�g���n�uA���_���SF��>�a���o�_t)�?*��������FO����F!���Fs����S����~�pQ��"M}x+����������B�<3��:d9���J\����lcE�"`z�(�
*"�LG�]�Z��c���B�N���LB$\��d��/�����;�I!���b���]&��QnY�>��R��^��b�X�����F���0E�R`oI:�w�U�(D�AX����g���U�������F%�
Af��K��EX��k���g��0B�m��]�� c�2��i{�2E�xP�Q��1-��;���6��
��6�bP���}Q�#���7����������$�������2��=��v�
���A����5���e"���z�����\_��3���*�����T��U�#_]w��*����.
5�����]�������4
���I�D����)g'��RcC�z�8~����l57���yjk���6��J-���t����N�����?�U
����k��obc��,4�3�m �����$KUx�����}n:vb�tB�yr��t\rf���C:i����\�q�=�!�x�����m�P��:2���va\pl���T~��s�Zu�n�����<�X�����������'�v*c��U��$Mx:K�I�;��*!�
���}������d4R����v�����z;,��`�~\L��
�-�p��.�����/Y"�BgGN�A�=�&��������rlac~c�hp�GV
c_����=a�*AS�W6��^)j4����L��D�@��(���@�QI�:=����A�e=���>�������@�e�2?�G
�N,$�1���R)����x
�g���n����fgw����G&�����E}Q�fe/�1��T�a>�U�(���h�nw`�g�����]����Q�#�����������tZb���(������3�=�6�fp��BWJ9����	w��[�	{��k����������i�$���Fy��j�;;�;E(X�����mW��.4�!g;en���Y����l������c%�E���J�t�T�Q��zy@��N�qD'@E�lN���A_{	�iaTd#��@Z��vm�A�F5[�����T��65v�U�b���(��"h(���w�Fg=�.������P����U�����~�T����������6�M;��������w�p}�7��5
W��-�*����2?z=�&y��[ErK-u���g`q�bU�B��T�B�>����vs�c����������F{F����]�3��n#���ar/������
��vE���79�8����:��~�zL�n@7��4�I�)$�<�����	�b��!�������[�]}E�a}= �mg��l��t<
��m�O�D��_������
k��6���	]�� ��,��M�I�d�:�:
e��kYE}-��&�Q��������AvC���A�aD2�C��B��A��]�a����:���P�-|Z|k��J�������%�����qq�1hQ����h(���U������** c|���N�+���8�M����� �+q��a���Q
�Ung��7��#������C����~~�i��x{��d\73����)�	��m�ra��p��6���H��5�����g�E�@
&T�p�n8T!
k�S�/]�#������j����e ��^����N�
�G���h�@�����#<��r&�v��a�PJ����FzXz�P	�A�^7Q���9?9������8?�|M����*<V���*7��jdW�p����o�t`M��*��4�����v������^\������^�X��K�M�]�����Z�e��Yq��}��.cO��Yg/�[�,c=�����������ems���%\���<
�x�e�NQ1{k��?Bg�)I��[��*P��Yf�(�Qy*#�����.U�|!���7M����6&t���)������\V���-M�"%%qv<���%'�CN~w
!�^�x/^���������p�����T{,
;U�`�g��Bngu�����4����e����Utp=t�c��.�h5�
���������)(�������������P>,�"������YB-���PK�e�������7�<-� !���FQn����[��8�>�m�s���fn$
�n����v���P~�<�����~�{���&���qU���e+�����9n+�����
n+����b?n#��,[;��i��QW�P�+��R_�-�Tt]]�[�%Y�XZ�L}5,A�'�+q�$�����)|�9�R�����������+����f}�Z���� c���e���[XYR�n�aeYY�,��DV���U�����T�Z�RG|+��5�FY2s����U����>��T�(I�:q&�~���j<5��_P��%M;��=���w�I�t�R5F��*��e~���E���>�D��8��L����&��V�	5������F���.���<\�9eW=���Z��a�������Zf���qr)���8��g�l~���<��E:<=}��s����9-��r�q�o���8O���WT���gQR�����[�T���jYzS���*t�R�><��f[Z�� q��l����7c�z3U�Re[?6���4�u���R�>�;���I9�@}C��
d���-��L/���!�����x{�qOr>|��9����i1s �g��5��H�g3/
{2s1w����'��@JzIr��F0RI
}��0��&�������5@ew������5��k�7��o��\�������g��r����7��o>�|
��|�)��SPj����w������	�G��_}�������7��o>�|�q�?G�����'��O�7��o>	�|��$��
�E��9�6_��j����3�
���v�(��������s�dP��0��/�7�
�`5o����b�@����I����m�/�/F���'E�	�cx�w"t�$�EOFb2��1���
S��k���0	����)�����!P,���t��&�}�wq�`0�!�rxY�`���Z�	r[v3��j �7,�("��h�����]���<>�f���i��[:
��\K�;�����e���/�����(���J���]�r�[�z��������\�o�u�x��C�Q�/X�}��se���E5,D�i.P��g/���nQ�(;�x��$�W^��B�8���5�]��
��@�9�q0l{�0�(�L�T���1�����X������ENG��e
������]U���{j}ss�o:�����1�~�n#�W�y�1�u(�Oa�F�~SF�K���?	�B����:sL[�����/�VM����50�-^�����(���`0{9�U��p������zi������\S_�e`Q���0KD6{��S�`���R�y"��UF�7�.�>�J��G*��������`����&x������!�1��!}�Y��3}1Ww��
�U�����������l�n�J:�J�b������$	4Pe�l`�f��e#`�������P����a������0�(��`�2
@�]d������HP �
�+�L�b4�1;zi�!&E�M�a��4/�$9��/�t��A���m/�9��w�O��\�q�H@�E�G�rD�3ADA%i(0�uS\���]�2L�����1[���r	��j�7�"oE	Z���:{�q����\�����������9����B�
�;���|��6��"�^q�\`�*���x�K��I���
�g:�%��+����/�e&q��
q����#
�c<�s7�7R�Wy2��0���4�+�����u
�`=;� �
M��e�A�S����'q|$S���1[� �\��8=dM��3�_�
���*����#�AA5�DF)?d�8K��DG�O%��]�
EU Ft,4�`��F�l*��)��d��f��z��3T�'����[3�����$����8wJ���pw��n��j��	�N�����M!���0(�\��G,�����1��p��~����.��)�s�
A��x2��!�F2/\�^�a*&�Y��we�@5����{��P����O�LvY�]
nBM���R�}��zTYt+�ap%�O0M&��f��?af�0����5Ad4R
�,�p?��'|���B*��1�O�<�N`�3X"�U`��l���(
)s}-K�0#�
Q��VS���]�Q��)E=�0���}v��
,J�Er���H�:{����������C��s����2\��
Ti���"����b��N��AT-|$�t�-������>Q���i`��Z0cx��AV|R��`��XV)
�r���$f�2�1N��{c�F�v��x����$�c8�n���"�\q���Lq%��L��V4.�&Y�����X{�H�*���"A�����x�H�Ry��T��<������&5G�/����b�oa�K[{�Ez��H��be��UDI�z���#�Q�z`#��+1�K������p�+y�`&��z���j��cI��)����|�^2R�HP�!A9L����\C��D���P4������]l�����*
������y�z@��-���m���Z?�a������{��!��L�Rq/^�4�i���nmdw�`C�$B�G;*��n�.1�/``�]���rt����}K�a���������N�,�c2�1�<D���5��
�#��D��m$�cPYT�&�qwU����;��@����N����dCi����c�'!�:i���R�^���:�,�(ja�c�Z,[�+p(>�\dt>���4xG2D�����o~R�������6��"j����A�J��'=E�XR���l��E����
V��9p�\���C���'���0��R���;�P��P8��9'�Z��%��A�P���dH�w__d���(k��f�+��1���O�e`=<�:�'g���r�v�9�u�	�v,��,,7	�(Z�q5M������h��r��9����p�����d�YF��%����3`��R����A"�8��a���I��SF�p4����q����\�Al��eM��g=w���=)[�zBHA+G��>�+G�gG..���N���'W��Vs�eLl���
�6Y����"R8)��Jh%���tq�"�r7�7	(�
v��g,�|�Q���A ��B��E�g}��Nm,���:�F�e_@IH*,��J��9)N��}��'����^�y�TL��r?WD�Q��?�.c~L�,��NkrC(tA���y1����
��*�"�Y�/�Uw2�������1�	_�7-q�WV%�AWF��`�oT���d�_&Q�,���$����!�M�_�FZ�a�If��=�����x"�<���Y���G�u��P��5�����u��
u�"�M4���IB�,��X�Z2-����Z�
��l������}`_�X����F�e��lE������y0����-r�/V ~���z'83�t0�z��S<��%C��@60-1��kt'v#��D��	�c%�F�������x�2��!p�]x�>�~�����b�T<����` ��x���;�h�p&e$����
�{0F�y��G@����>��
#65Jeff Davis
pgsql@j-davis.com
In reply to: Teodor Sigaev (#59)
Re: [PATCHES] GIN improvements

On Mon, 2009-01-19 at 19:53 +0300, Teodor Sigaev wrote:

I see only two guaranteed solution of the problem:
- after limit is reached, force normal index inserts. One of the motivation of
patch was frequent question from users: why update of whole table with GIN index
is so slow? So this way will not resolve this question.
- after limit is reached, force cleanup of pending list by calling
gininsertcleanup. Not very good, because users sometimes will see a huge
execution time of simple insert. Although users who runs a huge update should be
satisfied.

I have difficulties in a choice of way. Seems to me, the better will be second
way: if user gets very long time of insertion then (auto)vacuum of his
installation should tweaked.

I agree that the second solution sounds better to me.

With the new Visibility Map, it's more reasonable to run VACUUM more
often, so those that are inserting single tuples at a time should not
encounter the long insert time.

I'm still looking at the rest of the patch.

Regards,
Jeff Davis

#66Bryce Nesbitt
bryce2@obviously.com
In reply to: Jeff Davis (#65)
1 attachment(s)
New pg_dump patch, --no-stats flag, disables sending to statistics collector

This patch adds another flag to pg_dump, this time to disable statistics
collection. This is useful if your don't want pg_dump activity to show
(or clutter) your normal statistics. This may be appropriate for an
organization that regularly dumps a database for backup purposes, but
wants to analyze only the application's database use.

This is patched against CVS HEAD from today, though the code is quite
version independent. This patch is unsolicited, and as far as I know
has not been discussed on the list prior to now.

Comments?

Attachments:

pg_dump.patchtext/x-diff; name=pg_dump.patchDownload
Index: pg_dump.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/bin/pg_dump/pg_dump.c,v
retrieving revision 1.514
diff -c -2 -r1.514 pg_dump.c
*** pg_dump.c	18 Jan 2009 20:44:45 -0000	1.514
--- pg_dump.c	20 Jan 2009 20:47:25 -0000
***************
*** 236,239 ****
--- 236,240 ----
  	static int  outputNoTablespaces = 0;
  	static int	use_setsessauth = 0;
+ 	static int  noStatsCollection = 0;
  
  	static struct option long_options[] = {
***************
*** 278,281 ****
--- 279,283 ----
  		{"role", required_argument, NULL, 3},
  		{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
+ 		{"no-stats", no_argument, &noStatsCollection, 1},
  
  		{NULL, 0, NULL, 0}
***************
*** 430,433 ****
--- 432,437 ----
  				else if (strcmp(optarg, "no-tablespaces") == 0)
  					outputNoTablespaces = 1;
+ 				else if (strcmp(optarg, "no-stats") == 0)
+ 					noStatsCollection = 1;
  				else if (strcmp(optarg, "use-set-session-authorization") == 0)
  					use_setsessauth = 1;
***************
*** 613,616 ****
--- 617,629 ----
  		do_sql_command(g_conn, "SET statement_timeout = 0");
  
+	/* 
+	 * Disable collection of statistics.  pg_dump's activity may be very different
+	 * from what you are trying to analyze in the stats tables.
+	 */
+ 	if( noStatsCollection ) {
+ 		do_sql_command(g_conn, "SET stats_block_level = false");
+ 		do_sql_command(g_conn, "SET stats_row_level   = false");
+ 	}
+ 
  	/*
  	 * Start serializable transaction to dump consistent data.
***************
*** 833,836 ****
--- 846,850 ----
  	printf(_("  -U, --username=NAME      connect as specified database user\n"));
  	printf(_("  -W, --password           force password prompt (should happen automatically)\n"));
+ 	printf(_("  --no-stats               disable statistics collection (superuser only)\n"));
  
  	printf(_("\nIf no database name is supplied, then the PGDATABASE environment\n"
#67Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Bryce Nesbitt (#66)
Re: New pg_dump patch, --no-stats flag, disables sending to statistics collector

On Tue, Jan 20, 2009 at 4:04 PM, Bryce Nesbitt <bryce2@obviously.com> wrote:

This patch adds another flag to pg_dump, this time to disable statistics
collection. This is useful if your don't want pg_dump activity to show (or
clutter) your normal statistics. This may be appropriate for an
organization that regularly dumps a database for backup purposes, but wants
to analyze only the application's database use.

i haven't looked at the patch nor it's functional use... but from the
top of my head jumps a question: is there a reason to not make this
the default behaviour?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#68Bruce Momjian
bruce@momjian.us
In reply to: Jaime Casanova (#67)
Re: New pg_dump patch, --no-stats flag, disables sending to statistics collector

Jaime Casanova wrote:

On Tue, Jan 20, 2009 at 4:04 PM, Bryce Nesbitt <bryce2@obviously.com> wrote:

This patch adds another flag to pg_dump, this time to disable statistics
collection. This is useful if your don't want pg_dump activity to show (or
clutter) your normal statistics. This may be appropriate for an
organization that regularly dumps a database for backup purposes, but wants
to analyze only the application's database use.

i haven't looked at the patch nor it's functional use... but from the
top of my head jumps a question: is there a reason to not make this
the default behaviour?

If this is a generally desired feature (and I question that), I think we
need a more general solution.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#69Greg Smith
gsmith@gregsmith.com
In reply to: Bryce Nesbitt (#66)
Re: New pg_dump patch, --no-stats flag, disables sending to statistics collector

On Tue, 20 Jan 2009, Bryce Nesbitt wrote:

This patch adds another flag to pg_dump, this time to disable statistics
collection.

You can pass session parameters to anything that uses the standard libpq
library using PGOPTIONS. See
http://www.postgresql.org/docs/8.3/static/config-setting.html for a
sample. I suspect that something like:

PGOPTIONS='-c stats_block_level=false -c stats_row_level=false' pg_dump

would do the same thing as your patch without having to touch the code.

That's a pretty obscure bit of information though, and it would be
worthwhile to update the documentation suggesting such a syntax because I
think this would be handy for a lot of people. I was already planning to
do that for another use case (pgbench) once the 8.4 work here shifts from
development to testing and I have some more time for writing.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#70Bryce Nesbitt
bryce2@obviously.com
In reply to: Bruce Momjian (#68)
Re: New pg_dump patch, --no-stats flag, disables sending to statistics collector

Bruce Momjian wrote:

Jaime Casanova wrote:

i haven't looked at the patch nor it's functional use... but from the
top of my head jumps a question: is there a reason to not make this
the default behaviour?

If this is a generally desired feature (and I question that), I think we
need a more general solution.

I'm not a big fan of flags, preferring good defaults. But I was not
bold enough to suggest this as a new default, as someone would probably
want the opposite flag. If you're measuring total server load (rather
than analyzing an application), you may want to see pg_dump activity.

As for a "general" solution: one could add the ability to inject
arbitrary sql just prior to a dump run. That would let someone roll
their own by injecting "SET stats_block_level = false", or make any
other arbitrary settings changes.

Or one might slice the statistics collector by role or user (so your
'backup' role would keep a separate tally).

On the other hand, the flag's advantage is simplicity and directness.

#71Josh Berkus
josh@agliodbs.com
In reply to: Bruce Momjian (#68)
Re: New pg_dump patch, --no-stats flag, disables sending to statistics collector

Bruce,

If this is a generally desired feature (and I question that), I think we
need a more general solution.

I'd argue that it is generally desired (or some convenient workaround)
but not urgently so. I'd also argue that if we're going to have a
--no-stats flag, it should exist for the other client ultilites as well;
if I don't want pg_dump showing up, I probably don't want Vacuum showing
up, or various other already-debugged maintenance routines.

I'd suggest putting this into the first patch review for 8.5.

--Josh

#72Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#71)
Re: New pg_dump patch, --no-stats flag, disables sending to statistics collector

Josh Berkus <josh@agliodbs.com> writes:

Bruce,

If this is a generally desired feature (and I question that), I think we
need a more general solution.

I'd argue that it is generally desired (or some convenient workaround)
but not urgently so.

One person asking for it does not make it "generally desired". I think
that the use-case is more than adequately served already by using
PGOPTIONS, or by running pg_dump under a user id that has the
appropriate GUC settings applied via ALTER USER.

regards, tom lane

#73Joshua D. Drake
jd@commandprompt.com
In reply to: Tom Lane (#72)
Re: New pg_dump patch, --no-stats flag, disables sending to statistics collector

On Tue, 2009-01-20 at 18:37 -0500, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

Bruce,

If this is a generally desired feature (and I question that), I think we
need a more general solution.

I'd argue that it is generally desired (or some convenient workaround)
but not urgently so.

One person asking for it does not make it "generally desired". I think
that the use-case is more than adequately served already by using
PGOPTIONS, or by running pg_dump under a user id that has the
appropriate GUC settings applied via ALTER USER.

+1

Sincerely,

Joshua D. Drake

regards, tom lane

--
PostgreSQL - XMPP: jdrake@jabber.postgresql.org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997

#74Bryce Nesbitt
bryce2@obviously.com
In reply to: Josh Berkus (#71)
Re: New pg_dump patch, --no-stats flag, disables sending to statistics collector

Josh Berkus wrote:

I'd argue that it is generally desired (or some convenient workaround)
but not urgently so. I'd also argue that if we're going to have a
--no-stats flag, it should exist for the other client ultilites as
well; if I don't want pg_dump showing up, I probably don't want Vacuum
showing up, or various other already-debugged maintenance routines.

I'd suggest putting this into the first patch review for 8.5.

--Josh

As pg_dumpall calls pg_dump, I think this is covered or at least
coverable. For vaccum, I've never seen that activity in stats? Can you
supply a more specific scenario where routine maintenance is harmfully
cluttering the stats table? A specific utility that needs attention?

For this feature I'm not so sure about "generally desired" -- I'll bet
most people don't even think about this. The question is among those
who DO think about it, what's the best behavior? Can it be argued that
excluding pg_dump is "generally desirable", for the average use case?

If there is not enough demand for a dedicated flag, I may submit a man
page patch documenting the Do-It-Yourself solution proposed by Greg
Smith, or the one proposed by Tom Lane.

G'day
-Bryce

PS: Note that no respondent on the psql user's lists thought excluding
pg_dump was even possible -- so that at least argues for desirability of
instructional material :-).

#75Teodor Sigaev
teodor@sigaev.ru
In reply to: Jeff Davis (#65)
1 attachment(s)
Re: [PATCHES] GIN improvements

- after limit is reached, force cleanup of pending list by calling
gininsertcleanup. Not very good, because users sometimes will see a huge
execution time of simple insert. Although users who runs a huge update should be
satisfied.

I have difficulties in a choice of way. Seems to me, the better will be second
way: if user gets very long time of insertion then (auto)vacuum of his
installation should tweaked.

I agree that the second solution sounds better to me.

Done. Now GIN counts number of pending tuples and pages and stores they on
metapage. Index cleanup could start during normal insertion in two cases:
- number of pending tuples is too high to keep guaranteed non-lossy tidbitmap
- pending page's content doesn't fit into work_mem.

BTW, gincostestimate could use that information for cost estimation, but is
index opening and metapge reading in amcostestimate acceptable?

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

fast_insert_gin-0.23.gzapplication/x-tar; name=fast_insert_gin-0.23.gzDownload
#76Jeff Davis
pgsql@j-davis.com
In reply to: Teodor Sigaev (#75)
Re: [PATCHES] GIN improvements

On Wed, 2009-01-21 at 15:06 +0300, Teodor Sigaev wrote:

Done. Now GIN counts number of pending tuples and pages and stores they on
metapage. Index cleanup could start during normal insertion in two cases:
- number of pending tuples is too high to keep guaranteed non-lossy tidbitmap
- pending page's content doesn't fit into work_mem.

Great, thanks. I will take a look at this version tonight.

Because time is short, I will mark it as "Ready for committer review"
now. I think all of the major issues have been addressed, and I'll just
be looking at the code and testing it.

BTW, gincostestimate could use that information for cost estimation, but is
index opening and metapge reading in amcostestimate acceptable?

That sounds reasonable to me. I think that's what the index-specific
cost estimators are for. Do you expect a performance impact?

Regards,
Jeff Davis

#77Teodor Sigaev
teodor@sigaev.ru
In reply to: Jeff Davis (#76)
1 attachment(s)
Re: [PATCHES] GIN improvements

BTW, gincostestimate could use that information for cost estimation, but is
index opening and metapge reading in amcostestimate acceptable?

That sounds reasonable to me. I think that's what the index-specific
cost estimators are for.

Done.

Do you expect a performance impact?

I'm afraid for that and will test tomorrow. But statistic from index is exact.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

fast_insert_gin-0.24.gzapplication/x-tar; name=fast_insert_gin-0.24.gzDownload
#78Teodor Sigaev
teodor@sigaev.ru
In reply to: Teodor Sigaev (#77)
1 attachment(s)
Re: [PATCHES] GIN improvements

I'm very sorry, but v0.24 has a silly bug with not initialized value :(.
New version is attached

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

fast_insert_gin-0.25.gzapplication/x-tar; name=fast_insert_gin-0.25.gzDownload
�>�yIfast_insert_gin-0.25�}{_G�����hs�:�����1v���x7{���
�f-�hgF�l���o=�9��v�9��8 M?����^]]������X=Y�[K��v�&�}��z���Y��KF�$^O�a2��$��zXw�jO������i��V[����ho>oo=�l����Xim�ZOVWW�|����s�qc�����CC��mnm��o���m�V���<����Wuo�d(��e8���O?�����Q�_�wmid�d��p�)����0\�G�XCp1J��H������~�I5�<9�xw�����U
��g.��I�����e1���<LGQ�I*������8v&>��������\�p��o�� ��z��.���v�/Y��~Y��f����M�{��]�}O�h����(��4�����-
�<}�!�e���#���M���4�1�
eO��d2��A��4�Q���z2�������7�F�g=O�]�E?J�6Lx��>�xVi�h��RD���o~lU��R,��L|�5�a�-��{;�N�c(?�T�]����(^�V���0���>�H�Q�k_?����$��F�f	�Q�~�q�6��g"��x2��u�f��`����dt��}��^h%_#��k 5�}�C��?�A�G�D���"���i�h*�����WX����9���E�j�wHe�,�5/D�����|pC0�7�9l�q����f�A�Z5p9xj�}x��K�|C�h�j�0KS�33D��E[k���h����0s��S��"�/�6w�/6�hV�/����o�7Z��M����m?Z=������k@��0�l�?�R��Maj6�%Xi����#�����
�i���On�'���M�A1f����&�?�"�a��'2�-���o���0�}��n������&)>m�nj����s�@x��P;%F�iN���tt,����nnoiq���B�@D��bY���*�����d�Pu��<Mo����d0PKA�������f������
�`�������V�;�xJ{���3����f�	(���0��5��-$���hm<o�����Av�X����|s�-wE�/�~��M,�/a@D_��5�$,(�4�13��=���$�o���#T����! Fa~���t_�: (
"h���$�v���a���z�S���Bws
�"O�8��q��y��Bv;B�A=�u�5kf����Bq��`��r`)
����Wamv�j2W�u&���`(^�Q��x���xVW�Mq�B�g�zG�,���1X�8
����:9�>}}z|t}zqN�`�����_w��Q�[��d���\����b�B�0F\\.���P{�)L�z 9�Z'�)����K��P<�@����oE��Y�6�-���l}����*���= �F��#�<t$V�}���(�LB�"����z29h9�?Y�;��j�SAh��N�%����@�w`��n~	I�$;A��)�.g���3������6l�
�yv����������u������Peo�`G��6/8;�����7��T1�v=:@����5|\Y!��IW8T�r`��
����O��<��^�S?�3?E�4�+
]�Y�M@K��$�iQ:<eTG�a�����}$&��Bn
����z�������/)
�b�/����y0
���a����g�8u����?�}�;��g6�3�\G��BD#�!���cx�$>5��K?����5�pW��l
b8�](�a��	����c�L2&�\u���lkD�6�7�LN�*�_O��K�j�.,�x|b�kk��Q���e���2���kj�0]3��6H���*J�{��?���e�w�F���5�������
�"C�
�O��y������/5��o7~��R�@����]��=����9�;Z?��Pa��a�W���>�@{a��BI���xfF�k�* �A�;�F�E~�g��J�0��5b�B��Q��M���8��i�y]�`�@���K9�i$Ki��+z��:���9 �lR�p�0`�f�!Nl`+a6���4�����D#g��Oh�n�~���hQ���j���7�������wGoN��rrK�]�K
�����g����.���0�8<}�p���'
�2���Cz�����`_l��vx��$���Q�!�����Yb�spp�6o�<�y��� +i�c�e��e5F�jHZ��3|^�!�FkY�Ks�$v�uO�>�MT��W�a�K|&��H���0;C�r�F�5��,\��
�yx/����FK����S��_X�l�0���L��:5W���l�V�y��|�����z.z�VR�����@c����/�+I>l���e=g�#�P�J��H�Sh�XQ$e���+���R��������
�WD���VW��k�!�<
�_����q���iiL�+.����8l�0���������X� �"�n��!�_L*�G�+}����]��� c	.�C��y~�=C��{(�V��/��~�^H=����������}"�����������Xo�~<:;}s^��_���� �aW��"E0���RC4���J�%���8:�=�|D�P&���q��@�f#�5�z��4�N���������zy� B&�mM�B
,�|����[���W|�w&)���R�%waFC��+L�*�#�e<�O��������>�9Hy`�gR������R
p�
|XA�@J�)����i^8E�R/����W�v��8GbDT<��&YZC������T���=���r%B
#�a%���:��,�����0�7>�7�Z�iY��KKL�gH[��.�7��*J"uz^�Y,7���J��
��GX5X��)m������#��/�~8��.V��
��F��d��$���Z���5�4��h#��Nr���{���4r2�1Q:�!�l"c�M6%�WD>��x�D��m��;�������_N$��v��-qy�
���(6�U��U��/�y����+!� VW���AIEC$@��$C�B,t7@��a��$��C1!^Xh�'�C���?4G�L����Z�K^_��SWNR��V�>]C�MV��<��CaJ�]2�������y:��e���,+SA���.���(��i�!z���;�(���<M����(�J�,x��>�'Ep�|����wNjK���%#�0���BJ��2z�^
�i����1M���=���kMo��B�WfO�?9�K��0
�~����-�luM��s2}kn)��L���?�("Wd
QL/�=}���
B��U���L3G�G=F5t
Jo�H�����~�%��Y8�������S��%.�`C�X��3;�W��.GR��L�,�D+�8����6Gk$�i���K�s�������u,8�N�!3�6���+�9C���9H�uE_�O�swU6
�&X�8�a�f�v���o��2�2�z����\qnA�/3D]���0���������3�?�CJZ'f�D����g��u�,[�z�:T��m{�R2sWZ�JR�1^O=����[s�	�<Gt_~F7�!�s0uZ�����B��ZCm��`5���0�������4J_2�	t��q�l3�\��b_��e����(�����9c1�i��w���OHM�N=�P���H���k)�f�[n���(���u��W���8o�+�:IA��dOZu���pGF���v|+[*0N��a�e"��7Q>b*�N��O.�
��)�+��#�H#�I#�n�hA�\#��@T��<���������������D?!{b~C����DZ�:��j�����s^P�d�S��hb_�(����}�s�!�%yq/��'��.lI�0�qh�K�no�"����v\����4��	(��A$�(�/��}�Vo(e6Np��y�E�P�Lq���e��l�=��C="(n�:�g��_�iN��� Z�������.�K��W��	���1�_�{�^�'�4y�%��K ��1��b�����K[��e�dD�/I�aaoB�`a
v��z_���rm��^�8�f����6��&C�
{��D�'���h���b;x�$���ML6M�i�#��s�E���P�ntP��l=�
I
���d�4��'���}��[
D������XwE�*Xf������
�l�d(�[%rz�R1�vy%��1J�:��C��������)�!8.��e��UiZ�j*��-��~K������l������>tI����l�����H�bFx��j��Dg�i��GQ��4-+��G��Y�c��f����l<X+KWg":���P�,
����P0������l��>�Ixv�|/���TM�C��K���oN�>��h�,}-�/�������N~������
��y5(�H_��>W����$�K
T��L�e���l6}����~\`
��a!��  ]��}H<�H���\�
����@?�Ss|�p��&���$K��#����}m����."�����A6I'q���x,�n���E�����m����tx
�������e�M�@n�e�G�t��N@�?��^�����mZps&*m@�O��?����&u���3P��_�g�Hu��n�>���Z3u��]'<�p���y���[k�m��~�JX�;��Vc��i?1$�WE/��F���&����I�9��B��Gbdu�����q���
����Q.k�T�i�������S�l�tz(gPB���w����f�DU���B���t�}��l���`��m($?������.��l3��@�����.u�d9�9���p��S44"g���[�������+3�������K��f�LQ��eO\K��h�(m�R
(Q8@,�_-����W����}V�U_��e�����������d��S����a5�F����Hi&�	+����A|���g�=��D��F��0����}�	�+��4\_������K�O�L�m�����h=�ir1��<�����1��9�-S"��pg:	6���~�O������O�x��&��?�C�@�%D6lI�2b,�)�2`�����y���L�rn��A�OU���X���L�X��6�js �������[�g����U�,���jk�nMX�^c��bG�r�T��
	;��v~UZ���I�|J6_�����������@�b����d�Rr��p��$��z��h2��;~i
w]�tp&O���
c�e��H��1:8N�Z��d����4{~X��*��,5�P�:�L�[���HJ`�v�dt�g*���#�%lm�P  +��|!~9����a��u�CD9Y�k���uXi=9��g�u����m��a�������
��`�x������2�-�GQ$Y4 sV��(��O��.'�����k�gV�������/iT:_�5��i����a�ncy�:�$g�u\U�{]w���`�MF|�cyoi;0�$��<���d�������c�� m2$�HI_kMmq'�@�$" b`�
�p����^���M>*�<VPGI��d����b�������Y�]�O��q��`���bL3��(q@9�����:9Y��
�������d�&}�Q�����_:]�~%MT�I�Lm=�/��Qw���@L�|�p���X������Nx�9��J����}o�w�b��T�-1������VK��9�5��/�
�Fx����Q5��=�)��l��#�����^}ty����"����2��(���P6��{��n��5-��;��VA�X��%��=�q�5W��$'���<�5��?�)f>GU3B!xNGW��%n�iF���)�9''K�"@Di��	T��4H��:T]r���]_�\�B�����u���9+O�tz~z=�������g@[����>�{w�I�f]v�@'�G���=����e3�4�CjU�[b�����e9�K����������r���e�������(������
�#����Z���Qqa��0E�z�
��h�����CVj���hxjF����yte��0��/��c8��"���{?�>����
�E��u����� s3P�?�
����{o�x|Z��j�
��*u����@�O�[%����/(Km�W���Kc��V����6��K?O�xf�EPq|��|���K�K�������.��<� v����T��V���R'���k��D��U�J����v����Y@�=�D|t�����V���}�`#�U��% �))QJ��u�hO\���|��=5�TG���K�.�4�>C�6O�U��r/�X���I]���>�w�5����>:�:�-W%��Z�Z}���Y���l���}N2����%�M��s����~X��P�>$���������7��\�Sq�;uc�&�����;���Z��C�kS����V.�d�����J��A�D��1C��w����BYF �1�e|
�jt4C���f�nGG�P��-x���{q��w����� 5��!��/p
��m��&����_�:k�0�d���Qb��I�V3:nU�6��M#`�a%�+��_<�U�d���kK'��}���oMC.C@4����D$rK|	�oE��j-�Zk����"�~������]:���D��z��" g��[W��iH���Pn�����A���D��^k����|�^"����;�G�u�Gk��&[!��0~)�S�V�zE�Bx��i�Y
�}�����1�_�'�/���*�<�)yP����?�{w�K�?,-��@�/T~M�wU#��2hv$ �����~���������^�\^�w}���t�9���W��$��2*4bR�=5X������P�V� ?-x+����*~����~�����mvv�<�`��CK�����2��W��������
��u��5b��{y��'y���!��y�~���`�6<�������\)
��^����9�@��'����@F��5GL��_�,�����({
&�#'-��]��_���fgOg���'�T|b/HC�2�
��qu�������5\�����5��������j��^������Z�����q���
��I+E���[�����V�VZ�r�2���wa�_�a�������W�9�-i�q��������r��/C����8��N���vW!�e�.S��u8�pH����m=t�t��W��A�����A�!kj8��/�2�w��2r#��v��H��������������hnn�Z��66������L���+�E��Y����Qw�58@[��b+�|b@��&�];�9�6��A�4#x2���[j��:�������Qe�&���@�\M_e�����-�t�PE��I`���A��@��v�C�P�������H�Z��}�-Z��$;;��]+����N���0�]�I�'(����oFZ�
B��p(�E_w�8��\.� xVY�oD]����\��O,OXg�V2`��1B0_NS�PR�V�x�['_���+�J��(�� D:���w�K^
@01B5O���
4_�6��T��("S������C����&������I*O��HY>��=��P����c0:�E�{f=*n���<N�	��n5�kG����^��6{>+���e#��{��<���3�PC���HDs[�9|'ud���,s�T�l�qQUm^-T��>��7_����$h�zE���@�J�E��=b��z8��P�+��I��Rw�����]�0�L�\��LWI�.DG��
�Q4����l$q��)s��O�h���U���h�SVA���Z�A����;~�%��e_��O�V��{�`"E���sD����s��y�,6f���*��PF+T���H&h(������^i��B��A�P`nH�~)�L�z���7
������:v7�{���GK@����h�&�G������� ��Q�1C(�Rk�M���Z���KRu:Uw���sZ�rE���"������]��M~k�T�<�1�I��n�'|!��]�xs��=`������@����.=�)���vx�~��ks9����-����vj�k�.]���o\m�O���Z�$6Yu<>��T��z������-�)���"@P��+��cPw6���kVe:7�u���$,~�b�Ht��3�K������^���X��C�F�-_��^�P��U����8�+����E��P��&��0��r�a������&�AN����2�����7�j\->/��@�I,eEs�-����;9��#[WI����D�C�	���
z�l��$>K<[D~V��0\�?�?KF-�2e#sB[`��OV*����+�TMk�7���h������C�.yP���:H�k�/��L�
��h���`�I�)Q�(��9S{uxUdw�e�����}!�T��YATq���$��z^e$��� ��4uee6k+�q��������j���,���d|jxY8s����z����jo�~��52pK��A���s��?,��H���?��g}XS���w5�B
�$2��(T�Q��	���
qx(��/������i�x�~���^�p"0��/�u�$�����p3��C/�#���i���I�J���5�lb:��(n����x$�*����K�K�R��kJ�Q�fe����M!m��e�S$MNv�w��,K"�����m��v0E�/�����4�O�hw�.�|H��(���Z��V�FJ��v8{��](���
��{�0��[I��xn�"����u���Z����������Z�����"���e[Ma���9�#rs�W���J&�`k��
&d,��0��I*
jr�Q�^�����V��T�,��B�&���=���[|N���c	PE�PA�zw�����h�d��l>c�
��Hwl�������BM��C����W/�����"�������fD���7�2H!�*�c>���\e]���r��I���Hc��Q�vyK)B���R9c����U^{k�M+�.�^�����& ���|����{�zUye������ D:A�m������E9��@�w�*�M��Rf}�����-�m�C���x_����%�)�}�Uf#"4r�x-��p��h��G�,~���n7�FX<��|��8�;L������Dk	����>#�WA.���&�TbtEC�u�p#[��4O}������43�����(��kf�7Cf�&e��(	�m���86N�$$��T���h�v�hh�%�YK�_
t������f�:�IX*�k�CV+���9�C��[0���$�*�1ag����97�-(������X���
����>��"�Qhd���Xkl��
?������*�'���V
��.�+�S�}���F�"���c+���ju�� �����F:�=	�L�;&�*�s1�2����9^K��;�C`�1�7��H��&�L����})��If�7��]�H )y�e����g���C���.�E2�1|�kunm�2
�	�2,�)���i�K�!>>�xu��x��x
L�����bC�Qv[_BaM�6�))�>]f��- 9:~���	��$@��z�w��t><����B��/a��~��-5�eMY�oFZD���P�%D�IVG�x\�T�y�I��{���5w6�����Z.QN!�M���x�u���l��_��k�����,=��4����Mv*�D��T��:~F$����<�AD�x����BV�������^�e��e����.�u{�{���A���\�0}:L����J}�y[��A�T���o�9�2��@�Z���6%`%L���}��_k��.���]�:�9�P����;����^0`��8�{�����'�w�rm�L>t���^���6Nt362��b����G��x�HY�:Ucz7�y�����tqDamy]I��*� �JL��Z8^��$72�pM�����`���^�,-����:��������$�8����=�����Cw�5��@3���X�1�x����%.���k���!+�������:-+_vUG9���6m%K=�R��Q��4�_u&6mu�0��Y���d)u�]���P��nPy����"B���~V�\.(hC��\V�����{���yo�����\�/�\�3�#:����3��M��R6��m���u��!�Q������Y��w��S��$���0�V(am�����p?�;�<�����:��3$cx���Ku����J���!��<h"�o��[:+Y�&C��S?1��4�����Z����b
3D�����2��r��R�Lia��h�h�q.������Gg<N|��� @:R���%������o,��7�����lQ�X��K�OMI�w�$�:�"��}����Y*U�Vi�8
���t�!����x��:w@Z���4w@@tE�ugK��t�|0����x�����Ut�]���z�)�{��-�'�v�����@:�=�&�[x��4;��@%/<�[���-��f��jky����q��A���(�F2��Q�/���Y@j1�Y��'Sj����D���3�`r>|P�oHZ(�Q<b�eD4�b�/J�d� i������2�L2<���e�������'�G����N���2�{�
6���-2
;�ZS�.��
+���ZIu�/��N3)�Bw�Y%�z�[U!�����p����"��������2H)�<3�0��J������\%��@5x5����m�f�zD�l�"�g��#������Kr����)x��!��;�(�d�� �	I	~S���A�C���F��c.�����F�
e:C4��Ln�!���Jl��>�O�?���O��BXT���plB���Xk��R���\�+��[
�2I��$�$������v���m���$
C�����p�X����fV��9���7My���h}��7�|0����xJ���'��g'G��Ne�V��K	�{���V����V| �s`
��P�X��9�ip����b���'�u38?D������:}eL��V������F{W����Kz�����J��+9��F��/f+qs��8�<����
������
a�N��SgS��RO�_	�
�A������FgG��/�Wj=���Hd�*��|��V��le��M]�����)`��m���Ns��U�����T����"+��}��TfMm��ye�x2R�(e>��O�VV7��5������2�)�m���=����0���XC�tb���P��vss�]�.� �5��g�Z����{��pO�<�&������"�L�Y�JH���4�-�LU��[(;[ny���Z�o�����/������2�j�/_	 �"��B|�j���J�e�)�\a��k~����B���f���gYn����a���P�%�,��/f�-�d��+:p�I+)c�^&�=�t�K:6��~��v����h�H�����t�'�qd�.^~(�~x�Pn;������iv��P�?���I[uVWf��1n�[����k��Noo��Z�����1�zH�/�D2/��@!�����=��[����n7;���w��nKC��y"���\������b����<N�k\a��5�]�t����~��3�""���U��������%y�z����^�v���@�m��������V�2�Lg��F�OZ�q���$kt�U~���ino|I8��So�%��{� �:��&F9�H���^�k�X���{���o�������wG�']�u}zt�}wyqL��U$��/�����ns{��6��T~���iw���i�,�����g:���� �<� _����2k����+�B�`��!�<Krn|�G���Fc�;��&��k�B�x�@��l��-I�J��
U�`[S�v|�O������{�=�����&��B�N{��C������4$��]0��<h�Z8f����T�H��(��������d����9�(�z~{�>p�����uT9V�%������TJ2<o��9sge��'_��d�|R�J���B�>�Tl�lt4��|����I������~N������+�������PV��m��������t��T][#�r�'S����*c��<(e�2��&�*����5�����������]����pz�
m_�����	��<�~y�}������;UJ�M)Ju���#��h57v���0�d��y'��������({W���2��������6��d���fa7����'�1���/K��:!#���������(� (�K.�JMl�L��,�j��;AT�Xd���Z�@+���>�#@��|~�1���C������x �� �T��.itWx�nLDr�@i��=�D��5�:e\n~�mr�������r5�0�o�;��)+U��I9����Yck�g�x�fq.���t����2Q��e�'�J��9�(��s
$~u����By�vY��)p
�R��C���-	[?laP�:�����H����Ex��Q�O�~�%E��D��!�fF�E��|{�YN@���w�#{�E,��Go����������T��/�d6����[Nc�n3��+�����6!Eo�d��%�M�W������<J����doW���6�����S���'�+��O�I��zE��E9�N���{�g6��K����ss�Y��:}�r��Q����+��6'�l9,�c�f8~����fx>$����2��Do&��Z�[�!���n?fk����S-&Q/��2�uE�S��T�UI9�����J�$��+�(�@�*/�9����Z5~{��������<9Zk��._���_\�=:k�"������{���wiV�����k�2s
V�[��G��e��z�bEXy�U�c}����3$�_uGUl��W}tr����i��t����)[��^^\\3V���#�V��g��a�I�����0MT6�*17�`�"����o�S`�~4��s*[
_a�7��cA�q�aU�����Nk������
XyG�_��Dq0|�Wh_D�S6���v�����G�Djj���3O��}���_{���>
���W�G���_A��t�����n�i����	���e����e�{��N��������f���A���WhT���y�P�9�EP�=T9�rU����b���vB�k7(�r�R5����_(M��D5�RRGQ�*��x3="KK�I���U2�j�[�l�3�Ul�l"u�f��'�>�����g5�*�LD�\������1E��
\x:?.8>�_�-���[��t{��.���V��;o�OmaN�x�y�����'m����E������kEmI�}�����v�<d�<�t@����R/M��+f�zLU��m�*��
�)�*�����j��<��'2����^>���I��"�D�g�J��D�Q8����%���U� ��j����b!��D!w�	��0�,p�h�lMZH����*���Gt��`FGW:�uL.����a������w�@�?t�
7k/s%3�������Y��<����D��^rk��q���N�;�NFc�5�.����O��U��y�)�+��"R����(�"*�cI�����D������9���Y���t����f������M5:?��%{A�@q�G����L �\��eMN�An���
th�:t�05����b	�dqdE��E�x�3Vp$K�����z�w��r�S�rV$];���DEj�F��tEu��ME:)�
|�l�t
��I�W)2����-=W�a�e�3a8�k�%Og��Tt�<�'�c���8!Qowi�$d
b"zP���ghxi'	3��jU�\��k�K:kE6�q����q���1�����%n%��-�O<��d�x���5M���D������?_F��	6Yf6��6�(���{G&��J��|f��{zu�PG�s�7��EU��s��%Y�n6�Q!z�]Kn('��]XA�)��C�s���-g��������k��9x�c�#��Y5���@N!�U$oNr�K�[�G��B%�W(�J
��J}S�~_�����N�/n�$�Q(q�"E��[���Ls�:9;�>y���;sK<s�;�
��Ma��m�Z���M�N���^s{����C
B��e vq�=�C%���r� 41�JV`"����^�]^�������i��xej���W'������d�E����
��"���"�@% ���;:?=������ ��"�n,�4Y�CXu������nnu�fY�@����ctW��1@�)t�	/���(Kj��o��{�k�]���q'�P�yjh�Q�Q1p�a����x��m�c��}UQc_<
+&!�`��qD�9gRTU*fSUM.���Q�L$�Y������J�a	z����vu>���K���E��Cq`s��'�=���=`8���(��������J�����:��������m4�{��?��r���-yQ�8�������#l��+c>��C3�K�*5��������r��U�2H�����<�����L��0��Z��S��}���i`@�.�Qv�;Iud�*j��y���z�9�@�U�T���R�G���^���l�������e)nlSD���������v
�H�g�vl�����0�Ma?!�DC��)���V
���:�s��c�K�F|�����M�i(�6���C�e�I.��e�P��=W���N$��n%qA.��xtv��\y�|t�=���'����'�t���K�����^�K��(<�M�����7��	4���$S9�1��r��E�Y�Tz
�R��8x�w�/@G���O��qG|m��0�.@���&�-�k|n�LT��N�I�EO�� �B:����mE��p[)������nRQ2�=9�F�	�[=���_���������
��w�`��H��L���n���
��m�m(���<�V��L�x����H8���a5�s1���+1���������F����H����f��L�d�J�����v]�!/9r����Q����a?�no�t| C��o��ivZ����������n�A>A��������w����1OCp���|P���]�1��q�%�x������9w���j�/��������cy��
A!��i�~�M�t�M�����:��-���kZWn���`�
gDF��]b'�Q7;�]�U�������U��L�5����on�7�V���N�����*:����	����1�|��U��2q�
9s`(�fC2�|1�s1�#KVlHd�
�L���h�71m��P�0C��+>lp�9�����b�6�	�}�����@�e�z|b��c�	 A6IC��~���H��	%��}0��}�M�8����A��!�����wg'W�w'�d{`&�:�:��P�"�HW��s���11������������i�I
AQ�Xy�2�������,�lQ��Q>��D�L���0�P�TV��+��0x����������W����Veq�@h�=�.�������~�=�,�������&���9��@�Jt�\FOJ)T�Tb��b����"LA�
�}��rm�wT�N&5YAZW�����@���s[����RV����T��m����:���s@lD��,"t�t=����Z^E��UQ��um�����kA�����_=!l�C:z���4�d�8�,���b���^l�L'��d@��j�a%�U�q02Aw|�l������cG�����Z�
���+�����rW�,�O��j��f�zu�3�:����YyDj`2F��9
`��������1j�7�F��'��S��i#���P�4�NR ��]��#�I�%i@	�������Fs�cYx�v!��B���B�I��Q�SeLM.!����P6���A���'(�JZ�u������^L�'g�aL�0��(plH.T�'��|����������:](cp�����/uZ�������w`	ef��>�,zwu�Ma�bq.\z����K�t/���8E���4D�f��cI�%�Eh���d""�,d��*"��>����g���%7U��1_�V`�)�����>��j�* �9��~����&�������a?C�a�7�,����]���&9����������p�}����$������.R�M;t-���j��{�f�������e^�nl��� ���=��"CN��]�[��+4A�Q��^.%%��a���-����C���#V����M����i����������zt���
�� `�Fl>���.T�L�����B�SQSL�l�S��v�D��������C�>j��.T�&�`P\�BO�@
�� ����K��+u��s;�'����
����sX��<���E���J*�=��0�#������W�*�(�r8"$����%
����Rk-�����6I{�<h@��nP��F/CC�B���aa��'�tF�]�n{vh��,*����Q
�U��*@{:���>Y�����}��o���wY9��F��R��Lz��PX$	M}��'�<#�q��������Q`~�j|��j�V��gjT���jS�~��6�6v�[*M4���#+L�E)�cY����������KF�l�xx�_��<����rU���:����US�IS����Tma�t�����o��ZGm������]�����tRb�V'�25���wW�n-?����n��)���<�i���+_%g��N��i1����+���ZN�e���f>��A��qsg�Z�C����������
Q<8�>���x��.dL���[|�U��X����fR��N��*�Ps��z"����x(��������1:h!SW'�y���WcWc�Y���]x��9�j��l�YG�W�eI�`������`<N�Ox*������:��\��O�$��A������1G1�
)�/4���,��m�\J
�I��
�KE�SQ�Qr�<}-$������U[��O .�2��?�R��9�nb�u`{vyZ�T����-�v�5l��.�|\��(�5�����f��i�n5;�l���<=���4!qu4�w��*{�E��Q�`2;�h�������d�������v��up#>9�O5-����<\U(��
'|���aj�s��uXCL#���W	L������L���,N'R��,
ej~!y��h���i�[���v���p����Cs��){�0��GR�P�x��a���\K
r;�N�0��E������G7��o���=���b��Z����g��L�
ro���R�VP��K�����s���������u����V��cxEM+]'K|����z3Vo�@xb���#��p{�	��T�u���3��a'���C7\��a��m���X�R���T S�jc��;�t����x�a�pm��Y�`6U�Z��_�`bf2>�, q��L$C��n���u�������i�H9����|h�Ghx�����m�I�
z�����x:�$�����0w�w��@�:��zdQ�� 3s���Qt�%�+� ������M>����O&�><���,��q�b`s�q�l^�%��A�w���&�F�� ��<���Y�K���c�,~*��.��d/��yD��p��X���y�8���s��'�I�9\,�$��k/������
�,���#@���<Wjj�m��>G=J+*}����&O����vPO�U�O/�����5�nO��R�U�(n���d�';lQ���4{r;��>=>i.�}��2&?���@�\(�s���=����&���>Xjj�M�SP�GU2��j�:{,��N���1/�r���5�G��|NTUt%wEzPd�%�^���|�x����*���,".���f2����*��

�a���r�-���o���o������>/t\?9������kA��[�G�GD����k�v�6�����j��U��cG=�rs��n��>��@@���t&�6��T���q�9��!9���]<��7�����g�y�+�s@D������H��� ,DG��*)��'�&�:�GR
��e���by���4�8�.�
���Gp���;M�8���U_�c���@Yk,�p�^rTcN�w����M�F�m�7}������./��*m5b&Ix=Y�}���S{�8�����S����7Z��8G�]��X�b�� ����M��@�,�w��H��>rQ{���:J�a(� �$WIi�N���0HeG�P�c���%��j�y�XK�����w@�1���H��_Y�8��w�	�h�V�	@��IV��
a!n��Vb�w���QHGs��s)��RCM��@�sq���5�2�5����z1V��\��,�y���M���m��f�=�����
Hl����#����Qg|�N�X$�Q��4�h=��g(���+�Gu7��_��J��}�9���C7<=cm�?�B�.=�0�����0eRZU�� �:��^{c���l�7�w�r�b�b��-2��2)���/:��n\��V�o
�0����M�b��#���]@e�N2��b2��>��#4M����\����N��Exa�>#�
M)l�c�������
s���1d�Y�����btw_�Z�M��>���e��<A�~�I>|x��$�����q�(P`�N�h������S�^������]w�_�t_]]s�A���N(���9����yF~�*����,����
I��:|qf'�O�GHy��k/��~}r������P�Rv9���_���,��m77�3#'���DA��F��,�V��O�-�\R
3��\T����U6�uE���%-J*����Ap�������m���B,�@8��C(��1��M��lMv(�o�MnX���[$Rd��������L���$���������E)���(;S���3sa����:��t.�?���`Q�����m�u�e�������	���JP�|��=�<����J�y��Ka�"��:�;il�	���D��q6E�X(M��X������
�_Y�^�:��4��bP��b�Q�'+��7%�����v2����%[oB��������Q�/�u7�:�����kpc��D6���tocc����c��("Q���������
g���*�;v���GwYge$�8)X�����5%O�����U���I4��}Dw4�K�g��e������bA4���M3o�f�Z"�������k�.2�({��SNp������������nnnXQ9,�I����}�����(���2����.~*��Q@�^�_�0}x�S��5��!���{��Q���{G
��]��9^"9�Dp�&h[G�9�aO��&��7}1z\v���w\�|{J�<������ud�/S��8���Q��
�t_������������<�X�������������6��J��PwYZ�(�0�&9��{�����3�����z��d����v����z;,��s�~\L��
�-�1HA!]h"���H�l�K
�J?���8��T�����K��1548;Y)�g��>6���&	���d�y^�(��w�J3��c�G��(��Q��+'�Q(�LN�*�n@�R�	����I��nF�y��"�I��
&����#G�
�T���TJ[xv'����'��{Y�r�����mXJ�H��yuru|Y_��a��r89)�sY���elS��| �{;@��������w��=4jwDT���9����NK,u��D7��]��������L�R�/�j-O����.N��lZ�M��}U��W�O�����
o��;�U�����-B�rx�m�n{�?v��9�)s�+�����m�f�,���:V��tHK��O�J��I1���2�w��f���������r!R��g��y�f�(~�k���9��
	=\|�B/��a����pV`�}Gu����Dv������dq��|�����g�����b�z�GHU�(<�6^�6w@���T/~`�'�%�c���s��_�p���J�b9�p�D�c�]������Kp����.<�Kw�J���
J�*�=���6Eo���p+������l��3���2�ru�����H_��m5��]�pE��.�'������:9;9������Z�P|��<�B���Jq�rI�����i(�a��*�k�V6�����\����e7��Pid{�:�*����%6�?n2�T�Q;����&� �U��/�//��Z����\�����
������/�9[�y!��������>��������Z�	u���|pDx�?b�1B��7�a��+)�AoaG��R�+eW�7)�Z�P3�������p�b���.���d&T��=hg��;����RL�"�����+���ua(�PA�]��PE�(�]O!�*��T]>���l�V-�=
x�zk2�6���z�i�ECQV�t��������<�	;Q��B}(�bHou����J���	�&����9v�N����9��Z�0Z�����]��U�i�U#���V>;y}�[��u"�V�l��H��N�7P��X���������~�.����%�;��������Yv�M��lq��Ya�2�4��M|�2���2����^�(H�?�����bN�����^V�!��A�)*9��"�C��cJ��b}��.��c��Ywo�mT���+�=o2��[�=_����L;_B�����mg�}m����!�U�qjKS�H�EI���$r�I�����B�7�����c"`�}k�;����>���^��]��f}��(�yV'�Y�J��<
�_f��.�ZE7�A?�a\��On3�`�^��Zh��*��2�iY|Zhl�N����r+ �I��K��%�"Oi	��[����~�Y"����R�z�h�V�����C���F=g��nV�t/��{����
i�~�<�����~�{���&���qU���e+�����9n+�����
n+����b?n#�,[;��i��Q��P�+��R_�-�Tt]]�[�%Y�XZ�L}5,A�'�+q�$����)|���R�������������+��?x�=�-��S^�1qe�2fY
�-�,�Z7����,x
�K"+�W�*�H}V
��W*u�R�#��JK��S�,�9�eQ�*��]�r�xU�S��|�8�G?�a|	5�6��_P��%M;�b`v���]qR,��T��8�
�v���7t�k;��5����+��*���v��bB�N���!���j����+W�xN�U��(���d�������;����lh�^���k��Y$��|c7_���"��=~�9�Z����9-��r�q�o���8_����ByC���h��]��).�Z�O�����5�U�Z�N}x�������A�.��l�-]o���f�����~l��i������|w�#�o��������=}Z(<m3:�^��[C,���������2�|�r>"�s�/7�	�b�@>��'k ���Of^�����I��c�����(A�m$�e��H%5�5����<�&��s���8��k�7��o��\��|s
����_�5��n�5����|
��|�)��S�����OA��7��?�!�7����'<A��	�7��o>�|��|�9������O�7��o>	�|��$|�I�����7l}��|��A>�� +��{;-��j����V���A�l;fz��+b�@����M�����/������'Ek	�C��w"t�$DOFb2��)���
��QN��0	�]��)��	����@�DFW��;?������y�����-��e}�)�
k�O$�l��h�"��p������3��'�F��h��y|��vsk�J�D�u �y��&�2�&?��+��0<|�A	Q8�#6�8���u���:�(���
#Q��N�*���#�����_�~*����A/KjX�T��\�FC�^�c�C���QvZ��wI����� �r�������b��s��`�(�(aJA�,���0eec:L�=L�i�����{���|;�T�+S��������joma��-'�==21����]�{8�:Ft�	E�1L���cJ�|)��'�PHY��A���b������������b���m����G�G!���A��)m���
x4���E�K��at�������b�(#N��Y"�q�����P����A���Z���nq�1@S"��8R�7�_����38��.�x�k��% �w����;
����T�����y�����V �*~�z[/Eiw�����1�tR'JD4V�����@���50��J��98�0|��dr{G(g`L�0�YD��%�*�Q0\ �.1�_���w\$������&s1������B�&�&cr�{��I��^��B�@�:�W��E��9�qG��+��e0�	��hX�w&�(��#L��!���d7�
`�a��������pWJ��d[�y�x+J�"�� �y���#����J6]�����]&���t�$�3���=L(M�@�B�,B�g ��f��-����4=�D���}���_ro������YfR�����W-`�>�p�a��>w3H�������5q�a6�F�4�+���W�u�`=�k� �
M��%�A�c����'�{$���1
� �\��8=dM��s�_���*����#�AA5�����'q�P����J2&�X>��@��Xh�����x�T��S|�����&v���P��?��l����/���R~A���)y\�N��?��g���GL)<mXro�B7��cKd��|jpA��|�b"��G�b�iK���cF��� �1�->����(��xY�z�w���(��"
���+-��E�B�Z��S<3�e	w(�
5�*�KQ�i[�Qe�m�����>�Dy{4��E������t���b���H),J��������
�8��!�?���;�e�`�PV���ks:�(���a�D,y���6D�#�[9L�*v�Far���dN�>R��!7�(��M6(#Q��`FTC��S���*��Oi80r~���*�U�JK�!,��X:�w��2h�#1�C�U�����:�-����3��d�G��	f��e��`*�������@zA��"�I��ao���(X��VoBI���|��]��]��+N�)��9������U��"�x���k	6@y#���@$��0X��)U*o��*1 ���qs���H�E�#�SL�m�/�kik_c�YI �G��6���(�]�\8b�����2�Wb�� }�q{��0xP��L��L�E�T���PmS>��K���8��d�2O��+B�r���rf���D-�*}uQ4������]l�����*
�������]-��a��N������r�f�=��9DoH�1�T<��(�r��%��A��4��+�'$����w�[�KLu��}�������d��f�7p:e���0���)��A��a�!����=R��0&���hs(����z�=Au���T���p�"/v�u���'Hs�c�;S���I�NZ�n����7��k+>�Z�����+��
�������n�6�����j-'����f�4�!������Z��k�m�/�IO(�}q�8�lG�F����~���h\1�&������j����!������$���4T�(2g������*A	sm�&T���A��y`����,���b
��{-�?�Z�1��Gg�'�������X��T1���o0!��%_���6aEK2���0��3��X��:GWP1N�;zu�L2�h��d"���v,6Q����4Hd��P9�3;	�V�uu8�Q��8�Z�_�� ����&����;�N�������S=!
�����XZ���������'��g[����V����&�6���e�,�@�d�)�eW%����m�����{E�[�������3q.�>��@0�t����0��9Y �u|�|���Rz�TXxE��/
sR�"a�J/NF��
���L�����~�2���Lmb]����Y�����P��*S��|.!��'��U8Ep�j=^d��dam{���k���lY�;��I���d����(�����D)�LT�{@"�\cS\[���7y~uim��'�T�����;�5�0�L�sfY�J�F��Cg��2�����2*p���6�p4J�J&	����v`Ik��8N#�9zl(����V�����Y�}�c9��FdQ����m����C��,�G��y�PX���Z�C�i�,tf��l�������yd��P�1�
LK��������5��m��XI��n�&��6'���?ut�-��O����G@n1�(�ZT��l0�tW<�v��r4p8�2bzue��=#����#���'�JM3�
#79Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#78)
Re: [PATCHES] GIN improvements

Teodor Sigaev <teodor@sigaev.ru> writes:

I'm very sorry, but v0.24 has a silly bug with not initialized value :(.
New version is attached

I looked at this a little bit --- it needs proofreading ("VACUUME"?).

Do we really need an additional column in pgstat table entries in
order to store something that looks like it can be derived from the
other columns? The stats tables are way too big already.

Also, I really think it's a pretty bad idea to make index cost
estimation depend on the current state of the index's pending list
--- that state seems far too transient to base plan choices on.
It's particularly got to be nuts to turn off indexscans entirely
if the pending list is "too full".  Having some lossy pages might
not be great but I don't believe it can be so bad that you should
go to a seqscan all the time.

regards, tom lane

#80Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#79)
1 attachment(s)
Re: [PATCHES] GIN improvements

I looked at this a little bit --- it needs proofreading ("VACUUME"?).

Sorry, VACUUME fixed

Do we really need an additional column in pgstat table entries in
order to store something that looks like it can be derived from the
other columns? The stats tables are way too big already.

It's not derived, because vacuum resets n_inserted_tuples to zero, but it
doesn't reset tuples_inserted, tuples_updated and tuples_hot_updated. So,
n_inserted_tuples is calculable until first vacuum occurs.

Also, I really think it's a pretty bad idea to make index cost
estimation depend on the current state of the index's pending list
--- that state seems far too transient to base plan choices on.

I asked for that. v0.23 used statistic data by calling
pg_stat_get_fresh_inserted_tuples(), so revert to that.
It's possible to add pending list information to IndexOptInfo, if it's acceptable.

It's particularly got to be nuts to turn off indexscans entirely
if the pending list is "too full". Having some lossy pages might
not be great but I don't believe it can be so bad that you should
go to a seqscan all the time.

It's impossible to return "lossy page" via amgettuple interface. Although, with
amgetbitmap it works well - and GIN will not emit error even bitmap becomes lossy.

In attached version, gincostestimate will disable index scan if estimation of
number of matched tuples in pending list is greater than non-lossy limit of
tidbitmap. That estimation is a product of indexSelectivity and number of tuples
in pending list.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

fast_insert_gin-0.26.gzapplication/x-tar; name=fast_insert_gin-0.26.gzDownload
#81Jeff Davis
pgsql@j-davis.com
In reply to: Tom Lane (#79)
Re: [PATCHES] GIN improvements

On Mon, 2009-02-02 at 20:38 -0500, Tom Lane wrote:

Also, I really think it's a pretty bad idea to make index cost
estimation depend on the current state of the index's pending list
--- that state seems far too transient to base plan choices on.

I'm confused by this. Don't we want to base the plan choice on the most
current data, even if it is transient?

Regards,
Jeff Davis

#82Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#81)
Re: [PATCHES] GIN improvements

On Wed, Feb 4, 2009 at 1:39 PM, Jeff Davis <pgsql@j-davis.com> wrote:

On Mon, 2009-02-02 at 20:38 -0500, Tom Lane wrote:

Also, I really think it's a pretty bad idea to make index cost
estimation depend on the current state of the index's pending list
--- that state seems far too transient to base plan choices on.

I'm confused by this. Don't we want to base the plan choice on the most
current data, even if it is transient?

Regards,
Jeff Davis

Well, there's nothing to force that plan to be invalidated when the
state of the pending list changes, is there?

...Robert

#83Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#82)
Re: [PATCHES] GIN improvements

On Wed, 2009-02-04 at 14:40 -0500, Robert Haas wrote:

Well, there's nothing to force that plan to be invalidated when the
state of the pending list changes, is there?

Would it be unreasonable to invalidate cached plans during the pending
list cleanup?

Anyway, it just strikes me as strange to expect a plan to be a good plan
for very long. Can you think of an example where we applied this rule
before?

Regards,
Jeff Davis

#84Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#83)
Re: [PATCHES] GIN improvements

On Wed, Feb 4, 2009 at 4:23 PM, Jeff Davis <pgsql@j-davis.com> wrote:

On Wed, 2009-02-04 at 14:40 -0500, Robert Haas wrote:

Well, there's nothing to force that plan to be invalidated when the
state of the pending list changes, is there?

Would it be unreasonable to invalidate cached plans during the pending
list cleanup?

Anyway, it just strikes me as strange to expect a plan to be a good plan
for very long. Can you think of an example where we applied this rule
before?

Well, I am not an expert on this topic.

But, plans for prepared statements and statements within PL/pgsql
functions are cached for the lifetime of the session, which in some
situations could be quite long.

I would think that invalidating significantly more often would be bad
for performance.

...Robert

#85Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#83)
Re: [PATCHES] GIN improvements

Jeff Davis <pgsql@j-davis.com> writes:

On Wed, 2009-02-04 at 14:40 -0500, Robert Haas wrote:

Well, there's nothing to force that plan to be invalidated when the
state of the pending list changes, is there?

Would it be unreasonable to invalidate cached plans during the pending
list cleanup?

If the pending list cleanup is done by VACUUM then such an invalidation
already happens (VACUUM forces it after updating pg_class.reltuples/
relpages). What's bothering me is the lack of any reasonable mechanism
for invalidating plans in the other direction, ie when the list grows
past the threshold where this code wants to turn off indexscans. Since
the threshold depends on parameters that can vary across sessions, you'd
more or less have to send a global invalidation after every addition to
the list, in case that addition put it over the threshold in some other
session's view. That's unreasonably often, in my book.

Also, as mentioned earlier, I'm pretty down on the idea of a threshold
where indexscans suddenly turn off entirely; that's not my idea of how
the planner ought to work.

But the real bottom line is: if autovacuum is working properly, it
should clean up the index before the list ever gets to the point where
it'd be sane to turn off indexscans. So I don't see why we need to hack
the planner for this at all. If any hacking is needed, it should be
in the direction of making sure autovacuum puts sufficient priority
on this task.

regards, tom lane

#86Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#85)
Re: [PATCHES] GIN improvements

But the real bottom line is: if autovacuum is working properly, it
should clean up the index before the list ever gets to the point where
it'd be sane to turn off indexscans. So I don't see why we need to hack
the planner for this at all. If any hacking is needed, it should be
in the direction of making sure autovacuum puts sufficient priority
on this task.

Autovacuum will start if table has GIN index with fastupdate=on and number of
inserted tuple since last vacuum > autovacuum_vacuum_threshold.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/