pgsql: Buffering GiST index build algorithm.

Started by Heikki Linnakangasalmost 15 years ago7 messagescomitters
Jump to latest
#1Heikki Linnakangas
heikki.linnakangas@enterprisedb.com

Buffering GiST index build algorithm.

When building a GiST index that doesn't fit in cache, buffers are attached
to some internal nodes in the index. This speeds up the build by avoiding
random I/O that would otherwise be needed to traverse all the way down the
tree to the find right leaf page for tuple.

Alexander Korotkov

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/5edb24a8983e4a103e26153853d91141f818227c

Modified Files
--------------
doc/src/sgml/gist.sgml | 34 +
doc/src/sgml/ref/create_index.sgml | 20 +
src/backend/access/common/reloptions.c | 11 +
src/backend/access/gist/Makefile | 2 +-
src/backend/access/gist/README | 135 ++++
src/backend/access/gist/gist.c | 211 +-----
src/backend/access/gist/gistbuild.c | 1068 ++++++++++++++++++++++++++++
src/backend/access/gist/gistbuildbuffers.c | 787 ++++++++++++++++++++
src/backend/access/gist/gistutil.c | 27 +-
src/backend/access/gist/gistxlog.c | 6 +-
src/include/access/gist_private.h | 182 +++++-
11 files changed, 2297 insertions(+), 186 deletions(-)

#2Thom Brown
thom@linux.com
In reply to: Heikki Linnakangas (#1)
Re: pgsql: Buffering GiST index build algorithm.

On 8 September 2011 15:56, Heikki Linnakangas <heikki.linnakangas@iki.fi> wrote:

Buffering GiST index build algorithm.

When building a GiST index that doesn't fit in cache, buffers are attached
to some internal nodes in the index. This speeds up the build by avoiding
random I/O that would otherwise be needed to traverse all the way down the
tree to the find right leaf page for tuple.

Sorry Heikki...

s/additionaly/additionally/

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Thom Brown (#2)
Re: pgsql: Buffering GiST index build algorithm.

On 08.09.2011 18:04, Thom Brown wrote:

On 8 September 2011 15:56, Heikki Linnakangas<heikki.linnakangas@iki.fi> wrote:

Buffering GiST index build algorithm.

When building a GiST index that doesn't fit in cache, buffers are attached
to some internal nodes in the index. This speeds up the build by avoiding
random I/O that would otherwise be needed to traverse all the way down the
tree to the find right leaf page for tuple.

Sorry Heikki...

s/additionaly/additionally/

Sigh, fixed. I also reworded the sentence a bit. Thanks once again, Thom.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Heikki Linnakangas (#1)
Re: pgsql: Buffering GiST index build algorithm.

On 09/08/2011 10:56 AM, Heikki Linnakangas wrote:

Buffering GiST index build algorithm.

When building a GiST index that doesn't fit in cache, buffers are attached
to some internal nodes in the index. This speeds up the build by avoiding
random I/O that would otherwise be needed to traverse all the way down the
tree to the find right leaf page for tuple.

This seems to have broken MSVC builds:

"C:\prog\bf\root\HEAD\pgsql.5584\pgsql.sln" (default target) (1) ->
(postgres target) ->
.\src\backend\access\gist\gistbuild.c(423): warning C4013: 'round' undefined; assuming extern returning int

"C:\prog\bf\root\HEAD\pgsql.5584\pgsql.sln" (default target) (1) ->
(postgres target) ->
gistbuild.obj : error LNK2019: unresolved external symbol round referenced in function calculatePagesPerBuffer
.\Debug\postgres\postgres.exe : fatal error LNK1120: 1 unresolved externals

Maybe we need to include math.h. And while we're about it, should the
result of round() be cast to an int, since that's what the function returns?

cheers

andrew

#5Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#4)
Re: pgsql: Buffering GiST index build algorithm.

On 09/08/2011 04:13 PM, Andrew Dunstan wrote:

On 09/08/2011 10:56 AM, Heikki Linnakangas wrote:

Buffering GiST index build algorithm.

When building a GiST index that doesn't fit in cache, buffers are
attached
to some internal nodes in the index. This speeds up the build by
avoiding
random I/O that would otherwise be needed to traverse all the way
down the
tree to the find right leaf page for tuple.

This seems to have broken MSVC builds:

"C:\prog\bf\root\HEAD\pgsql.5584\pgsql.sln" (default target) (1) ->
(postgres target) ->
.\src\backend\access\gist\gistbuild.c(423): warning C4013:
'round' undefined; assuming extern returning int

"C:\prog\bf\root\HEAD\pgsql.5584\pgsql.sln" (default target) (1) ->
(postgres target) ->
gistbuild.obj : error LNK2019: unresolved external symbol round
referenced in function calculatePagesPerBuffer
.\Debug\postgres\postgres.exe : fatal error LNK1120: 1
unresolved externals

Maybe we need to include math.h. And while we're about it, should the
result of round() be cast to an int, since that's what the function
returns?

Or use rint(), which we certainly have.

cheers

andrew

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#5)
Re: pgsql: Buffering GiST index build algorithm.

Andrew Dunstan <andrew@dunslane.net> writes:

Maybe we need to include math.h. And while we're about it, should the
result of round() be cast to an int, since that's what the function
returns?

Or use rint(), which we certainly have.

Yeah, this broke the build on my old HPUX box too. I committed a quick
fix using rint(), but on reflection I wonder if it shouldn't be ceil().

regards, tom lane

#7Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#6)
Re: pgsql: Buffering GiST index build algorithm.

On 08.09.2011 23:44, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

Maybe we need to include math.h. And while we're about it, should the
result of round() be cast to an int, since that's what the function
returns?

Or use rint(), which we certainly have.

Yeah, this broke the build on my old HPUX box too. I committed a quick
fix using rint(),

Thanks!

but on reflection I wonder if it shouldn't be ceil().

Doesn't really matter. The exact value of that parameter isn't critical
to the algorithm, and it can't go down to 0.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com