GiST splitting on empty pages

Started by Andrew Gierthover 11 years ago2 messages
#1Andrew Gierth
andrew@tao11.riddles.org.uk

This is from Bug #11555, which is still in moderation as I type this
(analysis was done via IRC).

The GiST insertion code appears to have no length checks at all on the
inserted entry. index_form_tuple checks for length <= 8191, with the
default blocksize, but obviously a tuple less than 8191 bytes may
still not fit on the page due to page header info etc.

However, the gist insertion code assumes that if the tuple doesn't fit,
then it must have to split the page, with no check whether the page is
already empty.

So this crashes with infinite recursion in gistSplit (which also lacks
a check_stack_depth() call):

create extension pg_trgm;

create table t1 (a text);
create index on t1 using gist (a gist_trgm_ops);

insert into t1 values (
'vvsehbxxlezgwtyvbgdyburmhxmuzbwvwoaepxbbbaiyuguwnxvprmbzkotkqqfnegruds'
'wftedolykzvfonsqndehouxuibazwdrtjlynzjlkihqxvjnimrpbmnvupvtlzlejxdwwmh'
'hvpxtkggstyivlvqgkmawmlbvjerfrzmnokgyrnrllagwwxdgjddwofrxjidbiqowbvusi'
'mdumkrihuprxsnmyekhnojvexsftmybzcwlmntuijlfcyracciqqrmuoeairzkqbgcouvi'
'cfthhszhvbplshxmwcnetnokovmdpimrnfzuxzcsaseszcfvetaxgoivjuzzclqprqkopn'
'hqgmjgoocsicqpqylatkzvvqlmhwbwjjmpvvwkkyctatirstsldsgzismqonmxxzntvkdf'
'ifzizharbsdfkjetcrjqfwocvcvqmywuevddddvevyjgiozkfialfpnarjqdinymibqlem'
'qakzgtofeuoeftutulclpkynxgoostaitkizewfirunxnhqhttsiervbxkpqdqyxbhxfdc'
'nvwbskiirbckkgbfizqypuoorpvovzqiunjnxswpuyaefbkobbmrvbgmrbbmbsvwffjcxf'
'ssesxjtiyvjkmemsrdusqvklspqbsohkhlcevwmtrveeaqwrurjknuwfkngcbnnjzpnvma'
'odvsiwjfnewxpjslocyveajsjjhxeuxsxtlgqvldzhbortagvybazlsjuagyueqsycyoxj'
'swrtljnlpikrjjccswczuxelpdnorlyjhpdszqdozngjxilqfoqalumaxapplnzscclctp'
'rtdxdagorlchmocypayepjrpcusowldnfgkihrxzcagoojndjmizwzoyugmqsqeyxpgege'
'ejelytulxeyfdufsszzfqrvupskwxrbbafnyzhkwlicpchivhhaxywsopdlnumpusctrje'
'ovmqlpytlfamdziwnxzyltlaodciummihzzsoxmadmmgluczscxdwiekmgsgsfpaeostme'
'tprfwcazbtbzwyibiuhbbahqalftfryyhpeeseduxftudcvmwdoxdvodgtxllvktkoxdta'
'xrgqmjtiwqlknpfctmwqyhliawxyzrywienvogdkwovkeanxmjnkrztsvrqviprquflimp'
'tjeouiphfcrtnisgaoxrjfgbwahijuxddbsxkhfqjwjwfcdgrbxagdcdekmoekshmkfwsl'
'mbivynyctpdrqkutnzdaohkgpwqvsihfkpajczlwoonfziynibnwjxczttumcbrnrswtri'
'qgxelwmjjvlwruuutnoozqpregjbaajrhhvsdicndnkvhepbseprvfjzmsamtkearzsuiu'
'ilhsgpwwqoafgvkpuwhujbenbwnuqvoygwrnlnjccjhiesyyogtyhymiuzclvrkbobpapy'
'crhjalcykreepmdbvyaxkvpuxdwmdllfcspdesbpjguysyaowbmhwbcufyhiksonleqpws'
'ffyzerxefufrcctexydegnxvajzrywjebiegzckfxqxzsqdpohuvusrvbrmanwepeivelg'
'jiwhhoxlemszimraisrvterytwhpasvkarrhgptlclklyblhuccnhumbqtqrllcldutkkn'
'vmyfyxhkecmhqubcvsvmkgxmsbgllqyhdxmbuulzwygmtipoakakqywjadvltusxrfymzk'
'mwjsjcayqbirlzpiipmebfyucqabcampwvigxieoknfwnvfvlranxyiaoibringfjolgxq'
'uhdaeqwjmhamvxldxzlzqunxawmmdjcyrgzvxvfjcfwydhbbhmbxhovhlhtoqwnicmeahj'
'jkpgitojuvwvtekomvwfkncxvfkzfrjpcyvlskvmfrizwsoiokoyxqwsvhsazbpbalmsvh'
'fbznavgoeuystwjpoexhfwjvxkgkdcridrwdncsrxrkqntgbydjdzszwcfgghyolqlodnh'
'ukyfblyhnwkwajpzgsfnynlnybynfmuzxseyddfrapnaycafugsstdfsfefkqaknsplwsq'
'ntgbufdukybcrugxnmbsxrsielxqiqhwjnxdtbydzzgqunnhzoawgsflecbmtjjcxggqhe'
'tgeaxynkgmzgjgzordrtqkdznaftqnyktdkrcxcikbouiniarathkxgyxmsnzrytuikwfm'
'eqotkxgtxxtrfeomclyvzymxrggcdmpicebmbifyfzpldexgqvbptnnlutnxfdfihhuipa'
'hvaxgdbdkszliszvetpsrvvxddeymuytpyrvctzmlyytrxovreojzjhcnlazgzsvykrbdq'
'nopmhgjwcbaqlaasdneemkdfgcpxdtoqhddoknlmomdzmdrprvtegxkmzajctytacxpmka'
'zzncyzgqpxmjcsgmfgmojfndgpawckwbjjeijlzzjmilfpxkwkzfqmjxbjteuqfeaknjvm'
'iezrqegnodynjpasmbbffwvlavwnfraowzfmdmaspygyograisgopcaqxwednerkexwijw'
'azvhyjnpkwiqkxsloqhsuvwlfbjbjtykturrefhpcpfnnyybpftjaqvfsfhbygmraejekq'
'umfzztyxuocoydftixzqzxwlpxpyczowmuwlnuiiilxgocaxaaozxklnialkaagmucyixh'
'qgsnnhqnfqntpleaymbkxckdpfgnnduejnrwuikayytokyoilqtisdmhisvwwpafcscxan'
'xylrnvpcebsxjlbvtkoogkegqhzsfzgdyrulnknslgqusrqmbebhpfofnnysnewlvqxjal'
'cmrshkjxwkcxsrdhwquujhzftvwqexbgjtyqdioatqxliatfrnabvzhoueeybgflzecdmq'
'dghbsqclvuyvvtudiohm'
);

Suggested fixes (probably all of these are appropriate):

1. gistSplit should have check_stack_depth()

2. gistSplit should probably refuse to split anything if called with
only one item (which is the value being inserted).

3. somewhere before reaching gistSplit it might make sense to check
explicitly (e.g. in gistFormTuple) whether the tuple will actually
fit on a page.

4. pg_trgm probably should do something more sensible with large leaf
items, but this is a peripheral issue since ultimately the gist core
must enforce these limits rather than rely on the opclass.

--
Andrew (irc:RhodiumToad)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Andrew Gierth (#1)
Re: GiST splitting on empty pages

On 10/03/2014 05:03 AM, Andrew Gierth wrote:

This is from Bug #11555, which is still in moderation as I type this
(analysis was done via IRC).

The GiST insertion code appears to have no length checks at all on the
inserted entry. index_form_tuple checks for length <= 8191, with the
default blocksize, but obviously a tuple less than 8191 bytes may
still not fit on the page due to page header info etc.

However, the gist insertion code assumes that if the tuple doesn't fit,
then it must have to split the page, with no check whether the page is
already empty.

[script to reproduce]

Thanks for the analysis!

Suggested fixes (probably all of these are appropriate):

1. gistSplit should have check_stack_depth()

2. gistSplit should probably refuse to split anything if called with
only one item (which is the value being inserted).

3. somewhere before reaching gistSplit it might make sense to check
explicitly (e.g. in gistFormTuple) whether the tuple will actually
fit on a page.

4. pg_trgm probably should do something more sensible with large leaf
items, but this is a peripheral issue since ultimately the gist core
must enforce these limits rather than rely on the opclass.

Fixed, I did 1. and 2. Number 3. would make a lot of sense, but I
couldn't totally convince myself that we can safely put the check in
gistFormTuple() without causing some cases to fail that currently work.
I think the code sometimes forms tuples that are never added to the
insert as such, used only to "union" them to existing tuples on internal
pages. Or maybe not, but in any case, the check in gistSplit() is enough
to stop the crash.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers