Partial match fix for fast scan

Started by Alexander Korotkovalmost 12 years ago4 messages
#1Alexander Korotkov
aekorotkov@gmail.com
1 attachment(s)

Hi,

GIN partial match appears to be broken after fast scan. Following simple
test case raises assertion failure.

create extension btree_gin;
create table test as (select id, random() as val from
generate_series(1,1000000) id);
create index test_idx on test using gin (val);
vacuum test;
select * from test where val between 0.1 and 0.9;

Attached patch fixes bugs in entryGetItem function.
I would especially point that "continue;" checks "while" condition even if
it's postfix "while". That's why I surrounded tbm_iterate with another
"while".

------
With best regards,
Alexander Korotkov.

Attachments:

ginfastscanfix.patchapplication/octet-stream; name=ginfastscanfix.patchDownload
diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
new file mode 100644
index dfc3dc4..d698433
*** a/src/backend/access/gin/ginget.c
--- b/src/backend/access/gin/ginget.c
*************** entryGetItem(GinState *ginstate, GinScan
*** 732,739 ****
  
  		do
  		{
! 			if (entry->matchResult == NULL ||
! 				entry->offset >= entry->matchResult->ntuples)
  			{
  				entry->matchResult = tbm_iterate(entry->matchIterator);
  
--- 732,741 ----
  
  		do
  		{
! 			while (entry->matchResult == NULL ||
! 				(entry->matchResult->ntuples > 0 &&
! 				 entry->offset >= entry->matchResult->ntuples) ||
! 				entry->matchResult->blockno < advancePastBlk)
  			{
  				entry->matchResult = tbm_iterate(entry->matchIterator);
  
*************** entryGetItem(GinState *ginstate, GinScan
*** 752,758 ****
  				 */
  				if (entry->matchResult->blockno < advancePastBlk ||
  					(entry->matchResult->blockno == advancePastBlk &&
! 					 entry->matchResult->offsets[entry->offset] <= advancePastOff))
  				{
  					entry->offset = entry->matchResult->ntuples;
  					continue;
--- 754,761 ----
  				 */
  				if (entry->matchResult->blockno < advancePastBlk ||
  					(entry->matchResult->blockno == advancePastBlk &&
! 					 entry->matchResult->ntuples > 0 &&
! 					 entry->matchResult->offsets[entry->matchResult->ntuples - 1] <= advancePastOff))
  				{
  					entry->offset = entry->matchResult->ntuples;
  					continue;
*************** entryGetItem(GinState *ginstate, GinScan
*** 767,772 ****
--- 770,778 ----
  				entry->offset = 0;
  			}
  
+ 			if (entry->isFinished == TRUE)
+ 				break;
+ 
  			if (entry->matchResult->ntuples < 0)
  			{
  				/*
#2Fabrízio de Royes Mello
fabriziomello@gmail.com
In reply to: Alexander Korotkov (#1)
Re: Partial match fix for fast scan

On Thu, Apr 10, 2014 at 11:09 AM, Alexander Korotkov
<aekorotkov@gmail.com>wrote:

Hi,

GIN partial match appears to be broken after fast scan. Following simple
test case raises assertion failure.

create extension btree_gin;
create table test as (select id, random() as val from
generate_series(1,1000000) id);
create index test_idx on test using gin (val);
vacuum test;
select * from test where val between 0.1 and 0.9;

Attached patch fixes bugs in entryGetItem function.
I would especially point that "continue;" checks "while" condition even if
it's postfix "while". That's why I surrounded tbm_iterate with another
"while".

Interesting... to me (using current master) your test case doesn't fail...

fabrizio=# select * from test where val between 0.1 and 0.9;
id | val
--------+-------------------
1 | 0.554413774050772
2 | 0.767866868525743
3 | 0.601187175605446
...

But fail if I change the values of between clause:

fabrizio=# select * from test where val between 0.1 and 0.19;
ERROR: tuple offset out of range: 8080

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

#3Alexander Korotkov
aekorotkov@gmail.com
In reply to: Fabrízio de Royes Mello (#2)
Re: Partial match fix for fast scan

On Thu, Apr 10, 2014 at 8:22 PM, Fabrízio de Royes Mello <
fabriziomello@gmail.com> wrote:

On Thu, Apr 10, 2014 at 11:09 AM, Alexander Korotkov <aekorotkov@gmail.com

wrote:

Hi,

GIN partial match appears to be broken after fast scan. Following simple
test case raises assertion failure.

create extension btree_gin;
create table test as (select id, random() as val from
generate_series(1,1000000) id);
create index test_idx on test using gin (val);
vacuum test;
select * from test where val between 0.1 and 0.9;

Attached patch fixes bugs in entryGetItem function.
I would especially point that "continue;" checks "while" condition even
if it's postfix "while". That's why I surrounded tbm_iterate with another
"while".

Interesting... to me (using current master) your test case doesn't fail...

fabrizio=# select * from test where val between 0.1 and 0.9;
id | val
--------+-------------------
1 | 0.554413774050772
2 | 0.767866868525743
3 | 0.601187175605446
...

But fail if I change the values of between clause:

fabrizio=# select * from test where val between 0.1 and 0.19;
ERROR: tuple offset out of range: 8080

It must be compiled with --enable-cassert to fail on assertion.

------
With best regards,
Alexander Korotkov.

#4Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Alexander Korotkov (#3)
Re: Partial match fix for fast scan

On 04/10/2014 10:00 PM, Alexander Korotkov wrote:

On Thu, Apr 10, 2014 at 8:22 PM, Fabr�zio de Royes Mello <
fabriziomello@gmail.com> wrote:

On Thu, Apr 10, 2014 at 11:09 AM, Alexander Korotkov <aekorotkov@gmail.com

wrote:

GIN partial match appears to be broken after fast scan. Following simple
test case raises assertion failure.

create extension btree_gin;
create table test as (select id, random() as val from
generate_series(1,1000000) id);
create index test_idx on test using gin (val);
vacuum test;
select * from test where val between 0.1 and 0.9;

Attached patch fixes bugs in entryGetItem function.
I would especially point that "continue;" checks "while" condition even
if it's postfix "while". That's why I surrounded tbm_iterate with another
"while".

Interesting... to me (using current master) your test case doesn't fail...

fabrizio=# select * from test where val between 0.1 and 0.9;
id | val
--------+-------------------
1 | 0.554413774050772
2 | 0.767866868525743
3 | 0.601187175605446
...

But fail if I change the values of between clause:

fabrizio=# select * from test where val between 0.1 and 0.19;
ERROR: tuple offset out of range: 8080

It must be compiled with --enable-cassert to fail on assertion.

I'm actually getting the "tuple offset out of range" with Fabrizio's
query, even after your fix (not every time, run it a few times,
launching a new connection each time). So there's another bug lurking
there. The problem seems to be that even though we've checked in the
innermost loop that not all of the items on the page are <= advancePast,
that situation can change later if advancePast is advanced. So on next
invocation entryGetItem, the loop to skip past advancePast might read
bogus offsets in the array.

I pushed a patch, which includes Alexander's fix, and also fixes the
second issue.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers