Index scan optimization

Started by Rajeev rastogiover 11 years ago16 messages
#1Rajeev rastogi
rajeev.rastogi@huawei.com
1 attachment(s)

I have observed a scope of considerable performance improvement in-case of index by a very minor code change.
Consider the below schema:

create table tbl2(id1 int, id2 varchar(10), id3 int);
create index idx2 on tbl2(id2, id3);

Query as:
select count(*) from tbl2 where id2>'a' and id3>990000;

As per current design, it takes following steps to retrieve index tuples:

1. Find the scan start position by searching first position in BTree as per the first key condition i.e. as per id2>'a'

2. Then it fetches each tuples from position found in step-1.

3. For each tuple, it matches all scan key condition, in our example it matches both scan key condition.

4. If condition match, it returns the tuple otherwise scan stops.

Now problem is here that already first scan key condition is matched to find the scan start position (Step-1), so it is obvious that any further tuple also will match the first scan key condition (as records are sorted).
So comparison on first scan key condition again in step-3 seems to be redundant.

So my proposal is to skip the condition check on the first scan key condition for every tuple.

I have prototype code to see the result, Below are my some test data as per the query attached with this mail:
Time:
Original Code took: 3621 ms
Optimized Code took: 2576 ms
So overall performance improvement is around 29%.

Instruction:
Original Code took: 20057
Optimized Code took: 14557
So overall instruction improvement is around 27%.
Also I observed that only for function varstr_cmp, around 661M instruction was taken, which is completely saved for optimized code.

This optimization can be extended:

1. For a case where once all scan key matches the condition, no need to check condition for further tuples for any scan key. Just keep returning the tuple.

2. Above approach can be used for '<' operator also by finding the position of last matching tuples.

I would like to submit the patch for this improvement.
Please provide your feedback. Also let me know if I am missing something.

Thanks and Regards,
Kumar Rajeev Rastogi

Attachments:

test code.txttext/plain; name="test code.txt"Download
#2Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Rajeev rastogi (#1)
Re: Index scan optimization

On 09/22/2014 07:47 AM, Rajeev rastogi wrote:

I have observed a scope of considerable performance improvement in-case of index by a very minor code change.
Consider the below schema:

create table tbl2(id1 int, id2 varchar(10), id3 int);
create index idx2 on tbl2(id2, id3);

Query as:
select count(*) from tbl2 where id2>'a' and id3>990000;

As per current design, it takes following steps to retrieve index tuples:

1. Find the scan start position by searching first position in BTree as per the first key condition i.e. as per id2>'a'

2. Then it fetches each tuples from position found in step-1.

3. For each tuple, it matches all scan key condition, in our example it matches both scan key condition.

4. If condition match, it returns the tuple otherwise scan stops.

Now problem is here that already first scan key condition is matched to find the scan start position (Step-1), so it is obvious that any further tuple also will match the first scan key condition (as records are sorted).
So comparison on first scan key condition again in step-3 seems to be redundant.

So my proposal is to skip the condition check on the first scan key condition for every tuple.

The same happens in a single-column case. If you have a query like
"SELECT * FROM tbl2 where id2 > 'a'", once you've found the start
position of the scan, you know that all the rows that follow match too.

I would like to submit the patch for this improvement.
Please provide your feedback. Also let me know if I am missing something.

Yeah, sounds like a good idea. This scenario might not arise very often,
but it should be cheap to check, so I doubt it will add any measurable
overhead to the cases where the optimization doesn't help.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Rajeev rastogi
rajeev.rastogi@huawei.com
In reply to: Heikki Linnakangas (#2)
Re: Index scan optimization

On 22 September 2014 12:35, Heikki Linnakangas:

I have observed a scope of considerable performance improvement in-

case of index by a very minor code change.

Consider the below schema:

create table tbl2(id1 int, id2 varchar(10), id3 int); create index
idx2 on tbl2(id2, id3);

Query as:
select count(*) from tbl2 where id2>'a' and
id3>990000;

As per current design, it takes following steps to retrieve index

tuples:

1. Find the scan start position by searching first position in

BTree as per the first key condition i.e. as per id2>'a'

2. Then it fetches each tuples from position found in step-1.

3. For each tuple, it matches all scan key condition, in our

example it matches both scan key condition.

4. If condition match, it returns the tuple otherwise scan

stops.

Now problem is here that already first scan key condition is matched

to find the scan start position (Step-1), so it is obvious that any
further tuple also will match the first scan key condition (as records
are sorted).

So comparison on first scan key condition again in step-3 seems to be

redundant.

So my proposal is to skip the condition check on the first scan key

condition for every tuple.

The same happens in a single-column case. If you have a query like
"SELECT * FROM tbl2 where id2 > 'a'", once you've found the start
position of the scan, you know that all the rows that follow match too.

Very much true.

I would like to submit the patch for this improvement.
Please provide your feedback. Also let me know if I am missing

something.

Yeah, sounds like a good idea. This scenario might not arise very often,
but it should be cheap to check, so I doubt it will add any measurable
overhead to the cases where the optimization doesn't help.

Thanks, I shall start to prepare a patch for this optimization and share in 1 or 2 days.

Thanks and Regards,
Kumar Rajeev Rastogi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Stephen Frost
sfrost@snowman.net
In reply to: Rajeev rastogi (#3)
Re: Index scan optimization

* Rajeev rastogi (rajeev.rastogi@huawei.com) wrote:

Thanks, I shall start to prepare a patch for this optimization and share in 1 or 2 days.

This sounded interesting to me also- please be sure to put it on the
open commitfest once you have posted the patch.

Thanks!

Stephen

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#2)
Re: Index scan optimization

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 09/22/2014 07:47 AM, Rajeev rastogi wrote:

So my proposal is to skip the condition check on the first scan key condition for every tuple.

The same happens in a single-column case. If you have a query like
"SELECT * FROM tbl2 where id2 > 'a'", once you've found the start
position of the scan, you know that all the rows that follow match too.

... unless you're doing a backwards scan.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Tom Lane (#5)
Re: Index scan optimization

On 09/22/2014 04:45 PM, Tom Lane wrote:

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 09/22/2014 07:47 AM, Rajeev rastogi wrote:

So my proposal is to skip the condition check on the first scan key condition for every tuple.

The same happens in a single-column case. If you have a query like
"SELECT * FROM tbl2 where id2 > 'a'", once you've found the start
position of the scan, you know that all the rows that follow match too.

... unless you're doing a backwards scan.

Sure. And you have to still check for NULLs. Have to get the details right..

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Rajeev rastogi
rajeev.rastogi@huawei.com
In reply to: Heikki Linnakangas (#6)
1 attachment(s)
Re: Index scan optimization

On 22 September 2014 19:17, Heikki Linnakangas wrote:

On 09/22/2014 04:45 PM, Tom Lane wrote:

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 09/22/2014 07:47 AM, Rajeev rastogi wrote:

So my proposal is to skip the condition check on the first scan key

condition for every tuple.

The same happens in a single-column case. If you have a query like
"SELECT * FROM tbl2 where id2 > 'a'", once you've found the start
position of the scan, you know that all the rows that follow match

too.

... unless you're doing a backwards scan.

Sure. And you have to still check for NULLs. Have to get the details
right..

I have finished implementation of the discussed optimization.
I got a performance improvement of around "30%" on the schema and data shared in earlier mail.

I also tested for the index scan case, where our optimization is not done and observed that there
is no effect on those query because of this change.

Change details:
I have added a new flag as SK_BT_MATCHED as part of sk_flags (ScanKey structure), the value used for this
0x00040000, which was unused.
Inside the function _bt_first, once we finish finding the start scan position based on the first key,
I am appending the flag SK_BT_MATCHED to the first key.
Then in the function _bt_checkkeys, during the key comparison, I am checking if the key has SK_BT_MATCHED flag set, if yes then
there is no need to further comparison. But if the tuple is having NULL value, then even if this flag is set, we will continue
with further comparison (this handles the Heikki point of checking NULLs).

I will add this patch to the next CommitFest.

Please let me know your feedback.

Thanks and Regards,
Kumar Rajeev Rastogi

Attachments:

index_scan_opt.patchapplication/octet-stream; name=index_scan_opt.patchDownload
*** a/src/backend/access/nbtree/nbtsearch.c
--- b/src/backend/access/nbtree/nbtsearch.c
***************
*** 997,1002 **** _bt_first(IndexScanDesc scan, ScanDirection dir)
--- 997,1017 ----
  		offnum = OffsetNumberPrev(offnum);
  
  	/*
+ 	 * By here means scan position has been decided based on the first key.
+ 	 * So check if further tuple also expected to match as per direction.
+ 	 * If so then we can set SK_BT_MATCHED flag for this key so that next check
+ 	 * of this key can avoided.
+ 	 */
+ 	if ((startKeys[0]->sk_strategy == BTGreaterEqualStrategyNumber
+ 					|| startKeys[0]->sk_strategy == BTGreaterStrategyNumber)
+ 					&& 	ScanDirectionIsForward(dir))
+ 		startKeys[0]->sk_flags |= SK_BT_MATCHED;
+ 	else if ((startKeys[0]->sk_strategy == BTLessEqualStrategyNumber
+ 					|| startKeys[0]->sk_strategy == BTLessStrategyNumber)
+ 					&& ScanDirectionIsBackward(dir))
+ 		startKeys[0]->sk_flags |= SK_BT_MATCHED;
+ 
+ 	/*
  	 * Now load data from the first page of the scan.
  	 */
  	if (!_bt_readpage(scan, dir, offnum))
*** a/src/backend/access/nbtree/nbtutils.c
--- b/src/backend/access/nbtree/nbtutils.c
***************
*** 1429,1434 **** _bt_checkkeys(IndexScanDesc scan,
--- 1429,1443 ----
  		bool		isNull;
  		Datum		test;
  
+ 		/*
+ 		 * If tuple does not have NULL values and this key has already matched,
+ 		 * then the current tuple is expected to match. So safely deformation of
+ 		 * tuple and comparison of keys can be skiped.
+ 		 */
+ 		if (key->sk_flags & SK_BT_MATCHED && !IndexTupleHasNulls(tuple))
+ 			continue;
+ 
+ 
  		/* row-comparison keys need special processing */
  		if (key->sk_flags & SK_ROW_HEADER)
  		{
*** a/src/include/access/nbtree.h
--- b/src/include/access/nbtree.h
***************
*** 617,622 **** typedef BTScanOpaqueData *BTScanOpaque;
--- 617,623 ----
   */
  #define SK_BT_REQFWD	0x00010000		/* required to continue forward scan */
  #define SK_BT_REQBKWD	0x00020000		/* required to continue backward scan */
+ #define SK_BT_MATCHED	0x00040000		/* required to skip further key match */
  #define SK_BT_INDOPTION_SHIFT  24		/* must clear the above bits */
  #define SK_BT_DESC			(INDOPTION_DESC << SK_BT_INDOPTION_SHIFT)
  #define SK_BT_NULLS_FIRST	(INDOPTION_NULLS_FIRST << SK_BT_INDOPTION_SHIFT)
#8Rajeev rastogi
rajeev.rastogi@huawei.com
In reply to: Stephen Frost (#4)
Re: Index scan optimization

On 22 September 2014 18:51, Stephen Frost Wrote:

* Rajeev rastogi (rajeev.rastogi@huawei.com) wrote:

Thanks, I shall start to prepare a patch for this optimization and share in 1 or 2 days.

This sounded interesting to me also- please be sure to put it on the open commitfest once you have posted the patch.

Thanks, I have added it to next CommitFest i.e. October CommitFest.

Thanks and Regards,
Kumar Rajeev Rastogi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Simon Riggs
simon@2ndquadrant.com
In reply to: Heikki Linnakangas (#6)
Re: Index scan optimization

On 22 September 2014 14:46, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

On 09/22/2014 04:45 PM, Tom Lane wrote:

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 09/22/2014 07:47 AM, Rajeev rastogi wrote:

So my proposal is to skip the condition check on the first scan key
condition for every tuple.

The same happens in a single-column case. If you have a query like
"SELECT * FROM tbl2 where id2 > 'a'", once you've found the start
position of the scan, you know that all the rows that follow match too.

... unless you're doing a backwards scan.

Sure. And you have to still check for NULLs. Have to get the details right..

And also that the tests don't use volatile functions.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Haribabu Kommi
kommi.haribabu@gmail.com
In reply to: Rajeev rastogi (#7)
Re: Index scan optimization

On Tue, Sep 23, 2014 at 10:38 PM, Rajeev rastogi
<rajeev.rastogi@huawei.com> wrote:

On 22 September 2014 19:17, Heikki Linnakangas wrote:

On 09/22/2014 04:45 PM, Tom Lane wrote:

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 09/22/2014 07:47 AM, Rajeev rastogi wrote:

So my proposal is to skip the condition check on the first scan key

condition for every tuple.

The same happens in a single-column case. If you have a query like
"SELECT * FROM tbl2 where id2 > 'a'", once you've found the start
position of the scan, you know that all the rows that follow match

too.

... unless you're doing a backwards scan.

Sure. And you have to still check for NULLs. Have to get the details
right..

I have finished implementation of the discussed optimization.
I got a performance improvement of around "30%" on the schema and data shared in earlier mail.

I also tested for the index scan case, where our optimization is not done and observed that there
is no effect on those query because of this change.

Change details:
I have added a new flag as SK_BT_MATCHED as part of sk_flags (ScanKey structure), the value used for this
0x00040000, which was unused.
Inside the function _bt_first, once we finish finding the start scan position based on the first key,
I am appending the flag SK_BT_MATCHED to the first key.
Then in the function _bt_checkkeys, during the key comparison, I am checking if the key has SK_BT_MATCHED flag set, if yes then
there is no need to further comparison. But if the tuple is having NULL value, then even if this flag is set, we will continue
with further comparison (this handles the Heikki point of checking NULLs).

Hi,

I reviewed index scan optimization patch, the following are the observations.

- Patch applies cleanly.
- Compiles without warnings
- All regress tests are passed.

There is a good performance gain with the patch in almost all scenarios.

I have a question regarding setting of key flags matched. Only the
first key was set as matched
even if we have multiple index conditions. Is there any reason behind that?

If any volatile function is present in the index condition, the index
scan itself is not choosen,
Is there any need of handling the same similar to NULLS?

Thanks for the patch.

Regards,
Hari Babu
Fujitsu Australia

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Rajeev rastogi
rajeev.rastogi@huawei.com
In reply to: Haribabu Kommi (#10)
Re: Index scan optimization

On 26 October 2014 10:42, Haribabu Kommi wrote:

Hi,

I reviewed index scan optimization patch, the following are the
observations.

- Patch applies cleanly.
- Compiles without warnings
- All regress tests are passed.

There is a good performance gain with the patch in almost all scenarios.

Thanks for reviewing.

I have a question regarding setting of key flags matched. Only the
first key was set as matched even if we have multiple index conditions.
Is there any reason behind that?

Actually the keysCount can be more than one only if the key strategy is BTEqualStrategyNumber (i.e. = condition)
But our optimization is only for the '>' or '>=' condition only. So adding code to set flag for all keys is of no use.

Let me know if I am missing anything here?

If any volatile function is present in the index condition, the index
scan itself is not choosen, Is there any need of handling the same
similar to NULLS?

Do you mean to say that whether we should add any special handling for volatile function?
If yes then as you said since index scan itself is not selected for such functions, then
I feel We don’t need to add anything for that.
Any opinion?

Thanks and Regards,
Kumar Rajeev Rastogi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Haribabu Kommi
kommi.haribabu@gmail.com
In reply to: Rajeev rastogi (#11)
Re: Index scan optimization

On Mon, Oct 27, 2014 at 10:38 PM, Rajeev rastogi
<rajeev.rastogi@huawei.com> wrote:

On 26 October 2014 10:42, Haribabu Kommi wrote:

I have a question regarding setting of key flags matched. Only the
first key was set as matched even if we have multiple index conditions.
Is there any reason behind that?

Actually the keysCount can be more than one only if the key strategy is BTEqualStrategyNumber (i.e. = condition)
But our optimization is only for the '>' or '>=' condition only. So adding code to set flag for all keys is of no use.

Let me know if I am missing anything here?

Thanks. I understood the point.

If any volatile function is present in the index condition, the index
scan itself is not choosen, Is there any need of handling the same
similar to NULLS?

Do you mean to say that whether we should add any special handling for volatile function?
If yes then as you said since index scan itself is not selected for such functions, then
I feel We don’t need to add anything for that.

I also have the same opinion. I marked the patch as ready for committer.

Regards,
Hari Babu
Fujitsu Australia

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Simon Riggs
simon@2ndQuadrant.com
In reply to: Haribabu Kommi (#12)
Re: Index scan optimization

On 30 October 2014 08:43, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

I marked the patch as ready for committer.

This looks very interesting.

The value of the patch seems to come from skipping costly checks. Your
performance test has a leading VARCHAR column, so in that specific
case skipping the leading column is a big win. I don't see it would be
in all cases.

Can we see a few tests on similar things to check for other
opportunities and regressions.
* Single column bigint index
* Multi-column bigint index
* 5 column index with mixed text and integers

The explanatory comments need some work to more clearly explain what
this patch does.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Rajeev rastogi
rajeev.rastogi@huawei.com
In reply to: Simon Riggs (#13)
1 attachment(s)
Re: Index scan optimization

On 16 November 2014 19:30, Simon Riggs Wrote:

Sent: 16 November 2014 19:30

I marked the patch as ready for committer.

This looks very interesting.

The value of the patch seems to come from skipping costly checks. Your
performance test has a leading VARCHAR column, so in that specific case
skipping the leading column is a big win. I don't see it would be in
all cases.

Yes you are right. Best improvement will be for a case where leading column comparison is very costly (e.g. VARCHAR).

Can we see a few tests on similar things to check for other
opportunities and regressions.

* Single column bigint index

It gives around 5% improvement.

* Multi-column bigint index

It gives around 5% improvement.

* 5 column index with mixed text and integers

It gives around 15% improvement.

As we can see in-case of bigint, improvement is not as high as for VARCHAR because comparison of bigint data-type is not costly.
So this being one of the worst scenario performance test and I think even in-case of worst case 5% improvement with so less/safe code changes
should be OK specially since for other scenario (like varchar index) improvement is high (15-30%).

Also even for bigint (or any other similar data-type) improvement can increase if number of records going to be selected increases.

Test-case used for testing is attached.

Please provide your opinion.

The explanatory comments need some work to more clearly explain what
this patch does.

Please help me to understand this point, you want me to add more comments about patch in this mail chain or in code.

Thanks and Regards,
Kumar Rajeev Rastogi

Attachments:

new_test_cases.txttext/plain; name=new_test_cases.txtDownload
#15Simon Riggs
simon@2ndQuadrant.com
In reply to: Rajeev rastogi (#14)
Re: Index scan optimization

On 17 November 2014 05:49, Rajeev rastogi <rajeev.rastogi@huawei.com> wrote:

* Single column bigint index

It gives around 5% improvement.

I confirm 5% improvement in both short and long index scans, on the
least beneficial datatype. Looks likely to be a very positive win
overall.

TEST 1: 100000x
select id2 into x from tbl2 where id2>990000 limit 1;

Patched
Time: 617.026 ms
Time: 604.769 ms
Time: 613.392 ms
Time: 600.484 ms
Time: 607.895 ms
Time: 609.366 ms
Time: 606.187 ms
Time: 602.643 ms
Time: 613.115 ms
Time: 605.705 ms

Unpatched
Time: 642.786 ms
Time: 649.880 ms
Time: 637.400 ms
Time: 635.937 ms
Time: 641.210 ms
Time: 641.988 ms
Time: 637.839 ms
Time: 640.265 ms
Time: 640.237 ms
Time: 637.669 ms

TEST 2 10000x

select count(*) into x from tbl2 where id2>990000;

Patched:
Time: 17277.083 ms
Time: 17373.752 ms
Time: 17208.220 ms

Unpatched:
Time: 18186.062 ms
Time: 18067.664 ms
Time: 18118.807 ms

The explanatory comments need some work to more clearly explain what
this patch does.

Please help me to understand this point, you want me to add more comments about patch in this mail chain or in code.

The language used in the comments is not clear enough. I'll do my best
to improve that, then look to commit this in about 5 hours.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Rajeev rastogi
rajeev.rastogi@huawei.com
In reply to: Simon Riggs (#15)
Re: Index scan optimization

On 17 November 2014 16:05, Simon Riggs Wrote:

I confirm 5% improvement in both short and long index scans, on the least beneficial datatype. Looks likely to be a very positive win overall.

Thanks a lot for testing and confirmation.

The language used in the comments is not clear enough. I'll do my best to improve that, then look to commit this in about 5 hours.

Thanks a lot for support. Please let me know if I also need to add something.

Thanks and Regards,
Kumar Rajeev Rastogi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers