Proposal: scan key push down to heap [WIP]

Started by Dilip Kumarover 9 years ago36 messageshackers
Jump to latest
#1Dilip Kumar
dilipbalaut@gmail.com

Hi Hackers,

I would like to propose a patch for pushing down the scan key to heap.

Currently only in case of system table scan keys are pushed down. I
have implemented the POC patch to do the same for normal table scan.

This patch will extract the expression from qual and prepare the scan
keys. Currently in POC version I have only supported "var OP const"
type of qual, because these type of quals can be pushed down using
existing framework.

Purpose of this work is to first implement the basic functionality and
analyze the results. If results are good then we can extend it for
other type of expressions.

However in future when we try to expand the support for complex
expressions, then we need to be very careful while selecting
pushable expression. It should not happen that we push something very
complex, which may cause contention with other write operation (as
HeapKeyTest is done under page lock).

Performance Test: (test done in local machine, with all default setting).

Setup:
----------

create table test(a int, b varchar, c varchar, d int);
insert into test values(generate_series(1,10000000), repeat('x', 30),
repeat('y', 30), generate_series(1,10000000));
analyze test;

Test query:
--------------
select count(*) from test where a < $1;

Results: (execution time in ms)
------------
Selectivity Head(ms) Patch(ms) gain
0.01 612 307 49%
0.1 623 353 43%
0.2 645 398 38%
0.5 780 535 31%
0.8 852 590 30%
1 913 730 20%

Instructions: (Cpu instructions measured with callgrind tool):

Quary : select count(*) from test where a < 100000;

Head: 10,815,730,925
Patch: 4,780,047,331

Summary:
--------------
1. ~50% reduction in both instructions as well as execution time.
2. Here we can see ~ 20% execution time reduction even at selectivity
1 (when all tuples are selected). And, reasoning for the same can be
that HeapKeyTest is much simplified compared to ExecQual. It's
possible that in future when we try to support more variety of keys,
gain at high selectivity may come down.

WIP patch attached..

Thoughts ?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

heap_scankey_pushdown_v1.patchapplication/octet-stream; name=heap_scankey_pushdown_v1.patchDownload+151-9
heap_scankey_pushdown.pngimage/png; name=heap_scankey_pushdown.pngDownload+1-0
#2Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#1)
Re: Proposal: scan key push down to heap [WIP]

On Tue, Oct 11, 2016 at 4:57 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

This patch will extract the expression from qual and prepare the scan
keys. Currently in POC version I have only supported "var OP const"
type of qual, because these type of quals can be pushed down using
existing framework.

Purpose of this work is to first implement the basic functionality and
analyze the results. If results are good then we can extend it for
other type of expressions.

However in future when we try to expand the support for complex
expressions, then we need to be very careful while selecting
pushable expression. It should not happen that we push something very
complex, which may cause contention with other write operation (as
HeapKeyTest is done under page lock).

I seriously doubt that this should EVER be supported for anything
other than "var op const", and even then only for very simple
operators. For example, texteq() is probably too complicated, because
it might de-TOAST, and that might involve doing a LOT of work while
holding the buffer content lock. But int4eq() would be OK.

Part of the trick if we want to make this work is going to be figuring
out how we'll identify which operators are safe.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Dilip Kumar
dilipbalaut@gmail.com
In reply to: Robert Haas (#2)
Re: Proposal: scan key push down to heap [WIP]

On Thu, Oct 13, 2016 at 6:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I seriously doubt that this should EVER be supported for anything
other than "var op const", and even then only for very simple
operators.

Yes, with existing key push down infrastructure only "var op const",
but If we extend that I think we can cover many other simple
expressions, i.e

Unary Operator : ISNULL, NOTNULL
Other Simple expression : "Var op Const", "Var op Var", "Var op
SimpleExpr(Var, Const)" ..

We can not push down function expression because we can not guarantee
that they are safe, but can we pushdown inbuilt functions ? (I think
we can identify their safety based on their data type, but I am not
too sure about this point).

For example, texteq() is probably too complicated, because

it might de-TOAST, and that might involve doing a LOT of work while
holding the buffer content lock. But int4eq() would be OK.

Part of the trick if we want to make this work is going to be figuring
out how we'll identify which operators are safe.

Yes, I agree that's the difficult part. Can't we process full qual
list and decide decide the operator are safe or not based on their
datatype ?

What I mean to say is instead of checking safety of each operator like
texteq(), text_le()...
we can directly discard any operator involving such kind of data types.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dilip Kumar (#3)
Re: Proposal: scan key push down to heap [WIP]

Dilip Kumar <dilipbalaut@gmail.com> writes:

On Thu, Oct 13, 2016 at 6:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I seriously doubt that this should EVER be supported for anything
other than "var op const", and even then only for very simple
operators.

Yes, with existing key push down infrastructure only "var op const",
but If we extend that I think we can cover many other simple
expressions, i.e

I think it is a mistake to let this idea drive any additional
complication of the ScanKey data structure. That will have negative
impacts on indexscan performance, not to mention require touching
quite a lot of unrelated code. And as Robert points out, we do not
want to execute anything expensive while holding the buffer LWLock.

Part of the trick if we want to make this work is going to be figuring
out how we'll identify which operators are safe.

Yes, I agree that's the difficult part. Can't we process full qual
list and decide decide the operator are safe or not based on their
datatype ?

Possibly restricting it to builtin, immutable functions on non-toastable
data types would do. Or for more safety, only allow pass-by-value data
types. But I have a feeling that there might still be counterexamples.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Andres Freund
andres@anarazel.de
In reply to: Dilip Kumar (#1)
Re: Proposal: scan key push down to heap [WIP]

Hi,

On 2016-10-11 17:27:56 +0530, Dilip Kumar wrote:

I would like to propose a patch for pushing down the scan key to heap.

I think that makes sense. Both because scankey eval is faster than
generic expression eval, and because it saves a lot of function calls in
heavily filtered cases.

However in future when we try to expand the support for complex
expressions, then we need to be very careful while selecting
pushable expression. It should not happen that we push something very
complex, which may cause contention with other write operation (as
HeapKeyTest is done under page lock).

I don't think it's a good idea to do this under the content lock in any
case. But luckily I don't think we have to do so at all.

Due to pagemode - which is used by pretty much everything iterating over
heaps, and definitely seqscans - the key test already happens without
the content lock held, in heapgettup_pagemode():
/*
* ----------------
* heapgettup_pagemode - fetch next heap tuple in page-at-a-time mode
*
* Same API as heapgettup, but used in page-at-a-time mode
*
* The internal logic is much the same as heapgettup's too, but there are some
* differences: *****we do not take the buffer content lock**** (that only needs to
* happen inside heapgetpage), and we iterate through just the tuples listed
* in rs_vistuples[] rather than all tuples on the page. Notice that
* lineindex is 0-based, where the corresponding loop variable lineoff in
* heapgettup is 1-based.
* ----------------
*/
static void
heapgettup_pagemode(HeapScanDesc scan,
ScanDirection dir,
int nkeys,
ScanKey key)
...
/*
* advance the scan until we find a qualifying tuple or run out of stuff
* to scan
*/
for (;;)
{
while (linesleft > 0)
{
lineoff = scan->rs_vistuples[lineindex];
lpp = PageGetItemId(dp, lineoff);
Assert(ItemIdIsNormal(lpp));
...
/*
* if current tuple qualifies, return it.
*/
if (key != NULL)
{
bool valid;

HeapKeyTest(tuple, RelationGetDescr(scan->rs_rd),
nkeys, key, valid);
if (valid)
{
scan->rs_cindex = lineindex;
return;
}
}

Instructions: (Cpu instructions measured with callgrind tool):

Note that callgrind's numbers aren't very meaningful in these days. CPU
pipelining and speculative execution/reordering makes them very
inaccurate.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Dilip Kumar
dilipbalaut@gmail.com
In reply to: Andres Freund (#5)
Re: Proposal: scan key push down to heap [WIP]

On Fri, Oct 14, 2016 at 11:54 AM, Andres Freund <andres@anarazel.de> wrote:

I don't think it's a good idea to do this under the content lock in any
case. But luckily I don't think we have to do so at all.

Due to pagemode - which is used by pretty much everything iterating over
heaps, and definitely seqscans - the key test already happens without
the content lock held, in heapgettup_pagemode():

Yes, you are right. Now after this clarification, it's clear that
even we push down the key we are not evaluating it under buffer
content lock.

As of now, I have done my further analysis by keeping in mind only
pushing 'var op const'. Below are my observations.

#1. If we process the qual in executor, all temp memory allocation are
under "per_tuple_context" which get reset after each tuple process.
But, on other hand if we do that in heap, then that will be under
"per_query_context". This restrict us to pushdown any qual which need
to do memory allocations like "toastable" field.

Is there any other way to handle this ?

#2. Currently quals are ordered based on cost (refer
order_qual_clauses), But once we pushdown some of the quals, then
those quals will always be executed first. Can this create problem ?

consider below example..

create or replace function f1(anyelement) returns bool as
$$
begin
raise notice '%', $1;
return true;
end;
$$ LANGUAGE plpgsql cost 0.01;

select * from t3 where a>1 and f1(b);

In this case "f1(b)" will always be executed as first qual (cost is
set very low by user) hence it will raise notice for all the tuple.
But if we pushdown "a>1" qual to heap then only qualified tuple will
be passed to "f1(b)".

Is it behaviour change ?

I know that, it can also impact the performance, because when user has
given some function with very low cost then that should be executed
first as it may discard most of the tuple.

One solution to this can be..
1. Only pushdown if either all quals are of same cost.
2. Pushdown all quals until we find first non pushable qual (this way
we can maintain the same qual execution order what is there in
existing system).

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Dilip Kumar (#6)
Re: Proposal: scan key push down to heap [WIP]

Dilip Kumar wrote:

#2. Currently quals are ordered based on cost (refer
order_qual_clauses), But once we pushdown some of the quals, then
those quals will always be executed first. Can this create problem ?

We don't promise order of execution (which is why we can afford to sort
on cost), but I think it makes sense to keep a rough ordering based on
cost, and not let this push-down affect those ordering decisions too
much.

I think it's fine to push-down quals that are within the same order of
magnitude of the cost of a pushable condition, while keeping any other
much-costlier qual (which could otherwise be pushed down) together with
non-pushable conditions; this would sort-of guarantee within-order-of-
magnitude order of execution of quals.

Hopefully an example clarifies what I mean. Let's suppose you have
three quals, where qual2 is non-pushable but 1 and 3 are. cost(1)=10,
cost(2)=11, cost(3)=12. Currently, they are executed in that order.

If you were to compare costs in the straightforward way, you would push
only 1 (because 3 is costlier than 2 which is not push-down-able). With
fuzzy comparisons, you'd push both 1 and 3, because cost of 3 is close
enough to that of qual 2.

But if cost(3)=100 then only push qual 1, and let qual 3 be evaluated
together with 2 outside the scan node.

BTW, should we cost push-down-able quals differently, say discount some
fraction of the cost, to reflect the fact that they are cheaper to run?
However, since the decision of which ones to push down depends on the
cost, and the cost would depend on which ones we push down, it looks
rather messy.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#7)
Re: Proposal: scan key push down to heap [WIP]

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

BTW, should we cost push-down-able quals differently, say discount some
fraction of the cost, to reflect the fact that they are cheaper to run?
However, since the decision of which ones to push down depends on the
cost, and the cost would depend on which ones we push down, it looks
rather messy.

I don't think our cost model is anywhere near refined enough to make it
worth trying to distinguish that. (Frankly, I'm pretty skeptical of this
entire patch being worth the trouble...)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Dilip Kumar
dilipbalaut@gmail.com
In reply to: Alvaro Herrera (#7)
Re: Proposal: scan key push down to heap [WIP]

On Tue, Oct 25, 2016 at 10:35 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

We don't promise order of execution (which is why we can afford to sort
on cost), but I think it makes sense to keep a rough ordering based on
cost, and not let this push-down affect those ordering decisions too
much.

Ok, Thanks for clarification.

I think it's fine to push-down quals that are within the same order of
magnitude of the cost of a pushable condition, while keeping any other
much-costlier qual (which could otherwise be pushed down) together with
non-pushable conditions; this would sort-of guarantee within-order-of-
magnitude order of execution of quals.

Hopefully an example clarifies what I mean. Let's suppose you have
three quals, where qual2 is non-pushable but 1 and 3 are. cost(1)=10,
cost(2)=11, cost(3)=12. Currently, they are executed in that order.

If you were to compare costs in the straightforward way, you would push
only 1 (because 3 is costlier than 2 which is not push-down-able). With
fuzzy comparisons, you'd push both 1 and 3, because cost of 3 is close
enough to that of qual 2.

But if cost(3)=100 then only push qual 1, and let qual 3 be evaluated
together with 2 outside the scan node.

After putting more thought on this, IMHO it need not to be so
complicated. Currently we are talking about pushing only "var op
const", and cost of all such functions are very low and fixed "1".

Do we really need to take care of any user defined function which is
declared with very low cost ?
Because while building index conditions also we don't take care of
such things. Index conditions will always we evaluated first then only
filter will be applied.

Am I missing something ?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#8)
Re: Proposal: scan key push down to heap [WIP]

On 2016-10-25 13:18:47 -0400, Tom Lane wrote:

(Frankly, I'm pretty skeptical of this entire patch being worth the
trouble...)

The gains are quite noticeable in some cases. So if we can make it work
without noticeable downsides...

What I'm worried about though is that this, afaics, will quite
noticeably *increase* total cost in cases with a noticeable number of
columns and a not that selective qual. The reason for that being that
HeapKeyTest() uses heap_getattr(), whereas upper layers use
slot_getattr(). The latter "caches" repeated deforms, the former
doesn't... That'll lead to deforming being essentially done twice, and
it's quite often already a major cost of query processing.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#10)
Re: Proposal: scan key push down to heap [WIP]

On Wed, Oct 26, 2016 at 12:01 PM, Andres Freund <andres@anarazel.de> wrote:

On 2016-10-25 13:18:47 -0400, Tom Lane wrote:

(Frankly, I'm pretty skeptical of this entire patch being worth the
trouble...)

The gains are quite noticeable in some cases. So if we can make it work
without noticeable downsides...

+1.

What I'm worried about though is that this, afaics, will quite
noticeably *increase* total cost in cases with a noticeable number of
columns and a not that selective qual. The reason for that being that
HeapKeyTest() uses heap_getattr(), whereas upper layers use
slot_getattr(). The latter "caches" repeated deforms, the former
doesn't... That'll lead to deforming being essentially done twice, and
it's quite often already a major cost of query processing.

heap_getattr() also has some caching mechanism to cache the tuple
offset , however it might not be as good as slot_getattr(). I think
if we decide to form the scan key from a qual only when qual refers to
fixed length column and that column is before any varlen column, the
increased cost will be alleviated. Do you have any other idea to
alleviate such cost?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#11)
Re: Proposal: scan key push down to heap [WIP]

On 2016-10-28 11:23:22 +0530, Amit Kapila wrote:

On Wed, Oct 26, 2016 at 12:01 PM, Andres Freund <andres@anarazel.de> wrote:

What I'm worried about though is that this, afaics, will quite
noticeably *increase* total cost in cases with a noticeable number of
columns and a not that selective qual. The reason for that being that
HeapKeyTest() uses heap_getattr(), whereas upper layers use
slot_getattr(). The latter "caches" repeated deforms, the former
doesn't... That'll lead to deforming being essentially done twice, and
it's quite often already a major cost of query processing.

heap_getattr() also has some caching mechanism to cache the tuple
offset , however it might not be as good as slot_getattr().

It's most definitely not as good. In fact, my measurements show it to be
a net negative in a number of cases.

I think if we decide to form the scan key from a qual only when qual
refers to fixed length column and that column is before any varlen
column, the increased cost will be alleviated. Do you have any other
idea to alleviate such cost?

Well, that'll also make the feature not particularly useful :(. My
suspicion is that the way to suceed here isn't to rely more on testing
as part of the scan, but create a more general fastpath for qual
evaluation, which atm is a *LOT* more heavyweight than what
HeapKeyTest() does. But maybe I'm biased since I'm working on the
latter...

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#12)
Re: Proposal: scan key push down to heap [WIP]

On Fri, Oct 28, 2016 at 12:16 PM, Andres Freund <andres@anarazel.de> wrote:

On 2016-10-28 11:23:22 +0530, Amit Kapila wrote:

I think if we decide to form the scan key from a qual only when qual
refers to fixed length column and that column is before any varlen
column, the increased cost will be alleviated. Do you have any other
idea to alleviate such cost?

Well, that'll also make the feature not particularly useful :(.

Yeah, the number of cases it can benefit will certainly reduce, but
still it can be a win wherever it can be used which is not bad.
Another thing that can be considered here is to evaluate if we can use
selectivity as a measure to decide if we can push the quals down. If
the quals are selective, then I think the non-effective caching impact
can be negated and we can in turn see benefits. For example, we can
consider a table with 20 to 40 columns having large number of rows and
try to use different columns in quals and then test it for different
selectivity to see how the performance varies with this patch.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Dilip Kumar
dilipbalaut@gmail.com
In reply to: Andres Freund (#10)
Re: Proposal: scan key push down to heap [WIP]

On Wed, Oct 26, 2016 at 12:01 PM, Andres Freund <andres@anarazel.de> wrote:

The gains are quite noticeable in some cases. So if we can make it work
without noticeable downsides...

What I'm worried about though is that this, afaics, will quite
noticeably *increase* total cost in cases with a noticeable number of
columns and a not that selective qual. The reason for that being that
HeapKeyTest() uses heap_getattr(), whereas upper layers use
slot_getattr(). The latter "caches" repeated deforms, the former
doesn't... That'll lead to deforming being essentially done twice, and
it's quite often already a major cost of query processing.

What about putting slot reference inside HeapScanDesc ?. I know it
will make ,heap layer use executor structure but just a thought.

I have quickly hacked this way where we use slot reference in
HeapScanDesc and directly use
slot_getattr inside HeapKeyTest (only if we have valid slot otherwise
use _heap_getattr) and measure the worst case performance (what you
have mentioned above.)

My Test: (21 column table with varchar in beginning + qual is on last
few column + varying selectivity )

postgres=# \d test
Table "public.test"
Column | Type | Modifiers
--------+-------------------+-----------
f1 | integer |
f2 | character varying |
f3 | integer |
f4 | integer |
f5 | integer |
f6 | integer |
f7 | integer |
f8 | integer |
f9 | integer |
f10 | integer |
f11 | integer |
f12 | integer |
f13 | integer |
f14 | integer |
f15 | integer |
f16 | integer |
f17 | integer |
f18 | integer |
f19 | integer |
f20 | integer |
f21 | integer |

tuple count : 10000000 (10 Million)
explain analyze select * from test where f21< $1 and f20 < $1 and f19
< $1 and f15 < $1 and f10 < $1; ($1 vary from 1Million to 1Million).

Target code base:
-----------------------
1. Head
2. Heap_scankey_pushdown_v1
3. My hack for keeping slot reference in HeapScanDesc
(v1+use_slot_in_HeapKeyTest)

Result:
Selectivity Head scan_key_pushdown_v1 v1+use_slot_in_HeapKeyTest
0.1 3880 2980 2747
0.2 4041 3187 2914
0.5 5051 4921 3626
0.8 5378 7296 3879
1.0 6161 8525 4575

Performance graph is attached in the mail..

Observation:
----------------
1. Heap_scankey_pushdown_v1, start degrading after very high
selectivity (this behaviour is only visible if table have 20 or more
columns, I tested with 10 columns but with that I did not see any
regression in v1).

2. (v1+use_slot_in_HeapKeyTest) is always winner, even at very high selectivity.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

heap_scan_key_pushdown.pngimage/png; name=heap_scan_key_pushdown.pngDownload+1-1
#15Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#12)
Re: Proposal: scan key push down to heap [WIP]

On Fri, Oct 28, 2016 at 2:46 AM, Andres Freund <andres@anarazel.de> wrote:

Well, that'll also make the feature not particularly useful :(. My
suspicion is that the way to suceed here isn't to rely more on testing
as part of the scan, but create a more general fastpath for qual
evaluation, which atm is a *LOT* more heavyweight than what
HeapKeyTest() does. But maybe I'm biased since I'm working on the
latter...

I think you might be right, but I'm not very clear on what the
timeline for your work is. It would be easier to say, sure, let's put
this on hold if we knew that in a month or two we could come back and
retest after you've made some progress. But I don't know whether
we're talking about months or years.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#15)
Re: Proposal: scan key push down to heap [WIP]

On 2016-10-31 09:28:00 -0400, Robert Haas wrote:

On Fri, Oct 28, 2016 at 2:46 AM, Andres Freund <andres@anarazel.de> wrote:

Well, that'll also make the feature not particularly useful :(. My
suspicion is that the way to suceed here isn't to rely more on testing
as part of the scan, but create a more general fastpath for qual
evaluation, which atm is a *LOT* more heavyweight than what
HeapKeyTest() does. But maybe I'm biased since I'm working on the
latter...

I think you might be right, but I'm not very clear on what the
timeline for your work is.

Me neither. But I think if we can stomach Dilip's approach of using a
slot in heapam, then I think my concerns are addressed, and this is
probably going go to be a win regardless of faster expression evaluation
and/or batching.

It would be easier to say, sure, let's put
this on hold if we knew that in a month or two we could come back and
retest after you've made some progress. But I don't know whether
we're talking about months or years.

I sure hope it's months.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Dilip Kumar (#14)
Re: Proposal: scan key push down to heap [WIP]

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Dilip Kumar
Sent: Saturday, October 29, 2016 3:48 PM
To: Andres Freund
Cc: Tom Lane; Alvaro Herrera; pgsql-hackers
Subject: Re: [HACKERS] Proposal: scan key push down to heap [WIP]

On Wed, Oct 26, 2016 at 12:01 PM, Andres Freund <andres@anarazel.de> wrote:

The gains are quite noticeable in some cases. So if we can make it work
without noticeable downsides...

What I'm worried about though is that this, afaics, will quite
noticeably *increase* total cost in cases with a noticeable number of
columns and a not that selective qual. The reason for that being that
HeapKeyTest() uses heap_getattr(), whereas upper layers use
slot_getattr(). The latter "caches" repeated deforms, the former
doesn't... That'll lead to deforming being essentially done twice, and
it's quite often already a major cost of query processing.

What about putting slot reference inside HeapScanDesc ?. I know it
will make ,heap layer use executor structure but just a thought.

I have quickly hacked this way where we use slot reference in
HeapScanDesc and directly use
slot_getattr inside HeapKeyTest (only if we have valid slot otherwise
use _heap_getattr) and measure the worst case performance (what you
have mentioned above.)

My Test: (21 column table with varchar in beginning + qual is on last
few column + varying selectivity )

postgres=# \d test
Table "public.test"
Column | Type | Modifiers
--------+-------------------+-----------
f1 | integer |
f2 | character varying |
f3 | integer |
f4 | integer |
f5 | integer |
f6 | integer |
f7 | integer |
f8 | integer |
f9 | integer |
f10 | integer |
f11 | integer |
f12 | integer |
f13 | integer |
f14 | integer |
f15 | integer |
f16 | integer |
f17 | integer |
f18 | integer |
f19 | integer |
f20 | integer |
f21 | integer |

tuple count : 10000000 (10 Million)
explain analyze select * from test where f21< $1 and f20 < $1 and f19
< $1 and f15 < $1 and f10 < $1; ($1 vary from 1Million to 1Million).

Target code base:
-----------------------
1. Head
2. Heap_scankey_pushdown_v1
3. My hack for keeping slot reference in HeapScanDesc
(v1+use_slot_in_HeapKeyTest)

Result:
Selectivity Head scan_key_pushdown_v1 v1+use_slot_in_HeapKeyTest
0.1 3880 2980 2747
0.2 4041 3187 2914
0.5 5051 4921 3626
0.8 5378 7296 3879
1.0 6161 8525 4575

Performance graph is attached in the mail..

Observation:
----------------
1. Heap_scankey_pushdown_v1, start degrading after very high
selectivity (this behaviour is only visible if table have 20 or more
columns, I tested with 10 columns but with that I did not see any
regression in v1).

2. (v1+use_slot_in_HeapKeyTest) is always winner, even at very high selectivity.

Prior to this interface change, it may be a starting point to restrict scan key
pushdown only when OpExpr references the column with static attcacheoff.
This type of column does not need walks on tuples from the head, thus, tuple
deforming cost will not be a downside.

By the way, I'm a bit skeptical whether this enhancement is really beneficial
than works for this enhancement, because we can now easily increase the number
of processor cores to run seq-scan with qualifier, especially, when it has high
selectivity.
How about your thought?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#14)
Re: Proposal: scan key push down to heap [WIP]

On Sat, Oct 29, 2016 at 12:17 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

What about putting slot reference inside HeapScanDesc ?. I know it
will make ,heap layer use executor structure but just a thought.

I have quickly hacked this way where we use slot reference in
HeapScanDesc and directly use
slot_getattr inside HeapKeyTest (only if we have valid slot otherwise
use _heap_getattr) and measure the worst case performance (what you
have mentioned above.)

My Test: (21 column table with varchar in beginning + qual is on last
few column + varying selectivity )

postgres=# \d test
Table "public.test"
Column | Type | Modifiers
--------+-------------------+-----------
f1 | integer |
f2 | character varying |
f3 | integer |
f4 | integer |
f5 | integer |
f6 | integer |
f7 | integer |
f8 | integer |
f9 | integer |
f10 | integer |
f11 | integer |
f12 | integer |
f13 | integer |
f14 | integer |
f15 | integer |
f16 | integer |
f17 | integer |
f18 | integer |
f19 | integer |
f20 | integer |
f21 | integer |

tuple count : 10000000 (10 Million)
explain analyze select * from test where f21< $1 and f20 < $1 and f19
< $1 and f15 < $1 and f10 < $1; ($1 vary from 1Million to 1Million).

Target code base:
-----------------------
1. Head
2. Heap_scankey_pushdown_v1
3. My hack for keeping slot reference in HeapScanDesc
(v1+use_slot_in_HeapKeyTest)

Result:
Selectivity Head scan_key_pushdown_v1 v1+use_slot_in_HeapKeyTest
0.1 3880 2980 2747
0.2 4041 3187 2914
0.5 5051 4921 3626
0.8 5378 7296 3879
1.0 6161 8525 4575

Performance graph is attached in the mail..

Observation:
----------------
1. Heap_scankey_pushdown_v1, start degrading after very high
selectivity (this behaviour is only visible if table have 20 or more
columns, I tested with 10 columns but with that I did not see any
regression in v1).

2. (v1+use_slot_in_HeapKeyTest) is always winner, even at very high selectivity.

The patch is attached for this (storing slot reference in HeapScanDesc)..

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

heap_scankey_pushdown_v2.patchapplication/octet-stream; name=heap_scankey_pushdown_v2.patchDownload+194-18
#19Robert Haas
robertmhaas@gmail.com
In reply to: KaiGai Kohei (#17)
Re: Proposal: scan key push down to heap [WIP]

On Tue, Nov 1, 2016 at 8:31 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

By the way, I'm a bit skeptical whether this enhancement is really beneficial
than works for this enhancement, because we can now easily increase the number
of processor cores to run seq-scan with qualifier, especially, when it has high
selectivity.
How about your thought?

Are you saying we don't need to both making sequential scans faster
because we could just use parallel sequential scan instead? That
doesn't sound right to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Dilip Kumar
dilipbalaut@gmail.com
In reply to: Robert Haas (#19)
Re: Proposal: scan key push down to heap [WIP]

I have done performance analysis for TPCH queries, I saw visible gain
in 5 queries (10-25%).

Performance Data:

Benchmark : TPCH (S.F. 10)
shared_buffer : 20GB
work_mem : 50MB
Machine : POWER

Results are median of three run (explain analyze results for both
head/patch are attached in TPCH_out.tar).

Query Execution Time in (ms)
Head Patch Improvement
Q3 18475 16558 10%
Q4 7526 5856 22%
Q7 19386 17425 10%
Q10 16994 15019 11%
Q12 13852 10117 26%

Currently we had two major problems about this patch..

Problem1: As Andres has mentioned, HeapKeyTest uses heap_getattr,
whereas ExecQual use slot_getattr().So we can have worst case
performance problem when very less tuple are getting filter out and we
have table with many columns with qual on most of the columns.

Problem2. In HeapKeyTest we are under per_query_ctx, whereas in
ExecQual we are under per_tuple_ctx , so in former we can not afford
to have any palloc.

In this patch I have address both the concern by exposing executor
information to heap (I exposed per_tuple_ctx and slot to HeapDesc),
which is not a very good design.

I have other ideas in mind for solving these concerns, please provide
your thoughts..

For problem1 :
I think it's better to give task of key push down to optimizer, there
we can actually take the decision mostly based on two parameters.
1. Selectivity.
2. Column number on which qual is given.

For problem2 :
I think for solving this we need to limit the number of datatype we
pushdown to heap (I mean we can push all datatype which don't need
palloc in qual test).
1. Don't push down datatype with variable length.
2. Some other datatype with fixed length like 'name' can also
palloc (i.e nameregexeq). so we need to block them as well.

*Note: For exactly understanding which key is pushed down in below
attached exact analyze output, refer this example..

without patch:
-> Parallel Seq Scan on orders (cost=0.00..369972.00 rows=225038
width=20) (actual time=0.025..3216.157 rows=187204 loops=3)
Filter: ((o_orderdate >= '1995-01-01'::date) AND
(o_orderdate < '1995-04-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 4812796

with patch:
-> Parallel Seq Scan on orders (cost=0.00..369972.00 rows=225038
width=20) (actual time=0.015..1884.993 rows=187204 loops=3)
Filter: ((o_orderdate >= '1995-01-01'::date) AND
(o_orderdate < '1995-04-01 00:00:00'::timestamp without time zone))

1. So basically on head it shows how many rows are discarded by filter
(Rows Removed by Filter: 4812796), Now if we pushdown all the keys
then it will not show this value.

2. We can also check how much actual time reduced in the SeqScan node.
(i.e in above example on head it was 3216.157 whereas with patch it
was 1884.993).

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

On Thu, Nov 3, 2016 at 7:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Nov 1, 2016 at 8:31 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

By the way, I'm a bit skeptical whether this enhancement is really beneficial
than works for this enhancement, because we can now easily increase the number
of processor cores to run seq-scan with qualifier, especially, when it has high
selectivity.
How about your thought?

Are you saying we don't need to both making sequential scans faster
because we could just use parallel sequential scan instead? That
doesn't sound right to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

heap_scankey_pushdown_expr_ctx.patchapplication/octet-stream; name=heap_scankey_pushdown_expr_ctx.patchDownload+204-19
TPCH_out.tarapplication/x-tar; name=TPCH_out.tarDownload
#21Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#20)
#22Dilip Kumar
dilipbalaut@gmail.com
In reply to: Robert Haas (#21)
#23Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#22)
#24Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#23)
#25Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#24)
#26Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#24)
#27Dilip Kumar
dilipbalaut@gmail.com
In reply to: Robert Haas (#26)
#28Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#26)
#29Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#28)
#30Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#27)
#31Dilip Kumar
dilipbalaut@gmail.com
In reply to: Robert Haas (#30)
#32Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#31)
#33Andres Freund
andres@anarazel.de
In reply to: Dilip Kumar (#31)
#34Haribabu Kommi
kommi.haribabu@gmail.com
In reply to: Andres Freund (#33)
#35Dilip Kumar
dilipbalaut@gmail.com
In reply to: Andres Freund (#33)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#35)