Failures with installcheck and low work_mem value in 13~

Started by Michael Paquierover 5 years ago9 messages
#1Michael Paquier
michael@paquier.xyz

Hi all,

Attempting to run installcheck with 13~ and a value of work_mem lower
than the default causes two failures, both related to incremental
sorts (here work_mem = 1MB):
1) Test incremental_sort:
@@ -4,12 +4,13 @@
 select * from (select * from tenk1 order by four) t order by four, ten;
             QUERY PLAN             
 -----------------------------------
- Sort
+ Incremental Sort
    Sort Key: tenk1.four, tenk1.ten
+   Presorted Key: tenk1.four
    ->  Sort
          Sort Key: tenk1.four
          ->  Seq Scan on tenk1
-(5 rows)
+(6 rows)
2) Test join:
@@ -2368,12 +2368,13 @@
    ->  Merge Left Join
          Merge Cond: (x.thousand = y.unique2)
          Join Filter: ((x.twothousand = y.hundred) AND (x.fivethous = y.unique2))
-         ->  Sort
+         ->  Incremental Sort
                Sort Key: x.thousand, x.twothousand, x.fivethous
-               ->  Seq Scan on tenk1 x
+               Presorted Key: x.thousand
+               ->  Index Scan using tenk1_thous_tenthous on tenk1 x
          ->  Materialize
                ->  Index Scan using tenk1_unique2 on tenk1 y
-(9 rows)
+(10 rows)

There are of course regression failures when changing the relation
page size or such, but we should have tests more portable when it
comes to work_mem (this issue does not exist in ~12) or people running
installcheck on a new instance would be surprised. Please note that I
have not looked at the problem in details, but a simple solution would
be to enforce work_mem in those code paths to keep the two plans
stable.

Thanks,
--
Michael

#2Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#1)
Re: Failures with installcheck and low work_mem value in 13~

On Mon, Jun 15, 2020 at 10:29:41PM +0900, Michael Paquier wrote:

Attempting to run installcheck with 13~ and a value of work_mem lower
than the default causes two failures, both related to incremental
sorts (here work_mem = 1MB):
1) Test incremental_sort:
@@ -4,12 +4,13 @@
select * from (select * from tenk1 order by four) t order by four, ten;
QUERY PLAN             
-----------------------------------
- Sort
+ Incremental Sort
Sort Key: tenk1.four, tenk1.ten
+   Presorted Key: tenk1.four
->  Sort
Sort Key: tenk1.four
->  Seq Scan on tenk1
-(5 rows)
+(6 rows)

Looking at this one, it happens that this is the first test in
incremental_sort.sql, and we have the following comment:
-- When we have to sort the entire table, incremental sort will
-- be slower than plain sort, so it should not be used.
explain (costs off)
select * from (select * from tenk1 order by four) t order by four, ten;

When using such a low value of work_mem, why do we switch to an
incremental sort if we know that it is going to be slower than a plain
sort? Something looks wrong in the planner choice here.
--
Michael

#3Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Michael Paquier (#2)
Re: Failures with installcheck and low work_mem value in 13~

On Tue, Jun 16, 2020 at 02:39:47PM +0900, Michael Paquier wrote:

On Mon, Jun 15, 2020 at 10:29:41PM +0900, Michael Paquier wrote:

Attempting to run installcheck with 13~ and a value of work_mem lower
than the default causes two failures, both related to incremental
sorts (here work_mem = 1MB):
1) Test incremental_sort:
@@ -4,12 +4,13 @@
select * from (select * from tenk1 order by four) t order by four, ten;
QUERY PLAN
-----------------------------------
- Sort
+ Incremental Sort
Sort Key: tenk1.four, tenk1.ten
+   Presorted Key: tenk1.four
->  Sort
Sort Key: tenk1.four
->  Seq Scan on tenk1
-(5 rows)
+(6 rows)

Looking at this one, it happens that this is the first test in
incremental_sort.sql, and we have the following comment:
-- When we have to sort the entire table, incremental sort will
-- be slower than plain sort, so it should not be used.
explain (costs off)
select * from (select * from tenk1 order by four) t order by four, ten;

When using such a low value of work_mem, why do we switch to an
incremental sort if we know that it is going to be slower than a plain
sort? Something looks wrong in the planner choice here.

I don't think it's particularly wrong. The "full sort" can't be done in
memory with such low work_mem value, while the incremental sort can. So
I think the planner choice is sane, it's more than the comment does not
explicitly state this depends on work_mem too.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Michael Paquier (#1)
Re: Failures with installcheck and low work_mem value in 13~

On Mon, Jun 15, 2020 at 10:29:41PM +0900, Michael Paquier wrote:

Hi all,

Attempting to run installcheck with 13~ and a value of work_mem lower
than the default causes two failures, both related to incremental
sorts (here work_mem = 1MB):
1) Test incremental_sort:
@@ -4,12 +4,13 @@
select * from (select * from tenk1 order by four) t order by four, ten;
QUERY PLAN
-----------------------------------
- Sort
+ Incremental Sort
Sort Key: tenk1.four, tenk1.ten
+   Presorted Key: tenk1.four
->  Sort
Sort Key: tenk1.four
->  Seq Scan on tenk1
-(5 rows)
+(6 rows)
2) Test join:
@@ -2368,12 +2368,13 @@
->  Merge Left Join
Merge Cond: (x.thousand = y.unique2)
Join Filter: ((x.twothousand = y.hundred) AND (x.fivethous = y.unique2))
-         ->  Sort
+         ->  Incremental Sort
Sort Key: x.thousand, x.twothousand, x.fivethous
-               ->  Seq Scan on tenk1 x
+               Presorted Key: x.thousand
+               ->  Index Scan using tenk1_thous_tenthous on tenk1 x
->  Materialize
->  Index Scan using tenk1_unique2 on tenk1 y
-(9 rows)
+(10 rows)

There are of course regression failures when changing the relation
page size or such, but we should have tests more portable when it
comes to work_mem (this issue does not exist in ~12) or people running
installcheck on a new instance would be surprised. Please note that I
have not looked at the problem in details, but a simple solution would
be to enforce work_mem in those code paths to keep the two plans
stable.

I don't think the tests can be made not to depend on work_mem, because
it costing of sort / incremental sort depends on the value. I agree
setting the work_mem at the beginning of the test script is the right
solution.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#4)
Re: Failures with installcheck and low work_mem value in 13~

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

I don't think the tests can be made not to depend on work_mem, because
it costing of sort / incremental sort depends on the value. I agree
setting the work_mem at the beginning of the test script is the right
solution.

I'm a bit skeptical about changing anything here. There are quite
a large number of GUCs that can affect the regression results, and
it wouldn't be sane to try to force them all to fixed values. For
one thing, that'd be a PITA to maintain, and for another, it's not
infrequently useful to run the tests with nonstandard settings to
see what happens.

Is there a good reason for being concerned about work_mem in particular
and this test script in particular?

regards, tom lane

In reply to: Tom Lane (#5)
Re: Failures with installcheck and low work_mem value in 13~

On Fri, Jun 19, 2020 at 10:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

I don't think the tests can be made not to depend on work_mem, because
it costing of sort / incremental sort depends on the value. I agree
setting the work_mem at the beginning of the test script is the right
solution.

I'm a bit skeptical about changing anything here. There are quite
a large number of GUCs that can affect the regression results, and
it wouldn't be sane to try to force them all to fixed values. For
one thing, that'd be a PITA to maintain, and for another, it's not
infrequently useful to run the tests with nonstandard settings to
see what happens.

+1

--
Peter Geoghegan

#7Michael Paquier
michael@paquier.xyz
In reply to: Peter Geoghegan (#6)
Re: Failures with installcheck and low work_mem value in 13~

On Fri, Jun 19, 2020 at 10:28:56AM -0700, Peter Geoghegan wrote:

On Fri, Jun 19, 2020 at 10:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm a bit skeptical about changing anything here. There are quite
a large number of GUCs that can affect the regression results, and
it wouldn't be sane to try to force them all to fixed values. For
one thing, that'd be a PITA to maintain, and for another, it's not
infrequently useful to run the tests with nonstandard settings to
see what happens.

+1

We cared about such plan stability that in the past FWIW, see for
example c588df9 as work_mem is a setting that people like to change.
Why should this be different? work_mem is a popular configuration
setting. Perhaps people will not complain about that being an issue
if running installcheck, we'll know with the time. Anyway, I am fine
to just change my default configuration if the conclusion is to not
touch that and let it be, but I find a bit annoying that switching
work_mem from 4MB to 1MB is enough to destabilize the tests. And this
worked just fine in past releases.
--
Michael

In reply to: Michael Paquier (#7)
Re: Failures with installcheck and low work_mem value in 13~

On Fri, Jun 19, 2020 at 7:48 PM Michael Paquier <michael@paquier.xyz> wrote:

We cared about such plan stability that in the past FWIW, see for
example c588df9 as work_mem is a setting that people like to change.
Why should this be different? work_mem is a popular configuration
setting.

The RMT met today. We determined that it wasn't worth adjusting this
test to pass with non-standard work_mem values.

"make installcheck" also fails with lower random_page_cost settings.
There doesn't seem to be any reason to permit a non-standard setting
to cause installcheck to fail elsewhere, while not tolerating the same
issue here, with work_mem.

Thanks
--
Peter Geoghegan

#9Michael Paquier
michael@paquier.xyz
In reply to: Peter Geoghegan (#8)
Re: Failures with installcheck and low work_mem value in 13~

On Thu, Jun 25, 2020 at 12:15:54PM -0700, Peter Geoghegan wrote:

The RMT met today. We determined that it wasn't worth adjusting this
test to pass with non-standard work_mem values.

Okay, thanks for the feedback. We'll see how it works out.
--
Michael