PostgreSQL 10: Segmentation fault when using GROUPING SETS with all unsortable columns

Started by Huong Dangminhalmost 8 years ago6 messages
#1Huong Dangminh
huo-dangminh@ys.jp.nec.com
2 attachment(s)

Hi,

I have found a case which could get segmentation fault when using GROUPING SETS.
It occurs when columns in GROUPING SETS are all unsortable but hashable.

Attached grouping_sets_segv.zip include module to reproduce this problem.

I think it made from below change.

https://www.postgresql.org/docs/current/static/release-10.html#id-1.11.6.8.5.3.6
---
Allow hashed aggregation to be used with grouping sets (Andrew Gierth)
---
https://github.com/postgres/postgres/commit/b5635948ab165b6070e7d05d111f966e07570d81

bt from attached grouping_sets_segv.zip is like below.
---------------
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000797855 in consider_groupingsets_paths (root=0x11fe078, grouped_rel=0x120cfc0, path=0x11ff488, is_sorted=0 '\000',
can_hash=1 '\001', target=0x11ffb20, gd=0x11fedb8, agg_costs=0x7fffed813850, dNumGroups=400) at planner.c:4205
4205 unhashed_rollup = lfirst(l_start);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-8.el7.x86_64 libcom_err-1.42.9-7.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libxml2-2.9.1-6.el7_2.3.x86_64 openssl-libs-1.0.1e-42.el7.x86_64 pcre-8.32-14.el7.x86_64 tde_for_pg10-1.2.0-0.el7.x86_64 xz-libs-5.1.2-9alpha.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x0000000000797855 in consider_groupingsets_paths (root=0x11fe078, grouped_rel=0x120cfc0, path=0x11ff488, is_sorted=0 '\000',
can_hash=1 '\001', target=0x11ffb20, gd=0x11fedb8, agg_costs=0x7fffed813850, dNumGroups=400) at planner.c:4205
#1 0x0000000000797404 in create_grouping_paths (root=0x11fe078, input_rel=0x11fe290, target=0x11ffb20, agg_costs=0x7fffed813850,
gd=0x11fedb8) at planner.c:4045
#2 0x0000000000793a06 in grouping_planner (root=0x11fe078, inheritance_update=0 '\000', tuple_fraction=0) at planner.c:1895
#3 0x0000000000791ed8 in subquery_planner (glob=0x11b2a50, parse=0x11b2448, parent_root=0x0, hasRecursion=0 '\000', tuple_fraction=0)
at planner.c:862
#4 0x0000000000790d9e in standard_planner (parse=0x11b2448, cursorOptions=256, boundParams=0x0) at planner.c:334
#5 0x0000000000790b30 in planner (parse=0x11b2448, cursorOptions=256, boundParams=0x0) at planner.c:210
#6 0x00000000008756a1 in pg_plan_query (querytree=0x11b2448, cursorOptions=256, boundParams=0x0) at postgres.c:796
#7 0x00000000008757ce in pg_plan_queries (querytrees=0x11febc8, cursorOptions=256, boundParams=0x0) at postgres.c:862
#8 0x0000000000875a92 in exec_simple_query (
query_string=0x11b1060 "select c1,c2 from testtbl_gouping_sets group by grouping sets(c1,c2);") at postgres.c:1027
#9 0x0000000000879dc2 in PostgresMain (argc=1, argv=0x115d570, dbname=0x115d3d8 "si2", username=0x112dfc0 "postgres")
at postgres.c:4088
#10 0x00000000007dddb7 in BackendRun (port=0x1155fc0) at postmaster.c:4405
#11 0x00000000007dd53f in BackendStartup (port=0x1155fc0) at postmaster.c:4077
#12 0x00000000007d9a47 in ServerLoop () at postmaster.c:1755
#13 0x00000000007d9086 in PostmasterMain (argc=1, argv=0x112be90) at postmaster.c:1363
#14 0x00000000007184a2 in main (argc=1, argv=0x112be90) at main.c:228
(gdb) p l_start
$1 = (ListCell *) 0x0
(gdb) p gd->rollups
$2 = (List *) 0x0
(gdb)
---------------

Look at related source I found that,
In planner.c:preprocess_grouping_sets, we do not update gd->rollups if all of columns in
GROUPING SETS are unsortable but hashable.

After that in planner.c:consider_groupingsets_paths we used gd->rollups and made the SEGV above.

Not yet fully understand the related commit, but I think it is fine to put
ERRCODE_FEATURE_NOT_SUPPORTED error from preprocess_grouping_sets when all columns in
GROUPING SETS are unsortable. Is that right?

I have tried to write a patch (attached).
Could anyone confirm for me?

---
Thanks and best regards,
Dang Minh Huong
NEC Solution Innovators, Ltd.
http://www.nec-solutioninnovators.co.jp/en/

Attachments:

grouping_sets_all_unsortable_columns.patchapplication/octet-stream; name=grouping_sets_all_unsortable_columns.patchDownload
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 4dae405..4aebbf9 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -2173,7 +2173,10 @@ preprocess_grouping_sets(PlannerInfo *root)
 		if (sortable_sets)
 			sets = extract_rollup_sets(sortable_sets);
 		else
-			sets = NIL;
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("could not implement GROUP BY"),
+					 errdetail("Some of the datatypes only support hashing, while others only support sorting.")));
 	}
 	else
 		sets = extract_rollup_sets(parse->groupingSets);
grouping_sets_segv.zipapplication/x-zip-compressed; name=grouping_sets_segv.zipDownload
#2Huong Dangminh
huo-dangminh@ys.jp.nec.com
In reply to: Huong Dangminh (#1)
RE: PostgreSQL 10: Segmentation fault when using GROUPING SETS with all unsortable columns

Hi,

I have found a case which could get segmentation fault when using GROUPING
SETS.
It occurs when columns in GROUPING SETS are all unsortable but hashable.

I have added this thread to current CF.
please let me know if you need more information.

---
Thanks and best regards,
Dang Minh Huong
NEC Solution Innovators, Ltd.
http://www.nec-solutioninnovators.co.jp/en/

#3Andres Freund
andres@anarazel.de
In reply to: Huong Dangminh (#1)
Re: PostgreSQL 10: Segmentation fault when using GROUPING SETS with all unsortable columns

Hi,

On 2018-03-17 23:43:22 +0000, Huong Dangminh wrote:

Hi,

I have found a case which could get segmentation fault when using GROUPING SETS.
It occurs when columns in GROUPING SETS are all unsortable but hashable.

Attached grouping_sets_segv.zip include module to reproduce this problem.

I think it made from below change.

https://www.postgresql.org/docs/current/static/release-10.html#id-1.11.6.8.5.3.6
---
Allow hashed aggregation to be used with grouping sets (Andrew Gierth)
---
https://github.com/postgres/postgres/commit/b5635948ab165b6070e7d05d111f966e07570d81

Andrew, any comments?

Regards,

Andres

#4Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Huong Dangminh (#1)
Re: PostgreSQL 10: Segmentation fault when using GROUPING SETS with all unsortable columns

"Huong" == Huong Dangminh <huo-dangminh@ys.jp.nec.com> writes:

Huong> Hi,

Huong> I have found a case which could get segmentation fault when
Huong> using GROUPING SETS. It occurs when columns in GROUPING SETS are
Huong> all unsortable but hashable.

Yes, mea culpa. will fix.

BTW, you should have reported this as a bug via the pgsql-bugs list or
the bug submission form.

Huong> Attached grouping_sets_segv.zip include module to reproduce this
Huong> problem.

A much simpler way to reproduce it is to use one of the builtin types
that isn't sortable (I tend to use xid for testing).

Huong> Not yet fully understand the related commit, but I think it is
Huong> fine to put ERRCODE_FEATURE_NOT_SUPPORTED error from
Huong> preprocess_grouping_sets when all columns in GROUPING SETS are
Huong> unsortable. Is that right?

No, that's definitely wrong. The intent is to be able to generate a
hashed path in this case, it's just the logic that tries to prefer
sorting to hashing when the input arrives already sorted is doing the
wrong thing for unsortable data.

--
Andrew (irc:RhodiumToad)

#5Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Andrew Gierth (#4)
1 attachment(s)
Re: PostgreSQL 10: Segmentation fault when using GROUPING SETS with all unsortable columns

"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

Huong> Not yet fully understand the related commit, but I think it is
Huong> fine to put ERRCODE_FEATURE_NOT_SUPPORTED error from
Huong> preprocess_grouping_sets when all columns in GROUPING SETS are
Huong> unsortable. Is that right?

Andrew> No, that's definitely wrong. The intent is to be able to
Andrew> generate a hashed path in this case, it's just the logic that
Andrew> tries to prefer sorting to hashing when the input arrives
Andrew> already sorted is doing the wrong thing for unsortable data.

Attached is what I think is the correct fix, which I'll commit shortly.

--
Andrew (irc:RhodiumToad)

Attachments:

gsfix.patchtext/x-patchDownload
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 9c4a1baf5f..7fabe017c2 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -4017,7 +4017,28 @@ consider_groupingsets_paths(PlannerInfo *root,
 
 		Assert(can_hash);
 
-		if (pathkeys_contained_in(root->group_pathkeys, path->pathkeys))
+		/*
+		 * If the input is coincidentally sorted usefully (which can happen
+		 * even if is_sorted is false, since that only means that our caller
+		 * has set up the sorting for us), then save some hashtable space by
+		 * making use of that. But we need to watch out for degenerate cases:
+		 *
+		 * 1) If there are any empty grouping sets, then group_pathkeys might
+		 * be NIL if all non-empty grouping sets are unsortable. In this case,
+		 * there will be a rollup containing only empty groups, and the
+		 * pathkeys_contained_in test is vacuously true; this is ok.
+		 *
+		 * XXX: the above relies on the fact that group_pathkeys is generated
+		 * from the first rollup. If we add the ability to consider multiple
+		 * sort orders for grouping input, this assumption might fail.
+		 *
+		 * 2) If there are no empty sets and only unsortable sets, then the
+		 * rollups list will be empty (and thus l_start == NULL), and
+		 * group_pathkeys will be NIL; we must ensure that the vacuously-true
+		 * pathkeys_contain_in test doesn't cause us to crash.
+		 */
+		if (l_start != NULL &&
+			pathkeys_contained_in(root->group_pathkeys, path->pathkeys))
 		{
 			unhashed_rollup = lfirst_node(RollupData, l_start);
 			exclude_groups = unhashed_rollup->numGroups;
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index d21a494a9d..c7deec2ff4 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -1018,6 +1018,18 @@ explain (costs off)
          ->  Values Scan on "*VALUES*"
 (9 rows)
 
+-- unsortable cases
+select unsortable_col, count(*)
+  from gstest4 group by grouping sets ((unsortable_col),(unsortable_col))
+  order by unsortable_col::text;
+ unsortable_col | count 
+----------------+-------
+              1 |     4
+              1 |     4
+              2 |     4
+              2 |     4
+(4 rows)
+
 -- mixed hashable/sortable cases
 select unhashable_col, unsortable_col,
        grouping(unhashable_col, unsortable_col),
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index eb68028603..c32d23b8d7 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -292,6 +292,11 @@ explain (costs off)
   select a, b, grouping(a,b), array_agg(v order by v)
     from gstest1 group by cube(a,b);
 
+-- unsortable cases
+select unsortable_col, count(*)
+  from gstest4 group by grouping sets ((unsortable_col),(unsortable_col))
+  order by unsortable_col::text;
+
 -- mixed hashable/sortable cases
 select unhashable_col, unsortable_col,
        grouping(unhashable_col, unsortable_col),
#6Huong Dangminh
huo-dangminh@ys.jp.nec.com
In reply to: Andrew Gierth (#4)
RE: PostgreSQL 10: Segmentation fault when using GROUPING SETS with all unsortable columns

Hi,

Thanks for response and the fix patch.

"Huong" == Huong Dangminh <huo-dangminh@ys.jp.nec.com> writes:

Huong> Hi,

Huong> I have found a case which could get segmentation fault when Huong>
using GROUPING SETS. It occurs when columns in GROUPING SETS are Huong>
all unsortable but hashable.

Yes, mea culpa. will fix.

BTW, you should have reported this as a bug via the pgsql-bugs list or the
bug submission form.

Sorry. I will be careful in next time.

Huong> Attached grouping_sets_segv.zip include module to reproduce this
Huong> problem.

A much simpler way to reproduce it is to use one of the builtin types that
isn't sortable (I tend to use xid for testing).

Yeah. I did not realize xid type also do.

Huong> Not yet fully understand the related commit, but I think it is
Huong> fine to put ERRCODE_FEATURE_NOT_SUPPORTED error from Huong>
preprocess_grouping_sets when all columns in GROUPING SETS are Huong>
unsortable. Is that right?

No, that's definitely wrong. The intent is to be able to generate a hashed
path in this case, it's just the logic that tries to prefer sorting to hashing
when the input arrives already sorted is doing the wrong thing for unsortable
data.

Thanks, It should be like that.
And the patch work fine for me.

---
Thanks and best regards,
Dang Minh Huong
NEC Solution Innovators, Ltd.
http://www.nec-solutioninnovators.co.jp/en/