GROUP BY DISTINCT

Started by Vik Fearingabout 5 years ago22 messageshackersdocs
Jump to latest
#1Vik Fearing
vik@postgresfriends.org
hackersdocs

When combining multiple grouping items, such as rollups and cubes, the
resulting flattened grouping sets can contain duplicate items. The
standard provides for this by allowing GROUP BY DISTINCT to deduplicate
them prior to doing the actual work.

For example:
GROUP BY ROLLUP (a,b), ROLLUP (a,c)

expands to the sets:
(a,b,c), (a,b), (a,b), (a,c), (a), (a), (a,c), (a), ()

but:
GROUP BY DISTINCT ROLLUP (a,b), ROLLUP (a,c)

expands to just the sets:
(a,b,c), (a,b), (a,c), (a), ()

Attached is a patch to implement this for PostgreSQL.
--
Vik Fearing

Attachments:

0001-implement-GROUP-BY-DISTINCT.v01.patchtext/x-patch; charset=UTF-8; name=0001-implement-GROUP-BY-DISTINCT.v01.patchDownload+284-17
#2Erik Rijkers
er@xs4all.nl
In reply to: Vik Fearing (#1)
hackersdocs
Re: GROUP BY DISTINCT

On 2021.02.21. 13:52 Vik Fearing <vik@postgresfriends.org> wrote:

Attached is a patch to implement this for PostgreSQL.
[]

The changed line that gets stuffed into sql_features is missing a terminal value (to fill the 'comments' column).
This line:
'+T434 GROUP BY DISTINCT YES'

(A tab at the end will do, I suppose; that's how I fixed the patch locally)

Erik Rijkers

#3Vik Fearing
vik@postgresfriends.org
In reply to: Erik Rijkers (#2)
hackersdocs
Re: GROUP BY DISTINCT

On 2/21/21 3:06 PM, er@xs4all.nl wrote:

On 2021.02.21. 13:52 Vik Fearing <vik@postgresfriends.org> wrote:

Attached is a patch to implement this for PostgreSQL.
[]

The changed line that gets stuffed into sql_features is missing a terminal value (to fill the 'comments' column).
This line:
'+T434 GROUP BY DISTINCT YES'

(A tab at the end will do, I suppose; that's how I fixed the patch locally)

Argh. Fixed.

Thank you for looking at it!
--
Vik Fearing

Attachments:

0001-implement-GROUP-BY-DISTINCT.v02.patchtext/x-patch; charset=UTF-8; name=0001-implement-GROUP-BY-DISTINCT.v02.patchDownload+284-17
#4Georgios Kokolatos
gkokolatos@protonmail.com
In reply to: Vik Fearing (#3)
hackersdocs
Re: GROUP BY DISTINCT

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested

Hi,

this is a useful feature, thank you for implementing. I gather that it follows the standard, if so,
then there are definitely no objections from me.

The patch in version 2, applies cleanly and passes all the tests.
It contains documentation which seems correct to a non native speaker.

As a minor gripe, I would note the addition of list_int_cmp.
The block

+               /* Sort each groupset individually */
+               foreach(cell, result)
+                       list_sort(lfirst(cell), list_int_cmp);

Can follow suit from the rest of the code, and define a static cmp_list_int_asc(), as
indeed the same patch does for cmp_list_len_contents_asc.
This is indeed point of which I will not hold a too strong opinion.

Overall :+1: from me.

I will be bumping to 'Ready for Committer' unless objections.

#5Vik Fearing
vik@postgresfriends.org
In reply to: Georgios Kokolatos (#4)
hackersdocs
Re: GROUP BY DISTINCT

On 3/2/21 4:06 PM, Georgios Kokolatos wrote:

As a minor gripe, I would note the addition of list_int_cmp.
The block

+               /* Sort each groupset individually */
+               foreach(cell, result)
+                       list_sort(lfirst(cell), list_int_cmp);

Can follow suit from the rest of the code, and define a static cmp_list_int_asc(), as
indeed the same patch does for cmp_list_len_contents_asc.
This is indeed point of which I will not hold a too strong opinion.

I did it this way because list_int_cmp is a general purpose function for
int lists that can be reused elsewhere in the future. Whereas
cmp_list_len_contents_asc is very specific to this case so I kept it local.

I'm happy to change it around if that's what consensus wants.

Overall :+1: from me.

Thanks for looking at it!

I will be bumping to 'Ready for Committer' unless objections.

In that case, here is another patch that fixes a typo in the docs
mentioned privately to me by Erik. The typo (and a gratuitous rebase)
is the only change to what you just reviewed.
--
Vik Fearing

Attachments:

0001-implement-GROUP-BY-DISTINCT.v03.patchtext/x-patch; charset=UTF-8; name=0001-implement-GROUP-BY-DISTINCT.v03.patchDownload+284-17
#6Georgios Kokolatos
gkokolatos@protonmail.com
In reply to: Vik Fearing (#5)
hackersdocs
Re: GROUP BY DISTINCT

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, March 2, 2021 5:51 PM, Vik Fearing <vik@postgresfriends.org> wrote:

On 3/2/21 4:06 PM, Georgios Kokolatos wrote:

As a minor gripe, I would note the addition of list_int_cmp.
The block

- /* Sort each groupset individually */

- foreach(cell, result)

- list_sort(lfirst(cell), list_int_cmp);

Can follow suit from the rest of the code, and define a static cmp_list_int_asc(), as
indeed the same patch does for cmp_list_len_contents_asc.
This is indeed point of which I will not hold a too strong opinion.

I did it this way because list_int_cmp is a general purpose function for
int lists that can be reused elsewhere in the future. Whereas
cmp_list_len_contents_asc is very specific to this case so I kept it local.

Of course. I got the intention and I have noted my opinion.

I'm happy to change it around if that's what consensus wants.

As before, I will not hold a too strong opinion.

Overall :+1: from me.

Thanks for looking at it!

I will be bumping to 'Ready for Committer' unless objections.

In that case, here is another patch that fixes a typo in the docs
mentioned privately to me by Erik. The typo (and a gratuitous rebase)
is the only change to what you just reviewed.

Thank you. The typo was indistiguishable to me too.

My :+1: stands for version 3 of the patch. Updating status in the
commitfest to reflect that.

//Georgios -- https://www.vmware.com

Show quoted text

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Vik Fearing

#7Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Vik Fearing (#5)
hackersdocs
Re: GROUP BY DISTINCT

Hi Vik,

The patch seems quite ready, I have just two comments.

1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
documentation? Now the index points just to the SELECT DISTINCT part.

2) The part in gram.y that wraps/unwraps the boolean flag as an integer,
in order to stash it in the group lists is rather ugly, IMHO. It forces
all the places handling the list to be aware of this (there are not
many, but still ...). And there are no other places doing (bool) intVal
so it's not like there's a precedent for this.

I think the clean solution is to make group_clause produce a struct with
two fields, and just use that. Not sure how invasive that will be
outside gram.y, though.

Also, the all_or_distinct vs. distinct_or_all seems a bit error-prone. I
wonder if we can come up with some clearer names, describing the context
of those types.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Pantelis Theodosiou
ypercube@gmail.com
In reply to: Tomas Vondra (#7)
hackersdocs
Fwd: GROUP BY DISTINCT

---------- Forwarded message ---------
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Fri, Mar 12, 2021 at 11:33 PM
Subject: Re: GROUP BY DISTINCT
To: Vik Fearing <vik@postgresfriends.org>, Georgios Kokolatos <
gkokolatos@protonmail.com>, <pgsql-hackers@lists.postgresql.org>
Cc: Erik Rijkers <er@xs4all.nl>

Hi Vik,

The patch seems quite ready, I have just two comments.

1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
documentation? Now the index points just to the SELECT DISTINCT part.

.....

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

After reading the above thread in hackers, I noticed that the index does
not point to aggrgeate functions either and DISTINCT is not mentioned in
the aggregate functions page either:
https://www.postgresql.org/docs/current/functions-aggregate.html
Shouldn't it be mentioned with an example of COUNT(DISTINCT ...) or
aggregate_function(DISTINCT ...) in general ?

Best regards

Pantelis Theodosiou

#9Vik Fearing
vik@postgresfriends.org
In reply to: Tomas Vondra (#7)
hackersdocs
Re: GROUP BY DISTINCT

On 3/13/21 12:33 AM, Tomas Vondra wrote:

Hi Vik,

The patch seems quite ready, I have just two comments.

Thanks for taking a look.

1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
documentation? Now the index points just to the SELECT DISTINCT part.

Good idea; I never think about the index.

2) The part in gram.y that wraps/unwraps the boolean flag as an integer,
in order to stash it in the group lists is rather ugly, IMHO. It forces
all the places handling the list to be aware of this (there are not
many, but still ...). And there are no other places doing (bool) intVal
so it's not like there's a precedent for this.

There is kind of a precedent for it, I was copying off of TriggerEvents
and func_alias_clause.

I think the clean solution is to make group_clause produce a struct with
two fields, and just use that. Not sure how invasive that will be
outside gram.y, though.

I didn't want to create a whole new parse node for it, but Andrew Gierth
pointed me towards SelectLimit so I did it like that and I agree it is
much cleaner.

Also, the all_or_distinct vs. distinct_or_all seems a bit error-prone. I
wonder if we can come up with some clearer names, describing the context
of those types.

I turned this into an enum for ALL/DISTINCT/default and the caller can
choose what it wants to do with default. I think that's a lot cleaner,
too. Maybe DISTINCT ON should be changed to fit in that? I left it
alone for now.

I also snuck in something that all of us overlooked which is outputting
the DISTINCT in ruleutils.c. I didn't add a test for it but that would
have been an unfortunate bug.

New patch attached, rebased on 15639d5e8f.
--
Vik Fearing

Attachments:

0001-implement-GROUP-BY-DISTINCT.v04.patchtext/x-patch; charset=UTF-8; name=0001-implement-GROUP-BY-DISTINCT.v04.patchDownload+333-28
#10Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Vik Fearing (#9)
hackersdocs
Re: GROUP BY DISTINCT

On 3/16/21 9:21 AM, Vik Fearing wrote:

On 3/13/21 12:33 AM, Tomas Vondra wrote:

Hi Vik,

The patch seems quite ready, I have just two comments.

Thanks for taking a look.

1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
documentation? Now the index points just to the SELECT DISTINCT part.

Good idea; I never think about the index.

2) The part in gram.y that wraps/unwraps the boolean flag as an integer,
in order to stash it in the group lists is rather ugly, IMHO. It forces
all the places handling the list to be aware of this (there are not
many, but still ...). And there are no other places doing (bool) intVal
so it's not like there's a precedent for this.

There is kind of a precedent for it, I was copying off of TriggerEvents
and func_alias_clause.

I see. I was looking for "(bool) intVal" but you're right TriggerEvents
code does something similar.

I think the clean solution is to make group_clause produce a struct with
two fields, and just use that. Not sure how invasive that will be
outside gram.y, though.

I didn't want to create a whole new parse node for it, but Andrew Gierth
pointed me towards SelectLimit so I did it like that and I agree it is
much cleaner.

I agree, that's much cleaner.

Also, the all_or_distinct vs. distinct_or_all seems a bit error-prone. I
wonder if we can come up with some clearer names, describing the context
of those types.

I turned this into an enum for ALL/DISTINCT/default and the caller can
choose what it wants to do with default. I think that's a lot cleaner,
too. Maybe DISTINCT ON should be changed to fit in that? I left it
alone for now.

I think DISTINCT ON is a different kind of animal, because that is a
list of expressions, not just a simple enum state.

I also snuck in something that all of us overlooked which is outputting
the DISTINCT in ruleutils.c. I didn't add a test for it but that would
have been an unfortunate bug.

Oh!

New patch attached, rebased on 15639d5e8f.

Thanks. At this point it seems fine to me, no further comments.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#11Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tomas Vondra (#10)
hackersdocs
Re: GROUP BY DISTINCT

On 3/16/21 3:52 PM, Tomas Vondra wrote:

On 3/16/21 9:21 AM, Vik Fearing wrote:

On 3/13/21 12:33 AM, Tomas Vondra wrote:

Hi Vik,

The patch seems quite ready, I have just two comments.

Thanks for taking a look.

1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
documentation? Now the index points just to the SELECT DISTINCT part.

Good idea; I never think about the index.

2) The part in gram.y that wraps/unwraps the boolean flag as an integer,
in order to stash it in the group lists is rather ugly, IMHO. It forces
all the places handling the list to be aware of this (there are not
many, but still ...). And there are no other places doing (bool) intVal
so it's not like there's a precedent for this.

There is kind of a precedent for it, I was copying off of TriggerEvents
and func_alias_clause.

I see. I was looking for "(bool) intVal" but you're right TriggerEvents
code does something similar.

I think the clean solution is to make group_clause produce a struct with
two fields, and just use that. Not sure how invasive that will be
outside gram.y, though.

I didn't want to create a whole new parse node for it, but Andrew Gierth
pointed me towards SelectLimit so I did it like that and I agree it is
much cleaner.

I agree, that's much cleaner.

Also, the all_or_distinct vs. distinct_or_all seems a bit error-prone. I
wonder if we can come up with some clearer names, describing the context
of those types.

I turned this into an enum for ALL/DISTINCT/default and the caller can
choose what it wants to do with default. I think that's a lot cleaner,
too. Maybe DISTINCT ON should be changed to fit in that? I left it
alone for now.

I think DISTINCT ON is a different kind of animal, because that is a
list of expressions, not just a simple enum state.

I also snuck in something that all of us overlooked which is outputting
the DISTINCT in ruleutils.c. I didn't add a test for it but that would
have been an unfortunate bug.

Oh!

New patch attached, rebased on 15639d5e8f.

Thanks. At this point it seems fine to me, no further comments.

Pushed. Thanks for the patch.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12Pantelis Theodosiou
ypercube@gmail.com
In reply to: Pantelis Theodosiou (#8)
hackersdocs
DISTINCT term in aggregate function

Hi, I didn't think of including you in this suggestion.
Or the pdsql-docs was not the right list to post? I didn't want to mix it
with the GROUP BY DISTINCT patch.

Please check my suggestion.

Best regards
Pantelis Theodosiou

---------- Forwarded message ---------
From: Pantelis Theodosiou <ypercube@gmail.com>
Date: Sat, Mar 13, 2021 at 1:03 AM
Subject: Fwd: GROUP BY DISTINCT
To: <pgsql-docs@lists.postgresql.org>

---------- Forwarded message ---------
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Fri, Mar 12, 2021 at 11:33 PM
Subject: Re: GROUP BY DISTINCT
To: Vik Fearing <vik@postgresfriends.org>, Georgios Kokolatos <
gkokolatos@protonmail.com>, <pgsql-hackers@lists.postgresql.org>
Cc: Erik Rijkers <er@xs4all.nl>

Hi Vik,

The patch seems quite ready, I have just two comments.

1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
documentation? Now the index points just to the SELECT DISTINCT part.

.....

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

After reading the above thread in hackers, I noticed that the index does
not point to aggrgeate functions either and DISTINCT is not mentioned in
the aggregate functions page either:
https://www.postgresql.org/docs/current/functions-aggregate.html
Shouldn't it be mentioned with an example of COUNT(DISTINCT ...) or
aggregate_function(DISTINCT ...) in general ?

Best regards

Pantelis Theodosiou

#13Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Pantelis Theodosiou (#12)
hackersdocs
Re: DISTINCT term in aggregate function

Sorry, I'm not reading pgsql-docs very often, so I missed the post.
Yeah, we should probably add an indexterm to the other places too.

regards

On 3/18/21 7:03 PM, Pantelis Theodosiou wrote:

Hi, I didn't think of including you in this suggestion.
Or the pdsql-docs was not the right list to post? I didn't want to mix
it with the GROUP BY DISTINCT patch.

Please check my suggestion.

Best regards
Pantelis Theodosiou

---------- Forwarded message ---------
From: *Pantelis Theodosiou* <ypercube@gmail.com <mailto:ypercube@gmail.com>>
Date: Sat, Mar 13, 2021 at 1:03 AM
Subject: Fwd: GROUP BY DISTINCT
To: <pgsql-docs@lists.postgresql.org
<mailto:pgsql-docs@lists.postgresql.org>>

---------- Forwarded message ---------
From: *Tomas Vondra* <tomas.vondra@enterprisedb.com
<mailto:tomas.vondra@enterprisedb.com>>
Date: Fri, Mar 12, 2021 at 11:33 PM
Subject: Re: GROUP BY DISTINCT
To: Vik Fearing <vik@postgresfriends.org
<mailto:vik@postgresfriends.org>>, Georgios Kokolatos
<gkokolatos@protonmail.com <mailto:gkokolatos@protonmail.com>>,
<pgsql-hackers@lists.postgresql.org
<mailto:pgsql-hackers@lists.postgresql.org>>
Cc: Erik Rijkers <er@xs4all.nl <mailto:er@xs4all.nl>>

Hi Vik,

The patch seems quite ready, I have just two comments.

1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
documentation? Now the index points just to the SELECT DISTINCT part.

.....

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com&gt;
The Enterprise PostgreSQL Company

After reading the above thread in hackers, I noticed that the index does
not point to aggrgeate functions either and DISTINCT is not mentioned in
the aggregate functions page
either: https://www.postgresql.org/docs/current/functions-aggregate.html
<https://www.postgresql.org/docs/current/functions-aggregate.html&gt;
Shouldn't it be mentioned with an example of COUNT(DISTINCT ...)  or
aggregate_function(DISTINCT ...) in general ?

Best regards

Pantelis Theodosiou

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tomas Vondra (#11)
hackersdocs
Re: GROUP BY DISTINCT

On 3/18/21 6:25 PM, Tomas Vondra wrote:

On 3/16/21 3:52 PM, Tomas Vondra wrote:

On 3/16/21 9:21 AM, Vik Fearing wrote:

On 3/13/21 12:33 AM, Tomas Vondra wrote:

Hi Vik,

The patch seems quite ready, I have just two comments.

Thanks for taking a look.

1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
documentation? Now the index points just to the SELECT DISTINCT part.

Good idea; I never think about the index.

2) The part in gram.y that wraps/unwraps the boolean flag as an integer,
in order to stash it in the group lists is rather ugly, IMHO. It forces
all the places handling the list to be aware of this (there are not
many, but still ...). And there are no other places doing (bool) intVal
so it's not like there's a precedent for this.

There is kind of a precedent for it, I was copying off of TriggerEvents
and func_alias_clause.

I see. I was looking for "(bool) intVal" but you're right TriggerEvents
code does something similar.

I think the clean solution is to make group_clause produce a struct with
two fields, and just use that. Not sure how invasive that will be
outside gram.y, though.

I didn't want to create a whole new parse node for it, but Andrew Gierth
pointed me towards SelectLimit so I did it like that and I agree it is
much cleaner.

I agree, that's much cleaner.

Also, the all_or_distinct vs. distinct_or_all seems a bit error-prone. I
wonder if we can come up with some clearer names, describing the context
of those types.

I turned this into an enum for ALL/DISTINCT/default and the caller can
choose what it wants to do with default. I think that's a lot cleaner,
too. Maybe DISTINCT ON should be changed to fit in that? I left it
alone for now.

I think DISTINCT ON is a different kind of animal, because that is a
list of expressions, not just a simple enum state.

I also snuck in something that all of us overlooked which is outputting
the DISTINCT in ruleutils.c. I didn't add a test for it but that would
have been an unfortunate bug.

Oh!

New patch attached, rebased on 15639d5e8f.

Thanks. At this point it seems fine to me, no further comments.

Pushed. Thanks for the patch.

Hmmm, this seems to fail on lapwing with this error:

parse_agg.c: In function 'expand_grouping_sets':
parse_agg.c:1851:23: error: value computed is not used
[-Werror=unused-value]
cc1: all warnings being treated as errors

That line is this:

foreach_delete_current(result, cell);

and I don't see how any of the values close by could be unused ...

The only possibility I can think of is some sort of issue in the old-ish
gcc release (4.7.2).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#15Thomas Munro
thomas.munro@gmail.com
In reply to: Tomas Vondra (#14)
hackersdocs
Re: GROUP BY DISTINCT

On Fri, Mar 19, 2021 at 8:27 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

Hmmm, this seems to fail on lapwing with this error:

parse_agg.c: In function 'expand_grouping_sets':
parse_agg.c:1851:23: error: value computed is not used
[-Werror=unused-value]
cc1: all warnings being treated as errors

That line is this:

foreach_delete_current(result, cell);

and I don't see how any of the values close by could be unused ...

The only possibility I can think of is some sort of issue in the old-ish
gcc release (4.7.2).

No sure what's going on there, but data points: I tried a 32 bit build
here (that's the other special thing about lapwing) and didn't see the
warning. GCC 10. Also curculio (gcc 4.2) and snapper (gcc 4.7) are
also showing this warning, but they don't have -Werror so they don't
fail. sidewinder (gcc 4.8) is not showing the warning.

#16Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Thomas Munro (#15)
hackersdocs
Re: GROUP BY DISTINCT

On 3/18/21 10:02 PM, Thomas Munro wrote:

On Fri, Mar 19, 2021 at 8:27 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

Hmmm, this seems to fail on lapwing with this error:

parse_agg.c: In function 'expand_grouping_sets':
parse_agg.c:1851:23: error: value computed is not used
[-Werror=unused-value]
cc1: all warnings being treated as errors

That line is this:

foreach_delete_current(result, cell);

and I don't see how any of the values close by could be unused ...

The only possibility I can think of is some sort of issue in the old-ish
gcc release (4.7.2).

No sure what's going on there, but data points: I tried a 32 bit build
here (that's the other special thing about lapwing) and didn't see the
warning. GCC 10. Also curculio (gcc 4.2) and snapper (gcc 4.7) are
also showing this warning, but they don't have -Werror so they don't
fail. sidewinder (gcc 4.8) is not showing the warning.

Thanks for the info. So it's likely related to older gcc releases. The
question is how to tweak the code to get rid of this ...

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#17Thomas Munro
thomas.munro@gmail.com
In reply to: Tomas Vondra (#16)
hackersdocs
Re: GROUP BY DISTINCT

On Fri, Mar 19, 2021 at 10:14 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

The only possibility I can think of is some sort of issue in the old-ish
gcc release (4.7.2).

No sure what's going on there, but data points: I tried a 32 bit build
here (that's the other special thing about lapwing) and didn't see the
warning. GCC 10. Also curculio (gcc 4.2) and snapper (gcc 4.7) are
also showing this warning, but they don't have -Werror so they don't
fail. sidewinder (gcc 4.8) is not showing the warning.

Thanks for the info. So it's likely related to older gcc releases. The
question is how to tweak the code to get rid of this ...

It's frustrating to have to do press-ups to fix a problem because a
zombie Debian 7 system is running with -Werror (though it's always
possible that it's telling us something interesting...). Anyway, I
think someone with a GCC < 4.8 compiler would have to investigate. I
was hoping to help, but none of my systems have one in easy-to-install
format...

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#17)
hackersdocs
Re: GROUP BY DISTINCT

Thomas Munro <thomas.munro@gmail.com> writes:

On Fri, Mar 19, 2021 at 10:14 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

Thanks for the info. So it's likely related to older gcc releases. The
question is how to tweak the code to get rid of this ...

It's frustrating to have to do press-ups to fix a problem because a
zombie Debian 7 system is running with -Werror (though it's always
possible that it's telling us something interesting...). Anyway, I
think someone with a GCC < 4.8 compiler would have to investigate. I
was hoping to help, but none of my systems have one in easy-to-install
format...

Hmm ... prairiedog isn't showing the warning, but maybe gaur will.
I can take a look if nobody else is stepping up.

regards, tom lane

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#18)
hackersdocs
Re: GROUP BY DISTINCT

I wrote:

Hmm ... prairiedog isn't showing the warning, but maybe gaur will.

Bingo:

parse_agg.c: In function 'expand_grouping_sets':
parse_agg.c:1851:5: warning: value computed is not used

This is gcc 4.5, but hopefully whatever shuts it up will also work on 4.7.
I'll work on figuring that out.

regards, tom lane

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#19)
hackersdocs
Re: GROUP BY DISTINCT

I wrote:

This is gcc 4.5, but hopefully whatever shuts it up will also work on 4.7.
I'll work on figuring that out.

Actually, the problem is pretty obvious after comparing this use
of foreach_delete_current() to every other one. I'm not sure why
the compiler warnings are phrased just as they are, but the fix
I just pushed does make 4.5 happy.

regards, tom lane

#21Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tom Lane (#20)
hackersdocs
#22Vik Fearing
vik@postgresfriends.org
In reply to: Tomas Vondra (#21)
hackersdocs