problem with query and group by error

Started by Susan Cassidyabout 12 years ago6 messagesgeneral
Jump to latest
#1Susan Cassidy
susan.cassidy@decisionsciencescorp.com

I have a large query:
SELECT distinct on (srs.scan_run_id) srs.scan_run_id,
srs.run_request_number, srs.container_id, srs.manifest_id,
srs.scan_system_name_id,
srs.scan_site_name_id, srs.scan_site_nickname_id,
to_char(srs.start_time, 'MM/DD/YY HH24:MI:SS'),
to_char(srs.stop_time, 'MM/DD/YY HH24:MI:SS'), srs.system_operator,
srs.system_baseline_configuration_file_version_id,
srs.container_contents, srs.container_run_truth_data,
srs.scan_type_id, sty.scan_type,
srs.hardware_version_or_hardware_identifier_id,
srs.software_version_id, srs.operator_notes,
to_char(srs.expiration_date, 'MM/DD/YY HH24:MI:SS'),
srs.scan_outcome_id,
to_char(srs.alarm_time, 'MM/DD/YY HH24:MI:SS'),
srs.alarm_decision_id, srs.material_detected_id, srs.data_access,
ssn.scan_system_name, ssn.simulation, ssitenames.scan_site_name,
ssitenicknames.scan_site_nickname,
hvhi.hardware_version_or_hardware_identifier_name,
sv.software_version, sc.description
from scan_run_summary srs left outer join scan_system_names ssn on
srs.scan_system_name_id = ssn.scan_system_name_id
left outer join scan_site_names ssitenames on
srs.scan_site_name_id = ssitenames.scan_site_name_id
left outer join scan_site_nicknames ssitenicknames on
srs.scan_site_nickname_id = ssitenicknames.scan_site_nickname_id
left outer join hardware_version_or_hardware_identifiers hvhi on
srs.hardware_version_or_hardware_identifier_id =
hvhi.hardware_version_or_hardware_identifier_id
left outer join software_versions sv on srs.software_version_id =
sv.software_version_id
left outer join scenes sc on srs.container_run_truth_data =
sc.scene_id
left outer join scan_types sty on srs.scan_type_id =
sty.scan_type_id
join scene_thing_instances sti on srs.container_run_truth_data =
sti.scene_id
join scene_things stg on sti.scene_thing_id = stg.scene_thing_id
group by srs.scan_run_id
;

That gives this error:

ERROR: column "sty.scan_type" must appear in the GROUP BY clause or be
used in an aggregate function
LINE 5: srs.scan_type_id, sty.scan_type, srs.hardware_version...

I don't see why sty.scan_type should be singled out as requiring a group by
clause, when there are many other columns specified.

If I add scan_type to the group by, then it gives the same error, but with
ssn.scan_system_name.

Am I going to have to specify all the columns in the group by clause?

I originally had the query without the group by, but I had duplicate rows,
so I added a group by to eliminate them.

Thanks,
Susan

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Susan Cassidy (#1)
Re: problem with query and group by error

Susan Cassidy-3 wrote

I have a large query:
SELECT distinct on (srs.scan_run_id) srs.scan_run_id,
srs.run_request_number, srs.container_id, srs.manifest_id,
srs.scan_system_name_id,
srs.scan_site_name_id, srs.scan_site_nickname_id,
to_char(srs.start_time, 'MM/DD/YY HH24:MI:SS'),
to_char(srs.stop_time, 'MM/DD/YY HH24:MI:SS'), srs.system_operator,
srs.system_baseline_configuration_file_version_id,
srs.container_contents, srs.container_run_truth_data,
srs.scan_type_id, sty.scan_type,
srs.hardware_version_or_hardware_identifier_id,
srs.software_version_id, srs.operator_notes,
to_char(srs.expiration_date, 'MM/DD/YY HH24:MI:SS'),
srs.scan_outcome_id,
to_char(srs.alarm_time, 'MM/DD/YY HH24:MI:SS'),
srs.alarm_decision_id, srs.material_detected_id, srs.data_access,
ssn.scan_system_name, ssn.simulation, ssitenames.scan_site_name,
ssitenicknames.scan_site_nickname,
hvhi.hardware_version_or_hardware_identifier_name,
sv.software_version, sc.description
from scan_run_summary srs left outer join scan_system_names ssn on
srs.scan_system_name_id = ssn.scan_system_name_id
left outer join scan_site_names ssitenames on
srs.scan_site_name_id = ssitenames.scan_site_name_id
left outer join scan_site_nicknames ssitenicknames on
srs.scan_site_nickname_id =
ssitenicknames.scan_site_nickname_id
left outer join hardware_version_or_hardware_identifiers hvhi on
srs.hardware_version_or_hardware_identifier_id =
hvhi.hardware_version_or_hardware_identifier_id
left outer join software_versions sv on srs.software_version_id =
sv.software_version_id
left outer join scenes sc on srs.container_run_truth_data =
sc.scene_id
left outer join scan_types sty on srs.scan_type_id =
sty.scan_type_id
join scene_thing_instances sti on srs.container_run_truth_data =
sti.scene_id
join scene_things stg on sti.scene_thing_id = stg.scene_thing_id
group by srs.scan_run_id
;

That gives this error:

ERROR: column "sty.scan_type" must appear in the GROUP BY clause or be
used in an aggregate function
LINE 5: srs.scan_type_id, sty.scan_type, srs.hardware_version...

I don't see why sty.scan_type should be singled out as requiring a group
by
clause, when there are many other columns specified.

If I add scan_type to the group by, then it gives the same error, but with
ssn.scan_system_name.

Am I going to have to specify all the columns in the group by clause?

I originally had the query without the group by, but I had duplicate rows,
so I added a group by to eliminate them.

Thanks,
Susan

Newer releases (not sure which) are capable of identifying a primary key in
a group by and allow you to omit all dependent columns of said primary key.

However, as a general rule, every non-aggregated column must appear in the
GROUP BY. There is not "default behavior" for columns not appearing in
group by nor that have been aggregated.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/problem-with-query-and-group-by-error-tp5793127p5793128.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Emanuel Calvo
emanuel.calvo@2ndquadrant.com
In reply to: Susan Cassidy (#1)
Re: problem with query and group by error

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

El 21/02/14 14:40, Susan Cassidy escribi�:

I have a large query: SELECT distinct on (srs.scan_run_id)
srs.scan_run_id, srs.run_request_number, srs.container_id,
srs.manifest_id, srs.scan_system_name_id, srs.scan_site_name_id,
srs.scan_site_nickname_id, to_char(srs.start_time, 'MM/DD/YY
HH24:MI:SS'), to_char(srs.stop_time, 'MM/DD/YY HH24:MI:SS'),
srs.system_operator,
srs.system_baseline_configuration_file_version_id,
srs.container_contents, srs.container_run_truth_data,
srs.scan_type_id, sty.scan_type,
srs.hardware_version_or_hardware_identifier_id,
srs.software_version_id, srs.operator_notes,
to_char(srs.expiration_date, 'MM/DD/YY HH24:MI:SS'),
srs.scan_outcome_id, to_char(srs.alarm_time, 'MM/DD/YY
HH24:MI:SS'), srs.alarm_decision_id, srs.material_detected_id,
srs.data_access, ssn.scan_system_name, ssn.simulation,
ssitenames.scan_site_name, ssitenicknames.scan_site_nickname,
hvhi.hardware_version_or_hardware_identifier_name,
sv.software_version, sc.description from scan_run_summary srs left
outer join scan_system_names ssn on srs.scan_system_name_id =
ssn.scan_system_name_id left outer join scan_site_names ssitenames
on srs.scan_site_name_id = ssitenames.scan_site_name_id left outer
join scan_site_nicknames ssitenicknames on
srs.scan_site_nickname_id = ssitenicknames.scan_site_nickname_id
left outer join hardware_version_or_hardware_identifiers hvhi on
srs.hardware_version_or_hardware_identifier_id =
hvhi.hardware_version_or_hardware_identifier_id left outer join
software_versions sv on srs.software_version_id =
sv.software_version_id left outer join scenes sc on
srs.container_run_truth_data = sc.scene_id left outer join
scan_types sty on srs.scan_type_id = sty.scan_type_id join
scene_thing_instances sti on srs.container_run_truth_data =
sti.scene_id join scene_things stg on sti.scene_thing_id =
stg.scene_thing_id group by srs.scan_run_id ;

That gives this error:

ERROR: column "sty.scan_type" must appear in the GROUP BY clause
or be used in an aggregate function LINE 5:
srs.scan_type_id, sty.scan_type, srs.hardware_version...

I don't see why sty.scan_type should be singled out as requiring a
group by clause, when there are many other columns specified.

Many not always means enough.

If I add scan_type to the group by, then it gives the same error,
but with ssn.scan_system_name.

Am I going to have to specify all the columns in the group by
clause?

Yes, except those that are inside the aggregate.

http://www.postgresql.org/docs/9.3/static/queries-table-expressions.html

Section 7.2.3

I originally had the query without the group by, but I had
duplicate rows, so I added a group by to eliminate them.

Thanks, Susan

- --
- --
Emanuel Calvo
Consultant // 2ndQuadrant
Bs. As., Argentina // (GMT-3)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.18 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCgAGBQJTB5KAAAoJEIBeI/HMagHm5iQQAI8WcPcdUJrfTyensI1oI1ig
2zqq+mLVfgCHnh+9+1AH7eESl7mSqpk3cD6L5FoMNWVsG/5VKZG/vEcgB1IuN8DQ
RPVa0MBAV03HTWX37HXAleyu++vQ8BCIUIgfsmmrWpmJonhssnwW91uHvisl3hXq
dujypLtT6Xcu+0b+jtAwsayX6H5dH7g1ODzU8ofxS6o8SRxe3zCCJIykeK81PR0Q
L55WH30xx1YRPhj48OFrPuvcCRS1M3nhWTTlh5OQ6UCzMMCCmUv2bR2nurqq9gBs
lUx3iB+ra2fnIIZYcZHocMFaWJUOQQ4+dj9LsUxyel8qOLZvIzcNrSrA868XlZNT
IXoWm9IYfKyyZYtHD7PdwLPSZuFYDqW0ll+GMm3/wbaK2NOIW7p8C4/DylIxbUgO
DXkt8y3Hn05UjpfgFCDiOrMeXvdEjlb66aNiIePYmsJWDq6/CF8fj77EXZ3KP6t7
JUJ7YzDRtW99M+GsOYOLjVvMbE7NfS1KUKt/NNKGFsZAJ/TmQlHyFxYThIVYeYq2
FFqp6s1cWYJILhFD150zCZt2DpDt7NmNuczm7gJEb61avUHIZIrTw9VThcTF+Yh0
5YmJmF+wsewKy38jiyHWGRChH5n65NscZDzMO0NpfO8VR1KN4Su/ahMA+GADLpE9
WjLHQHHHRWcoiOCOgiyF
=ighX
-----END PGP SIGNATURE-----

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Scott Marlowe
scott.marlowe@gmail.com
In reply to: Susan Cassidy (#1)
Re: problem with query and group by error

On Fri, Feb 21, 2014 at 10:40 AM, Susan Cassidy
<susan.cassidy@decisionsciencescorp.com> wrote:

I originally had the query without the group by, but I had duplicate rows,
so I added a group by to eliminate them.

Have you tried select distinct or select distinct on ()?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Susan Cassidy
susan.cassidy@decisionsciencescorp.com
In reply to: Scott Marlowe (#4)
Re: problem with query and group by error

I tried distinct on srs.scan_run_id, which is a primary key, and got an
error, but I tried it again just now, and it worked fine.

Thanks for having me try it again.

I had ended up with
group by srs.scan_run_id, sty.scan_type, ssn.scan_system_name,
ssn.simulation, ssitenames.scan_site_name,
ssitenames.scan_site_name, ssitenicknames.scan_site_nickname,
hvhi.hardware_version_or_hardware_identifier_name, sv.software_version,
sc.description

because I kept adding column names that I got errors on, but this is by no
means the whole list. So, I'm still confused as to why I got the error in
the first place.

Someone said something about
"Yes, except those that are inside the aggregate."
but I don't have an aggregate specified.

Susan

On Fri, Feb 21, 2014 at 10:38 AM, Scott Marlowe <scott.marlowe@gmail.com>wrote:

Show quoted text

On Fri, Feb 21, 2014 at 10:40 AM, Susan Cassidy
<susan.cassidy@decisionsciencescorp.com> wrote:

I originally had the query without the group by, but I had duplicate

rows,

so I added a group by to eliminate them.

Have you tried select distinct or select distinct on ()?

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Susan Cassidy (#5)
Re: problem with query and group by error

Susan Cassidy-3 wrote

Someone said something about
"Yes, except those that are inside the aggregate."
but I don't have an aggregate specified.

So every column then...

As soon as you add "group by" the rule becomes - every column is either a
group determinate or is aggregated. If you simply want to remove duplicates
you can write:

Select distinct ... From

No group by clause required and every output column is used to create an
implicit non-duplicated group.

I haven't tried to figure out what you are trying to do with this query so I
don't know which, if either, form is more correct but grouping without
aggregates is unusual and I also find that using distinct is not that common
a need of you have a correctly normalized database. IOW you should not use
group by or distinct to "make the query work" but only if you
know/understand why doing so is necessary.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/problem-with-query-and-group-by-error-tp5793127p5793140.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general