Feedback about Drupal SQL debugging

Started by Jean-Michel Pouréover 16 years ago30 messages

Dear friends,

I have been using PostgreSQL since 6.3 releases and I am a real fan.
Of course, I never use nor trust MySQL to deliver data.

Now I use Drupal 6.3 with PostgreSQL 8.4.

I loose a lot of time correcting Drupal SQL.
You may be interested in my developer feedback.

I gathered some real examples here: Guidelines for writing MySQL and
PostgreSQL compliant SQL => http://drupal.org/node/555514

This page gathers most frequent problems that Drupal users and
developers encounter when using PostgreSQL.

I would be delighted to have your feedback.
Could some issues reasonably be fixed for a better Drupal support?

Kind regards,
Jean-Michel

#2Joshua D. Drake
jd@commandprompt.com
In reply to: Jean-Michel Pouré (#1)
Re: Feedback about Drupal SQL debugging

This page gathers most frequent problems that Drupal users and
developers encounter when using PostgreSQL.

I would be delighted to have your feedback.
Could some issues reasonably be fixed for a better Drupal support?

Well I doubt we would do anything to copy MySQL. However Drupal has
already made strides to solve this in 7.x. The abstraction layer is much
smarter.

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering

#3D'Arcy J.M. Cain
darcy@druid.net
In reply to: Jean-Michel Pouré (#1)
Re: Feedback about Drupal SQL debugging

On Fri, 21 Aug 2009 18:22:41 +0200
Jean-Michel Pouré <jm@poure.com> wrote:

I gathered some real examples here: Guidelines for writing MySQL and
PostgreSQL compliant SQL => http://drupal.org/node/555514

This page gathers most frequent problems that Drupal users and
developers encounter when using PostgreSQL.

I would be delighted to have your feedback.

It looks to me like you could just reference SQL99 rather than
mentioning PostgreSQL other than as an example of a standards compliant
database engine.

How would those constructs work in MS-SQL or Oracle?

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.
#4Bruce Momjian
bruce@momjian.us
In reply to: Jean-Michel Pouré (#1)
Re: Feedback about Drupal SQL debugging

Jean-Michel Pour��� wrote:
-- Start of PGP signed section.

Dear friends,

I have been using PostgreSQL since 6.3 releases and I am a real fan.
Of course, I never use nor trust MySQL to deliver data.

Now I use Drupal 6.3 with PostgreSQL 8.4.

I loose a lot of time correcting Drupal SQL.
You may be interested in my developer feedback.

I gathered some real examples here: Guidelines for writing MySQL and
PostgreSQL compliant SQL => http://drupal.org/node/555514

This page gathers most frequent problems that Drupal users and
developers encounter when using PostgreSQL.

I would be delighted to have your feedback.
Could some issues reasonably be fixed for a better Drupal support?

I doubt we are going to change Postgres to improve Drupal support --- it
would be better to fix Drupal.

However, I have a few suggestions:

For this item, http://drupal.org/node/555580, use || for concatentation.

I can't believe MySQL doesn't support multi-column indexes,
http://drupal.org/node/555558.

For this item, I think you want DELETE FROM history USING ...,
http://drupal.org/node/555562.

The SQL standard doesn't support multiple deletes, so odds are we will
not either, http://drupal.org/node/555648.

I show multi-value INSERT was added in PG 8.2, not 8.4, * Add
support for multiple-row VALUES clauses, per SQL standard (Joe, Tom),
http://drupal.org/node/555568.

I am confused because I thought Drupal worked with Postgres, but looking
at your list, it seems it doesn't.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#5Joshua D. Drake
jd@commandprompt.com
In reply to: Bruce Momjian (#4)
Re: Feedback about Drupal SQL debugging

On Fri, 2009-08-21 at 12:50 -0400, Bruce Momjian wrote:

I show multi-value INSERT was added in PG 8.2, not 8.4, * Add
support for multiple-row VALUES clauses, per SQL standard (Joe, Tom),
http://drupal.org/node/555568.

I am confused because I thought Drupal worked with Postgres, but looking
at your list, it seems it doesn't.

Drupal itself works perfectly on Postgres. The problem is module authors
that are MySQL developers who never test their module on Postgres.

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Joshua D. Drake (#2)
Re: Feedback about Drupal SQL debugging

Joshua D. Drake wrote:

This page gathers most frequent problems that Drupal users and
developers encounter when using PostgreSQL.

I would be delighted to have your feedback.
Could some issues reasonably be fixed for a better Drupal support?

Well I doubt we would do anything to copy MySQL. However Drupal has
already made strides to solve this in 7.x. The abstraction layer is much
smarter.

Joshua,

Since you haven't shown us what page this refers to, I at least am
totally in the dark about what is being discussed.

cheers

andrew

#7Greg Stark
gsstark@mit.edu
In reply to: Andrew Dunstan (#6)
Re: Feedback about Drupal SQL debugging

2009/8/21 Andrew Dunstan <andrew@dunslane.net>:

Since you haven't shown us what page this refers to, I at least am totally
in the dark about what is being discussed.

It was in the original post

http://drupal.org/node/555514

--
greg
http://mit.edu/~gsstark/resume.pdf

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Greg Stark (#7)
Re: Feedback about Drupal SQL debugging

Greg Stark wrote:

2009/8/21 Andrew Dunstan <andrew@dunslane.net>:

Since you haven't shown us what page this refers to, I at least am totally
in the dark about what is being discussed.

It was in the original post

http://drupal.org/node/555514

Darn. Our mail system sucks badly. For some insane reason some
postgresbut not all emails to me get badly delayed.

cheers

andrew

#9Robert Haas
robertmhaas@gmail.com
In reply to: Jean-Michel Pouré (#1)
Re: Feedback about Drupal SQL debugging

2009/8/21 Jean-Michel Pouré <jm@poure.com>:

Dear friends,

I have been using PostgreSQL since 6.3 releases and I am a real fan.
Of course, I never use nor trust MySQL to deliver data.

Now I use Drupal 6.3 with PostgreSQL 8.4.

I loose a lot of time correcting Drupal SQL.
You may be interested in my developer feedback.

I gathered some real examples here: Guidelines for writing MySQL and
PostgreSQL compliant SQL => http://drupal.org/node/555514

This page gathers most frequent problems that Drupal users and
developers encounter when using PostgreSQL.

I would be delighted to have your feedback.
Could some issues reasonably be fixed for a better Drupal support?

A lot of these issues seem to have easy workarounds, so I'm not sure
what the big deal is. If you don't write standards-compliant SQL, you
shouldn't be surprised when you find out that it's not portable. Most
of those constructions wouldn't work on Microsoft SQL server either,
and I bet at least some of them would fail under Oracle as well.

For the int/varchar casting issue, you might try putting single quotes
around the values. I would expect that to work in both databases,
though I don't use MySQL.

...Robert

#10Josh Berkus
josh@agliodbs.com
In reply to: Jean-Michel Pouré (#1)
Re: Feedback about Drupal SQL debugging

Jean-Michel,

Thank you for doing this!

I've registered for the Drupal site so that I can fix and/or expand some
of your items.

People who know Drupal better than me should add to them.

If you want to discuss Drupal & PostgreSQL again, please post on the
pgsql-advocacy list or the pgsql-php mailing lists. pgsql-hackers isn't
the best place to get people to help you.

BTW, why don't we have a multi-argument version of CONCAT()? In 8.4, it
would be possible ... I should add it to mysqlcompat library.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#10)
Re: Feedback about Drupal SQL debugging

Josh Berkus <josh@agliodbs.com> writes:

BTW, why don't we have a multi-argument version of CONCAT()?

Why wouldn't people use the SQL-standard || operator instead?

regards, tom lane

In reply to: Josh Berkus (#10)
Re: Feedback about Drupal SQL debugging

I've registered for the Drupal site so that I can fix and/or expand
some
of your items.

Thanks. I corrected the index on dual fields page.

If you want to discuss Drupal & PostgreSQL again, please post on the
pgsql-advocacy list or the pgsql-php mailing lists. pgsql-hackers
isn't
the best place to get people to help you.

I would prefer no, please. This post is made to understand what needs to
be done at PostgreSQL level for better Drupal supports.

As written previously, Drupal developers write MySQL code. Some of this
code is not portable, okay.

BTW, why don't we have a multi-argument version of CONCAT()? In 8.4,
it
would be possible ... I should add it to mysqlcompat library.

yes. In PostgreSQL core ...

PostgreSQL requires all non-aggregated fields to be present in the GROUP
BY clause (I fixed 10 such issues in Drupal code).
http://drupal.org/node/555530

Why can't PostgreSQL add the required field automatically? Could this be
added to PostgreSQL to-do-list?

Kind regards,
Jean-Michel

#13David Fetter
david@fetter.org
In reply to: Tom Lane (#11)
Re: Feedback about Drupal SQL debugging

On Fri, Aug 21, 2009 at 04:19:43PM -0400, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

BTW, why don't we have a multi-argument version of CONCAT()?

Why wouldn't people use the SQL-standard || operator instead?

Because by default, MySQL uses that as, get this, "logical OR."

Cheers,
David (grateful he's with a project that doesn't just gratuitously go
around breaking stuff)
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Fetter (#13)
Re: Feedback about Drupal SQL debugging

David Fetter <david@fetter.org> writes:

On Fri, Aug 21, 2009 at 04:19:43PM -0400, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

BTW, why don't we have a multi-argument version of CONCAT()?

Why wouldn't people use the SQL-standard || operator instead?

Because by default, MySQL uses that as, get this, "logical OR."

Egad. Well, I think that's something for the mysqlcompat project
not core ...

regards, tom lane

#15Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#14)
Re: Feedback about Drupal SQL debugging

On 8/21/09 3:17 PM, Tom Lane wrote:

David Fetter <david@fetter.org> writes:

On Fri, Aug 21, 2009 at 04:19:43PM -0400, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

BTW, why don't we have a multi-argument version of CONCAT()?

Why wouldn't people use the SQL-standard || operator instead?

Because by default, MySQL uses that as, get this, "logical OR."

Egad. Well, I think that's something for the mysqlcompat project
not core ...

Yeah, I'll write one. I'll also blog it as an example of the new
variable argument functions.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

#16Andrew Dunstan
andrew@dunslane.net
In reply to: Jean-Michel Pouré (#12)
Re: Feedback about Drupal SQL debugging

Jean-Michel Pour� wrote:

BTW, why don't we have a multi-argument version of CONCAT()? In 8.4,
it
would be possible ... I should add it to mysqlcompat library.

yes. In PostgreSQL core ...

No. That is exactly where it shouldn't go. And frankly, Drupal
developers should stop writing non-portable code. Isn't there a Mysql
mode that is supposed to conform to the standard, at least more than
their default mode?

PostgreSQL requires all non-aggregated fields to be present in the GROUP
BY clause (I fixed 10 such issues in Drupal code).
http://drupal.org/node/555530

Why can't PostgreSQL add the required field automatically? Could this be
added to PostgreSQL to-do-list?

Isn't that contrary to the standard?

cheers

andrew

#17Greg Stark
gsstark@mit.edu
In reply to: Jean-Michel Pouré (#12)
Re: Feedback about Drupal SQL debugging

2009/8/21 Jean-Michel Pouré <jm@poure.com>:

PostgreSQL requires all non-aggregated fields to be present in the GROUP
BY clause (I fixed 10 such issues in Drupal code).
http://drupal.org/node/555530

Why can't PostgreSQL add the required field automatically? Could this be
added to PostgreSQL to-do-list?

This is a more complex (and more interesting) topic than what your
blog discusses.

Firstly understand what MySQL is *actually* doing:

select a,b,c from tab group by a

Only sorts and groups by "a" as instructed. The b columns and c
columns are not included in the grouping. So if you have data like:

a,b,c
1,1,1
1,2,2
2,1,1
2,2,2

You'll get two groups because there are only two values of "a". One
group will have a=1 and one group will have a=2. Which value you get
for b and c will be completely arbitrary and unpredictable.

If Postgres added b,c to the GROUP BY it would produce four groups,
because there four different values of <a,b,c>. You *can* get
something similar to MySQL's behaviour using DISTINCT ON:

select distinct on (a) a,b,c from a ORDER BY a,b,c

But Postgres insists you have an ORDER BY which has to agree with the
DISTINCT ON columns and provide some extra column(s) to determine
which values of b,c are chosen.

If Postgres changed on this front it would be to support the SQL
Standard concept of "functional dependency". In cases where some
columns are guaranteed to be unique you can leave them out of the
GROUP BY but still use them in the select list. This isn't MySQL's
behaviour of just allowing you to leave them out and hope that it
doesn't matter which row's values are used. The database has to
actually determine that it really doesn't matter. Typically that would
be because you've grouped by a set of columns which form the key of a
unique constraint, in which case every other column from that table
would also necessarily be the same since they would all come from the
same row of that table.

--
greg
http://mit.edu/~gsstark/resume.pdf

#18Joshua D. Drake
jd@commandprompt.com
In reply to: Andrew Dunstan (#16)
Re: Feedback about Drupal SQL debugging

On Friday 21 August 2009 04:01:36 pm Andrew Dunstan wrote:

Jean-Michel Pouré wrote:

BTW, why don't we have a multi-argument version of CONCAT()? In 8.4,
it
would be possible ... I should add it to mysqlcompat library.

yes. In PostgreSQL core ...

No. That is exactly where it shouldn't go. And frankly, Drupal
developers should stop writing non-portable code. Isn't there a Mysql
mode that is supposed to conform to the standard, at least more than
their default mode?

This is all solved with Drupal 7. We really shouldn't be burning time on this.

Joshua D. Drake
--
PostgreSQL.org Major Contributor
Command Prompt, Inc : 503-667-4564 - http://www.commandprompt.com/
Since 1997, Consulting, Development, Support, Training

#19Andrew Dunstan
andrew@dunslane.net
In reply to: Greg Stark (#17)
Re: Feedback about Drupal SQL debugging

Greg Stark wrote:

If Postgres changed on this front it would be to support the SQL
Standard concept of "functional dependency". In cases where some
columns are guaranteed to be unique you can leave them out of the
GROUP BY but still use them in the select list. This isn't MySQL's
behaviour of just allowing you to leave them out and hope that it
doesn't matter which row's values are used. The database has to
actually determine that it really doesn't matter. Typically that would
be because you've grouped by a set of columns which form the key of a
unique constraint, in which case every other column from that table
would also necessarily be the same since they would all come from the
same row of that table.

That would make much more sense.

You can also get the effect of picking an arbitrary row now by use
max(column) or min(column) in place of the straight column.

cheers

andrew

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#16)
Re: Feedback about Drupal SQL debugging

Andrew Dunstan <andrew@dunslane.net> writes:

Jean-Michel Pour� wrote:

Why can't PostgreSQL add the required field automatically? Could this be
added to PostgreSQL to-do-list?

Isn't that contrary to the standard?

As of SQL99 it's supposed to be legal if you're grouping by a primary key
(or some other cases where the other columns can be proved functionally
dependent on the grouping columns, but that's the most useful one).
We haven't got round to implementing that, but I'm not sure that it
would make the Drupal code work anyway. Are they actually writing to
spec here, or just doing whatever mysql will let them?

BTW, I was under the impression there already *was* a TODO entry about
improving our standards compliance in this area. I can't find it in
the list right now, though.

regards, tom lane

#21Stephen Frost
sfrost@snowman.net
In reply to: Greg Stark (#17)
Re: Feedback about Drupal SQL debugging

* Greg Stark (gsstark@mit.edu) wrote:

select distinct on (a) a,b,c from a ORDER BY a,b,c

But Postgres insists you have an ORDER BY which has to agree with the
DISTINCT ON columns and provide some extra column(s) to determine
which values of b,c are chosen.

Not quite technically correct. You have to have an ORDER BY which
includes the columns inside the DISTINCT ON, but not any more than that.
At that point, the values you get for the other columns are arbitrary.
PG does *allow* you to provide other columns in the ORDER BY, so you can
specify which values from those other columns should be used.

I'm not advocating that we force another column to be used, nor do I
think you are, but I have to admit that I don't think I've ever used it
w/o other columns in the ORDER BY.

If Postgres changed on this front it would be to support the SQL
Standard concept of "functional dependency". In cases where some
columns are guaranteed to be unique you can leave them out of the
GROUP BY but still use them in the select list. This isn't MySQL's
behaviour of just allowing you to leave them out and hope that it
doesn't matter which row's values are used. The database has to
actually determine that it really doesn't matter. Typically that would
be because you've grouped by a set of columns which form the key of a
unique constraint, in which case every other column from that table
would also necessarily be the same since they would all come from the
same row of that table.

Hrmm. That sounds kinda neat, but you'd still have to specify one of
the columns in the GROUP BY, I presume? Or could you just say 'GROUP
BY' without any columns, and have it GROUP BY the key of the table
you're using?

Thanks,

Stephen

#22Greg Stark
gsstark@mit.edu
In reply to: Stephen Frost (#21)
Re: Feedback about Drupal SQL debugging

2009/8/22 Stephen Frost <sfrost@snowman.net>:

Hrmm.  That sounds kinda neat, but you'd still have to specify one of
the columns in the GROUP BY, I presume?  Or could you just say 'GROUP
BY' without any columns, and have it GROUP BY the key of the table
you're using?

You would have to specify the key. I think typically you would have
something like:

SELECT a.*, sum(b.col)
FROM a,b
GROUP BY a.pk

Since you have the primary key of a in your group by column you're
allowed to use any columns from a in your select list even if they're
not listed in the group by clause.

The database knows that it can use those values from any output row of
the group since they'll all come from the same orginal row of a. Or
possibly it could use some plan that doesn't involve multiplying that
data in the first place.

--
greg
http://mit.edu/~gsstark/resume.pdf

#23Stephen Frost
sfrost@snowman.net
In reply to: Greg Stark (#22)
Re: Feedback about Drupal SQL debugging

* Greg Stark (gsstark@mit.edu) wrote:

You would have to specify the key. I think typically you would have
something like:

SELECT a.*, sum(b.col)
FROM a,b
GROUP BY a.pk

Ahhh, ok, this makes more sense. This is SQL standard? Do we have a
TODO for it?

The database knows that it can use those values from any output row of
the group since they'll all come from the same orginal row of a. Or
possibly it could use some plan that doesn't involve multiplying that
data in the first place.

Right. It strikes me as a relativly small amount of work to get the
initial "just add the columns to the group by" logic implemented. I'd
start from exactly where that ERROR comes from, to minimize any
performance hit from having to go figure out if the columns in the GROUP
BY comprise a key. Doing something different in the planner based on
that could come later, if necessary.

I havn't looked at any code yet, but those who are familiar with these
areas- any gotchas you can think of off-hand to make this more difficult
than I'm hoping it is?

Thanks,

Stephen

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#23)
Re: Feedback about Drupal SQL debugging

Stephen Frost <sfrost@snowman.net> writes:

Right. It strikes me as a relativly small amount of work to get the
initial "just add the columns to the group by" logic implemented.

Well, no, you *aren't* adding the columns to the GROUP BY. You're just
not throwing the error. You really don't want to add redundant columns
to GROUP BY because it makes more work for the planner and executor
(unless the planner can figure out they're redundant, which in itself
takes work).

This is a bit trickier than it looks because it makes the validity of a
query dependent on the existence of an appropriate uniqueness
constraint; thus for example DROP CONSTRAINT might invalidate a stored
rule or view. See prior discussions.

regards, tom lane

#25Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#24)
Re: Feedback about Drupal SQL debugging

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Stephen Frost <sfrost@snowman.net> writes:

Right. It strikes me as a relativly small amount of work to get the
initial "just add the columns to the group by" logic implemented.

Well, no, you *aren't* adding the columns to the GROUP BY. You're just
not throwing the error. You really don't want to add redundant columns
to GROUP BY because it makes more work for the planner and executor
(unless the planner can figure out they're redundant, which in itself
takes work).

Hmm, right. Possibly also add some bit of info to pass to something
down the line, if necessary.

This is a bit trickier than it looks because it makes the validity of a
query dependent on the existence of an appropriate uniqueness
constraint; thus for example DROP CONSTRAINT might invalidate a stored
rule or view. See prior discussions.

Ah, yes. Couldn't the dependency system be used to handle this though?
If you try to drop that constraint it'll complain unless you use cascade
which would drop the view? I'll try and find older discussions. Sadly,
I don't see it on the TODO. Perhaps I can add it.

Thanks,

Stephen

#26Greg Stark
gsstark@mit.edu
In reply to: Stephen Frost (#23)
Re: Feedback about Drupal SQL debugging

2009/8/22 Stephen Frost <sfrost@snowman.net>:

* Greg Stark (gsstark@mit.edu) wrote:

You would have to specify the key. I think typically you would have
something like:

SELECT a.*, sum(b.col)
   FROM a,b
 GROUP BY a.pk

Ahhh, ok, this makes more sense.  This is SQL standard?

Incidentally it makes even more sense that MySQL would do what they do
when you remember that they didn't have subqueries until recently. So
MySQL programmers had all become accustomed to the circumlocutions
like:

SELECT a.*
FROM a left join b USING (a.b_id = b.id)
WHERE b.id IS NULL
GROUP BY a.id

to express the much simpler

select * from a where b_id in (select id from b)

So not many uses of it in MySQL actually *would* be valid if we
implemented the shortcut. But MySQL doesn't enforce that so it serves
that purpose as well as what we get out of DISTINCT ON.

--
greg
http://mit.edu/~gsstark/resume.pdf

#27Stephen Frost
sfrost@snowman.net
In reply to: Greg Stark (#26)
Re: Feedback about Drupal SQL debugging

* Greg Stark (gsstark@mit.edu) wrote:

So not many uses of it in MySQL actually *would* be valid if we
implemented the shortcut. But MySQL doesn't enforce that so it serves
that purpose as well as what we get out of DISTINCT ON.

That's probably a good thing- if they're valid then we'd probably return
something different which would be a suprise. I'm not really looking at
this from the "help MySQL apps" point of view.. It just strikes me as
something nice to have.

Thanks,

Stephen

#28Greg Stark
gsstark@mit.edu
In reply to: Stephen Frost (#25)
Re: Feedback about Drupal SQL debugging

2009/8/22 Stephen Frost <sfrost@snowman.net>:

This is a bit trickier than it looks because it makes the validity of a
query dependent on the existence of an appropriate uniqueness
constraint; thus for example DROP CONSTRAINT might invalidate a stored
rule or view.  See prior discussions.

Ah, yes.  Couldn't the dependency system be used to handle this though?
If you try to drop that constraint it'll complain unless you use cascade
which would drop the view?  I'll try and find older discussions.  Sadly,
I don't see it on the TODO.  Perhaps I can add it.

All this wasn't possible before 8.3 so there are a whole slew of
optimizations that have been kind of waiting in the wings until we got
that infrastructure.

The first step is probably to do the opposite of what we're talking
about here: cases where people *have* added extra columns to the GROUP
BY key so they can use those columns in their select list. We can
remove those columns from the sort or hash comparison key if there's a
column (or columns) which is a unique constraint key for the same
source. Similarly we can remove columns from an ORDER BY if the order
key has earlier columns which are already a unique key for the same
source.

That would be fairly simple and it would provide a good test case for
the dependency tracking stuff and plan invalidation triggered by
constraint ddl. It wouldn't be a massive performance change but it
would help some cases where the sort node is comparing a lot of
redundant keys.

--
greg
http://mit.edu/~gsstark/resume.pdf

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Stark (#28)
Re: Feedback about Drupal SQL debugging

Greg Stark <gsstark@mit.edu> writes:

The first step is probably to do the opposite of what we're talking
about here: cases where people *have* added extra columns to the GROUP
BY key so they can use those columns in their select list. We can
remove those columns from the sort or hash comparison key if there's a
column (or columns) which is a unique constraint key for the same
source. Similarly we can remove columns from an ORDER BY if the order
key has earlier columns which are already a unique key for the same
source.

This is something we could only do at plan time --- if we do it at parse
time we risk making a robust query into one that will break when
somebody drops a constraint. So it's not really the inverse of the
other case.

regards, tom lane

#30Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#20)
Re: Feedback about Drupal SQL debugging

On fre, 2009-08-21 at 20:07 -0400, Tom Lane wrote:

As of SQL99 it's supposed to be legal if you're grouping by a primary key
(or some other cases where the other columns can be proved functionally
dependent on the grouping columns, but that's the most useful one).
We haven't got round to implementing that, but I'm not sure that it
would make the Drupal code work anyway. Are they actually writing to
spec here, or just doing whatever mysql will let them?

BTW, I was under the impression there already *was* a TODO entry about
improving our standards compliance in this area. I can't find it in
the list right now, though.

I added "Add support for functional dependencies" just now, with a
comment how this relates to GROUP BY.