effective SELECT from child tables

Started by Ilia Kantorover 20 years ago45 messageshackers
Jump to latest
#1Ilia Kantor
ilia@obnovlenie.ru

Let table A be inherited by A1, A2, A3.

How to select from A records where actual relations are A1, A2 ?

I found a way somewhere, it sounds like SELECT : WHERE tableoid IN (a1.oid,
a2.oid),

but tableoid checks actually do seq scan.

Like: SELECT * FROM sometable WHERE tableoid =anything will do seq. scan on
sometable..

So such way seems very ineffective: it seq scans and filters records..

Maybe new constraint_exclusion staff could help to exclude non-matching
tables from inheritance query ?

#2Josh Berkus
josh@agliodbs.com
In reply to: Ilia Kantor (#1)
Re: effective SELECT from child tables

Ilia,

Maybe new constraint_exclusion staff could help to exclude non-matching
tables from inheritance query ?

Yes, that's exactly what it's for. Your testing is welcome. Download 8.1
and try it today.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ilia Kantor (#1)
Re: effective SELECT from child tables

"Ilia Kantor" <ilia@obnovlenie.ru> writes:

Let table A be inherited by A1, A2, A3.

How to select from A records where actual relations are A1, A2 ?

Why not just select directly from the child tables? I can't get excited
about optimizing the case you propose.

regards, tom lane

#4Ilia Kantor
ilia@obnovlenie.ru
In reply to: Josh Berkus (#2)
Re: effective SELECT from child tables

Maybe new constraint_exclusion staff could help to exclude non-matching
tables from inheritance query ?

Yes, that's exactly what it's for. Your testing is welcome. Download 8.1

and try it today.

Great, I'm developing on 8.1b2 now...
But could you be more particular about the solution ?

Only the way I can think of is to add "relname" field into parent table, add
"BEFORE INSERT" trigger to each child that will set it appropriately and
CHECK (relname=<here goes table name>).
It works in this case..
More than that.. I can create indexes on each table with "WHERE
relname!=<table name>", then they are used with DELETE/UPDATE WHERE relname
IN(..).

But the whole idea to adding an extra field, trigger and duplicating table
name multiple times.. Feels a bit stinky to me..

Could you suggest another path ?

#5Ilia Kantor
ilia@obnovlenie.ru
In reply to: Tom Lane (#3)
Re: effective SELECT from child tables

Let table A be inherited by A1, A2, A3.
How to select from A records where actual relations are A1, A2 ?

Why not just select directly from the child tables? I can't get excited
about optimizing the case you propose.

Because "WHERE concrete_class IN (a,b,c)" is much more convenient and
flexible way of forming select then manually split request into many unions.

Also, this query runs on top of "abstract class", so inheritance really
assists me here.

#6Mark Mielke
mark@mark.mielke.cc
In reply to: Ilia Kantor (#1)
Re: effective SELECT from child tables

On Tue, Sep 27, 2005 at 09:30:55PM +0400, Ilia Kantor wrote:

Let table A be inherited by A1, A2, A3.
How to select from A records where actual relations are A1, A2 ?

If A1 and A2 will be naturally together, where compared to A, or A3,
why not introduce an intermediate table?

A would be inherited by A12, and A3. A12 would be inherited by A1, and A2.

You can do the UNION yourself, as well.

Cheers,
mark

--
mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

#7Simon Riggs
simon@2ndQuadrant.com
In reply to: Ilia Kantor (#5)
Re: effective SELECT from child tables

On Wed, 2005-09-28 at 12:13 +0400, Ilia Kantor wrote:

Let table A be inherited by A1, A2, A3.
How to select from A records where actual relations are A1, A2 ?

Why not just select directly from the child tables? I can't get excited
about optimizing the case you propose.

Because "WHERE concrete_class IN (a,b,c)" is much more convenient and
flexible way of forming select then manually split request into many unions.

Also, this query runs on top of "abstract class", so inheritance really
assists me here.

If you treat the sub-class Discriminator as a data item rather than some
additional syntax for class membership then you will find this works
very straightforwardly for you.

Include the Discriminator as a column in A and it will be inherited by
all A1, A2, A3.
e.g. concrete_class char(1) not null

then setup constraints on each table like so

ALTER TABLE A1 ADD CHECK (concrete_class = 'A')
ALTER TABLE A2 ADD CHECK (concrete_class = 'B')
ALTER TABLE A3 ADD CHECK (concrete_class = 'C')

then when you run a query like

SELECT * FROM A WHERE concrete_class IN ('A','B')

you will find that table A3, which corresponds to concrete_class C has
been excluded from your query.

Presumably A, B, C are all mutually exclusive, so the end result will be
the same as if you had used a UNION ALL set query.

This will add 1 byte per row in your superclass... and requires no
index. You can even add this as a DEFAULT value for each child table, so
the actual column concrete_class need not be mentioned in an INSERT
statement.

(I've got plans to add an ABSTRACT keyword to tables to follow along the
same set of OO terminology in describing this situation. In next
release, not 8.1)

Best Regards, Simon Riggs

#8Ilia Kantor
ilia@obnovlenie.ru
In reply to: Simon Riggs (#7)
Re: effective SELECT from child tables

If you treat the sub-class Discriminator as a data item rather than some
additional syntax for class membership then you will find this works
very straightforwardly for you.

Your suggestion is essentially the same as mine..
There exists tableoid, pretty much suited to tell between tables in the case
of inheritance.. I can't see a "real" need to add a special "class
classifier" to each table..
This solution is a workaround. It will work, just can't make myself love it.

#9Martijn van Oosterhout
kleptog@svana.org
In reply to: Ilia Kantor (#8)
Re: effective SELECT from child tables

On Wed, Sep 28, 2005 at 11:00:46PM +0400, Ilia Kantor wrote:

Your suggestion is essentially the same as mine..
There exists tableoid, pretty much suited to tell between tables in the case
of inheritance.. I can't see a "real" need to add a special "class
classifier" to each table..
This solution is a workaround. It will work, just can't make myself love it.

I wonder if it would be possible to tweak the constraints exclusion
code so that if it sees something of the form "tableoid = X" to exclude
other tables...

You know, assume each table has a constraint "tableoid = OID".

Still, it is a fairly unusual feature.
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#10Josh Berkus
josh@agliodbs.com
In reply to: Ilia Kantor (#4)
Re: effective SELECT from child tables

Ilia,

Well, Simon is still writing the CE docs. In the meantime:

http://archives.postgresql.org/pgsql-hackers/2005-07/msg00461.php

--josh

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Martijn van Oosterhout (#9)
Re: effective SELECT from child tables

On Wed, 2005-09-28 at 22:24 +0200, Martijn van Oosterhout wrote:

On Wed, Sep 28, 2005 at 11:00:46PM +0400, Ilia Kantor wrote:

Your suggestion is essentially the same as mine..
There exists tableoid, pretty much suited to tell between tables in the case
of inheritance.. I can't see a "real" need to add a special "class
classifier" to each table..
This solution is a workaround. It will work, just can't make myself love it.

I wonder if it would be possible to tweak the constraints exclusion
code so that if it sees something of the form "tableoid = X" to exclude
other tables...

You know, assume each table has a constraint "tableoid = OID".

Still, it is a fairly unusual feature.

As I pointed out, the solution I proposed works with CE, in comparison
with selecting a tableoid, which does not. It also costs 1 byte per row.

Jim Nasby requested a similar feature a couple of months back.
Essentially this request reduces to the idea of having attributes that
are constant for all rows of a table. That doesn't have any side or
additional benefits AFAICS, so improving that case isn't going to be at
the top of a worthwhile-improvements list for a long time yet,
especially since the CE list already has at least 10 items on it.

Best Regards, Simon Riggs

#12Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Martijn van Oosterhout (#9)
Re: effective SELECT from child tables

On Wed, Sep 28, 2005 at 10:24:18PM +0200, Martijn van Oosterhout wrote:

On Wed, Sep 28, 2005 at 11:00:46PM +0400, Ilia Kantor wrote:

Your suggestion is essentially the same as mine..
There exists tableoid, pretty much suited to tell between tables in the case
of inheritance.. I can't see a "real" need to add a special "class
classifier" to each table..
This solution is a workaround. It will work, just can't make myself love it.

I wonder if it would be possible to tweak the constraints exclusion
code so that if it sees something of the form "tableoid = X" to exclude
other tables...

You know, assume each table has a constraint "tableoid = OID".

Still, it is a fairly unusual feature.

Well, it's possibly a good way to do list partitioning where you can
drop the partitioning key out of each partition, something I would love
to have for stats.distributed.net (I'm actually working on a project
that does exactly this with a UNION ALL view and rules...)
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#13Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Simon Riggs (#7)
Re: effective SELECT from child tables

On Wed, Sep 28, 2005 at 07:25:46PM +0100, Simon Riggs wrote:

Include the Discriminator as a column in A and it will be inherited by
all A1, A2, A3.
e.g. concrete_class char(1) not null

<snip>

This will add 1 byte per row in your superclass... and requires no

I thought char was actually stored variable-length...? I know there's a
type that actually acts like char does on most databases, but I can't
remember what it is off-hand (it should be mentioned in docs 8.3...)
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#14Simon Riggs
simon@2ndQuadrant.com
In reply to: Jim Nasby (#13)
Re: effective SELECT from child tables

On Fri, 2005-09-30 at 18:30 -0500, Jim C. Nasby wrote:

I thought char was actually stored variable-length...? I know there's a
type that actually acts like char does on most databases, but I can't
remember what it is off-hand (it should be mentioned in docs 8.3...)

You are correct on that point, but IMHO that doesn't change the
situation sufficiently for the main part of this thread.

Best Regards, Simon Riggs

#15Martijn van Oosterhout
kleptog@svana.org
In reply to: Jim Nasby (#13)
Re: effective SELECT from child tables

On Fri, Sep 30, 2005 at 06:30:10PM -0500, Jim C. Nasby wrote:

On Wed, Sep 28, 2005 at 07:25:46PM +0100, Simon Riggs wrote:

Include the Discriminator as a column in A and it will be inherited by
all A1, A2, A3.
e.g. concrete_class char(1) not null

<snip>

This will add 1 byte per row in your superclass... and requires no

I thought char was actually stored variable-length...? I know there's a
type that actually acts like char does on most databases, but I can't
remember what it is off-hand (it should be mentioned in docs 8.3...)

IIRC, this is the difference between "char" and char(1). The latter is
variable length and can store any character per current encoding, hence
the variable length. "char" on the other hand is a one byte (presumably
ASCII) character. It's used mainly in the system catalogs...
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#16Mark Mielke
mark@mark.mielke.cc
In reply to: Simon Riggs (#11)
Re: effective SELECT from child tables

On Fri, Sep 30, 2005 at 09:54:39PM +0100, Simon Riggs wrote:

On Wed, 2005-09-28 at 22:24 +0200, Martijn van Oosterhout wrote:

I wonder if it would be possible to tweak the constraints exclusion
code so that if it sees something of the form "tableoid = X" to exclude
other tables...
You know, assume each table has a constraint "tableoid = OID".
Still, it is a fairly unusual feature.

As I pointed out, the solution I proposed works with CE, in comparison
with selecting a tableoid, which does not. It also costs 1 byte per row.

I can't see the use of an extra char column in a row, that is static
for an entire table, as anything except an ugly hack. It might work.
It doesn't make it right.

Jim Nasby requested a similar feature a couple of months back.
Essentially this request reduces to the idea of having attributes that
are constant for all rows of a table. That doesn't have any side or
additional benefits AFAICS, so improving that case isn't going to be at
the top of a worthwhile-improvements list for a long time yet,
especially since the CE list already has at least 10 items on it.

It has the 'side or additional benefit' being requested here. The ability
to filter the child table by some attribute. For example, if the child
tables are used for partitioning, and the attribute were to keep a date
range, the field restriction optimization could be used to automatically
determine the set of tables to use for the date range specified. With
such a change, it would even work automatically if the date ranges
overlapped for some reason. Selecting a table name by date is hacky. This
sort of solution would be a general solution to the problem.

If the original code suggested - the use of the table oid, and comparing
it to the rows table oid, is supported by PostgreSQL (I have never tried
it), it seems like a miss for the optimizer not to understand what it
means. Each row has one table that it is associated with. It's quite
obviously criteria that could be used to restrict the table search, and
belongs in any COMPLETE implementation of a restriction optimizer.

I still prefer Jim Nasby's model, though. I think it easily extends such
that the table row oid IS an automatic constant for all rows of a table.
It's a natural extension, and takes away the exceptional nature of the
table row oid.

Cheers,
mark

--
mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

#17Martijn van Oosterhout
kleptog@svana.org
In reply to: Mark Mielke (#16)
Re: effective SELECT from child tables

On Sat, Oct 01, 2005 at 10:05:22AM -0400, mark@mark.mielke.cc wrote:

It has the 'side or additional benefit' being requested here. The ability
to filter the child table by some attribute. For example, if the child
tables are used for partitioning, and the attribute were to keep a date
range, the field restriction optimization could be used to automatically
determine the set of tables to use for the date range specified. With
such a change, it would even work automatically if the date ranges
overlapped for some reason. Selecting a table name by date is hacky. This
sort of solution would be a general solution to the problem.

This is what "Constraint Exclusion" does. It uses CHECK constraints on
a table to filter out tables that obviously don't apply to a query.
It's just the the specific case of "tableoid = XXX" is not supported
right now.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#18Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Martijn van Oosterhout (#17)
Re: effective SELECT from child tables

To clarify, this is a hard-coded implementation of what I'm asking for:
http://cvs.distributed.net/viewcvs.cgi/stats-sql/logdb/ in a nutshell:

CREATE TABLE log_other (
project_id smallint NOT NULL
...
)

CREATE TABLE log_8 (
-- No project_id
...
)
CREATE TABLE log_24, log_25, log_5...
CREATE VIEW log AS
SELECT * FROM log_other
UNION ALL SELECT 8 AS project_id, * FROM log_8
...

So the end result is that for cases where project_id is 5, 8, 24, or 25,
the data will be stored in tables that don't have the project_id.

If I were to use this on the main table for
http://stats.distributed.net, which has ~130M rows, I would be able to
save 130M*4 bytes (4 instead of 2 due to alignment), or 520MB. The logdb
will have many times that number of rows, so the savings will be even
larger.

Note that this technique wouldn't help at all for something like date
partitioning, because you have to store the date in the partitioned
table.

On Sat, Oct 01, 2005 at 04:35:49PM +0200, Martijn van Oosterhout wrote:

On Sat, Oct 01, 2005 at 10:05:22AM -0400, mark@mark.mielke.cc wrote:

It has the 'side or additional benefit' being requested here. The ability
to filter the child table by some attribute. For example, if the child
tables are used for partitioning, and the attribute were to keep a date
range, the field restriction optimization could be used to automatically
determine the set of tables to use for the date range specified. With
such a change, it would even work automatically if the date ranges
overlapped for some reason. Selecting a table name by date is hacky. This
sort of solution would be a general solution to the problem.

This is what "Constraint Exclusion" does. It uses CHECK constraints on
a table to filter out tables that obviously don't apply to a query.
It's just the the specific case of "tableoid = XXX" is not supported
right now.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#19Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Simon Riggs (#14)
Re: effective SELECT from child tables

On Sat, Oct 01, 2005 at 09:43:03AM +0100, Simon Riggs wrote:

On Fri, 2005-09-30 at 18:30 -0500, Jim C. Nasby wrote:

I thought char was actually stored variable-length...? I know there's a
type that actually acts like char does on most databases, but I can't
remember what it is off-hand (it should be mentioned in docs 8.3...)

You are correct on that point, but IMHO that doesn't change the
situation sufficiently for the main part of this thread.

Depends on how many tuples and how wide they are. 4 extra bytes over a
couple hundred million rows isn't something to sneeze at. :)
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#20Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Martijn van Oosterhout (#15)
Re: effective SELECT from child tables

On Sat, Oct 01, 2005 at 02:13:09PM +0200, Martijn van Oosterhout wrote:

On Fri, Sep 30, 2005 at 06:30:10PM -0500, Jim C. Nasby wrote:

On Wed, Sep 28, 2005 at 07:25:46PM +0100, Simon Riggs wrote:

Include the Discriminator as a column in A and it will be inherited by
all A1, A2, A3.
e.g. concrete_class char(1) not null

<snip>

This will add 1 byte per row in your superclass... and requires no

I thought char was actually stored variable-length...? I know there's a
type that actually acts like char does on most databases, but I can't
remember what it is off-hand (it should be mentioned in docs 8.3...)

IIRC, this is the difference between "char" and char(1). The latter is
variable length and can store any character per current encoding, hence
the variable length. "char" on the other hand is a one byte (presumably
ASCII) character. It's used mainly in the system catalogs...

According to the docs, char == char(1).
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#21Andrew Dunstan
andrew@dunslane.net
In reply to: Jim Nasby (#20)
#22Martijn van Oosterhout
kleptog@svana.org
In reply to: Jim Nasby (#18)
#23Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Martijn van Oosterhout (#22)
#24Simon Riggs
simon@2ndQuadrant.com
In reply to: Jim Nasby (#18)
#25Hannu Krosing
hannu@tm.ee
In reply to: Simon Riggs (#24)
#26Bruce Momjian
bruce@momjian.us
In reply to: Hannu Krosing (#25)
#27Mark Mielke
mark@mark.mielke.cc
In reply to: Martijn van Oosterhout (#17)
#28Hannu Krosing
hannu@tm.ee
In reply to: Bruce Momjian (#26)
#29Martijn van Oosterhout
kleptog@svana.org
In reply to: Hannu Krosing (#28)
#30Bruce Momjian
bruce@momjian.us
In reply to: Martijn van Oosterhout (#29)
#31Mark Mielke
mark@mark.mielke.cc
In reply to: Martijn van Oosterhout (#17)
#32Hannu Krosing
hannu@tm.ee
In reply to: Martijn van Oosterhout (#29)
#33Martijn van Oosterhout
kleptog@svana.org
In reply to: Hannu Krosing (#32)
#34Bruce Momjian
bruce@momjian.us
In reply to: Martijn van Oosterhout (#33)
#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#34)
#36Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#35)
#37Hannu Krosing
hannu@tm.ee
In reply to: Bruce Momjian (#34)
#38Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#35)
#39Bruce Momjian
bruce@momjian.us
In reply to: Hannu Krosing (#37)
#40Bruce Momjian
bruce@momjian.us
In reply to: Hannu Krosing (#38)
#41Rod Taylor
rbt@rbt.ca
In reply to: Bruce Momjian (#40)
#42Rod Taylor
rbt@rbt.ca
In reply to: Rod Taylor (#41)
#43Bruce Momjian
bruce@momjian.us
In reply to: Rod Taylor (#41)
#44Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Simon Riggs (#24)
#45Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Bruce Momjian (#40)