Views With Unions

Started by Chris Browneover 22 years ago15 messagesgeneral
Jump to latest
#1Chris Browne
cbbrowne@acm.org

This is stepping back quite a while; let me point people to the thread
of 2003-02 where Mariusz Czu\x{0142}ada <manieq@idea.net.pl> was
looking for a way of optimizing a VIEW that was a UNION.

<http://archives.postgresql.org/pgsql-performance/2003-02/msg00095.php&gt;

The subject has come up a few times through PostgreSQL history, and
I'd imagine to think I may have a little something new to offer to it.

Let's consider a table used to store log information:

create table log_table (
request_time timestamp with time zone,
object character varying, -- What they asked for
request_type character(8), -- What they did to it
request_size integer,
requestor inet,
request_status integer,
result_size integer,
request_detail character varying
);
create index log_times on log_table(request_time);
create index log_object on log_table(object);

Every time "something happens," an entry goes into this table.
Unfortunately, the table is likely to grow to tremendous size, over
time, and there are all sorts of troublesome things about purging it:

-> Fragmentation may waste space and destroy the usefulness of
indices;

-> Deleting data row by row will cause replication logic to go mad,
as triggers get invoked for every single row modified;

-> The action of deletion will draw the data we just decided was
_useless_ into memory, injuring cache utilization badly as we fill
the cache with trash.

The obvious thought: Create several tables, and join them together
into a view. So instead of log_table being a table, we have
log_table_1 thru log_table_3, each with the schema describe above, and
define the view:

create view log_table as select * from log_table_1 union all
select * from log_table_2 union all
select * from log_table_3;

It's easy enough (modulo a little debugging and pl/pgsql work :-)) to
turn this into an updatable view so that inserts into log_table use a
different log table every (day|week|month). And we can TRUNCATE the
eldest one, which is a cheap operation.

This approach also resembles the way the "O guys" handle partitioned
tables, so it's not merely about "logs."

Unfortunately, selects on the VIEW are, at present, unable to make use
of the indices. So if we want all log entries for June 11th, the
query:

select * from log_table where request_time between 'june 11 2003' and
'june 12 2003';

returns a plan:
Subquery Scan log_table (cost=0.00..10950.26 rows=177126 width=314)
-> Append (cost=0.00..10950.26 rows=177126 width=314)
-> Subquery Scan *SELECT* 1 (cost=0.00..3089.07 rows=50307 width=71)
-> Seq Scan on log_table_1 (cost=0.00..3089.07 rows=50307 width=71)
-> Subquery Scan *SELECT* 2 (cost=0.00..602.92 rows=9892 width=314)
-> Seq Scan on log_table_2 (cost=0.00..602.92 rows=9892 width=314)
-> Subquery Scan *SELECT* 3 (cost=0.00..2390.09 rows=39209 width=314)
-> Seq Scan on log_table_3 (cost=0.00..2390.09 rows=39209 width=314)

In effect, the query is materialized into:

select * from
(select * from log_table_1 union all select * from log_table_2
union all select * from log_table_3) as merger
where [request_time between 'june 11 2003' and 'june 12 2003'];

What would perform better would be to attach the WHERE clause to each
of the union members. (Everyone stop and sing "Solidarity Forever"
:-))

E.g.:

select * from
(
select * from log_table_1 where request_time between 'june 11 2003' and 'june 12 2003' union all
select * from log_table_2 where request_time between 'june 11 2003' and 'june 12 2003' union all
select * from log_table_3 where request_time between 'june 11 2003' and 'june 12 2003' union all
) as merged_version;

Subquery Scan merged_version (cost=0.00..947.04 rows=247 width=314) (actual time=55.86..1776.42 rows=20124 loops=1)
-> Append (cost=0.00..947.04 rows=247 width=314) (actual time=55.84..1483.60 rows=20124 loops=1)
-> Subquery Scan *SELECT* 1 (cost=0.00..3.02 rows=1 width=71) (actual time=55.83..289.81 rows=3422 loops=1)
-> Index Scan using log_table_1_trans_on_idx on log_table_1 (cost=0.00..3.02 rows=1 width=71) (actual time=55.80..239.84 rows=3422 loops=1)
-> Subquery Scan *SELECT* 2 (cost=0.00..191.38 rows=49 width=314) (actual time=62.32..1115.15 rows=16702 loops=1)
-> Index Scan using log_table_2_trans_on_idx on log_table_2 (cost=0.00..191.38 rows=49 width=314) (actual time=62.29..873.63 rows=16702 loops=1)
-> Subquery Scan *SELECT* 3 (cost=0.00..752.64 rows=196 width=314) (actual time=26.69..26.69 rows=0 loops=1)
-> Index Scan using log_table_3_trans_on_idx on log_table_3 (cost=0.00..752.64 rows=196 width=314) (actual time=26.69..26.69 rows=0 loops=1)
Total runtime: 1806.39 msec

Which is nice and quick, as it cuts each set down to size _before_
merging them.

Mariusz had been looking, back in February, for an optimization that
would, in effect, throw away the UNION ALL clauses that were
unnecessary. Tom Lane and Stephan Szabo, in discussing this,
observed, quite rightly, that this is liable to be an obscure sort of
optimization:

Tom Lane writes:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

Yeah, but I think what he's hoping is that it'll notice that
"key=1 and key=3" would be noticed as a false condition so that it doesn't
scan those tables since a row presumably can't satisify both. The question
would be, is the expense of checking the condition for all queries
greater than the potential gain for these sorts of queries.

Yes, this is the key point: we won't put in an optimization that
wins on a small class of queries unless there is no material cost
added for planning cases where it doesn't apply.

In contrast, I would argue that adding the WHERE clause in as an extra
condition on each of the UNION subqueries is an optimization that is
likely to win in _most_ cases.

It helps with the example I illustrated; it would help with Mariusz'
scenario, not by outright eliminating UNION subqueries, but rather by
making their result sets empty.

select key, value from view123 where key = 2

transforms into...

select key, value from tab1 where key=1 [and key = 2]
union all
select key, value from tab2 where key=2 [and key = 2]
union all
select key, value from tab3 where key=3 [and key = 2];

The generalization is that:

select * from
(select [fields1] from t1 where [cond1] (UNION|UNION ALL|INTERSECT)
select [fields2] from t2 where [cond2] (UNION|UNION ALL|INTERSECT)
...
select [fieldsn] from tn where [condn]) as COMBINATION
WHERE [globalcond];

is equivalent to:

select * from
(select [fields1] from t1 where ([cond1]) and [globalcond] (UNION|UNION ALL|INTERSECT)
select [fields2] from t2 where ([cond2]) and [globalcond] (UNION|UNION ALL|INTERSECT)
...
select [fieldsn] from tn where ([condn]) and [globalcond]
) as COMBINATION;

[globalcond] has to be expressed in terms of the fields available for
each subquery, but that already needs to be true, because the global
condition at present is being applied to the fields that are given by
the UNION/INTERSECT/UNION ALL.
--
let name="cbbrowne" and tld="libertyrms.info" in String.concat "@" [name;tld];;
<http://dev6.int.libertyrms.com/&gt;
Christopher Browne
(416) 646 3304 x124 (land)

#2Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Chris Browne (#1)
Re: Views With Unions

On Thu, 31 Jul 2003, Christopher Browne wrote:

select * from log_table where request_time between 'june 11 2003' and
'june 12 2003';

returns a plan:
Subquery Scan log_table (cost=0.00..10950.26 rows=177126 width=314)
-> Append (cost=0.00..10950.26 rows=177126 width=314)
-> Subquery Scan *SELECT* 1 (cost=0.00..3089.07 rows=50307 width=71)
-> Seq Scan on log_table_1 (cost=0.00..3089.07 rows=50307 width=71)
-> Subquery Scan *SELECT* 2 (cost=0.00..602.92 rows=9892 width=314)
-> Seq Scan on log_table_2 (cost=0.00..602.92 rows=9892 width=314)
-> Subquery Scan *SELECT* 3 (cost=0.00..2390.09 rows=39209 width=314)
-> Seq Scan on log_table_3 (cost=0.00..2390.09 rows=39209 width=314)

What version are you using? In 7.3 and up it should be willing to
consider moving the clause down, unless there's something like a type
mismatch (because in that case it may not be equivalent without a bunch
more work on the clause).

#3Rajesh Kumar Mallah
mallah@trade-india.com
In reply to: Stephan Szabo (#2)
Re: Views With Unions

Stephan Szabo wrote:

On Thu, 31 Jul 2003, Christopher Browne wrote:

select * from log_table where request_time between 'june 11 2003' and
'june 12 2003';

returns a plan:
Subquery Scan log_table (cost=0.00..10950.26 rows=177126 width=314)
-> Append (cost=0.00..10950.26 rows=177126 width=314)
-> Subquery Scan *SELECT* 1 (cost=0.00..3089.07 rows=50307 width=71)
-> Seq Scan on log_table_1 (cost=0.00..3089.07 rows=50307 width=71)
-> Subquery Scan *SELECT* 2 (cost=0.00..602.92 rows=9892 width=314)
-> Seq Scan on log_table_2 (cost=0.00..602.92 rows=9892 width=314)
-> Subquery Scan *SELECT* 3 (cost=0.00..2390.09 rows=39209 width=314)
-> Seq Scan on log_table_3 (cost=0.00..2390.09 rows=39209 width=314)

What version are you using? In 7.3 and up it should be willing to
consider moving the clause down, unless there's something like a type
mismatch (because in that case it may not be equivalent without a bunch
more work on the clause).

Dear Chris,

I had the same problem(type mismatch) and it was solved finally.
check the list
"factoring problem ... " subject only 2 weeks back .

regds
mallah.

Show quoted text

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

#4Chris Browne
cbbrowne@acm.org
In reply to: Stephan Szabo (#2)
Re: Views With Unions

Stephan Szabo said:

On Thu, 31 Jul 2003, Christopher Browne wrote:

select * from log_table where request_time between 'june 11 2003'
and
'june 12 2003';

returns a plan:
Subquery Scan log_table (cost=0.00..10950.26 rows=177126 width=314)
-> Append (cost=0.00..10950.26 rows=177126 width=314)
-> Subquery Scan *SELECT* 1 (cost=0.00..3089.07 rows=50307
width=71)
-> Seq Scan on log_table_1 (cost=0.00..3089.07
rows=50307 width=71)
-> Subquery Scan *SELECT* 2 (cost=0.00..602.92 rows=9892
width=314)
-> Seq Scan on log_table_2 (cost=0.00..602.92
rows=9892 width=314)
-> Subquery Scan *SELECT* 3 (cost=0.00..2390.09 rows=39209
width=314)
-> Seq Scan on log_table_3 (cost=0.00..2390.09
rows=39209 width=314)

What version are you using? In 7.3 and up it should be willing to
consider moving the clause down, unless there's something like a type
mismatch (because in that case it may not be equivalent without a bunch
more work on the clause).

That was 7.2.4, although I had also tried it on 7.4 (yesterday's CVS).

Which provides four findings:

1. On 7.2.4, adding additional type info just doesn't help, fitting with
the notion that, consistent with your comment, improvement wouldn't happen
earlier than 7.3.

There's no help on 7.2 :-(, and the system I'm initially most interested
in using this on is still on 7.2.

2. When I retried on 7.4, it _did_ find search paths based on Index Scan,
when I added in additional type information. So the optimization I was
wishing for _is_ there :-). In the longer term, that's very good news.

3. I'll have to test this out on 7.3.4, now, as I hadn't, and it sounds
as though that is an interesting case.

4. It's often necessary to expressly specify type information in queries
to get the optimizer to do the Right Thing.
--
(reverse (concatenate 'string "ofni.smrytrebil@" "enworbbc"))
<http://dev6.int.libertyrms.info/&gt;
Christopher Browne
(416) 646 3304 x124 (land)

#5Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Chris Browne (#4)
Re: Views With Unions

On Fri, 1 Aug 2003, Christopher Browne wrote:

Stephan Szabo said:

What version are you using? In 7.3 and up it should be willing to
consider moving the clause down, unless there's something like a type
mismatch (because in that case it may not be equivalent without a bunch
more work on the clause).

That was 7.2.4, although I had also tried it on 7.4 (yesterday's CVS).

Which provides four findings:

1. On 7.2.4, adding additional type info just doesn't help, fitting with
the notion that, consistent with your comment, improvement wouldn't happen
earlier than 7.3.

There's no help on 7.2 :-(, and the system I'm initially most interested
in using this on is still on 7.2.

If you really wanted you could try going back and finding the diffs
associated with this in the CVS history or committers archives and see if
you can make equivalent changes to 7.2, but that's possibly going to be
difficult.

2. When I retried on 7.4, it _did_ find search paths based on Index Scan,
when I added in additional type information. So the optimization I was
wishing for _is_ there :-). In the longer term, that's very good news.

3. I'll have to test this out on 7.3.4, now, as I hadn't, and it sounds
as though that is an interesting case.

4. It's often necessary to expressly specify type information in queries
to get the optimizer to do the Right Thing.

Especially for cases like this. It takes the safer route of not pushing
things down when it's not sure if pushing down might change the semantics
(for example if a union piece has a different type from the union
output, simply pushing clauses down unchanged could change the results)

Tom would probably be willing to relax conditions if it could be proven
safe even for the wierd outlying cases with char and varchar and such.

#6Chris Browne
cbbrowne@acm.org
In reply to: Stephan Szabo (#5)
Re: Views With Unions

Stephan Szabo <sszabo@megazone.bigpanda.com> writes:

On Fri, 1 Aug 2003, Christopher Browne wrote:

Stephan Szabo said:

What version are you using? In 7.3 and up it should be willing to
consider moving the clause down, unless there's something like a type
mismatch (because in that case it may not be equivalent without a bunch
more work on the clause).

That was 7.2.4, although I had also tried it on 7.4 (yesterday's CVS).

Which provides four findings:

1. On 7.2.4, adding additional type info just doesn't help, fitting with
the notion that, consistent with your comment, improvement wouldn't happen
earlier than 7.3.

There's no help on 7.2 :-(, and the system I'm initially most interested
in using this on is still on 7.2.

If you really wanted you could try going back and finding the diffs
associated with this in the CVS history or committers archives and see if
you can make equivalent changes to 7.2, but that's possibly going to be
difficult.

Somehow I don't think that'll fly; I have taken a brief look at some
of the optimizer code, and I very much don't want to leap into that at
the moment. (I don't imagine I'd be able to muster much enthusiasm
for the idea from others that are involved, either. More likely, I'm
understating the probable opposition to the idea... :-))

I was hoping there would be some help on 7.2, but can live without it.
This approach to improving log purgeability is NOT the sort of thing
that you deploy on a day's notice because it seems like a "neat idea."
If it waits a couple months to be implemented, that's doubtless OK.

2. When I retried on 7.4, it _did_ find search paths based on Index Scan,
when I added in additional type information. So the optimization I was
wishing for _is_ there :-). In the longer term, that's very good news.

3. I'll have to test this out on 7.3.4, now, as I hadn't, and it sounds
as though that is an interesting case.

It turns out nicely on 7.3.4, using index scans for the subqueries in
the query:

select count(*) from log_table where event_date between
'2003-04-01' and '2003-05-01';

Which is a Good Thing.

4. It's often necessary to expressly specify type information in
queries to get the optimizer to do the Right Thing.

Especially for cases like this. It takes the safer route of not
pushing things down when it's not sure if pushing down might change
the semantics (for example if a union piece has a different type
from the union output, simply pushing clauses down unchanged could
change the results)

Tom would probably be willing to relax conditions if it could be
proven safe even for the wierd outlying cases with char and varchar
and such.

Evidently the dates of the form '2003-04-01' and such are getting
types promoted properly enough. I don't see anything to "lobby" for
at this point.

The DOMAIN case I mentioned the other day had something odd going on
that LOST the type information associated with the domain. Albeit
that was on 7.3, whereas the changes in DOMAIN functionality that make
them meaningfully useful come in 7.4...
--
let name="cbbrowne" and tld="libertyrms.info" in name ^ "@" ^ tld;;
<http://dev6.int.libertyrms.com/&gt;
Christopher Browne
(416) 646 3304 x124 (land)

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Chris Browne (#6)
Re: Views With Unions

Christopher Browne <cbbrowne@libertyrms.info> writes:

The DOMAIN case I mentioned the other day had something odd going on
that LOST the type information associated with the domain. Albeit
that was on 7.3, whereas the changes in DOMAIN functionality that make
them meaningfully useful come in 7.4...

Domains were a work-in-progress in 7.3, and to some extent still are.
Please try to test out 7.4beta and let us know about deficiencies you
find.

regards, tom lane

#8Francisco J Reyes
fran@natserv.net
In reply to: Tom Lane (#7)
Domains (Was [PERFORM] Views With Unions)

On Fri, 1 Aug 2003, Tom Lane wrote:

Domains were a work-in-progress in 7.3, and to some extent still are.
Please try to test out 7.4beta and let us know about deficiencies you
find.

Are domains user defined types? That they seem to be based on what I see
on the docs.

Any drawbacks to using them?

Right now I have a new database I am making and wanted some consistency
accros some tables. Currently I used inheritance to enforce the consitency
since a good number of fields needed to be common among the tables AND the
inheritted tables are basically a supperset of the data, so some times I
would want to access the inheritted tables and other times the parent/main
table.

#9Ron Johnson
ron.l.johnson@cox.net
In reply to: Francisco J Reyes (#8)
Re: Domains (Was [PERFORM] Views With Unions)

On Fri, 2003-08-01 at 12:26, Francisco J Reyes wrote:

On Fri, 1 Aug 2003, Tom Lane wrote:

[snip]

accros some tables. Currently I used inheritance to enforce the consitency
since a good number of fields needed to be common among the tables AND the
inheritted tables are basically a supperset of the data, so some times I
would want to access the inheritted tables and other times the parent/main
table.

Isn't this when you'd really want child tables, instead?

-- 
+-----------------------------------------------------------------+
| Ron Johnson, Jr.        Home: ron.l.johnson@cox.net             |
| Jefferson, LA  USA                                              |
|                                                                 |
| "I'm not a vegetarian because I love animals, I'm a vegetarian  |
|  because I hate vegetables!"                                    |
|    unknown                                                      |
+-----------------------------------------------------------------+
#10Jochem van Dieten
jochemd@oli.tudelft.nl
In reply to: Francisco J Reyes (#8)
Re: Domains (Was [PERFORM] Views With Unions)

Francisco J Reyes wrote:

Are domains user defined types? That they seem to be based on what I see
on the docs.

They are similar a bit to user defined types, but there are some
important differences.

First of all, domains are based on other datatypes, so you would
have to have an existing datatype that is a sufficiently close match.
Secondly, domains can be mixed directly with the datatype they
are based on and other domains that are based on that datatype.
User defined types require you to first cast them (which in turn
implies a casting function needs to be defined) to the other/a
common type before you can mix them.

Overall, I tend to see them more as macro's then as real
datatypes. Typically I use them to verify that data is properly
formatted (like the phone number example).

Please note that the SQL:1999 concept of user defined types is
quite different from the one in PostgreSQL (in his book "Advanced
SQL:1999" Jim Melton even warns that domains may be deprecated in
future SQL standards in favour of SQL-style user defined types).

Jochem

#11Francisco J Reyes
fran@natserv.net
In reply to: Ron Johnson (#9)
Inheritance vs child tables (Was Domains)

On Fri, 1 Aug 2003, Ron Johnson wrote:

On Fri, 2003-08-01 at 12:26, Francisco J Reyes wrote:

On Fri, 1 Aug 2003, Tom Lane wrote:

[snip]

Currently I used inheritance to enforce the consitency
since a good number of fields needed to be common among the tables AND the
inheritted tables are basically a supperset of the data, so some times I
would want to access the inheritted tables and other times the parent/main
table.

Isn't this when you'd really want child tables, instead?

I think both ways can accomplish the same (if not very simmilar
functionality), however I find using inherittance easier.
Not really sure about efficiency though.

A simple example of the type of design I am planning to do would be:

Table A
Userid
date entered
last changed

Table B inherited from A(additional fields)
person name
birthday

Table C inherited from A(additional fields)
book
isbn
comment

I plan to keep track of how many records a user has so with inherittance
it's easy to do this. I can count for the user in Table A and find out how
many records he/she has or I can count in each of the inheritted tables
and see how many there are for that particular table.

Inheritance makes it easier to see everything for a userid or just a
particular type of records.

#12Francisco J Reyes
fran@natserv.net
In reply to: Jochem van Dieten (#10)
Re: Domains (Was [PERFORM] Views With Unions)

On Sat, 2 Aug 2003, Jochem van Dieten wrote:

Francisco J Reyes wrote:

Are domains user defined types? That they seem to be based on what I see
on the docs.

They are similar a bit to user defined types,

......

Please note that the SQL:1999 concept of user defined types is
quite different from the one in PostgreSQL (in his book "Advanced
SQL:1999" Jim Melton even warns that domains may be deprecated in
future SQL standards in favour of SQL-style user defined types).

Thanks for the explanation and the warning. Based on your feedback, don't
quite see how domains would be usefull to me so I will not use them for
now.

#13Ron Johnson
ron.l.johnson@cox.net
In reply to: Francisco J Reyes (#11)
Re: Inheritance vs child tables (Was Domains)

On Sat, 2003-08-02 at 13:22, Francisco J Reyes wrote:

On Fri, 1 Aug 2003, Ron Johnson wrote:

On Fri, 2003-08-01 at 12:26, Francisco J Reyes wrote:

On Fri, 1 Aug 2003, Tom Lane wrote:

[snip]

Currently I used inheritance to enforce the consitency
since a good number of fields needed to be common among the tables AND the
inheritted tables are basically a supperset of the data, so some times I
would want to access the inheritted tables and other times the parent/main
table.

Isn't this when you'd really want child tables, instead?

I think both ways can accomplish the same (if not very simmilar
functionality), however I find using inherittance easier.
Not really sure about efficiency though.

A simple example of the type of design I am planning to do would be:

Table A
Userid
date entered
last changed

Table B inherited from A(additional fields)
person name
birthday

Table C inherited from A(additional fields)
book
isbn
comment

I plan to keep track of how many records a user has so with inherittance
it's easy to do this. I can count for the user in Table A and find out how
many records he/she has or I can count in each of the inheritted tables
and see how many there are for that particular table.

Inheritance makes it easier to see everything for a userid or just a
particular type of records.

But isn't this what LEFT OUTER JOIN is for?

Attached is a zip of the sql and results of what I mean.

Plain inner joins or LOJ with "WHERE {B|C}.whatever IS NOT NULL"
also pare things dawn.

Of course, just yesterday, in a post on -general or -performance,
I read that LEFT OUTER JOIN isn't particularly efficient in PG.

Also, wouldn't it be odd to have a userid without a name? So,
why isn't table_b combined with table_a? But all circumstances
are different, I guess...

-- 
+-----------------------------------------------------------------+
| Ron Johnson, Jr.        Home: ron.l.johnson@cox.net             |
| Jefferson, LA  USA                                              |
|                                                                 |
| "I'm not a vegetarian because I love animals, I'm a vegetarian  |
|  because I hate vegetables!"                                    |
|    unknown                                                      |
+-----------------------------------------------------------------+

Attachments:

oj_example.zipapplication/zip; name=oj_example.zipDownload
#14Ron Johnson
ron.l.johnson@cox.net
In reply to: Jochem van Dieten (#10)
Re: Domains (Was [PERFORM] Views With Unions)

On Sat, 2003-08-02 at 11:29, Jochem van Dieten wrote:

Francisco J Reyes wrote:

Are domains user defined types? That they seem to be based on what I see
on the docs.

They are similar a bit to user defined types, but there are some
important differences.

First of all, domains are based on other datatypes, so you would
have to have an existing datatype that is a sufficiently close match.
Secondly, domains can be mixed directly with the datatype they
are based on and other domains that are based on that datatype.
User defined types require you to first cast them (which in turn
implies a casting function needs to be defined) to the other/a
common type before you can mix them.

Overall, I tend to see them more as macro's then as real
datatypes. Typically I use them to verify that data is properly
formatted (like the phone number example).

They are also (primarily?) used on other DBMSs to "prove correctness".
Thus, for example, everywhere you see a field of type TXNID_DOM,
you know that it is a "transaction id", no matter what the field
name is.

It's also theoretically possible to make type modifications simpler.
Say you have domain TXNID_DOM of type INTEGER, and you are running
up against the 2,100,000,000 threshold:
ALTER DOMAIN TXNID_DOM BIGINT;

Now, you have 10^9 more transaction ids. Of course, the down side
is that *every* table with a field of type TXNID_DOM is modified,
and that might take a *REALLY*LONG*TIME*...

-- 
+-----------------------------------------------------------------+
| Ron Johnson, Jr.        Home: ron.l.johnson@cox.net             |
| Jefferson, LA  USA                                              |
|                                                                 |
| "I'm not a vegetarian because I love animals, I'm a vegetarian  |
|  because I hate vegetables!"                                    |
|    unknown                                                      |
+-----------------------------------------------------------------+
#15Francisco J Reyes
fran@natserv.net
In reply to: Ron Johnson (#13)
Re: Inheritance vs child tables (Was Domains)

On Sat, 2 Aug 2003, Ron Johnson wrote:

Inheritance makes it easier to see everything for a userid or just a
particular type of records.

But isn't this what LEFT OUTER JOIN is for?

Yes but the more tables you have the more cumbersome it would become to do
with outer joins.
Imagine a parent table and 20 children tables. To get a count of all
records the user has I either have to do a nasty/ugly union or do 20
counts and then add them (ie doing the separate counts and keeping
track of them with a language like PHP)

Of course, just yesterday, in a post on -general or -performance,
I read that LEFT OUTER JOIN isn't particularly efficient in PG.

And it's probably worse when many tables are involved.

Also, wouldn't it be odd to have a userid without a name? So,
why isn't table_b combined with table_a?

I have a separate table with user information.
The main reason I thought of inherittance was because I need to do
accounting and keep track of how many records a user has for certain type
of data or in total. Inheritance makes this really easy.

Table A, B and C are not combined because B, C and onward have totally
different type of data and they are not one to one.

There are times when children tables make more sense like:

*person table
-person id
-name
-address

*phones
-person id
-phone type (ie fax, home, work)
-area
-phone

*emails
-person id
-email type (home, work)
-email

In my opinion a case like that is best handled with children tables.
Specially if there are only a couple of childre tables.

On my case I have about 8 inherited tables and what I believe inheritance
does for me is:
* Easy way to count both a grand total or a table per inherited table.
* Easy to work with each inheritted table, which will be very often.
* Much simpler queries/reporting