Inheritance

Started by Jan Johanssonover 9 years ago17 messages

jan.johansson.mr@gmail.com

over 9 years ago

Hi,

I've been reading some threads about inheritance, and how complicated it
seems to let child (children) to inherit constraints and indexes
(behaviors). In fact this seems to have been the issue for years, with no
resolution.

However, inheritance is a very good feature, and it would be great to have
it feature complete.

To try to unlock the feature, to be more complete, how about introducing
restriction to inheritance like:

- Allow single (behavior) inheritance (model here is quite a few modern
languages, such as C#, D, ...)
- Allow multiple declarative inheritance (interface like, the inheritance
almost works like this today though)

If, with these restrictions (or maybe only the first), do you think that it
will simplify implementation and make it more feature complete?

Kind regards,
Jan Johansson

Jim Nasby

Jim.Nasby@BlueTreble.com

over 9 years ago

In reply to: Jan Johansson (#1)

Re: Inheritance

On 5/22/16 1:37 AM, Jan Johansson wrote:

- Allow single (behavior) inheritance (model here is quite a few modern
languages, such as C#, D, ...)
- Allow multiple declarative inheritance (interface like, the
inheritance almost works like this today though)

If, with these restrictions (or maybe only the first), do you think that
it will simplify implementation and make it more feature complete?

I think you'll need to be a bit more specific to elicit a response. What
exactly are you proposing to change?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Merlin Moncure

mmoncure@gmail.com

over 9 years ago

In reply to: Jim Nasby (#2)

Re: Inheritance

On Mon, May 23, 2016 at 10:21 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 5/22/16 1:37 AM, Jan Johansson wrote:

- Allow single (behavior) inheritance (model here is quite a few modern
languages, such as C#, D, ...)
- Allow multiple declarative inheritance (interface like, the
inheritance almost works like this today though)

If, with these restrictions (or maybe only the first), do you think that
it will simplify implementation and make it more feature complete?

I think you'll need to be a bit more specific to elicit a response. What
exactly are you proposing to change?

I would guess OP is complaining about what everyone complains about.
What people want is for tables to have a base set of shared columns by
a varying set of derived type dependent columns. Constraints on the
shared columns will be enforced on all the derived columns.

Postgres doesn't work that way, and the documentation disclaims this:
"Note: Although inheritance is frequently useful, it has not been
integrated with unique constraints or foreign keys, which limits its
usefulness. See Section 5.8 for more detail."

Personally, I don't think this will ever be fixed. The reason why it
doesn't work is due to some foundational implementation decisions that
would have to be revisited.

This feature was very much a product of the time, at the height of the
"Object Relational" fad. The trend for postgres has been in the exact
opposite direction, towards the SQL standard. Further complicating
matters, inheritance has been repurposed to be the foundation for
table partitioning, making heavy changes problematic.

The classic SQL approach to the problem, establishing a base table
plus a type and derived tables with a pkey:pkey link isn't a very bad
one from a data modelling perspective and serialization to the
application is increasingly going to be handled by json going forward
as opposed to hacking the postgres type system. This really reduces
the value proposition of heavy changes to the inheritance features.
If there was consensus on that point, I suppose the way forward is
some documentation restructuring, starting with removing the
increasingly baroque tutorial (trivia: the tutorial was the old manual
at one point).

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Jim Nasby

Jim.Nasby@BlueTreble.com

over 9 years ago

In reply to: Merlin Moncure (#3)

Re: Inheritance

On 5/23/16 11:05 AM, Merlin Moncure wrote:

Postgres doesn't work that way, and the documentation disclaims this:
"Note: Although inheritance is frequently useful, it has not been
integrated with unique constraints or foreign keys, which limits its
usefulness. See Section 5.8 for more detail."

Personally, I don't think this will ever be fixed. The reason why it
doesn't work is due to some foundational implementation decisions that
would have to be revisited.

If the complaint is really about FKs/UNIQUE (and really AFAIK it's only
UNIQUE that's the problem), then I agree: it should be addressed. It's a
major impediment to partitioning (and generic inheritance).

This feature was very much a product of the time, at the height of the
"Object Relational" fad. The trend for postgres has been in the exact
opposite direction, towards the SQL standard. Further complicating
matters, inheritance has been repurposed to be the foundation for
table partitioning, making heavy changes problematic.

I don't see why partitioning complicates fixing these issues. ISTM it's
the exact same complaint for both inheritance and partitioning.

I also disagree about PK:PK FK's between a bunch of completely
independent tables being a good way to model this stuff. It doubles the
complexity of every query against a child table and doesn't perform
nearly as well, because your data locality goes down the tubes.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Jim Nasby (#4)

Re: Inheritance

Jim Nasby <Jim.Nasby@BlueTreble.com> writes:

On 5/23/16 11:05 AM, Merlin Moncure wrote:

This feature was very much a product of the time, at the height of the
"Object Relational" fad. The trend for postgres has been in the exact
opposite direction, towards the SQL standard. Further complicating
matters, inheritance has been repurposed to be the foundation for
table partitioning, making heavy changes problematic.

I don't see why partitioning complicates fixing these issues. ISTM it's
the exact same complaint for both inheritance and partitioning.

My feeling about it is that we need to provide a partitioning feature
that doesn't rely on the current notion of inheritance at all. We've
heard from multiple users who want to use large numbers of partitions,
enough that simply having a separate relcache entry for each partition
would be a performance problem, never mind the current approach to
planning queries over inheritance trees. So the partitions need to be
objects much simpler than full-fledged tables.

If we had that, and encouraged people to migrate simple partitioning
use-cases to it, that might take off enough pressure that we could
afford to consider more-complicated inheritance schemes rather than
treating inheritance as an unfortunate legacy design. But we're
some years away from being able to do that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Joe Conway

mail@joeconway.com

over 9 years ago

In reply to: Tom Lane (#5)

Re: Inheritance

On 05/23/2016 03:05 PM, Tom Lane wrote:

Jim Nasby <Jim.Nasby@BlueTreble.com> writes:

I don't see why partitioning complicates fixing these issues. ISTM it's
the exact same complaint for both inheritance and partitioning.

My feeling about it is that we need to provide a partitioning feature
that doesn't rely on the current notion of inheritance at all. We've
heard from multiple users who want to use large numbers of partitions,
enough that simply having a separate relcache entry for each partition
would be a performance problem, never mind the current approach to
planning queries over inheritance trees. So the partitions need to be
objects much simpler than full-fledged tables.

I wonder if it wouldn't make sense to define a partition as a list of
segments within a single table that represent the partition?

But then again, maybe we need to start with a clear notion of what
problems people are trying to solve when they use partitions. At least
some of the historic reasons are no longer valid.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Joe Conway (#6)

Re: Inheritance

Joe Conway <mail@joeconway.com> writes:

But then again, maybe we need to start with a clear notion of what
problems people are trying to solve when they use partitions. At least
some of the historic reasons are no longer valid.

That's true. Just because people want to have a gazillion partitions
doesn't necessarily mean it's a good design that we need to support well.
Some investigation would be a smart use of time.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Alvaro Herrera

alvherre@2ndquadrant.com

over 9 years ago

In reply to: Tom Lane (#5)

Re: Inheritance

Tom Lane wrote:

My feeling about it is that we need to provide a partitioning feature
that doesn't rely on the current notion of inheritance at all. We've
heard from multiple users who want to use large numbers of partitions,
enough that simply having a separate relcache entry for each partition
would be a performance problem, never mind the current approach to
planning queries over inheritance trees. So the partitions need to be
objects much simpler than full-fledged tables.

Sorry to hijack the thread, but I agree on this, and I'm worried that
the patch being floated for partitioning may paint us on a corner from
which it may be difficult to get out.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Craig Ringer

craig@2ndquadrant.com

over 9 years ago

In reply to: Merlin Moncure (#3)

Re: Inheritance

On 24 May 2016 at 00:05, Merlin Moncure <mmoncure@gmail.com> wrote:

This feature was very much a product of the time, at the height of the
"Object Relational" fad. The trend for postgres has been in the exact
opposite direction, towards the SQL standard. Further complicating
matters, inheritance has been repurposed to be the foundation for
table partitioning, making heavy changes problematic.

Indeed.

I find it notable that no popular ORM has bothered adopting PostgreSQL's
inheritance features, and instead just use big left joins or repeated
SELECTs to implement parent/child relationships, with foreign keys
enforcing constraints.

I consider inheritance mostly useless without the ability to have UNIQUE
indexes that span a parent relation and all its children. You can use them
for partitioning only by sacrificing a bunch of integrity protection or
creating messy chains of FKs between individual partitions.

I'd rather like to quietly deprecate inheritance and eventually remove it
once we have real partitioning and some time has passed...

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#10

Jim Nasby

Jim.Nasby@BlueTreble.com

over 9 years ago

In reply to: Craig Ringer (#9)

Re: Inheritance

On 5/23/16 10:30 PM, Craig Ringer wrote:

I find it notable that no popular ORM has bothered adopting PostgreSQL's
inheritance features, and instead just use big left joins or repeated
SELECTs to implement parent/child relationships, with foreign keys
enforcing constraints.

Since when do we consider ORMs to be an example of how to do good
database design?

In this case, I'm sure no ORM uses the feature because either the ORM's
authors have no clue it exists (except maybe for partitioning) or
because we're the only mainstream database that has it.

I consider inheritance mostly useless without the ability to have UNIQUE
indexes that span a parent relation and all its children. You can use
them for partitioning only by sacrificing a bunch of integrity
protection or creating messy chains of FKs between individual partitions.

There's ways around this issue, but I agree that it's a very unhappy
situation right now.

I'd rather like to quietly deprecate inheritance and eventually remove
it once we have real partitioning and some time has passed...

IMO that's a very unfortunate attitude to have for the "Worlds Most
Advanced Open Source Database". Now that OO programming isn't treated as
a magic bullet it's proven to be an extremely powerful tool, especially
when used to encapsulate complex data. The
ndarray->Series->DataFrame->Panel inheritance tree is a great example of
this. Feature-wise, we need to be doing *more* of this kind of work, not
less. Lack of support for OO paradigms was one of the drivers for NoSQL.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Konstantin Knizhnik

k.knizhnik@postgrespro.ru

over 9 years ago

In reply to: Jim Nasby (#10)

Re: Inheritance

There is one aspect of inheritance support which was not mentioned:
polymorphic queries.
Actually polymorphism is the fundamental feature of OOP, without it
there is no behavioral inheritance and inheritance can be considered
just as "syntax sugar" for sharing some common subset of attributes
between tables.

The main problem with supporting polymorphic queries is that SQL query
returns set of tuples, not set of objects.
So there is no nice way to return both instances of based and derived
tables. There are several alternatives
(for example return joined set of attributes in all derived tables,
leaving missed as NULLs) but none of them is good.

There is more specific problem - polymorphic function calls, see:

http://postgresql.nabble.com/Polymorphic-function-calls-td5784772.html

which in principle can be solved... But looks like there is no such
intention.

I do not know how much people will be interested in normal OO support
from DBMS.
Right now PostgreQSL inheritance is used for different purposes:
partitioning, FDW-based replication or sharding, ...
But IMHO it is mostly because current inheritance implementation just
not make it possible to use it for OOP.
As far as I know before there were a lot of discussions whether arrays,
json and other non-traditional for RDBMS types is needed.
But right now them are quite popular and frequently used. The same thing
can happen with OO support: it will be popular if provide necessary
functionality.

Right now OO DBMS applications are mostly using ORMs and them are not
using PostgreSQL inheritance.
Mostly because ORMs are used to be more flexible in providing different
approaches for implementing inheritance (for example in most cases
placing all derived classes in single table works better than approach
based on appending results of several queries).
Another reason is that ORM usually support more than one DBMS, and since
most of them have no built-in support for inheritance,
ORM has to implement inheritance itself.
But I do not think that presence of ORM excludes necessity to have
internal support of OO in DBMS.

On 24.05.2016 16:51, Jim Nasby wrote:

On 5/23/16 10:30 PM, Craig Ringer wrote:

I find it notable that no popular ORM has bothered adopting PostgreSQL's
inheritance features, and instead just use big left joins or repeated
SELECTs to implement parent/child relationships, with foreign keys
enforcing constraints.

Since when do we consider ORMs to be an example of how to do good
database design?

In this case, I'm sure no ORM uses the feature because either the
ORM's authors have no clue it exists (except maybe for partitioning)
or because we're the only mainstream database that has it.

I consider inheritance mostly useless without the ability to have UNIQUE
indexes that span a parent relation and all its children. You can use
them for partitioning only by sacrificing a bunch of integrity
protection or creating messy chains of FKs between individual
partitions.

There's ways around this issue, but I agree that it's a very unhappy
situation right now.

I'd rather like to quietly deprecate inheritance and eventually remove
it once we have real partitioning and some time has passed...

IMO that's a very unfortunate attitude to have for the "Worlds Most
Advanced Open Source Database". Now that OO programming isn't treated
as a magic bullet it's proven to be an extremely powerful tool,
especially when used to encapsulate complex data. The
ndarray->Series->DataFrame->Panel inheritance tree is a great example
of this. Feature-wise, we need to be doing *more* of this kind of
work, not less. Lack of support for OO paradigms was one of the
drivers for NoSQL.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Craig Ringer

craig@2ndquadrant.com

over 9 years ago

In reply to: Jim Nasby (#10)

Re: Inheritance

On 24 May 2016 at 21:51, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 5/23/16 10:30 PM, Craig Ringer wrote:

I find it notable that no popular ORM has bothered adopting PostgreSQL's
inheritance features, and instead just use big left joins or repeated
SELECTs to implement parent/child relationships, with foreign keys
enforcing constraints.

Since when do we consider ORMs to be an example of how to do good database
design?

I don't, but I do consider them a useful measure of what the majority of
users in the field are seeing and doing.

I'd rather like to quietly deprecate inheritance and eventually remove

it once we have real partitioning and some time has passed...

IMO that's a very unfortunate attitude to have for the "Worlds Most
Advanced Open Source Database". Now that OO programming isn't treated as a
magic bullet it's proven to be an extremely powerful tool, especially when
used to encapsulate complex data.

Absolutely, and I use OO heavily. But a half-assed "object relational"
feature in the database that only kind-of works isn't OO, and it's
generally less useful than using existing relational-to-oo modelling
techniques like FK-related 1:1 child tables for specialisation.

I'm so sick of seeing "object relational" in PostgreSQL's tagline. It's
nonsense. We do so much so well, why focus on this bogus half-feature so
much?

The ndarray->Series->DataFrame->Panel inheritance tree is a great example
of this. Feature-wise, we need to be doing *more* of this kind of work, not
less. Lack of support for OO paradigms was one of the drivers for NoSQL.

I agree.

Note that there was enough enthusiasm to adopt whole new database engines,
but not enough to use PostgreSQL's existing features for that. Partly
because they suck. In particular, people looking for this tend to want to
be able to create new subtypes without having to mess around making schema
changes and modelling everything.

There's much more future in improving document-structured storage like
jsonb, and possibly extending in future toward hybrid storage with some
normal cols and some dynamic cols, than with Pg's pseudo-object-relational
inheritance feature.

I really wish we could just remove it. I know it's impossible because it's
used for partitioning and because there's a significant and important,
albeit tiny relative to total users, group who're utterly reliant on it for
their DB schemas. I know of one health engine provider who uses it heavily
for example. It's a pity, because deleting the complexity around
inheritance entirely would simplify a number of useful features. I'd
certainly be in favour of allowing new PostgreSQL features _not_ to support
tables with inheritance, making it a semi-deprecated feature that new
functionality doesn't have to support. (I know that goes against the usual
way of things, and I'm usually really against introducing things where X
works unless you have Y or A enabled or it's a full moon on a Thursday, but
IMO the benefit justifies the cost for making inheritance support optional).

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#13

Craig Ringer

craig@2ndquadrant.com

over 9 years ago

In reply to: Konstantin Knizhnik (#11)

Re: Inheritance

On 24 May 2016 at 22:45, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
wrote:

There is one aspect of inheritance support which was not mentioned:
polymorphic queries.
Actually polymorphism is the fundamental feature of OOP, without it there
is no behavioral inheritance and inheritance can be considered just as
"syntax sugar" for sharing some common subset of attributes between tables.

The main problem with supporting polymorphic queries is that SQL query
returns set of tuples, not set of objects.
So there is no nice way to return both instances of based and derived
tables. There are several alternatives
(for example return joined set of attributes in all derived tables,
leaving missed as NULLs) but none of them is good.

Exactly. We have a sort-of-object-ish storage option, but none of the
surrounding stuff to make it useful for actual OO / object-relational work.

The "joined set of attributes" approach is exactly what ORMs already do,
and many direct implementations of the same idea will use too. So we'd
offer no advantage over what they already do in a way that works with
multiple DBMSes, except we might be able to do it faster. Maybe.

The lack of polymorphism is critical. It's not really usefully OO but it
costs you important relational features if you use it. We have some very
limited polymorphism in the sense that you can query the parent table and
see rows in child tables, but you only get the subset of cols that exists
at that level of the heirarchy.

One thing I'd like to explore one day is a nice, user-friendly way to
express "SELECT this row and the corresponding sets of rows from [these
tables and their children in turn] as a structured object". Right now users
have to write series of LEFT JOINs and de-duplicate the left-hand sides. Or
do multiple queries (n+1 selects), possibly expensively with repeated join
work involved. Or they have to write pretty baroque queries to construct a
jsonb object with jsonb_agg with multiple levels of group-by in
subqueries-in-from. We should be able to do this for them, so they can say

SELECTOBJECT customer
CHILD JOIN invoice ON (customer.customer_id = invoice.customer_id AND
invoice_date > $1)
CHILD JOIN invoiceline USING (invoice_id)
CHILD JOIN address USING (customer_id)
WHERE customer.in_debt_collections;

instead of the current nested mess of aggregation and subqueries needed,
like:

SELECT
to_jsonb(customer) || (
SELECT jsonb_build_object('invoices', jsonb_agg(invoice_obj))
FROM (
SELECT to_jsonb(invoice) || jsonb_build_object('invoice_lines',
jsonb_agg(invoice_line))
FROM invoice
LEFT OUTER JOIN invoice_line ON (invoice.invoice_id =
invoice_line.invoice_id)
WHERE invoice.customer_id = customer.customer_id AND invoice_date

= current_date

GROUP BY invoice.invoice_id
) invoice_obj
) || (
SELECT jsonb_build_object('addresses', jsonb_agg(address))
FROM address
WHERE address.customer_id = customer.customer_id
)
FROM customer
WHERE customer.in_debt_collections

which is just pure joy to read, and gets even more contorted as the layers
of parent/child relationships get deeper. The resulting query plan clearly
expresses the desired result, but writing it in SQL is horrible:

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on customer (cost=0.00..130226.27 rows=1270 width=64)
SubPlan 1
-> Aggregate (cost=77.47..77.48 rows=1 width=24)
-> Subquery Scan on invoice_obj (cost=77.37..77.46 rows=3
width=24)
-> HashAggregate (cost=77.37..77.43 rows=3 width=104)
Group Key: invoice.invoice_id
-> Hash Right Join (cost=50.84..77.36 rows=3
width=104)
Hash Cond: (invoice_line.invoice_id =
invoice.invoice_id)
-> Seq Scan on invoice_line
(cost=0.00..22.00 rows=1200 width=68)
-> Hash (cost=50.80..50.80 rows=3 width=40)
-> Seq Scan on invoice
(cost=0.00..50.80 rows=3 width=40)
Filter: ((customer_id =
customer.customer_id) AND (invoice_date >= ('now'::cstring)::date))
SubPlan 2
-> Aggregate (cost=25.02..25.03 rows=1 width=32)
-> Seq Scan on address (cost=0.00..25.00 rows=6 width=32)
Filter: (customer_id = customer.customer_id)
(16 rows)

Maybe grouping sets can help avoid the nested joins, but I couldn't figure
out how without wrapping the grouping set output query in another query to
aggregate the produced objects into a top level one.

Inheritance does nothing to help with this.

But I do not think that presence of ORM excludes necessity to have
internal support of OO in DBMS.

I'd agree if it was a finished and general OO feature. I just don't think
what we have now is.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#14

Merlin Moncure

mmoncure@gmail.com

over 9 years ago

In reply to: Craig Ringer (#13)

Re: Inheritance

On Tue, May 24, 2016 at 9:47 PM, Craig Ringer <craig@2ndquadrant.com> wrote:

On 24 May 2016 at 22:45, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
wrote:

There is one aspect of inheritance support which was not mentioned:
polymorphic queries.
Actually polymorphism is the fundamental feature of OOP, without it there
is no behavioral inheritance and inheritance can be considered just as
"syntax sugar" for sharing some common subset of attributes between tables.

The main problem with supporting polymorphic queries is that SQL query
returns set of tuples, not set of objects.
So there is no nice way to return both instances of based and derived
tables. There are several alternatives
(for example return joined set of attributes in all derived tables,
leaving missed as NULLs) but none of them is good.

Exactly. We have a sort-of-object-ish storage option, but none of the
surrounding stuff to make it useful for actual OO / object-relational work.

The "joined set of attributes" approach is exactly what ORMs already do, and
many direct implementations of the same idea will use too. So we'd offer no
advantage over what they already do in a way that works with multiple
DBMSes, except we might be able to do it faster. Maybe.

The lack of polymorphism is critical. It's not really usefully OO but it
costs you important relational features if you use it. We have some very
limited polymorphism in the sense that you can query the parent table and
see rows in child tables, but you only get the subset of cols that exists at
that level of the heirarchy.

One thing I'd like to explore one day is a nice, user-friendly way to
express "SELECT this row and the corresponding sets of rows from [these
tables and their children in turn] as a structured object". Right now users
have to write series of LEFT JOINs and de-duplicate the left-hand sides. Or
do multiple queries (n+1 selects), possibly expensively with repeated join
work involved. Or they have to write pretty baroque queries to construct a
jsonb object with jsonb_agg with multiple levels of group-by in
subqueries-in-from. We should be able to do this for them, so they can say

SELECTOBJECT customer
CHILD JOIN invoice ON (customer.customer_id = invoice.customer_id AND
invoice_date > $1)
CHILD JOIN invoiceline USING (invoice_id)
CHILD JOIN address USING (customer_id)
WHERE customer.in_debt_collections;

instead of the current nested mess of aggregation and subqueries needed,
like:

SELECT
to_jsonb(customer) || (
SELECT jsonb_build_object('invoices', jsonb_agg(invoice_obj))
FROM (
SELECT to_jsonb(invoice) || jsonb_build_object('invoice_lines',
jsonb_agg(invoice_line))
FROM invoice
LEFT OUTER JOIN invoice_line ON (invoice.invoice_id =
invoice_line.invoice_id)
WHERE invoice.customer_id = customer.customer_id AND invoice_date >=
current_date
GROUP BY invoice.invoice_id
) invoice_obj
) || (
SELECT jsonb_build_object('addresses', jsonb_agg(address))
FROM address
WHERE address.customer_id = customer.customer_id
)
FROM customer
WHERE customer.in_debt_collections

Well, I don't know. There's a lot of ways to write that type of thing.
Personally, I tend to prefer to delay the serialization to json as
long as possible (although it's sometimes unavoidable) because it
keeps the query cleaner. I also sometimes use the array() subquery
syntax for sake of brevity, but this query could be restructured to
use proper aggregation on all levels if you're concerned about
performance (this query would tend to underperform yours for very
large compositions because of the second subquery scan vs the hash
join OTOH, it's a faster serialization model). I didn't test the
syntax, but you get the idea.

SELECT to_json(q)
FROM
(
SELECT
c.*,
array(
SELECT
i.*,
array(
SELECT il
FROM invoice_line il
WHERE il.invoice_id = i.invoice_id
) AS invoice_lines
FROM invoice i
WHERE i.customer_id = c.customer_id AND invoice_date >= current_date
) AS invoices
FROM customer c
WHERE c.in_debt_collections
) q

The point is this: the postgresql type system is flexible enough that
you can do arbitrary constructions pretty easy and the situation has
been one of continuous improvement over the last several releases. It
isn't perfect, but json enhancements FWICT have made syntactical
approaches to the problem a dead end; json gets the job done is less
likely to cause problems with the SQL standard down the road. For the
same set of reasons I no longer use crosstab.

In the 15+ years I've been watching postgres inheritance has gone
precisely nowhere, and there other ways to do the things it can do
that also supply a much broader range of use cases. Plus, I'm biased:
I happen to think the 90's OO style of inheritance is pretty dumb :-).

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Jim Nasby

Jim.Nasby@BlueTreble.com

over 9 years ago

In reply to: Craig Ringer (#12)

Re: Inheritance

On 5/24/16 8:52 PM, Craig Ringer wrote:

Absolutely, and I use OO heavily. But a half-assed "object relational"
feature in the database that only kind-of works isn't OO, and it's
generally less useful than using existing relational-to-oo modelling
techniques like FK-related 1:1 child tables for specialisation.

How is it less useful than that? To me, the FK "solution" is the
absolute worst of everything: you still have all the separate child
tables that you must explicitly query *and* you have to get all the
joins correct as well. And hope it doesn't have horrible performance.

Note that there was enough enthusiasm to adopt whole new database
engines, but not enough to use PostgreSQL's existing features for that.
Partly because they suck. In particular, people looking for this tend to
want to be able to create new subtypes without having to mess around
making schema changes and modelling everything.

Which is a decision people have come to regret, because then your
codebase somehow has to deal with 38 different versions of what a
"customer" is.

There's much more future in improving document-structured storage like
jsonb, and possibly extending in future toward hybrid storage with some
normal cols and some dynamic cols, than with Pg's
pseudo-object-relational inheritance feature.

I don't see why we can't do both. There's many cases where more
flexibility in what output tuples look like would be very valuable. The
JSON features are one aspect; crosstab is another.

Postgres is well past the point where our relational features are the
big selling point. It's now about scale, an incredibly robust storage
engine, and all the extensiblity opportunities. We've moved from being
an RDBMS to being a "Data Platform". Improving our OO capabilities just
continues that.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Craig Ringer

craig@2ndquadrant.com

over 9 years ago

In reply to: Jim Nasby (#15)

Re: Inheritance

On 26 May 2016 at 01:56, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 5/24/16 8:52 PM, Craig Ringer wrote:

Absolutely, and I use OO heavily. But a half-assed "object relational"
feature in the database that only kind-of works isn't OO, and it's
generally less useful than using existing relational-to-oo modelling
techniques like FK-related 1:1 child tables for specialisation.

How is it less useful than that? To me, the FK "solution" is the absolute
worst of everything: you still have all the separate child tables that you
must explicitly query *and* you have to get all the joins correct as well.
And hope it doesn't have horrible performance.

Note that there was enough enthusiasm to adopt whole new database

engines, but not enough to use PostgreSQL's existing features for that.
Partly because they suck. In particular, people looking for this tend to
want to be able to create new subtypes without having to mess around
making schema changes and modelling everything.

Which is a decision people have come to regret, because then your codebase
somehow has to deal with 38 different versions of what a "customer" is.

Oh, I totally agree there. It's almost as bad as people serialising Java
objects into the DB. Ugh. It's a bad, bad idea.

It's also what people seem to want to do, and I understand that somewhat
given the pain involved in full table rewrites under extended locks and the
hoop-jumping required to avoid them. It's particularly painful with
something app devs tend to need, but RDBMS designers prefer to ignore:
user-defined columns/attributes. Find me someone who *doesn't* want the
ability for their app users/deployers/etc to add arbitrary attributes to
customer records etc w/o code changes. jsonb helps a lot there, but you
lose Pg's type system and have to use a different query syntax etc.

Lower-pain ways to make schema changes and blend dynamic columns
(application-user-defined columns) with normal columns would help a lot
there. A pseudo-column that can store a TOASTable record extension that's a
set of colname/coltype/coltypmod . Storage is the relatively easy bit
though, the problem is how to work with that though the planner and
executor and output useful results...

There's much more future in improving document-structured storage like

jsonb, and possibly extending in future toward hybrid storage with some
normal cols and some dynamic cols, than with Pg's
pseudo-object-relational inheritance feature.

I don't see why we can't do both. There's many cases where more
flexibility in what output tuples look like would be very valuable. The
JSON features are one aspect; crosstab is another.

Agreed. A real PIVOT, especially one we can use as a source of tuples for
subqueries etc, would be really useful and is the sort of thing we'd need
some degree of dynamic column handling for.

Postgres is well past the point where our relational features are the big
selling point. It's now about scale, an incredibly robust storage engine,
and all the extensiblity opportunities. We've moved from being an RDBMS to
being a "Data Platform". Improving our OO capabilities just continues that.

Right. I'm just not convinced table inheritance is a useful part of that
picture.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#17

Jim Nasby

Jim.Nasby@BlueTreble.com

over 9 years ago

In reply to: Craig Ringer (#16)

Re: Inheritance

On 5/25/16 8:19 PM, Craig Ringer wrote:

Postgres is well past the point where our relational features are
the big selling point. It's now about scale, an incredibly robust
storage engine, and all the extensiblity opportunities. We've moved
from being an RDBMS to being a "Data Platform". Improving our OO
capabilities just continues that.

Right. I'm just not convinced table inheritance is a useful part of that
picture.

In it's current state it's certainly not the best tool in the toolbox,
but IMHO there's still plenty of room for the ability to support
restricted polymorphism, because of all the benefits of not allowing
willy-nilly stuff in the schema. There's certainly other cases (ie:
per-customer custom fields) where willy-nilly is what you actually need.

And certainly things that reduce table modification pain for *any*
operations are most welcome! I think allowing queued backgrounded
processes would be a huge win there. If we had real stored procedures
(IE: ability to control transaction state) and a modest background
scheduler then it wouldn't be hard to build that.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers