YADP - Yet another Dependency Patch

Started by Rod Tayloralmost 24 years ago9 messageshackers
Jump to latest
#1Rod Taylor
rbt@rbt.ca

This one has been corrected to fit in with Toms recent changes, as well
as the changes with command/ restructuring.

Please accept or reject quickly, else risk conflicts.

Attachments:

pg_depend.ctext/x-c; charset=ISO-8859-1; name=pg_depend.cDownload
pg_depend.htext/x-c-header; charset=ISO-8859-1; name=pg_depend.hDownload
depend.patch2text/plain; charset=ISO-8859-1; name=depend.patch2Download+1427-469
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rod Taylor (#1)
Re: YADP - Yet another Dependency Patch

Rod Taylor <rbt@zort.ca> writes:

This one has been corrected to fit in with Toms recent changes, as well
as the changes with command/ restructuring.
Please accept or reject quickly, else risk conflicts.

This is interesting but certainly far from ready for prime time.

Some random comments, in no particular order:

1. I don't like the code that installs and removes ad-hoc dependencies
from relations to type Oid. On its own terms it's wrong (if it were
right, then you'd also need to be installing dependencies for Tid, Xid,
and the types of the other system columns), but on a larger scale I
don't see the point of expending cycles, disk space, and code complexity
to record these dependencies. The system *cannot* work if you drop type
Oid. So it'd seem to make more sense to wire in a notion that certain
types, tables, etc are "pinned" and may never be dropped; then there's
no need to create explicit dependencies on them. If you want an
explicit representation of pinning in the pg_depends table, perhaps it
would work to create a row claiming that "table 0 / Oid 0 / subid 0"
depends on a pinned object.

2. Is it really necessary to treat pg_depends as a bootstrapped
relation? That adds a lot of complexity, as you've evidently already
found, and it does not seem necessary if you're going to load the system
dependencies in a later step of the initdb process. You can just make
the dependency-adding routines be no-ops in bootstrap mode; then create
pg_depends as an ordinary system catalog; and finally load the entries
post-bootstrap.

3. Isn't there a better way to find the initial dependencies? That
SELECT is truly ugly, and more to the point is highly likely to break
anytime someone rearranges the catalogs. I'd like to see it generated
automatically (maybe using a tool like findoidjoins); or perhaps we
could do the discovery of the columns to look at on-the-fly.

4. Do not use the parser's RESTRICT/CASCADE tokens as enumerated type
values. They change value every time someone tweaks the grammar.
(Yes, I know you copied from extant code; that code is on my hitlist.)
Define your own enum type instead of creating a lot of bogus
dependencies on parser/parser.h.

5. Avoid using heapscans on pg_depend; it's gonna be way too big for
that to give acceptable performance. Make sure you have indexes
available to match your searches, and use the systable_scan routines.

6. The tests on relation names in dependDelete, getObjectName are (a)
slow and (b) not schema-aware. Can you make these into OID comparisons
instead?

7. The namespaceIdGetNspname routine you added is unsafe (it would need
to pstrdup the name to be sure it's still valid after releasing the
syscache entry); but more to the point, it duplicates a routine already
present in lsyscache.c (which is where this sort of utility generally
belongs, anyway).

8. Aggregate code seems unaware that aggfinalfn is optional.

I have to leave, but reserve the right to make more comments later ;-)

In general though, this seems like a cool approach and definitely
worth pursuing. Keep at it!

regards, tom lane

#3Rod Taylor
rbt@rbt.ca
In reply to: Rod Taylor (#1)
Re: [PATCHES] YADP - Yet another Dependency Patch

[ copied to hackers ]

1. I don't like the code that installs and removes ad-hoc

dependencies

from relations to type Oid. On its own terms it's wrong (if it were

...

explicit representation of pinning in the pg_depends table, perhaps

it

would work to create a row claiming that "table 0 / Oid 0 / subid 0"
depends on a pinned object.

Yes, a pinned dependency makes much more sense.

int4, bool, varchar, and name are in the same boat.

I'll make it so dependCreate() will ignore adding any additional
dependencies on pinned types (class 0, Oid 0, SubID 0) and
dependDelete() will never allow deletion when that dependency exists.

2. Is it really necessary to treat pg_depends as a bootstrapped
relation? That adds a lot of complexity, as you've evidently

already

found, and it does not seem necessary if you're going to load the

system

dependencies in a later step of the initdb process. You can just

make

the dependency-adding routines be no-ops in bootstrap mode; then

create

pg_depends as an ordinary system catalog; and finally load the

entries

post-bootstrap.

Ack.. <sound of hand hitting head>. All that work to avoid a simple
if statement.

Ahh well.. learning at it's finest :)

3. Isn't there a better way to find the initial dependencies? That
SELECT is truly ugly, and more to the point is highly likely to

break

anytime someone rearranges the catalogs. I'd like to see it

generated

automatically (maybe using a tool like findoidjoins); or perhaps we
could do the discovery of the columns to look at on-the-fly.

I'm not entirely sure how to approach this, but it does appear that
findoidjoins would find all the relations.

So... I could create a pg_ function which will find all oid joins,
and call dependCreate() for each entry it finds. That way
dependCreate will ignore anything that was pinned (see above)
automagically. It would also make initdb quite slow, and would add a
pg_ function that one should normally avoid during normal production.
Then again, I suppose it could be used to recreate missing
dependencies if a user was manually fiddling with that table.

initdb would call SELECT pg_findSystemDepends(); or something.

4. Do not use the parser's RESTRICT/CASCADE tokens as enumerated

type

values. They change value every time someone tweaks the grammar.
(Yes, I know you copied from extant code; that code is on my

hitlist.)

Define your own enum type instead of creating a lot of bogus
dependencies on parser/parser.h.

All but one of those will go away once the functions are modified to
accept the actual RESTRICT or CASCADE bit. That was going to be step
2 of the process but I suppose I could do it now, along with a rather
large regression test. The only place that RESTRICT will be used is
dependDelete(); Nowhere else will care. It'll simply pass on what
was given to it by the calling function from utility.c or a cascading
dependDelete. Of course, gram.y will be littered with the
'opt_restrictcascade' tag.

The RESTRICT usage is more of a current placeholder. I've marked the
includes as /* FOR RESTRICT */ for that reason, make them easy to
remove later.

6. The tests on relation names in dependDelete, getObjectName are

(a)

slow and (b) not schema-aware. Can you make these into OID

comparisons

instead?

Ahh yes. Good point.

#4Bruce Momjian
bruce@momjian.us
In reply to: Rod Taylor (#3)
Re: [PATCHES] YADP - Yet another Dependency Patch

Rod Taylor wrote:

[ copied to hackers ]

1. I don't like the code that installs and removes ad-hoc

dependencies

from relations to type Oid. On its own terms it's wrong (if it were

Looks good to me. I realize this is a huge chunk of code. The only
ultra-minor thing I saw was the use of dashes for comment blocks when
not needed:

/* ----
* word
* ----
*/

We use dashes like this only for comments that shouldn't be reformatted
by pgindent.

The one thing I would like to know is what things does it track, and
what does it not track? Does it complete any TODO items, or do we save
that for later?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#5Rod Taylor
rbt@rbt.ca
In reply to: Bruce Momjian (#4)
Re: [HACKERS] YADP - Yet another Dependency Patch

Yes, I've removed all of those.

I've submitted a list of stuff it tracked earlier, but will do so
again with the next patch. Basically, anything simple and
straightforward ;)

It doesn't manage dependencies of function code, view internals,
default internals as I don't know how to find navigate an arbitrary
parse tree for that information. Some function code isn't even
available (C), so its next to impossible.

No, it doesn't directly tick off any todo item other than the first in
dependency. Make pg_depend table ;)

--
Rod Taylor

Your eyes are weary from staring at the CRT. You feel sleepy. Notice
how restful it is to watch the cursor blink. Close your eyes. The
opinions stated above are yours. You cannot imagine why you ever felt
otherwise.

----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
To: "Rod Taylor" <rbt@zort.ca>
Cc: "Tom Lane" <tgl@sss.pgh.pa.us>; <pgsql-patches@postgresql.org>;
"Hackers List" <pgsql-hackers@postgresql.org>
Sent: Tuesday, April 16, 2002 1:15 AM
Subject: Re: [HACKERS] [PATCHES] YADP - Yet another Dependency Patch

Rod Taylor wrote:

[ copied to hackers ]

1. I don't like the code that installs and removes ad-hoc

dependencies

from relations to type Oid. On its own terms it's wrong (if it

were

Looks good to me. I realize this is a huge chunk of code. The only
ultra-minor thing I saw was the use of dashes for comment blocks

when

not needed:

/* ----
* word
* ----
*/

We use dashes like this only for comments that shouldn't be

reformatted

by pgindent.

The one thing I would like to know is what things does it track, and
what does it not track? Does it complete any TODO items, or do we

save

that for later?

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania

19026

Show quoted text
#6Bruce Momjian
bruce@momjian.us
In reply to: Rod Taylor (#5)
Re: [HACKERS] YADP - Yet another Dependency Patch

Rod Taylor wrote:

Yes, I've removed all of those.

I've submitted a list of stuff it tracked earlier, but will do so
again with the next patch. Basically, anything simple and
straightforward ;)

Thanks for an updated list of tracked items. I know you posted it
earlier, but some are complaining recently that patches are coming in
too fast or with too long of a gap between discussion and patch arrival,
and they are getting confused trying to track where we are going.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#7Rod Taylor
rbt@rbt.ca
In reply to: Rod Taylor (#1)
Re: [PATCHES] YADP - Yet another Dependency Patch

3. Isn't there a better way to find the initial dependencies? That
SELECT is truly ugly, and more to the point is highly likely to

break

anytime someone rearranges the catalogs. I'd like to see it

generated

automatically (maybe using a tool like findoidjoins); or perhaps we
could do the discovery of the columns to look at on-the-fly.

I'm having a really hard time coming up with a good method for this.

The key problems with doing what findoidjoins does is recording things
like indexes, complex types, etc. as items that should be implicitly
dropped when the table is dropped. Also, in the case of indicies the
work required to 'discover' 2 value foreign keys against pg_attribute
(attnum and attrelid) would be a royal pain in the ass -- especially
when it's in the form of a vector (ugh). There are enough of those to
warrent something other than hardcoding the relations in c code.

It might be possible to create foreign keys for the system tables and
use those to discover the associations. Any foreign key in pg_catalog
made up of an object address (table, oid column) or (table, oid
column, int column) could potentially have each foreign key tuple set
(join tables by key) recorded as a dependency.

So... Any thoughts to adding a no-op foreign key reference? A key
thats used for style purposes only.

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rod Taylor (#7)
Re: [PATCHES] YADP - Yet another Dependency Patch

"Rod Taylor" <rbt@zort.ca> writes:

3. Isn't there a better way to find the initial dependencies? That
SELECT is truly ugly, and more to the point is highly likely to
break anytime someone rearranges the catalogs.

I'm having a really hard time coming up with a good method for this.

Well, do we actually need an *accurate* representation of the
dependencies? You seemed to like my idea of pinning essential stuff,
and in reality all of the initial catalog structures ought to be pinned.
Maybe it would be sufficient to just make "pinned" entries for
everything that appears in the initial catalogs. Someone who's really
intent on manually deleting, say, the "box" datatype could be expected
to be bright enough to figure out how to remove the pg_depends entry
that's preventing him from doing so.

(There are a very small number of things that are specifically intended
to be droppable, like the "public" namespace, but seems like excluding
that short list from the pg_depends entries would be more maintainable
than the approach you've got now.)

regards, tom lane

#9Rod Taylor
rbt@rbt.ca
In reply to: Rod Taylor (#1)
Re: [PATCHES] YADP - Yet another Dependency Patch

Thats what I was going to propose if no-one could figure out a way of
automatically gathering system table dependencies.

It would be nice (for a minimallist db) to be able to drop a bunch of
stuff, but a number of other things would need to be done as well
(full system compression for example).

--
Rod
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Rod Taylor" <rbt@zort.ca>
Cc: "Hackers List" <pgsql-hackers@postgresql.org>
Sent: Thursday, April 18, 2002 1:24 AM
Subject: Re: [HACKERS] [PATCHES] YADP - Yet another Dependency Patch

"Rod Taylor" <rbt@zort.ca> writes:

3. Isn't there a better way to find the initial dependencies?

That

SELECT is truly ugly, and more to the point is highly likely to
break anytime someone rearranges the catalogs.

I'm having a really hard time coming up with a good method for

this.

Well, do we actually need an *accurate* representation of the
dependencies? You seemed to like my idea of pinning essential

stuff,

and in reality all of the initial catalog structures ought to be

pinned.

Maybe it would be sufficient to just make "pinned" entries for
everything that appears in the initial catalogs. Someone who's

really

intent on manually deleting, say, the "box" datatype could be

expected

to be bright enough to figure out how to remove the pg_depends entry
that's preventing him from doing so.

(There are a very small number of things that are specifically

intended

to be droppable, like the "public" namespace, but seems like

excluding

that short list from the pg_depends entries would be more

maintainable

Show quoted text

than the approach you've got now.)

regards, tom lane