RFC: Restructuring pg_aggregate
I was originally planning to revise pg_aggregate along the same lines
as pg_proc and so forth: add an aggnamespace column and fix the search
code to be namespace-aware. But it seemed a tad annoying that standard
function lookups would thereby incur *two* namespace-aware searches:
one in pg_aggregate and one in pg_proc. Thinking about that, it
occurred to me that there is another way. Why shouldn't aggregate
functions have entries in pg_proc? Then one search would cover both
possibilities, and we could eliminate some duplicate code in the parser.
Doing things this way would mean that one could not create an aggregate
function with the same name and input arguments as a regular function
(at least, not in the same namespace). However, doing so has always
been a bad idea, and it seems like it'd be a step forward not back for
the system to reject it as a naming conflict.
A more serious objection is that this will break client applications
that know about the pg_aggregate table. However, 7.3 is already going
to force a lot of reprogramming of clients that inspect system tables,
because of the addition of namespaces. Restructuring pg_aggregate
doesn't seem like it makes life all that much worse.
I would envision this working like so:
In pg_proc: add a boolean column "proisagg" to mark function entries
that are aggregates. A row for an aggregate function would contain
a pointer to a dummy C function that would just raise an error if
called (which shouldn't ever happen, but just in case some bit of
code doesn't get updated, this would be a good safety check).
In pg_aggregate: remove the columns aggname, aggowner, aggbasetype,
aggfinaltype, and add a column aggfnoid containing the OID of the
aggregate's pg_proc row. (pg_aggregate itself doesn't need OIDs
anymore, and its only index will be on aggfnoid.) Essentially this
reduces pg_aggregate to an auxiliary extension of pg_proc, carrying
the fields aggtransfn, aggfinalfn, aggtranstype, agginitval for those
pg_proc rows that need them.
An interesting aspect of this is that the catalog structure would now
be prepared to support aggregate functions with more than one input,
which is a feature we've been asked for occasionally. I am *not*
volunteering to make that happen right now ... but the catalog
structures would be ready for it.
Comments, objections, better ideas?
regards, tom lane
A more serious objection is that this will break client applications
that know about the pg_aggregate table. However, 7.3 is already going
to force a lot of reprogramming of clients that inspect system tables,
because of the addition of namespaces. Restructuring pg_aggregate
doesn't seem like it makes life all that much worse.
How about putting a note in the 7.3 release that tells them not to rely on
sequential attnums in tn pg_attribute anymore. That should make it easier
to implement column dropping in the future?
Chris
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
How about putting a note in the 7.3 release that tells them not to rely on
sequential attnums in tn pg_attribute anymore. That should make it easier
to implement column dropping in the future?
That seems like pure speculation to me, if not outright wrong. If we
can't renumber the attnums it'll be because the table's tuples still
have data at a particular column position. Since we'll need to know
the datatype of that data (if only to skip over it correctly), there
will still have to be a pg_attribute entry for the dropped column.
Thus, what people will more likely have to watch out for is pg_attribute
rows marked "deleted" in some fashion.
We are actually not that far away from being able to do DROP COLUMN,
if people don't mind being slow to recover the space used by a dropped
column. It'd work like this:
1. Add an "attisdropped" boolean to pg_attribute.
2. DROP COLUMN sets this flag and changes attname to something like
"***deleted_NNN". (Changing attname is only necessary to allow the
same column name to be reused without drawing a unique-index error.)
That's it --- it's done.
3. Column lookup, expansion of *, etc have to be taught to ignore
columns marked attisdropped.
The idea is that the extant data sits there but is invisible. Inserts
of new rows in the table would always insert a NULL in the dropped
column (which'd fall out more or less for free, there being no way
to tell the system to insert anything else). Over time, UPDATEs of
extant rows would also replace the dropped data with NULLs.
I suspect there are only about half a dozen key places that would have
to explicitly check attisdropped. None of the low-level executor
machinery would care at all, since it's dealing with "real" tuples where
the attribute is still there, at least as a NULL.
Hiroshi's "DROP_COLUMN_HACK" was essentially along this line, but
I think he made a representational mistake by trying to change the
attnums of dropped columns to be negative values. That means that
a lot of low-level places *do* need to know about the dropped-column
convention, else they can't make any sense of tuple layouts. The
negative-attnum idea might have been a little easier for clients
inspecting pg_attribute to cope with, but in practice I think they'd
need to be taught about dropped columns anyway --- as evidenced by
your remark suggesting that gaps in the sequence of positive attnums
would break clients.
regards, tom lane
PS: Once you have that, try this on for size: ALTER COLUMN is
ALTER DROP COLUMN;
ALTER ADD COLUMN newtype;
UPDATE foo SET newcol = coercion_fn(oldcol);
That last couldn't be expressed as an SQL statement because the parser
wouldn't allow access to oldcol, but there's nothing stopping it at the
implementation level.
This approach changes the user-visible column ordering, which'd be
a tad annoying, so probably something based on building a new version of
the table would be better. But as a quick hack this would be doable.
Actually, given the DROP feature a user could do it for himself:
ALTER ADD COLUMN tempcol newtype;
UPDATE foo SET tempcol = coercion_fn(oldcol);
ALTER DROP COLUMN oldcol;
ALTER RENAME COLUMN tempcol to oldcol;
which seems like an okay approach, especially since it'd allow the
UPDATE computing the new column values to be of arbitrary complexity,
not just a simple coercion of one existing column.
Tom Lane writes:
Why shouldn't aggregate functions have entries in pg_proc? Then one
search would cover both possibilities, and we could eliminate some
duplicate code in the parser.
Furthermore, we could make the new function privileges apply to aggregates
as well.
--
Peter Eisentraut peter_e@gmx.net
Peter Eisentraut <peter_e@gmx.net> writes:
Tom Lane writes:
Why shouldn't aggregate functions have entries in pg_proc?
Furthermore, we could make the new function privileges apply to aggregates
as well.
Good point. Another thing that would fall out for free is that the
aggregate type-coercion rules would become exactly like the function
type-coercion rules; right now they are a tad stupider.
regards, tom lane
That seems like pure speculation to me, if not outright wrong. If we
can't renumber the attnums it'll be because the table's tuples still
have data at a particular column position. Since we'll need to know
the datatype of that data (if only to skip over it correctly), there
will still have to be a pg_attribute entry for the dropped column.
Thus, what people will more likely have to watch out for is pg_attribute
rows marked "deleted" in some fashion.
You know there is a way to do this and not break client compatibility.
Rename the current pg_attribute relation to pg_baseatt or something. Fix
all references to it in the code. Create a system view called pg_attribute
which is SELECT * (except attisdropped) FROM pg_baseattr WHERE NOT
attisdropped.
More work though, of course.
We are actually not that far away from being able to do DROP COLUMN,
if people don't mind being slow to recover the space used by a dropped
column. It'd work like this:
Logical vs. physical column numbers would still be quite handy tho. If
you're going to break compatibility, may as well do all breaks at once?
1. Add an "attisdropped" boolean to pg_attribute.
2. DROP COLUMN sets this flag and changes attname to something like
"***deleted_NNN". (Changing attname is only necessary to allow the
same column name to be reused without drawing a unique-index error.)
That's it --- it's done.3. Column lookup, expansion of *, etc have to be taught to ignore
columns marked attisdropped.The idea is that the extant data sits there but is invisible. Inserts
of new rows in the table would always insert a NULL in the dropped
column (which'd fall out more or less for free, there being no way
to tell the system to insert anything else). Over time, UPDATEs of
extant rows would also replace the dropped data with NULLs.
Would it be possible to modify VACUUM FULL in some way so as to permanently
remove these tuples? Surely people would like an actual space-saving column
drop?
Chris
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
You know there is a way to do this and not break client compatibility.
Rename the current pg_attribute relation to pg_baseatt or something. Fix
all references to it in the code. Create a system view called pg_attribute
which is SELECT * (except attisdropped) FROM pg_baseattr WHERE NOT
attisdropped.
Wasn't your original concern that the attnums wouldn't be consecutive?
How is this view going to hide that?
Logical vs. physical column numbers would still be quite handy tho.
But confusing as all hell, at *all* levels of the code ... I've thought
about that quite a bit, and I can't see that we could expect to make it
work without a lot of hard-to-find bugs. Too many places where it's
not instantly obvious which set of numbers you should be using.
regards, tom lane
Tom Lane wrote:
Hiroshi's "DROP_COLUMN_HACK" was essentially along this line, but
I think he made a representational mistake by trying to change the
attnums of dropped columns to be negative values.
Negative attnums had 2 advantages then. It had a big
advantage that initdb isn't needed. Note that it was
only a trial hack and there was no consensus on the way.
It was very easy to change the implementation to use
attisdropped. OTOH physical/logical attnums approach
needed the change on pg_class, pg_attribute and so
I've never had a chance to open the patch to public.
It was also more sensitive about oversights of needed
changes than the attisdropped flag approach.
That means that
a lot of low-level places *do* need to know about the dropped-column
convention, else they can't make any sense of tuple layouts.
Why ? As you already mentioned, there were not that many places
to be changed.
Well what's changed since then ?
regards,
Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
Tom Lane wrote:
That means that
a lot of low-level places *do* need to know about the dropped-column
convention, else they can't make any sense of tuple layouts.
Why ? As you already mentioned, there were not that many places
to be changed.
There are not many places to change if the implementation uses
attisdropped, because we *only* have to hide the existence of the column
at the parser level. The guts of the system don't know anything funny
is going on; a dropped column looks the same as an undropped one
throughout the executor. But with negative attnums, even such basic
routines as heap_formtuple have to know about it, no?
regards, tom lane
Hiroshi Inoue wrote:
Tom Lane wrote:
Hiroshi's "DROP_COLUMN_HACK" was essentially along this line, but
I think he made a representational mistake by trying to change the
attnums of dropped columns to be negative values.Negative attnums had 2 advantages then. It had a big
advantage that initdb isn't needed. Note that it was
only a trial hack and there was no consensus on the way.
It was very easy to change the implementation to use
attisdropped. OTOH physical/logical attnums approach
needed the change on pg_class, pg_attribute and so
I've never had a chance to open the patch to public.
It was also more sensitive about oversights of needed
changes than the attisdropped flag approach.That means that
a lot of low-level places *do* need to know about the dropped-column
convention, else they can't make any sense of tuple layouts.Why ? As you already mentioned, there were not that many places
to be changed.Well what's changed since then ?
Here is an old email from me that outlines the idea of having a
physical/logical attribute numbering system, and the advantages. For
implementation, I thought we could do most of the work by filtering what
the client saw, and let the server just worry about physical numbering,
except for 'SELECT *' expansion.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Attachments:
/bjm/0text/plainDownload
Tom Lane wrote:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
Tom Lane wrote:
That means that
a lot of low-level places *do* need to know about the dropped-column
convention, else they can't make any sense of tuple layouts.Why ? As you already mentioned, there were not that many places
to be changed.There are not many places to change if the implementation uses
attisdropped, because we *only* have to hide the existence of the column
at the parser level. The guts of the system don't know anything funny
is going on; a dropped column looks the same as an undropped one
throughout the executor. But with negative attnums, even such basic
routines as heap_formtuple have to know about it, no?
When a tuple descriptor is made, the info of
dropped columns are placed at (their physical
position - 1) index in the same way as ordinary
columns. There are few places where conversions
between negative attnums and the physical positions
are needed.
The following is my posting more than 2 years ago.
What's changed since then.
regards,
Hiroshi Inoue
I don't want a final implementation this time.
What I want is to provide a quick hack for both others and me
to judge whether this direction is good or not.
My idea is essentially an invisible column implementation.
DROP COLUMN would change the target pg_attribute tuple
as follows..
attnum -> an offset - attnum;
atttypid -> 0
We would be able to see where to change by tracking error/
crashes caused by this change.
I would also change attname to '*already dropped %d' for
examle to avoid duplicate attname.
Bruce Momjian wrote:
Hiroshi Inoue wrote:
Tom Lane wrote:
Hiroshi's "DROP_COLUMN_HACK" was essentially along this line, but
I think he made a representational mistake by trying to change the
attnums of dropped columns to be negative values.Negative attnums had 2 advantages then. It had a big
advantage that initdb isn't needed. Note that it was
only a trial hack and there was no consensus on the way.
It was very easy to change the implementation to use
attisdropped. OTOH physical/logical attnums approach
needed the change on pg_class, pg_attribute and so
I've never had a chance to open the patch to public.
It was also more sensitive about oversights of needed
changes than the attisdropped flag approach.That means that
a lot of low-level places *do* need to know about the dropped-column
convention, else they can't make any sense of tuple layouts.Why ? As you already mentioned, there were not that many places
to be changed.Well what's changed since then ?
Here is an old email from me that outlines the idea of having a
physical/logical attribute numbering system, and the advantages.
I already tried physical/logical attribute implementation
pretty long ago. Where are new ideas to solve the problems
that the approach has ?
regards,
Hiroshi Inoue
Hiroshi Inoue wrote:
Why ? As you already mentioned, there were not that many places
to be changed.Well what's changed since then ?
Here is an old email from me that outlines the idea of having a
physical/logical attribute numbering system, and the advantages.I already tried physical/logical attribute implementation
pretty long ago. Where are new ideas to solve the problems
that the approach has ?
Good question. I am suggesting more than just the drop column fix. It
could be used for smaller data files to reduce padding, fix for
inheritance problems with ADD COLUMN, and performance of moving
varlena's to the end of the row.
Also, my idea was to have the physical/logical mapping happen closer to
the client, so the backend mostly only deals with physical. I was
thinking of having the libpq backend communication layer actually do the
reordering of the return results.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian wrote:
Hiroshi Inoue wrote:
Why ? As you already mentioned, there were not that many places
to be changed.Well what's changed since then ?
Here is an old email from me that outlines the idea of having a
physical/logical attribute numbering system, and the advantages.I already tried physical/logical attribute implementation
pretty long ago. Where are new ideas to solve the problems
that the approach has ?Good question. I am suggesting more than just the drop column fix. It
could be used for smaller data files to reduce padding, fix for
inheritance problems with ADD COLUMN, and performance of moving
varlena's to the end of the row.Also, my idea was to have the physical/logical mapping happen closer to
the client, so the backend mostly only deals with physical.
If the client has to bear the some part, isn't the invisible
column approach much simpler ?
I've put a pretty much time into DROP COLUMN feature but
I am really disappointed to see the comments in this thread.
What DROP COLUMN has brought me seems only a waste of time.
Possibly I must have introduced either implementation forcibly.
regards,
Hiroshi Inoue
If the client has to bear the some part, isn't the invisible
column approach much simpler ?I've put a pretty much time into DROP COLUMN feature but
I am really disappointed to see the comments in this thread.
What DROP COLUMN has brought me seems only a waste of time.
I kind of agree with Hiroshi here. All I want to be able to do is drop
columns from my tables, and reclaim the space. I've got all sorts of
production tables with columns just sitting there doing nothing, awaiting
the time that I can happily drop them. It seems to me that whatever we do
will require some kind of client breakage.
Chris
Hiroshi Inoue wrote:
If the client has to bear the some part, isn't the invisible
column approach much simpler ?I've put a pretty much time into DROP COLUMN feature but
I am really disappointed to see the comments in this thread.
What DROP COLUMN has brought me seems only a waste of time.Possibly I must have introduced either implementation forcibly.
I understand. I personally think maybe we have been a little to picky
about patches being accepted. Sometimes when something is not 100%
perfect, we do nothing rather than accept the patch, and replace or
improve it later. The DROP COLUMN approach you had clearly is one of
them.
Personally, now that we have relfilenode, I think we should implement
drop of columns by just recreating the table without the column.
The big problem with DROP COLUMN was that we couldn't decide on what to
do, so we did nothing, which is probably worse than just choosing one
and doing it.
Our code is littered with my 80% solutions for LIKE optimization,
optimizer statistics, BETWEEN, and lots of other solutions that have met
a need and are now being replaced with better code. My code was not
great, but I hadn't dont them, PostgreSQL would have had even more
missing features than we do now. DROP COLUMN is clearly one where we
missed getting something that works and would keep people happy.
As far as my proposal, my idea was not to do it in the client, but
rather to do it just before the data is sent from/to the client. Maybe
that is a stupid idea. I never really researched it. My idea was more
to make the physical/logical column numbers distinct so certain tricks
could be performed. It wasn't for DROP COLUMN specifically, and in fact
to do DROP COLUMN with my code, there would have to be more code similar
to what you had where clients would see a column and have to skip it. I
was focusing more on physical/logical to enable other features.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Christopher Kings-Lynne wrote:
If the client has to bear the some part, isn't the invisible
column approach much simpler ?I've put a pretty much time into DROP COLUMN feature but
I am really disappointed to see the comments in this thread.
What DROP COLUMN has brought me seems only a waste of time.I kind of agree with Hiroshi here. All I want to be able to do is drop
columns from my tables, and reclaim the space. I've got all sorts of
production tables with columns just sitting there doing nothing, awaiting
the time that I can happily drop them. It seems to me that whatever we do
will require some kind of client breakage.
Actually, what we need to do to reclaim space is to enable table
recreation without the column, now that we have relfilenode for file
renaming. It isn't hard to do, but no one has focused on it. I want to
focus on it, but have not had the time, obviously, and would be very
excited to assist someone else.
Hiroshi's fine idea of marking certain columns as unused would not have
reclaimed the missing space, just as my idea of physical/logical column
distinction would not reclaim the space either. Again, my
physical/logical idea is more for fixing other problems and
optimization, not DROP COLUMN.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Actually, what we need to do to reclaim space is to enable table
recreation without the column, now that we have relfilenode for file
renaming. It isn't hard to do, but no one has focused on it. I want to
focus on it, but have not had the time, obviously, and would be very
excited to assist someone else.
I'm happy to help - depends if it's within my skill level or not tho. Most
of the time the problem I have is finding where to make the changes, not
actually making the changes themselves. So, count me in.
Hiroshi's fine idea of marking certain columns as unused would not have
reclaimed the missing space, just as my idea of physical/logical column
distinction would not reclaim the space either. Again, my
physical/logical idea is more for fixing other problems and
optimization, not DROP COLUMN.
Question: Is it _possible_ to reclaim the space during a VACUUM FULL? I do
not know enough about the file format to know this. What happens if the
VACUUM is stopped halfway thru reclaiming a column in a table?
Bruce: WRT modifying libpq to do the translation - won't this cause probs
for JDBC and ODBC people?
Chris
Christopher Kings-Lynne wrote:
Actually, what we need to do to reclaim space is to enable table
recreation without the column, now that we have relfilenode for file
renaming. It isn't hard to do, but no one has focused on it. I want to
focus on it, but have not had the time, obviously, and would be very
excited to assist someone else.I'm happy to help - depends if it's within my skill level or not tho. Most
of the time the problem I have is finding where to make the changes, not
actually making the changes themselves. So, count me in.
OK, let me mention that I have had great success with chat sessions with
PostgreSQL developers. They can code and ask questions and I can answer
quickly. Seems to be speeding things along for some people. I am:
AIM bmomjian
ICQ 151255111
Yahoo bmomjian
MSN root@candle.pha.pa.us
I am also on the PostgreSQL IRC channel. As far as where to start, I
think the CLUSTER command would be a good start because it just reorders
the existing table. Then DROP COLUMN can come out of that by removing
the column during the copy, and removing mention of the column from
pg_attribute, and of course renumbering the gap.
Hiroshi's fine idea of marking certain columns as unused would not have
reclaimed the missing space, just as my idea of physical/logical column
distinction would not reclaim the space either. Again, my
physical/logical idea is more for fixing other problems and
optimization, not DROP COLUMN.Question: Is it _possible_ to reclaim the space during a VACUUM FULL? I do
not know enough about the file format to know this. What happens if the
VACUUM is stopped halfway thru reclaiming a column in a table?
Not really. I moves only whole tuples, and only certain ones.
Bruce: WRT modifying libpq to do the translation - won't this cause probs
for JDBC and ODBC people?
No, not in libpq, but rather in backend/libpq, the backend part of the
connection. My idea is for the user to think things are in a different
order in the row than they actually appear on disk. I haven't really
researched it enough to understand its validity.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Christopher Kings-Lynne wrote:
If the client has to bear the some part, isn't the invisible
column approach much simpler ?I've put a pretty much time into DROP COLUMN feature but
I am really disappointed to see the comments in this thread.
What DROP COLUMN has brought me seems only a waste of time.I kind of agree with Hiroshi here. All I want to be able to do is drop
columns from my tables, and reclaim the space. I've got all sorts of
production tables with columns just sitting there doing nothing, awaiting
the time that I can happily drop them.
It seems to me that whatever we do
will require some kind of client breakage.
Physical/logical attnum approach was mainly to not break
clients. I implemented it on trial but the implementation
was hard to maintain unfortunately. It's pretty difficult
to decide whether the number is physical or logical in
many cases.
regards,
Hiroshi Inoue