patch for parallel pg_dump
So this is the parallel pg_dump patch, generalizing the existing
parallel restore and allowing parallel dumps for the directory archive
format, the patch works on Windows and Unix.
In the first phase of a parallel pg_dump/pg_restore, it does catalog
backup/restore in a single process, then forks off worker processes
which are connected to the master process by pipes (on Windows, the
pg_pipe implementation is used). These pipes are only used for a few
commands and status messages. The processes then work on the items
that they get assigned to by the master, in other words the worker
processes do not terminate after each item but stay there until the
end of the parallel part of the dump/restore. Once they finish their
current item and send the status back to the master, they are assigned
the next item and so forth...
In parallel restore, the master closes its own connection to the
database before forking of worker processes, just as it does now. In
parallel dump however, we need to hold the masters connection open so
that we can detect deadlocks. The issue is that somebody could have
requested an exclusive lock after the master has initially requested a
shared lock on all tables. Therefore, the worker process also requests
a shared lock on the table with NOWAIT and if this fails, we know that
there is a conflicting lock in between and that we need to abort the
dump.
Parallel pg_dump sorts the tables and indexes by their sizes so that
it can start with the largest items first.
The connections of the parallel dump use the synchronized snapshot
feature. However there's also an option --no-synchronized-snapshots
which can be used to dump from an older PostgreSQL version.
I'm also attaching another use-case for the parallel backup as a
separate patch, which is a new archive format that I named
"pg_backup_mirror", it's basically the parallel version of "pg_dump |
psql", so it does a parallel dump and restore of a database from one
host to another. The patch for this is fairly small, but it's still a
bit rough and needs some more work and discussion. Depending on how
quickly (or not) we get done with the review of the main patch, we can
then include this one as well or postpone it.
Joachim
On Sun, Jan 15, 2012 at 1:01 PM, Joachim Wieland <joe@mcknight.de> wrote:
So this is the parallel pg_dump patch, generalizing the existing
parallel restore and allowing parallel dumps for the directory archive
format, the patch works on Windows and Unix.
This patch introduces a large amount of notational churn, as perhaps
well-illustrated by this hunk:
static int
! dumpBlobs(Archive *AHX, void *arg)
{
+ /*
+ * This is a data dumper routine, executed in a child for
parallel backup,
+ * so it must not access the global g_conn but AH->connection instead.
+ */
+ ArchiveHandle *AH = (ArchiveHandle *) AHX;
It seems pretty grotty to have a static function that gets an argument
of the wrong type, and then just immediately turns around and casts it
to something else. It's not exactly obvious that that's even safe,
especially if you don't know that ArchiveHandle is a struct whose
first element is an Archive. But even if you do know that subclassing
is intended, that doesn't prove that the particular Archive object is
always going to be an ArchiveHandle under the hood. If it is, why not
just pass it as an ArchiveHandle to begin with? I think this needs to
be revised in some way. At least in the few cases I checked, the only
point of getting at the ArchiveHandle this way was to find
AH->connection, which suggests to me either that AH->connection should
be in the "public" section, inside Archive rather than ArchiveHandle,
or else that we ought to just pass the connection object to this
function (and all of its friends who have similar issues) as a
separate argument. Either way, I think that would make this patch
both cleaner and less-invasive. In fact we might want to pull out
just that change and commit it separately to simplify review of the
remaining work.
It's not clear to me why fmtQualifiedId needs to move to dumputils.c.
The way you have it, fmtQualifiedId() is now with fmtId(), but no
longer with fmtCopyColumnList(), the only other similarly named
function in that directory. That may be more logical, or it may not,
but rearranging the code like this makes it a lot harder to review,
and I would prefer that we make such changes as separate commits if
we're going to do them, so that diff can do something sensible with
the changes that are integral to the patch.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, Jan 27, 2012 at 10:57 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Jan 15, 2012 at 1:01 PM, Joachim Wieland <joe@mcknight.de> wrote:
So this is the parallel pg_dump patch, generalizing the existing
parallel restore and allowing parallel dumps for the directory archive
format, the patch works on Windows and Unix.This patch introduces a large amount of notational churn, as perhaps
well-illustrated by this hunk:static int ! dumpBlobs(Archive *AHX, void *arg) { + /* + * This is a data dumper routine, executed in a child for parallel backup, + * so it must not access the global g_conn but AH->connection instead. + */ + ArchiveHandle *AH = (ArchiveHandle *) AHX;It seems pretty grotty to have a static function that gets an argument
of the wrong type, and then just immediately turns around and casts it
to something else. It's not exactly obvious that that's even safe,
especially if you don't know that ArchiveHandle is a struct whose
first element is an Archive. But even if you do know that subclassing
is intended, that doesn't prove that the particular Archive object is
always going to be an ArchiveHandle under the hood. If it is, why not
just pass it as an ArchiveHandle to begin with? I think this needs to
be revised in some way. At least in the few cases I checked, the only
point of getting at the ArchiveHandle this way was to find
AH->connection, which suggests to me either that AH->connection should
be in the "public" section, inside Archive rather than ArchiveHandle,
or else that we ought to just pass the connection object to this
function (and all of its friends who have similar issues) as a
separate argument. Either way, I think that would make this patch
both cleaner and less-invasive. In fact we might want to pull out
just that change and commit it separately to simplify review of the
remaining work.It's not clear to me why fmtQualifiedId needs to move to dumputils.c.
The way you have it, fmtQualifiedId() is now with fmtId(), but no
longer with fmtCopyColumnList(), the only other similarly named
function in that directory. That may be more logical, or it may not,
but rearranging the code like this makes it a lot harder to review,
and I would prefer that we make such changes as separate commits if
we're going to do them, so that diff can do something sensible with
the changes that are integral to the patch.
Woops, hit send a little too soon there. I'll try to make some more
substantive comments after looking at this more.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, Jan 27, 2012 at 10:58 AM, Robert Haas <robertmhaas@gmail.com> wrote:
It's not clear to me why fmtQualifiedId needs to move to dumputils.c.
The way you have it, fmtQualifiedId() is now with fmtId(), but no
longer with fmtCopyColumnList(), the only other similarly named
function in that directory. That may be more logical, or it may not,
but rearranging the code like this makes it a lot harder to review,
and I would prefer that we make such changes as separate commits if
we're going to do them, so that diff can do something sensible with
the changes that are integral to the patch.Woops, hit send a little too soon there. I'll try to make some more
substantive comments after looking at this more.
And maybe retract some of the bogus ones, like the above: I see why
you moved this, now - parallel.c needs it.
Still looking...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sun, Jan 15, 2012 at 1:01 PM, Joachim Wieland <joe@mcknight.de> wrote:
So this is the parallel pg_dump patch, generalizing the existing
parallel restore and allowing parallel dumps for the directory archive
format, the patch works on Windows and Unix.
It seems a little unfortunate that we are using threads here on
Windows and processes on Linux. Maybe it's too late to revisit that
decision, but it seems like a recipe for subtle bugs.
In parallel restore, the master closes its own connection to the
database before forking of worker processes, just as it does now. In
parallel dump however, we need to hold the masters connection open so
that we can detect deadlocks. The issue is that somebody could have
requested an exclusive lock after the master has initially requested a
shared lock on all tables. Therefore, the worker process also requests
a shared lock on the table with NOWAIT and if this fails, we know that
there is a conflicting lock in between and that we need to abort the
dump.
I think this is an acceptable limitation, but the window where it can
happen seems awfully wide right now. As things stand, it seems like
we don't try to lock the table in the child until we're about to
access it, which means that, on a large database, we could dump out
99% of the database and then be forced to abort the dump because of a
conflicting lock on the very last table. We could fix that by having
every child lock every table right at the beginning, so that all
possible failures of this type would happen before we do any work, but
that will eat up a lot of lock table space. It would be nice if the
children could somehow piggyback on the parent's locks, but I don't
see any obvious way to make that work. Maybe we just have to live
with it the way it is, but I worry that people whose dumps fail 10
hours into a 12 hour parallel dump are going to be grumpy.
The connections of the parallel dump use the synchronized snapshot
feature. However there's also an option --no-synchronized-snapshots
which can be used to dump from an older PostgreSQL version.
Right now, you have it set up so that --no-synchronized-snapshots is
ignored even if synchronized snapshots are supported, which doesn't
make much sense to me. I think you should allow
--no-synchronized-snapshots with any release, and error out if it's
not specified and the version is too old work without it. It's
probably not a good idea to run with --no-synchronized-snapshots ever,
and doubly so if they're not available, but I'd rather leave that
decision to the user. (Imagine, for example, that we discovered a bug
in our synchronized snapshot implementation.)
I am tempted to advocate calling the flag --unsynchronized-snapshots,
because to me that underscores the danger a little more clearly, but
perhaps that is too clever.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 27.01.2012 18:46, Robert Haas wrote:
On Sun, Jan 15, 2012 at 1:01 PM, Joachim Wieland<joe@mcknight.de> wrote:
In parallel restore, the master closes its own connection to the
database before forking of worker processes, just as it does now. In
parallel dump however, we need to hold the masters connection open so
that we can detect deadlocks. The issue is that somebody could have
requested an exclusive lock after the master has initially requested a
shared lock on all tables. Therefore, the worker process also requests
a shared lock on the table with NOWAIT and if this fails, we know that
there is a conflicting lock in between and that we need to abort the
dump.I think this is an acceptable limitation, but the window where it can
happen seems awfully wide right now. As things stand, it seems like
we don't try to lock the table in the child until we're about to
access it, which means that, on a large database, we could dump out
99% of the database and then be forced to abort the dump because of a
conflicting lock on the very last table. We could fix that by having
every child lock every table right at the beginning, so that all
possible failures of this type would happen before we do any work, but
that will eat up a lot of lock table space. It would be nice if the
children could somehow piggyback on the parent's locks, but I don't
see any obvious way to make that work. Maybe we just have to live
with it the way it is, but I worry that people whose dumps fail 10
hours into a 12 hour parallel dump are going to be grumpy.
If the master process keeps the locks it acquires in the beginning, you
could fall back to dumping those tables where the child lock fails using
the master connection.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Fri, Jan 27, 2012 at 11:53 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
If the master process keeps the locks it acquires in the beginning, you
could fall back to dumping those tables where the child lock fails using the
master connection.
Hmm, that's a thought.
Another idea I just had is to allow a transaction that has imported a
snapshot to jump the queue when attempting to acquire a lock that the
backend from which the snapshot was imported already holds. We don't
want to allow queue-jumping in general because there are numerous
places in the code where we rely on the current behavior to avoid
starving strong lockers, but it seems like it might be reasonable to
allow it in this special case.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, Jan 27, 2012 at 4:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
But even if you do know that subclassing
is intended, that doesn't prove that the particular Archive object is
always going to be an ArchiveHandle under the hood. If it is, why not
just pass it as an ArchiveHandle to begin with?
I know that you took back some of your comments, but I'm with you
here. Archive is allocated as an ArchiveHandle and then casted back to
Archive*, so you always know that an Archive is an ArchiveHandle. I'm
all for getting rid of Archive and just using ArchiveHandle throughout
pg_dump which would get rid of these useless casts. I admit that I
might have made it a bit worse by adding a few more of these casts but
the fundamental issue was already there and there is precedence for
casting between them in both directions :-)
Joachim
Joachim Wieland <joe@mcknight.de> writes:
I know that you took back some of your comments, but I'm with you
here. Archive is allocated as an ArchiveHandle and then casted back to
Archive*, so you always know that an Archive is an ArchiveHandle. I'm
all for getting rid of Archive and just using ArchiveHandle throughout
pg_dump which would get rid of these useless casts.
I'd like to see a more thoroughgoing look at the basic structure of
pg_dump. Everybody who's ever looked at that code has found it
confusing, with the possible exception of the original author who is
long gone from the project anyway. I don't know exactly what would make
it better, but the useless distinction between Archive and ArchiveHandle
seems like a minor annoyance, not the core disease.
Not that there'd be anything wrong with starting with that.
regards, tom lane
On Sun, Jan 29, 2012 at 12:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Joachim Wieland <joe@mcknight.de> writes:
I know that you took back some of your comments, but I'm with you
here. Archive is allocated as an ArchiveHandle and then casted back to
Archive*, so you always know that an Archive is an ArchiveHandle. I'm
all for getting rid of Archive and just using ArchiveHandle throughout
pg_dump which would get rid of these useless casts.I'd like to see a more thoroughgoing look at the basic structure of
pg_dump. Everybody who's ever looked at that code has found it
confusing, with the possible exception of the original author who is
long gone from the project anyway. I don't know exactly what would make
it better, but the useless distinction between Archive and ArchiveHandle
seems like a minor annoyance, not the core disease.Not that there'd be anything wrong with starting with that.
After some study, I'm reluctant to completely abandon the
Archive/ArchiveHandle distinction because it seems to me that it does
serve some purpose: right now, nothing in pg_dump.c - where all the
code to actually dump stuff lives - knows anything about what's inside
the ArchiveHandle, just the Archive. So having two data structures
really does serve a purpose: if it's part of the Archive, you need it
in order to query the system catalogs and generate SQL. If it isn't,
you don't. Considering how much more crap there is inside an
ArchiveHandle than an Archive, I don't think we should lightly abandon
that distinction.
Now, that having been said, there are probably better ways of making
that distinction than what we have here; Archive and ArchiveHandle
could be better named, perhaps, and we could have pointers from one
structure to the other instead of magically embedding them inside each
other. All the function pointers that are part of the ArchiveHandle
could be separated out into a static, constant structure with a name
like ArchiveMethod, and we could keep a pointer to the appropriate
ArchiveMethod inside each ArchiveHandle instead of copying all the
pointers into it. I think that'd be a lot less confusing.
But the immediate problem is that pg_dump.c is heavily reliant on
global variables, which isn't going to fly if we want this code to use
threads on Windows (or anywhere else). It's also bad style. So I
suggest that we refactor pg_dump.c to get rid of g_conn and g_fout.
Getting rid of g_fout should be fairly easy: in many places, we're
already passing Archive *fout as a parameter. If we pass it as a
parameter to the functions that need it but aren't yet getting it as a
parameter, then it can cease to exist as a global variable and become
local to main().
Getting rid of g_conn is a little harder. Joachim's patch takes the
view that we can safely overload the existing ArchiveHandle.connection
member. Currently, that member is the connection to which we are
doing a direct to database restore; what he's proposing to do is also
use it to store the connection from which are doing the dump. But it
seems possible that we might someday want to dump from one database
and restore into another database at the same time, so maybe we ought
to play it safe and use different variables for those things. So I'm
inclined to think we could just add a "PGconn *remote_connection"
argument to the Archive structure (to go with all of the similarly
named "remote" fields, all of which describe the DB to be dumped) and
then that would be available everywhere that the Archive itself is.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
But the immediate problem is that pg_dump.c is heavily reliant on
global variables, which isn't going to fly if we want this code to use
threads on Windows (or anywhere else). It's also bad style. So I
suggest that we refactor pg_dump.c to get rid of g_conn and g_fout.
I've looked at that a bit in the past and decided that the notational
overhead would be too much. OTOH, if we're going to be forced into it
in order to support parallel pg_dump, we might as well do it first in a
separate patch.
... But it
seems possible that we might someday want to dump from one database
and restore into another database at the same time, so maybe we ought
to play it safe and use different variables for those things. So I'm
inclined to think we could just add a "PGconn *remote_connection"
argument to the Archive structure (to go with all of the similarly
named "remote" fields, all of which describe the DB to be dumped) and
then that would be available everywhere that the Archive itself is.
I always thought that the "remote" terminology was singularly unhelpful.
It implies there's a "local" connection floating around somewhere, which
of course there is not, and it does nothing to remind you of whether the
connection leads to a database being dumped or a database being restored
into. If we are going to have two fields, could we name them something
less opaque, perhaps "src_connection" and "dest_connection"?
regards, tom lane
On Mon, Jan 30, 2012 at 12:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:
But the immediate problem is that pg_dump.c is heavily reliant on
global variables, which isn't going to fly if we want this code to use
threads on Windows (or anywhere else). It's also bad style.
Technically, since most of pg_dump.c dumps the catalog and since this
isn't done in parallel but only in the master process, most functions
need not be changed for the parallel restore. Only those that are
called from the worker threads need to be changed, this has been done
in e.g. dumpBlobs(), the change that you quoted upthread.
But it
seems possible that we might someday want to dump from one database
and restore into another database at the same time, so maybe we ought
to play it safe and use different variables for those things.
Actually I've tried that but in the end concluded that it's best to
have at most one database connection in an ArchiveHandle if you don't
want to do a lot more refactoring. Besides the normal connection
parameters like host, port, ... there's also std_strings, encoding,
savedPassword, currUser/currSchema, lo_buf, remoteVersion ... that
wouldn't be obvious where they belonged to.
Speaking about refactoring, I'm happy to also throw in the idea to
make the dump and restore more symmetrical than they are now. I kinda
disliked RestoreOptions* being a member of the ArchiveHandle without
having something similar for the dump. Ideally I'd say there should be
a DumpOptions and a RestoreOptions field (with a "struct Connection"
being part of them containing all the different connection
parameters). They could be a union if you wanted to allow only one
connection, or not if you want more than one.
On Tue, Jan 31, 2012 at 12:55 AM, Joachim Wieland <joe@mcknight.de> wrote:
On Mon, Jan 30, 2012 at 12:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:
But the immediate problem is that pg_dump.c is heavily reliant on
global variables, which isn't going to fly if we want this code to use
threads on Windows (or anywhere else). It's also bad style.Technically, since most of pg_dump.c dumps the catalog and since this
isn't done in parallel but only in the master process, most functions
need not be changed for the parallel restore. Only those that are
called from the worker threads need to be changed, this has been done
in e.g. dumpBlobs(), the change that you quoted upthread.
If we're going to go to the trouble of cleaning this up, I'd prefer to
rationalize it rather than doing just the absolute bare minimum amount
of stuff needed to make it appear to work.
But it
seems possible that we might someday want to dump from one database
and restore into another database at the same time, so maybe we ought
to play it safe and use different variables for those things.Actually I've tried that but in the end concluded that it's best to
have at most one database connection in an ArchiveHandle if you don't
want to do a lot more refactoring. Besides the normal connection
parameters like host, port, ... there's also std_strings, encoding,
savedPassword, currUser/currSchema, lo_buf, remoteVersion ... that
wouldn't be obvious where they belonged to.
And just for added fun and excitement, they all have inconsistent
naming conventions and inadequate documentation.
I think if we need more refactoring in order to support multiple
database connections, we should go do that refactoring. The current
situation is not serving anyone well.
Speaking about refactoring, I'm happy to also throw in the idea to
make the dump and restore more symmetrical than they are now. I kinda
disliked RestoreOptions* being a member of the ArchiveHandle without
having something similar for the dump. Ideally I'd say there should be
a DumpOptions and a RestoreOptions field (with a "struct Connection"
being part of them containing all the different connection
parameters). They could be a union if you wanted to allow only one
connection, or not if you want more than one.
I like the idea of a struct Connection. I think that could make a lot of sense.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Tue, Jan 31, 2012 at 9:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:
And just for added fun and excitement, they all have inconsistent
naming conventions and inadequate documentation.I think if we need more refactoring in order to support multiple
database connections, we should go do that refactoring. The current
situation is not serving anyone well.
I guess I'd find it cleaner to have just one connection per Archive
(or ArchiveHandle). If you need two connections, why not just have two
Archive objects, as they would have different characteristics anyway,
one for dumping data, one to restore.
On Tue, Jan 31, 2012 at 4:46 PM, Joachim Wieland <joe@mcknight.de> wrote:
On Tue, Jan 31, 2012 at 9:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:
And just for added fun and excitement, they all have inconsistent
naming conventions and inadequate documentation.I think if we need more refactoring in order to support multiple
database connections, we should go do that refactoring. The current
situation is not serving anyone well.I guess I'd find it cleaner to have just one connection per Archive
(or ArchiveHandle). If you need two connections, why not just have two
Archive objects, as they would have different characteristics anyway,
one for dumping data, one to restore.
I think we're more-or-less proposing to rename "Archive" to
"Connection", aren't we?
And then ArchiveHandle can store all the things that aren't related to
a specific connection.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Feb 1, 2012 at 12:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:
I think we're more-or-less proposing to rename "Archive" to
"Connection", aren't we?And then ArchiveHandle can store all the things that aren't related to
a specific connection.
How about something like that:
(Hopefully you'll come up with better names...)
StateHandle {
Connection
Schema
Archive
Methods
union {
DumpOptions
RestoreOptions
}
}
Dumping would mean to do:
Connection -> Schema -> Archive using DumpOptions through the
specified Methods
Restore:
Archive -> Schema -> Connection using RestoreOptions through the
specified Methods
Dumping from one database and restoring into another one would be two
StateHandles with different Connections, Archive == NULL, Schema
pointing to the same Schema, Methods most likely also pointing to the
same function pointer table and each with different Options in the
union of course.
Granted, you could say that above I've only grouped the elements of
the ArchiveHandle, but I don't really see that breaking it up into
several objects makes it any better or easier. There could be some
accessor functions to hide the details of the whole object to the
different consumers.
On Thu, Feb 2, 2012 at 8:31 PM, Joachim Wieland <joe@mcknight.de> wrote:
On Wed, Feb 1, 2012 at 12:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:
I think we're more-or-less proposing to rename "Archive" to
"Connection", aren't we?And then ArchiveHandle can store all the things that aren't related to
a specific connection.How about something like that:
(Hopefully you'll come up with better names...)StateHandle {
Connection
Schema
Archive
Methods
union {
DumpOptions
RestoreOptions
}
}Dumping would mean to do:
Connection -> Schema -> Archive using DumpOptions through the
specified MethodsRestore:
Archive -> Schema -> Connection using RestoreOptions through the
specified MethodsDumping from one database and restoring into another one would be two
StateHandles with different Connections, Archive == NULL, Schema
pointing to the same Schema, Methods most likely also pointing to the
same function pointer table and each with different Options in the
union of course.Granted, you could say that above I've only grouped the elements of
the ArchiveHandle, but I don't really see that breaking it up into
several objects makes it any better or easier. There could be some
accessor functions to hide the details of the whole object to the
different consumers.
I'm not sure I understand what the various structures would contain.
My gut feeling for how to begin grinding through this is to go through
and do the following:
1. Rename Archive to Connection.
2. Add a PGconn object to it.
3. Change the declaration inside ArchiveHandle from "Archive public"
to "Connection source_connection".
I think that'd get us significantly closer to sanity and be pretty
simple to understand, and then we can take additional passes over it
until we're happy with what we've got.
If you're OK with that much I'll go do it.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, Feb 3, 2012 at 7:52 AM, Robert Haas <robertmhaas@gmail.com> wrote:
If you're OK with that much I'll go do it.
Sure, go ahead!
On Fri, Feb 3, 2012 at 10:43 AM, Joachim Wieland <joe@mcknight.de> wrote:
On Fri, Feb 3, 2012 at 7:52 AM, Robert Haas <robertmhaas@gmail.com> wrote:
If you're OK with that much I'll go do it.
Sure, go ahead!
It turns out that (as you anticipated) there are some problems with my
previous proposal. For one thing, an Archive isn't really just a
connection, because it's also used as a data sink by passing it to
functions like ArchiveEntry(). For two things, as you probably
realized but I failed to, ConnectDatabase() is already setting
AH->connection to the same PGconn it returns, so the idea that we can
potentially have multiple connection objects using the existing
framework is not really true; or at least it's going to require more
work than I originally thought. I think it might still be worth doing
at some point, but I think we probably need to clean up some of the
rest of this mess first.
I've now rejiggered things so that the Archive is passed down to
everything that needs it, so the global variable g_fout is gone. I've
also added a couple of additional accessors to pg_backup_db.c so that
most places that issue queries can simply use those routines without
needing to peek under the hood into the ArchiveHandle. This is not
quite enough to get rid of g_conn, but it's close: the major stumbling
block at this point is probably exit_nicely(). The gyrations we're
going through to make sure that AH->connection gets closed before
exiting are fairly annoying; maybe we should invent something in
dumputils.c along the line of the backend's on_shmem_exit().
I'm starting to think it might make sense to press on with this
refactoring just a bit further and eliminate the distinction between
Archive and ArchiveHandle. Given a few more accessors (and it really
looks like it would only be a few), pg_dump.c could treat an
ArchiveHandle as an opaque struct, which would accomplish more or less
the same thing as the current design, but in a less confusing fashion
- i.e. without giving the reader the idea that the author desperately
wishes he were coding in C++. That would allow simplification in a
number other places; just to take one example, we wouldn't need both
appendStringLiteralAH and appendStringLiteralAHX.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
most places that issue queries can simply use those routines without
needing to peek under the hood into the ArchiveHandle. This is not
quite enough to get rid of g_conn, but it's close: the major stumbling
block at this point is probably exit_nicely(). The gyrations we're
going through to make sure that AH->connection gets closed before
exiting are fairly annoying; maybe we should invent something in
dumputils.c along the line of the backend's on_shmem_exit().
Do we actually care about closing the connection? Worst case is that
backend logs an "unexpected EOF" message. But yeah, an atexit hook
might be the easiest solution.
regards, tom lane