Non-text mode for pg_dumpall
Tom and Nathan opined recently that providing for non-text mode for
pg_dumpall would be a Good Thing (TM). Not having it has been a
long-standing complaint, so I've decided to give it a go.
I think we would need to restrict it to directory mode, at least to
begin with. I would have a toc.dat with a different magic block (say
"PGGLO" instead of "PGDMP") containing the global entries (roles,
tablespaces, databases). Then for each database there would be a
subdirectory (named for its toc entry) with a standard directory mode
dump for that database. These could be generated in parallel (possibly
by pg_dumpall calling pg_dump for each database). pg_restore on
detecting a global type toc.data would restore the globals and then each
of the databases (again possibly in parallel).
I'm sure there are many wrinkles I haven't thought of, but I don't see
any insurmountable obstacles, just a significant amount of code.
Barring the unforeseen my main is to have a preliminary patch by the
September CF.
Following that I would turn my attention to using it in pg_upgrade.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Mon, Jun 10, 2024 at 08:58:49AM -0400, Andrew Dunstan wrote:
Tom and Nathan opined recently that providing for non-text mode for
pg_dumpall would be a Good Thing (TM). Not having it has been a
long-standing complaint, so I've decided to give it a go.
Thank you!
I think we would need to restrict it to directory mode, at least to begin
with. I would have a toc.dat with a different magic block (say "PGGLO"
instead of "PGDMP") containing the global entries (roles, tablespaces,
databases). Then for each database there would be a subdirectory (named for
its toc entry) with a standard directory mode dump for that database. These
could be generated in parallel (possibly by pg_dumpall calling pg_dump for
each database). pg_restore on detecting a global type toc.data would restore
the globals and then each of the databases (again possibly in parallel).
I'm curious why we couldn't also support the "custom" format.
Following that I would turn my attention to using it in pg_upgrade.
+1
--
nathan
On 2024-06-10 Mo 10:14, Nathan Bossart wrote:
On Mon, Jun 10, 2024 at 08:58:49AM -0400, Andrew Dunstan wrote:
Tom and Nathan opined recently that providing for non-text mode for
pg_dumpall would be a Good Thing (TM). Not having it has been a
long-standing complaint, so I've decided to give it a go.Thank you!
I think we would need to restrict it to directory mode, at least to begin
with. I would have a toc.dat with a different magic block (say "PGGLO"
instead of "PGDMP") containing the global entries (roles, tablespaces,
databases). Then for each database there would be a subdirectory (named for
its toc entry) with a standard directory mode dump for that database. These
could be generated in parallel (possibly by pg_dumpall calling pg_dump for
each database). pg_restore on detecting a global type toc.data would restore
the globals and then each of the databases (again possibly in parallel).I'm curious why we couldn't also support the "custom" format.
We could, but the housekeeping would be a bit harder. We'd need to keep
pointers to the offsets of the per-database TOCs (I don't want to have a
single per-cluster TOC). And we can't produce it in parallel, so I'd
rather start with something we can produce in parallel.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Mon, Jun 10, 2024 at 4:14 PM Nathan Bossart <nathandbossart@gmail.com>
wrote:
On Mon, Jun 10, 2024 at 08:58:49AM -0400, Andrew Dunstan wrote:
Tom and Nathan opined recently that providing for non-text mode for
pg_dumpall would be a Good Thing (TM). Not having it has been a
long-standing complaint, so I've decided to give it a go.Thank you!
Indeed, this has been quite annoying!
I think we would need to restrict it to directory mode, at least to begin
with. I would have a toc.dat with a different magic block (say "PGGLO"
instead of "PGDMP") containing the global entries (roles, tablespaces,
databases). Then for each database there would be a subdirectory (namedfor
its toc entry) with a standard directory mode dump for that database.
These
could be generated in parallel (possibly by pg_dumpall calling pg_dump
for
each database). pg_restore on detecting a global type toc.data would
restore
the globals and then each of the databases (again possibly in parallel).
I'm curious why we couldn't also support the "custom" format.
Or maybe even a combo - a directory of custom format files? Plus that one
special file being globals? I'd say that's what most use cases I've seen
would prefer.
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
On Mon, Jun 10, 2024 at 10:51:42AM -0400, Andrew Dunstan wrote:
On 2024-06-10 Mo 10:14, Nathan Bossart wrote:
I'm curious why we couldn't also support the "custom" format.
We could, but the housekeeping would be a bit harder. We'd need to keep
pointers to the offsets of the per-database TOCs (I don't want to have a
single per-cluster TOC). And we can't produce it in parallel, so I'd rather
start with something we can produce in parallel.
Got it.
--
nathan
On Mon, Jun 10, 2024 at 04:52:06PM +0200, Magnus Hagander wrote:
On Mon, Jun 10, 2024 at 4:14 PM Nathan Bossart <nathandbossart@gmail.com>
wrote:I'm curious why we couldn't also support the "custom" format.
Or maybe even a combo - a directory of custom format files? Plus that one
special file being globals? I'd say that's what most use cases I've seen
would prefer.
Is there a particular advantage to that approach as opposed to just using
"directory" mode for everything? I know pg_upgrade uses "custom" mode for
each of the databases, so a combo approach would be a closer match to the
existing behavior, but that doesn't strike me as an especially strong
reason to keep doing it that way.
--
nathan
On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <nathandbossart@gmail.com>
wrote:
On Mon, Jun 10, 2024 at 04:52:06PM +0200, Magnus Hagander wrote:
On Mon, Jun 10, 2024 at 4:14 PM Nathan Bossart <nathandbossart@gmail.com
wrote:
I'm curious why we couldn't also support the "custom" format.
Or maybe even a combo - a directory of custom format files? Plus that one
special file being globals? I'd say that's what most use cases I've seen
would prefer.Is there a particular advantage to that approach as opposed to just using
"directory" mode for everything? I know pg_upgrade uses "custom" mode for
each of the databases, so a combo approach would be a closer match to the
existing behavior, but that doesn't strike me as an especially strong
reason to keep doing it that way.
A gazillion files to deal with? Much easier to work with individual custom
files if you're moving databases around and things like that.
Much easier to monitor eg sizes/dates if you're using it for backups.
It's not things that are make-it-or-break-it or anything, but there are
some smaller things that definitely can be useful.
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
On Mon, Jun 10, 2024 at 05:45:19PM +0200, Magnus Hagander wrote:
On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <nathandbossart@gmail.com>
wrote:Is there a particular advantage to that approach as opposed to just using
"directory" mode for everything? I know pg_upgrade uses "custom" mode for
each of the databases, so a combo approach would be a closer match to the
existing behavior, but that doesn't strike me as an especially strong
reason to keep doing it that way.A gazillion files to deal with? Much easier to work with individual custom
files if you're moving databases around and things like that.
Much easier to monitor eg sizes/dates if you're using it for backups.It's not things that are make-it-or-break-it or anything, but there are
some smaller things that definitely can be useful.
Makes sense, thanks for elaborating.
--
nathan
Magnus Hagander <magnus@hagander.net> writes:
On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <nathandbossart@gmail.com>
wrote:Is there a particular advantage to that approach as opposed to just using
"directory" mode for everything?
A gazillion files to deal with? Much easier to work with individual custom
files if you're moving databases around and things like that.
Much easier to monitor eg sizes/dates if you're using it for backups.
You can always tar up the directory tree after-the-fact if you want
one file. Sure, that step's not parallelized, but I think we'd need
some non-parallelized copying to create such a file anyway.
regards, tom lane
On 2024-06-10 Mo 12:21, Tom Lane wrote:
Magnus Hagander <magnus@hagander.net> writes:
On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <nathandbossart@gmail.com>
wrote:Is there a particular advantage to that approach as opposed to just using
"directory" mode for everything?A gazillion files to deal with? Much easier to work with individual custom
files if you're moving databases around and things like that.
Much easier to monitor eg sizes/dates if you're using it for backups.You can always tar up the directory tree after-the-fact if you want
one file. Sure, that step's not parallelized, but I think we'd need
some non-parallelized copying to create such a file anyway.
Yeah.
I think I can probably allow for Magnus' suggestion fairly easily, but
if I have to choose I'm going to go for the format that can be produced
with the maximum parallelism.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Mon, Jun 10, 2024 at 6:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <nathandbossart@gmail.com
wrote:
Is there a particular advantage to that approach as opposed to just
using
"directory" mode for everything?
A gazillion files to deal with? Much easier to work with individual
custom
files if you're moving databases around and things like that.
Much easier to monitor eg sizes/dates if you're using it for backups.You can always tar up the directory tree after-the-fact if you want
one file. Sure, that step's not parallelized, but I think we'd need
some non-parallelized copying to create such a file anyway.
That would require double the disk space.
But you can also just run pg_dump manually on each database and a
pg_dumpall -g like people are doing today -- I thought this whole thing was
about making it more convenient :)
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
Hi all,
With the help of Andrew and Dilip Kumar, I made a poc patch to dump all the
databases in archive format and then restore them using pg_restore.
Brief about the patch:
new option to pg_dumpall:
-F, --format=d|p (directory|plain) output file format (directory, plain
text (default))
Ex: ./pg_dumpall --format=directory --file=dumpDirName
dumps are as:
global.dat ::: global sql commands in simple plain format
map.dat. ::: dboid dbname ---entries for all databases in simple text form
databases. :::
subdir dboid1 -> toc.dat and data files in archive format
subdir dboid2. -> toc.dat and data files in archive format
etc
---------------------------------------------------------------------------
new options to pg_restore:
-g, --globals-only restore only global objects, no databases
--exclude-database=PATTERN exclude databases whose name matches PATTERN
When we give -g/--globals-only option, then only restore globals, no db
restoring.
*Design*:
When --format=directory is specified and there is no toc.dat file in the
main directory, then check
for global.dat and map.dat to restore all databases. If both files exist in
a directory,
then first restore all globals from global.dat and then restore all
databases one by one
from map.dat list.
While restoring, skip the databases that are given with exclude-database.
---------------------------------------------------------------------------
NOTE:
if needed, restore single db by particular subdir
Ex: ./pg_restore --format=directory -d postgres dumpDirName/databases/5
-- here, 5 is the dboid of postgres db
-- to get dboid, refer dbname in map.file
--------------------------------------------------------------------------
Please let me know feedback for the attached patch.
On Tue, 11 Jun 2024 at 01:06, Magnus Hagander <magnus@hagander.net> wrote:
On Mon, Jun 10, 2024 at 6:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <
nathandbossart@gmail.com>
wrote:
Is there a particular advantage to that approach as opposed to just
using
"directory" mode for everything?
A gazillion files to deal with? Much easier to work with individual
custom
files if you're moving databases around and things like that.
Much easier to monitor eg sizes/dates if you're using it for backups.You can always tar up the directory tree after-the-fact if you want
one file. Sure, that step's not parallelized, but I think we'd need
some non-parallelized copying to create such a file anyway.That would require double the disk space.
But you can also just run pg_dump manually on each database and a
pg_dumpall -g like people are doing today -- I thought this whole thing was
about making it more convenient :)--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v01_poc_pg_dumpall_with_directory_31dec.patchapplication/octet-stream; name=v01_poc_pg_dumpall_with_directory_31dec.patchDownload+739-34
Here, I am attaching an updated patch. I fixed some bugs of v01 patch and
did some code cleanup also.
TODO WIP 1: after excluding databases, we have paths of all the databases
that are needed to
restore so we can launch parallel workers for each database. I am studying
for this part.
TODO WIP 2: exclude-database=NAME, for pg_restore, I am using NAME as of
now, I will try to make it PATTERN. PATTERN
should be matched from map.dat file.
Please have a look over the patch and let me know feedback.
On Tue, 31 Dec 2024 at 23:53, Mahendra Singh Thalor <mahi6run@gmail.com>
wrote:
Hi all,
With the help of Andrew and Dilip Kumar, I made a poc patch to dump all
the databases in archive format and then restore them using pg_restore.Brief about the patch:
new option to pg_dumpall:
-F, --format=d|p (directory|plain) output file format (directory, plain
text (default))Ex: ./pg_dumpall --format=directory --file=dumpDirName
dumps are as:
global.dat ::: global sql commands in simple plain format
map.dat. ::: dboid dbname ---entries for all databases in simple text
form
databases. :::
subdir dboid1 -> toc.dat and data files in archive format
subdir dboid2. -> toc.dat and data files in archive format
etc
---------------------------------------------------------------------------new options to pg_restore:
-g, --globals-only restore only global objects, no databases
--exclude-database=PATTERN exclude databases whose name matches PATTERNWhen we give -g/--globals-only option, then only restore globals, no db
restoring.*Design*:
When --format=directory is specified and there is no toc.dat file in the
main directory, then check
for global.dat and map.dat to restore all databases. If both files exist
in a directory,
then first restore all globals from global.dat and then restore all
databases one by one
from map.dat list.
While restoring, skip the databases that are given with exclude-database.---------------------------------------------------------------------------
NOTE:
if needed, restore single db by particular subdirEx: ./pg_restore --format=directory -d postgres dumpDirName/databases/5
-- here, 5 is the dboid of postgres db
-- to get dboid, refer dbname in map.file
--------------------------------------------------------------------------Please let me know feedback for the attached patch.
On Tue, 11 Jun 2024 at 01:06, Magnus Hagander <magnus@hagander.net> wrote:
On Mon, Jun 10, 2024 at 6:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <
nathandbossart@gmail.com>
wrote:
Is there a particular advantage to that approach as opposed to just
using
"directory" mode for everything?
A gazillion files to deal with? Much easier to work with individual
custom
files if you're moving databases around and things like that.
Much easier to monitor eg sizes/dates if you're using it for backups.You can always tar up the directory tree after-the-fact if you want
one file. Sure, that step's not parallelized, but I think we'd need
some non-parallelized copying to create such a file anyway.That would require double the disk space.
But you can also just run pg_dump manually on each database and a
pg_dumpall -g like people are doing today -- I thought this whole thing was
about making it more convenient :)--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v02_poc_pg_dumpall_with_directory_2nd_jan.patchapplication/octet-stream; name=v02_poc_pg_dumpall_with_directory_2nd_jan.patchDownload+829-34
On Thu, Jan 02, 2025 at 02:05:13AM +0530, Mahendra Singh Thalor wrote:
Here, I am attaching an updated patch. I fixed some bugs of v01 patch and
did some code cleanup also.
Thank you for picking this up! I started to review it, but the
documentation changes didn't build, and a few tests in check-world are
failing. Would you mind resolving those issues? Also, if you haven't
already, please add an entry to the next commitfest [0]https://commitfest.postgresql.org to ensure that 1)
this feature is tracked and 2) the automated tests will run.
+ if (dbfile)
+ {
+ printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin,
+ dbfile, create_opts);
+ appendPQExpBufferStr(&cmd, " -F d ");
+ }
Have you given any thought to allowing a directory of custom format files,
as discussed upthread [1]/messages/by-id/CABUevExoQ26jo+aQ9QZq+UMA1aD6gfpm9xBnh_t5e0DhaCeRYA@mail.gmail.com? Perhaps that is better handled as a follow-up
patch, but it'd be good to understand the plan, anyway.
[0]: https://commitfest.postgresql.org
[1]: /messages/by-id/CABUevExoQ26jo+aQ9QZq+UMA1aD6gfpm9xBnh_t5e0DhaCeRYA@mail.gmail.com
--
nathan
On Mon, 6 Jan 2025 at 23:05, Nathan Bossart <nathandbossart@gmail.com>
wrote:
On Thu, Jan 02, 2025 at 02:05:13AM +0530, Mahendra Singh Thalor wrote:
Here, I am attaching an updated patch. I fixed some bugs of v01 patch
and
did some code cleanup also.
Thank you for picking this up! I started to review it, but the
documentation changes didn't build, and a few tests in check-world are
failing. Would you mind resolving those issues? Also, if you haven't
already, please add an entry to the next commitfest [0] to ensure that 1)
this feature is tracked and 2) the automated tests will run.
Thanks Nathan for the quick response.
I fixed bugs of documentation changes and check-world in the latest patch.
Now docs are building and check-world is passing.
I added entry into commitfest for this patch.[0]
<https://commitfest.postgresql.org/52/5495/>
+ if (dbfile) + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + dbfile, create_opts); + appendPQExpBufferStr(&cmd, " -F d "); + }Have you given any thought to allowing a directory of custom format files,
as discussed upthread [1]? Perhaps that is better handled as a follow-up
patch, but it'd be good to understand the plan, anyway.
I will make these changes and will test. I will update my findings after
doing some testing.
Apart from these bugs, I added code to handle --exclude-database= PATTERN.
Earlier I was using NAME only to skip databases for restore.
*TODO: .pl test cases for new added options.*
Here, I am attaching an updated patch for review and feedback.
/messages/by-id/CABUevExoQ26jo+aQ9QZq+UMA1aD6gfpm9xBnh_t5e0DhaCeRYA@mail.gmail.com
--
nathan
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v03-pg_dumpall-with-directory-format-and-restore-08_jan.patchapplication/octet-stream; name=v03-pg_dumpall-with-directory-format-and-restore-08_jan.patchDownload+945-34
Hi all,
On Wed, 8 Jan 2025 at 00:34, Mahendra Singh Thalor <mahi6run@gmail.com>
wrote:
On Mon, 6 Jan 2025 at 23:05, Nathan Bossart <nathandbossart@gmail.com>
wrote:
On Thu, Jan 02, 2025 at 02:05:13AM +0530, Mahendra Singh Thalor wrote:
Here, I am attaching an updated patch. I fixed some bugs of v01 patch
and
did some code cleanup also.
Thank you for picking this up! I started to review it, but the
documentation changes didn't build, and a few tests in check-world are
failing. Would you mind resolving those issues? Also, if you haven't
already, please add an entry to the next commitfest [0] to ensure that
1)
this feature is tracked and 2) the automated tests will run.
Thanks Nathan for the quick response.
I fixed bugs of documentation changes and check-world in the latest
patch. Now docs are building and check-world is passing.
I added entry into commitfest for this patch.[0]
+ if (dbfile) + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + dbfile, create_opts); + appendPQExpBufferStr(&cmd, " -F d "); + }Have you given any thought to allowing a directory of custom format
files,
as discussed upthread [1]? Perhaps that is better handled as a
follow-up
patch, but it'd be good to understand the plan, anyway.
I will make these changes and will test. I will update my findings after
doing some testing.
In the latest patch, I added dump and restoring for
directory/custom/tar/plain formats. Please consider this patch for review
and testing.
*Design*:
When we give --format=d|c|t then we are dumping all global sql commands in
global.dat in plain sql format and we are making a map.dat file with dbname
and dboid. For each database, we are making separate subdirectory with
dboid under databases directory and dumping as per archive format(d|c|t).
While restoring, first we are restoring all global sql commands from
global.dat and then we are restoring one by one all databases. As we are
supporting --exclude-database with pg_dumpall, the same we are supporting
with pg_restore also to skip restoring on some specified database patterns.
If we want to restore a single database, then we can specided particular
subdirectory from the databases folder. To get file name, we refer
dbname into map.file.
*TODO*: Now I will work on test cases for these new added options to the
pg_dumpall and pg_restore.
Here, I am attaching the v04 patch for testing and review.
Apart from these bugs, I added code to handle --exclude-database=
PATTERN. Earlier I was using NAME only to skip databases for restore.
TODO: .pl test cases for new added options.
Here, I am attaching an updated patch for review and feedback.
/messages/by-id/CABUevExoQ26jo+aQ9QZq+UMA1aD6gfpm9xBnh_t5e0DhaCeRYA@mail.gmail.com
--
nathan--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v04-pg_dumpall-with-directory-format-and-restore-08_jan.patchapplication/octet-stream; name=v04-pg_dumpall-with-directory-format-and-restore-08_jan.patchDownload+214-142
On Wed, 8 Jan 2025 at 20:07, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
Hi all,
On Wed, 8 Jan 2025 at 00:34, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Mon, 6 Jan 2025 at 23:05, Nathan Bossart <nathandbossart@gmail.com> wrote:
On Thu, Jan 02, 2025 at 02:05:13AM +0530, Mahendra Singh Thalor wrote:
Here, I am attaching an updated patch. I fixed some bugs of v01 patch and
did some code cleanup also.Thank you for picking this up! I started to review it, but the
documentation changes didn't build, and a few tests in check-world are
failing. Would you mind resolving those issues? Also, if you haven't
already, please add an entry to the next commitfest [0] to ensure that 1)
this feature is tracked and 2) the automated tests will run.Thanks Nathan for the quick response.
I fixed bugs of documentation changes and check-world in the latest patch. Now docs are building and check-world is passing.
I added entry into commitfest for this patch.[0]
+ if (dbfile) + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + dbfile, create_opts); + appendPQExpBufferStr(&cmd, " -F d "); + }Have you given any thought to allowing a directory of custom format files,
as discussed upthread [1]? Perhaps that is better handled as a follow-up
patch, but it'd be good to understand the plan, anyway.I will make these changes and will test. I will update my findings after doing some testing.
In the latest patch, I added dump and restoring for directory/custom/tar/plain formats. Please consider this patch for review and testing.
Design:
When we give --format=d|c|t then we are dumping all global sql commands in global.dat in plain sql format and we are making a map.dat file with dbname and dboid. For each database, we are making separate subdirectory with dboid under databases directory and dumping as per archive format(d|c|t).
While restoring, first we are restoring all global sql commands from global.dat and then we are restoring one by one all databases. As we are supporting --exclude-database with pg_dumpall, the same we are supporting with pg_restore also to skip restoring on some specified database patterns.
If we want to restore a single database, then we can specided particular subdirectory from the databases folder. To get file name, we refer dbname into map.file.TODO: Now I will work on test cases for these new added options to the pg_dumpall and pg_restore.
Here, I am attaching the v04 patch for testing and review.
Sorry. My mistake.
v04 was the delta patch on the top of v03.
Here, I am attaching the v05 patch for testing and review.
Apart from these bugs, I added code to handle --exclude-database= PATTERN. Earlier I was using NAME only to skip databases for restore.
TODO: .pl test cases for new added options.
Here, I am attaching an updated patch for review and feedback.
[0] https://commitfest.postgresql.org
[1] /messages/by-id/CABUevExoQ26jo+aQ9QZq+UMA1aD6gfpm9xBnh_t5e0DhaCeRYA@mail.gmail.com--
nathan--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v05_pg_dumpall-with-directory-tar-custom-format-08-jan.patchapplication/octet-stream; name=v05_pg_dumpall-with-directory-tar-custom-format-08-jan.patchDownload+1020-32
Hi,
Le mer. 8 janv. 2025 à 17:41, Mahendra Singh Thalor <mahi6run@gmail.com> a
écrit :
On Wed, 8 Jan 2025 at 20:07, Mahendra Singh Thalor <mahi6run@gmail.com>
wrote:Hi all,
On Wed, 8 Jan 2025 at 00:34, Mahendra Singh Thalor <mahi6run@gmail.com>
wrote:
On Mon, 6 Jan 2025 at 23:05, Nathan Bossart <nathandbossart@gmail.com>
wrote:
On Thu, Jan 02, 2025 at 02:05:13AM +0530, Mahendra Singh Thalor
wrote:
Here, I am attaching an updated patch. I fixed some bugs of v01
patch and
did some code cleanup also.
Thank you for picking this up! I started to review it, but the
documentation changes didn't build, and a few tests in check-worldare
failing. Would you mind resolving those issues? Also, if you
haven't
already, please add an entry to the next commitfest [0] to ensure
that 1)
this feature is tracked and 2) the automated tests will run.
Thanks Nathan for the quick response.
I fixed bugs of documentation changes and check-world in the latest
patch. Now docs are building and check-world is passing.
I added entry into commitfest for this patch.[0]
+ if (dbfile) + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + dbfile,create_opts);
+ appendPQExpBufferStr(&cmd, " -F d "); + }Have you given any thought to allowing a directory of custom format
files,
as discussed upthread [1]? Perhaps that is better handled as a
follow-up
patch, but it'd be good to understand the plan, anyway.
I will make these changes and will test. I will update my findings
after doing some testing.
In the latest patch, I added dump and restoring for
directory/custom/tar/plain formats. Please consider this patch for review
and testing.Design:
When we give --format=d|c|t then we are dumping all global sql commandsin global.dat in plain sql format and we are making a map.dat file with
dbname and dboid. For each database, we are making separate subdirectory
with dboid under databases directory and dumping as per archive
format(d|c|t).While restoring, first we are restoring all global sql commands from
global.dat and then we are restoring one by one all databases. As we are
supporting --exclude-database with pg_dumpall, the same we are supporting
with pg_restore also to skip restoring on some specified database patterns.If we want to restore a single database, then we can specided particular
subdirectory from the databases folder. To get file name, we refer dbname
into map.file.TODO: Now I will work on test cases for these new added options to the
pg_dumpall and pg_restore.
Here, I am attaching the v04 patch for testing and review.
Sorry. My mistake.
v04 was the delta patch on the top of v03.Here, I am attaching the v05 patch for testing and review.
Just FWIW, I did a quick test tonight. It applies cleanly, compiles OK. I
did a dump:
$ pg_dumpall -Fd -f dir
and then a restore (after dropping the databases I had):
$ pg_restore -Cd postgres -v dir
It worked really well. That's great.
Quick thing to fix: you've got this error message:
pg_restore: error: -d/--dbanme should be given when using archive dump of
pg_dumpall
I guess it is --dbname, rather than --dbanme.
Of course, it needs much more testing, but this feature would be great to
have. Thanks for working on this!
--
Guillaume.
On Thu, 9 Jan 2025 at 02:30, Guillaume Lelarge <guillaume@lelarge.info> wrote:
Hi,
Le mer. 8 janv. 2025 à 17:41, Mahendra Singh Thalor <mahi6run@gmail.com> a écrit :
On Wed, 8 Jan 2025 at 20:07, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
Hi all,
On Wed, 8 Jan 2025 at 00:34, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Mon, 6 Jan 2025 at 23:05, Nathan Bossart <nathandbossart@gmail.com> wrote:
On Thu, Jan 02, 2025 at 02:05:13AM +0530, Mahendra Singh Thalor wrote:
Here, I am attaching an updated patch. I fixed some bugs of v01 patch and
did some code cleanup also.Thank you for picking this up! I started to review it, but the
documentation changes didn't build, and a few tests in check-world are
failing. Would you mind resolving those issues? Also, if you haven't
already, please add an entry to the next commitfest [0] to ensure that 1)
this feature is tracked and 2) the automated tests will run.Thanks Nathan for the quick response.
I fixed bugs of documentation changes and check-world in the latest patch. Now docs are building and check-world is passing.
I added entry into commitfest for this patch.[0]
+ if (dbfile) + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + dbfile, create_opts); + appendPQExpBufferStr(&cmd, " -F d "); + }Have you given any thought to allowing a directory of custom format files,
as discussed upthread [1]? Perhaps that is better handled as a follow-up
patch, but it'd be good to understand the plan, anyway.I will make these changes and will test. I will update my findings after doing some testing.
In the latest patch, I added dump and restoring for directory/custom/tar/plain formats. Please consider this patch for review and testing.
Design:
When we give --format=d|c|t then we are dumping all global sql commands in global.dat in plain sql format and we are making a map.dat file with dbname and dboid. For each database, we are making separate subdirectory with dboid under databases directory and dumping as per archive format(d|c|t).
While restoring, first we are restoring all global sql commands from global.dat and then we are restoring one by one all databases. As we are supporting --exclude-database with pg_dumpall, the same we are supporting with pg_restore also to skip restoring on some specified database patterns.
If we want to restore a single database, then we can specided particular subdirectory from the databases folder. To get file name, we refer dbname into map.file.TODO: Now I will work on test cases for these new added options to the pg_dumpall and pg_restore.
Here, I am attaching the v04 patch for testing and review.
Sorry. My mistake.
v04 was the delta patch on the top of v03.Here, I am attaching the v05 patch for testing and review.
Just FWIW, I did a quick test tonight. It applies cleanly, compiles OK. I did a dump:
Thanks for testing and review.
$ pg_dumpall -Fd -f dir
and then a restore (after dropping the databases I had):
$ pg_restore -Cd postgres -v dir
It worked really well. That's great.
Quick thing to fix: you've got this error message:
pg_restore: error: -d/--dbanme should be given when using archive dump of pg_dumpallI guess it is --dbname, rather than --dbanme.
Fixed.
Of course, it needs much more testing, but this feature would be great to have. Thanks for working on this!
Apart from above typo, I fixed some review comments those I received
from Andrew in offline discussion. Thanks Andrew for the quick review.
Here, I am attaching an updated patch for review and testing.
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v06_pg_dumpall-with-directory-tar-custom-format-08-jan.patchapplication/octet-stream; name=v06_pg_dumpall-with-directory-tar-custom-format-08-jan.patchDownload+1124-46
On Thu, 9 Jan 2025 at 08:11, Mahendra Singh Thalor <mahi6run@gmail.com>
wrote:
On Thu, 9 Jan 2025 at 02:30, Guillaume Lelarge <guillaume@lelarge.info>
wrote:
Hi,
Le mer. 8 janv. 2025 à 17:41, Mahendra Singh Thalor <mahi6run@gmail.com>
a écrit :
On Wed, 8 Jan 2025 at 20:07, Mahendra Singh Thalor <mahi6run@gmail.com>
wrote:
Hi all,
On Wed, 8 Jan 2025 at 00:34, Mahendra Singh Thalor <
mahi6run@gmail.com> wrote:
On Mon, 6 Jan 2025 at 23:05, Nathan Bossart <
nathandbossart@gmail.com> wrote:
On Thu, Jan 02, 2025 at 02:05:13AM +0530, Mahendra Singh Thalor
wrote:
Here, I am attaching an updated patch. I fixed some bugs of
v01 patch and
did some code cleanup also.
Thank you for picking this up! I started to review it, but the
documentation changes didn't build, and a few tests in
check-world are
failing. Would you mind resolving those issues? Also, if you
haven't
already, please add an entry to the next commitfest [0] to
ensure that 1)
this feature is tracked and 2) the automated tests will run.
Thanks Nathan for the quick response.
I fixed bugs of documentation changes and check-world in the
latest patch. Now docs are building and check-world is passing.
I added entry into commitfest for this patch.[0]
+ if (dbfile) + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s",
pg_dump_bin,
+ dbfile,
create_opts);
+ appendPQExpBufferStr(&cmd, " -F d "); + }Have you given any thought to allowing a directory of custom
format files,
as discussed upthread [1]? Perhaps that is better handled as a
follow-up
patch, but it'd be good to understand the plan, anyway.
I will make these changes and will test. I will update my findings
after doing some testing.
In the latest patch, I added dump and restoring for
directory/custom/tar/plain formats. Please consider this patch for review
and testing.
Design:
When we give --format=d|c|t then we are dumping all global sql
commands in global.dat in plain sql format and we are making a map.dat file
with dbname and dboid. For each database, we are making separate
subdirectory with dboid under databases directory and dumping as per
archive format(d|c|t).
While restoring, first we are restoring all global sql commands from
global.dat and then we are restoring one by one all databases. As we are
supporting --exclude-database with pg_dumpall, the same we are supporting
with pg_restore also to skip restoring on some specified database patterns.
If we want to restore a single database, then we can specided
particular subdirectory from the databases folder. To get file name, we
refer dbname into map.file.
TODO: Now I will work on test cases for these new added options to
the pg_dumpall and pg_restore.
Here, I am attaching the v04 patch for testing and review.
Sorry. My mistake.
v04 was the delta patch on the top of v03.Here, I am attaching the v05 patch for testing and review.
Just FWIW, I did a quick test tonight. It applies cleanly, compiles OK.
I did a dump:
Thanks for testing and review.
$ pg_dumpall -Fd -f dir
and then a restore (after dropping the databases I had):
$ pg_restore -Cd postgres -v dir
It worked really well. That's great.
Quick thing to fix: you've got this error message:
pg_restore: error: -d/--dbanme should be given when using archive dump
of pg_dumpall
I guess it is --dbname, rather than --dbanme.
Fixed.
Of course, it needs much more testing, but this feature would be great
to have. Thanks for working on this!
Apart from above typo, I fixed some review comments those I received
from Andrew in offline discussion. Thanks Andrew for the quick review.Here, I am attaching an updated patch for review and testing.
Hi all,
Based on some testing(dump was shared by Andrew, Thanks Andrew), I fixed
some more bugs in the attached patch.
There are some open points for this patch. I will put those points in
follow-up patches also.
*Point 1*: With pg_dumpall, we have option --exclude-database=PATTERN, and
there we are validating this PATTERN by server because
we have connection but in pg_restore, we don't have some db connection in
some case so how to handle these patterns? or should we use
only NAMES for --exclude-database ?
*Point 2*:
For each database, we are registering entry to on_exit_nicely array due AH
entry but max size of array is MAX_ON_EXIT_NICELY=20,
so after 20 db restoring, we are getting fatal so either my code needs to
reset this array or do we need to increase array size?
Here, I am attaching an updated patch for review and testing.
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com