Adding pg_dump flag for parallel export to pipes
Hi Hackers,
We are proposing the ability to specify a pipe command to pg_dump by a
flag. And attaching the patch set.
Why : Currently it is quite simple to pipe the output of pg_dump for
text format to a pipe at command line and do any manipulations
necessary. Following is an example :
pg_dump <flags> <dbname> | lz4 | pv -L 10k | ssh remote.host
"cat - > remote.dump.lz4"
Here we first compress the stream using lz4 and then send it over ssh
to a remote host to be saved as a file while rate-limiting the network
usage to 10KB/s.
Something like this is not possible for format=directory (-Fd) since
all you can provide is the directory name to store the individual
files. Note it is not possible to do this irrespective of the usage of
the parallel dump option ('--jobs' flag).
While the directory format supports compression using a flag, the rest
of the operations in the above example are not possible. And a pipe
command provides more flexibility in what compression algorithm one
wants to use.
This patch set provides pg_dump the ability to pipe the data in the
directory mode by using a new flag '--pipe-command' (in both parallel
and non-parallel mode).
We also add a similar option to pg_restore.
The following can be the major use cases of these changes :
1. Stream pg_dump output to a cloud storage
2. SSH the data to a remote host (with or without throttling)
3. Custom compression options
Usage Examples : Here is an example of how the pipe-command will look like.
pg_dump -Fd mydb --pipe-command="cat > dumpdir/%f" (dumpdir
should exist beforehand.)
This is equivalent to
pg_dump -Fd mydb --file=dumpdir
(Please note that the flags '--file' or '--pipe-command' can't be used
together.)
For the more complex scenario as mentioned above, the command will be
(with the parallelism of 5) :
pg_dump -Fd mydb -j 5 --pipe-command="lz4 | pv -L 10k | ssh
remote.host "cat > dumpdir/%f""
Please note the use of %f in the above examples. As a user would
almost always want to write the post-processing output to a file (or
perhaps a cloud location), we provide a format specifier %f in the
command. The implementation of pipe-command replaces these format
specifiers with the corresponding file names. These file names are the
same as they would be in the current usage of directory format with
'--file' flag (<dump_id>.dat, toc.dat, blob_NNN.toc,
blob_<blob_id>.dat).
The usage of this flag with pg_restore will also be similar. Here is
an example of restoring from a gzip compressed dump directory.
pg_restore -C -Fd -d postgres --pipe-commnad="cat
dumpdir/%f.gz | gunzip"
The new flag in pg_restore also works with '-l' and '-L' options
pg_restore -C -Fd -d postgres --pipe-commnad="cat dumpdir/%f" -L db.list
Implementation Details : Here are the major changes :
1. We reuse the same variables which store the file name to store
the pipe command. And add a new bool fSpecIsPipe in _archiveHandle
(similar bools in pg_dump.c and pg_restore.c) to specify if it's a
pipe command.
2. In the cases when the above bool is set to true, we use popen
and pclose instead of fopen and fclose.
3. To enable the format specifier %f in the pipe-command, we make
changes to the file name creation logic in a few places. Currently the
file name (corresponding to a table or large object) is appended to
the directory name provided by '--file' command. In case of
'--pipe-command', we use 'replace_percent_placeholders' to replace %f
with the corresponding file name. This change is made for both table
files and LO TOC files.
With these core changes, the rest of the code continues working as-is.
We are attaching 4 patches for this change :
001-pg_dump_pipe has the pg_dump pipe support code.
002-pg_restore_pipe has the pg_restore pipe support.
003-pg_dump_basic_tests has a few basic validation tests for
correctmflag combinations. We need to write more automated tests in
002_pg_dump.pl but have been running into some issues with environment
setup due to which certain pipe commands result in the shell process
becoming defunct. These same commands are working fine in manual
testing. We are still looking into this.
004-pg_dump_documentation has the proposed documentation changes.
We are working on the above test issues and cleanup of the patches.
Open Questions : There are a couple of open questions in the implementation :
1. Currently the LO TOC file (blob_NNN.toc) is opened in the
append mode. This is not possible with popen for the pipe command.
From reading the code, it seems to us that this file doesn't need to
be opened in the append mode. As '_StartLOs' is called once per
archive entry in WriteDataChunksForToCEntry followed by the dumper
function and then '_EndLOs', it should be okay to change this to 'w'
mode. But this code has been there since the start so we haven't made
that change yet. In the patch, we have changed it to 'w' pipe-command
only and added the ideas for potential solutions in the comments.
2. We are also not sure yet on how to handle the environment
issues when trying to add new tests to 002_pg_dump.pl.
Please let us know what you think.
Thanks & Regards,
Nitin Motiani
Google
Attachments:
002-pg_restore_pipe_v4.patchapplication/octet-stream; name=002-pg_restore_pipe_v4.patchDownload+64-22
001-pg_dump_pipe_v4.patchapplication/octet-stream; name=001-pg_dump_pipe_v4.patchDownload+234-78
004-pg_dump_documentation_v4.patchapplication/octet-stream; name=004-pg_dump_documentation_v4.patchDownload+126-1
003-pg_dump_basic_tests_v4.patchapplication/octet-stream; name=003-pg_dump_basic_tests_v4.patchDownload+36-0
Just to bring this out separately : Does anybody have any idea why pipe
commands close inside tests ?
Re: 003-pg_dump_basic_tests has a few basic validation tests for
correctmflag combinations. We need to write more automated tests in
002_pg_dump.pl but have been running into some issues with environment
setup due to which certain pipe commands result in the shell process
becoming defunct. These same commands are working fine in manual
testing. We are still looking into this.
----
Hannu
On Mon, Apr 7, 2025 at 7:17 PM Nitin Motiani <nitinmotiani@google.com>
wrote:
Show quoted text
Hi Hackers,
We are proposing the ability to specify a pipe command to pg_dump by a
flag. And attaching the patch set.Why : Currently it is quite simple to pipe the output of pg_dump for
text format to a pipe at command line and do any manipulations
necessary. Following is an example :pg_dump <flags> <dbname> | lz4 | pv -L 10k | ssh remote.host
"cat - > remote.dump.lz4"Here we first compress the stream using lz4 and then send it over ssh
to a remote host to be saved as a file while rate-limiting the network
usage to 10KB/s.Something like this is not possible for format=directory (-Fd) since
all you can provide is the directory name to store the individual
files. Note it is not possible to do this irrespective of the usage of
the parallel dump option ('--jobs' flag).While the directory format supports compression using a flag, the rest
of the operations in the above example are not possible. And a pipe
command provides more flexibility in what compression algorithm one
wants to use.This patch set provides pg_dump the ability to pipe the data in the
directory mode by using a new flag '--pipe-command' (in both parallel
and non-parallel mode).We also add a similar option to pg_restore.
The following can be the major use cases of these changes :
1. Stream pg_dump output to a cloud storage
2. SSH the data to a remote host (with or without throttling)
3. Custom compression optionsUsage Examples : Here is an example of how the pipe-command will look like.
pg_dump -Fd mydb --pipe-command="cat > dumpdir/%f" (dumpdir
should exist beforehand.)This is equivalent to
pg_dump -Fd mydb --file=dumpdir
(Please note that the flags '--file' or '--pipe-command' can't be used
together.)For the more complex scenario as mentioned above, the command will be
(with the parallelism of 5) :pg_dump -Fd mydb -j 5 --pipe-command="lz4 | pv -L 10k | ssh
remote.host "cat > dumpdir/%f""Please note the use of %f in the above examples. As a user would
almost always want to write the post-processing output to a file (or
perhaps a cloud location), we provide a format specifier %f in the
command. The implementation of pipe-command replaces these format
specifiers with the corresponding file names. These file names are the
same as they would be in the current usage of directory format with
'--file' flag (<dump_id>.dat, toc.dat, blob_NNN.toc,
blob_<blob_id>.dat).The usage of this flag with pg_restore will also be similar. Here is
an example of restoring from a gzip compressed dump directory.pg_restore -C -Fd -d postgres --pipe-commnad="cat
dumpdir/%f.gz | gunzip"The new flag in pg_restore also works with '-l' and '-L' options
pg_restore -C -Fd -d postgres --pipe-commnad="cat dumpdir/%f" -L
db.listImplementation Details : Here are the major changes :
1. We reuse the same variables which store the file name to store
the pipe command. And add a new bool fSpecIsPipe in _archiveHandle
(similar bools in pg_dump.c and pg_restore.c) to specify if it's a
pipe command.
2. In the cases when the above bool is set to true, we use popen
and pclose instead of fopen and fclose.
3. To enable the format specifier %f in the pipe-command, we make
changes to the file name creation logic in a few places. Currently the
file name (corresponding to a table or large object) is appended to
the directory name provided by '--file' command. In case of
'--pipe-command', we use 'replace_percent_placeholders' to replace %f
with the corresponding file name. This change is made for both table
files and LO TOC files.With these core changes, the rest of the code continues working as-is.
We are attaching 4 patches for this change :
001-pg_dump_pipe has the pg_dump pipe support code.
002-pg_restore_pipe has the pg_restore pipe support.
003-pg_dump_basic_tests has a few basic validation tests for
correctmflag combinations. We need to write more automated tests in
002_pg_dump.pl but have been running into some issues with environment
setup due to which certain pipe commands result in the shell process
becoming defunct. These same commands are working fine in manual
testing. We are still looking into this.
004-pg_dump_documentation has the proposed documentation changes.We are working on the above test issues and cleanup of the patches.
Open Questions : There are a couple of open questions in the
implementation :1. Currently the LO TOC file (blob_NNN.toc) is opened in the
append mode. This is not possible with popen for the pipe command.
From reading the code, it seems to us that this file doesn't need to
be opened in the append mode. As '_StartLOs' is called once per
archive entry in WriteDataChunksForToCEntry followed by the dumper
function and then '_EndLOs', it should be okay to change this to 'w'
mode. But this code has been there since the start so we haven't made
that change yet. In the patch, we have changed it to 'w' pipe-command
only and added the ideas for potential solutions in the comments.
2. We are also not sure yet on how to handle the environment
issues when trying to add new tests to 002_pg_dump.pl.Please let us know what you think.
Thanks & Regards,
Nitin Motiani
If there are no objections we will add this to the commitfest
Show quoted text
On Mon, Apr 7, 2025 at 9:48 PM Hannu Krosing <hannuk@google.com> wrote:
Just to bring this out separately : Does anybody have any idea why pipe commands close inside tests ?
Re: 003-pg_dump_basic_tests has a few basic validation tests for
correctmflag combinations. We need to write more automated tests in
002_pg_dump.pl but have been running into some issues with environment
setup due to which certain pipe commands result in the shell process
becoming defunct. These same commands are working fine in manual
testing. We are still looking into this.----
HannuOn Mon, Apr 7, 2025 at 7:17 PM Nitin Motiani <nitinmotiani@google.com> wrote:
Hi Hackers,
We are proposing the ability to specify a pipe command to pg_dump by a
flag. And attaching the patch set.Why : Currently it is quite simple to pipe the output of pg_dump for
text format to a pipe at command line and do any manipulations
necessary. Following is an example :pg_dump <flags> <dbname> | lz4 | pv -L 10k | ssh remote.host
"cat - > remote.dump.lz4"Here we first compress the stream using lz4 and then send it over ssh
to a remote host to be saved as a file while rate-limiting the network
usage to 10KB/s.Something like this is not possible for format=directory (-Fd) since
all you can provide is the directory name to store the individual
files. Note it is not possible to do this irrespective of the usage of
the parallel dump option ('--jobs' flag).While the directory format supports compression using a flag, the rest
of the operations in the above example are not possible. And a pipe
command provides more flexibility in what compression algorithm one
wants to use.This patch set provides pg_dump the ability to pipe the data in the
directory mode by using a new flag '--pipe-command' (in both parallel
and non-parallel mode).We also add a similar option to pg_restore.
The following can be the major use cases of these changes :
1. Stream pg_dump output to a cloud storage
2. SSH the data to a remote host (with or without throttling)
3. Custom compression optionsUsage Examples : Here is an example of how the pipe-command will look like.
pg_dump -Fd mydb --pipe-command="cat > dumpdir/%f" (dumpdir
should exist beforehand.)This is equivalent to
pg_dump -Fd mydb --file=dumpdir
(Please note that the flags '--file' or '--pipe-command' can't be used
together.)For the more complex scenario as mentioned above, the command will be
(with the parallelism of 5) :pg_dump -Fd mydb -j 5 --pipe-command="lz4 | pv -L 10k | ssh
remote.host "cat > dumpdir/%f""Please note the use of %f in the above examples. As a user would
almost always want to write the post-processing output to a file (or
perhaps a cloud location), we provide a format specifier %f in the
command. The implementation of pipe-command replaces these format
specifiers with the corresponding file names. These file names are the
same as they would be in the current usage of directory format with
'--file' flag (<dump_id>.dat, toc.dat, blob_NNN.toc,
blob_<blob_id>.dat).The usage of this flag with pg_restore will also be similar. Here is
an example of restoring from a gzip compressed dump directory.pg_restore -C -Fd -d postgres --pipe-commnad="cat
dumpdir/%f.gz | gunzip"The new flag in pg_restore also works with '-l' and '-L' options
pg_restore -C -Fd -d postgres --pipe-commnad="cat dumpdir/%f" -L db.list
Implementation Details : Here are the major changes :
1. We reuse the same variables which store the file name to store
the pipe command. And add a new bool fSpecIsPipe in _archiveHandle
(similar bools in pg_dump.c and pg_restore.c) to specify if it's a
pipe command.
2. In the cases when the above bool is set to true, we use popen
and pclose instead of fopen and fclose.
3. To enable the format specifier %f in the pipe-command, we make
changes to the file name creation logic in a few places. Currently the
file name (corresponding to a table or large object) is appended to
the directory name provided by '--file' command. In case of
'--pipe-command', we use 'replace_percent_placeholders' to replace %f
with the corresponding file name. This change is made for both table
files and LO TOC files.With these core changes, the rest of the code continues working as-is.
We are attaching 4 patches for this change :
001-pg_dump_pipe has the pg_dump pipe support code.
002-pg_restore_pipe has the pg_restore pipe support.
003-pg_dump_basic_tests has a few basic validation tests for
correctmflag combinations. We need to write more automated tests in
002_pg_dump.pl but have been running into some issues with environment
setup due to which certain pipe commands result in the shell process
becoming defunct. These same commands are working fine in manual
testing. We are still looking into this.
004-pg_dump_documentation has the proposed documentation changes.We are working on the above test issues and cleanup of the patches.
Open Questions : There are a couple of open questions in the implementation :
1. Currently the LO TOC file (blob_NNN.toc) is opened in the
append mode. This is not possible with popen for the pipe command.
From reading the code, it seems to us that this file doesn't need to
be opened in the append mode. As '_StartLOs' is called once per
archive entry in WriteDataChunksForToCEntry followed by the dumper
function and then '_EndLOs', it should be okay to change this to 'w'
mode. But this code has been there since the start so we haven't made
that change yet. In the patch, we have changed it to 'w' pipe-command
only and added the ideas for potential solutions in the comments.
2. We are also not sure yet on how to handle the environment
issues when trying to add new tests to 002_pg_dump.pl.Please let us know what you think.
Thanks & Regards,
Nitin Motiani
On Tue, Apr 8, 2025 at 7:48 AM Hannu Krosing <hannuk@google.com> wrote:
Just to bring this out separately : Does anybody have any idea why pipe commands close inside tests ?
Re: 003-pg_dump_basic_tests has a few basic validation tests for
correctmflag combinations. We need to write more automated tests in
002_pg_dump.pl but have been running into some issues with environment
setup due to which certain pipe commands result in the shell process
becoming defunct. These same commands are working fine in manual
testing. We are still looking into this.
No comment on the wider project except that it looks generally useful,
and I can see that it's not possible to use the conventional POSIX
filename "-" to represent stdout, because you need to write to
multiple files so you need to come up with *something* along the lines
you're proposing here. But I was interested in seeing if I could help
with that technical problem you mentioned above, and I don't see that
happening with the current patches. Do I understand correctly that
the problem you encountered is in some other tests that you haven't
attached yet? Could you post what you have so that others can see the
problem and perhaps have a chance of helping? I also recommend using
git format-patch when you post patches so that you have a place to
write a commit message including a note about which bits are WIP and
known not to work correctly yet.
Thanks for the feedback, Thomas.
No comment on the wider project except that it looks generally useful,
and I can see that it's not possible to use the conventional POSIX
filename "-" to represent stdout, because you need to write to
multiple files so you need to come up with *something* along the lines
you're proposing here. But I was interested in seeing if I could help
with that technical problem you mentioned above, and I don't see that
happening with the current patches. Do I understand correctly that
the problem you encountered is in some other tests that you haven't
attached yet? Could you post what you have so that others can see the
problem and perhaps have a chance of helping?
Yes, we didn't add the failed tests to the patch. We'll add those and
send new patches.
I also recommend using
git format-patch when you post patches so that you have a place to
write a commit message including a note about which bits are WIP and
known not to work correctly yet.
Will follow these recommendations when sending the next set of patches.
Regards,
Nitin Motiani
Google
Hi,
Apologies for the delay on this thread.
On Mon, Apr 28, 2025 at 1:52 PM Nitin Motiani <nitinmotiani@google.com> wrote:
Thanks for the feedback, Thomas.
Do I understand correctly that
the problem you encountered is in some other tests that you haven't
attached yet? Could you post what you have so that others can see the
problem and perhaps have a chance of helping?Yes, we didn't add the failed tests to the patch. We'll add those and
send new patches.
I'm attaching the patch files generated using git format-patch.
0001 has the pg_dump pipe support code.
0002 has the pg_restore pipe support.
0003 has a few basic validation tests for correct flag combinations.
0004 has the proposed documentation changes.
The above 4 are the same as before.
The 0005 patch is the new WIP patch file. This includes the tests
which we have been trying to add but which are failing (although the
same commands run fine manually).
The tests in this patch are added to src/bin/pg_dump/t/002_pg_dump.pl.
The original attempt was to have a test case with dump and restore
commands using the new flag and run it in multiple scenarios. But
since that was failing, for the ease of debugging I added a few
standalone tests which just run a pg_dump with the pipe-command flag.
In these tests, if the pipe-command is a simple command like 'cat' or
'gzip', the test passes. But if the pipe-command itself uses a pipe
(either to a file or another command), the test fails.
In the following test
['pg_dump', '-Fd', '-B', 'postgres', "--pipe-command=\"cat > $tempdir/%f\"",],]
I get the below error.
# 'sh: line 1: cat >
/usr/local/google/home/nitinmotiani/postgresql/src/bin/pg_dump/tmp_check/tmp_test_XpFO/toc.dat:
No such file or directory
I can see that the temp directory tmp_test_XpFO exists. Even when I
changed the test to use an absolute path to an existing directory, I
got the same error. When I do manual testing with the same
pipe-command, it works fine. That is why we think there is some issue
with our environment setup for the tap test where it is not able to
parse the command.
I also ran the following loop (started just before starting the test
run) to print the output of ps commands around 'cat >' to see what
happens.
for i in $(seq 1 10000); do ps --forest -ef | grep "cat >" -A 5 >>
~/ps_output.txt; done
The printed results showed that the child process with the pipe
command became defunct.
nitinmo+ 3180211 3180160 5 17:05 pts/1 00:00:00 | |
\_ /usr/local/google/home/nitinmotiani/postgresql/tmp_install/usr/local/pgsql/bin/pg_dump
-Fd -B p ostgres --pipe-command="cat >
/usr/local/google/home/nitinmotiani/postgresql/src/bin/pg_dump/definite_dumpdir/%f"
nitinmo+ 3180215 3180211 0 17:05 pts/1 00:00:00 | |
\_ [sh] <defunct>
We are not sure how to handle this issue. Please let us know your thoughts.
Thanks & Regards,
Nitin Motiani
Google
Attachments:
v5-0003-Add-basic-tests-for-pipe-command.patchapplication/octet-stream; name=v5-0003-Add-basic-tests-for-pipe-command.patchDownload+36-1
v5-0002-Add-pipe-command-support-in-pg_restore.patchapplication/octet-stream; name=v5-0002-Add-pipe-command-support-in-pg_restore.patchDownload+64-23
v5-0005-WIP-This-is-WIP-patch-for-adding-tests-to-pg_dump.patchapplication/octet-stream; name=v5-0005-WIP-This-is-WIP-patch-for-adding-tests-to-pg_dump.patchDownload+38-1
v5-0001-Add-pipe-command-support-for-directory-mode-of-pg.patchapplication/octet-stream; name=v5-0001-Add-pipe-command-support-for-directory-mode-of-pg.patchDownload+216-58
v5-0004-Add-documentation-for-pipe-command-in-pg_dump-and.patchapplication/octet-stream; name=v5-0004-Add-documentation-for-pipe-command-in-pg_dump-and.patchDownload+123-2
I have added this to the commitfest
We would be grateful for any reviews and feedback on this.
When adding to commitfest I tried to put Nitin as "first author" as he
has done the bulk of the work (I did just a quick pg_dump-only PoC)
but it looks like Commitfest just orders all provided authors
alphabetically .
Hi,
Very interesting patch. One question: is it possible with this patch to pipe pg_dump directory output directly into pg_restore with this patch? Looking at the code I don't believe that is the case but figured I would ask.
Thanks,
Andrew Jackson
Hi,
Went ahead and experimented with your patch a bit. To answer my previous question this patch can be used to pipe pg_dump directly into pg_restore. This should absolutely be added as another use case to your list above as it is a well known limitation that you can use pg_dump/psql to do buffered copy but only with a single process, while using pg_dump/pg_restore is capable of multiprocessed copy but it must be saved to disk in its entirety before the restore can begin. This is extremely frustrating when dealing with large databases where you don't want multiple copies saved on disk and because it's not as fast as it can be. With this patch you can get the best of both worlds.
Example dump
```bash
pg_dump --jobs=4 -Fd "${connection_str}" --pipe-command="mkfifo dumpdir/%f; cat >> dumpdir/%f"
```
Example restore run in different process
```bash
pg_restore --jobs=4 -Fd --dbname="${another_connection_str}" ./dumpdir
```
Thanks,
Andrew Jackson
On Thu, Jun 5, 2025 at 6:09 PM Nitin Motiani <nitinmotiani@google.com> wrote:
Hi,
Apologies for the delay on this thread.
On Mon, Apr 28, 2025 at 1:52 PM Nitin Motiani <nitinmotiani@google.com> wrote:
Thanks for the feedback, Thomas.
Do I understand correctly that
the problem you encountered is in some other tests that you haven't
attached yet? Could you post what you have so that others can see the
problem and perhaps have a chance of helping?Yes, we didn't add the failed tests to the patch. We'll add those and
send new patches.I'm attaching the patch files generated using git format-patch.
0001 has the pg_dump pipe support code.
0002 has the pg_restore pipe support.
0003 has a few basic validation tests for correct flag combinations.
0004 has the proposed documentation changes.The above 4 are the same as before.
The 0005 patch is the new WIP patch file. This includes the tests
which we have been trying to add but which are failing (although the
same commands run fine manually).The tests in this patch are added to src/bin/pg_dump/t/002_pg_dump.pl.
The original attempt was to have a test case with dump and restore
commands using the new flag and run it in multiple scenarios. But
since that was failing, for the ease of debugging I added a few
standalone tests which just run a pg_dump with the pipe-command flag.
In these tests, if the pipe-command is a simple command like 'cat' or
'gzip', the test passes. But if the pipe-command itself uses a pipe
(either to a file or another command), the test fails.In the following test
['pg_dump', '-Fd', '-B', 'postgres', "--pipe-command=\"cat > $tempdir/%f\"",],]
I get the below error.
# 'sh: line 1: cat >
/usr/local/google/home/nitinmotiani/postgresql/src/bin/pg_dump/tmp_check/tmp_test_XpFO/toc.dat:
No such file or directoryI can see that the temp directory tmp_test_XpFO exists. Even when I
changed the test to use an absolute path to an existing directory, I
got the same error. When I do manual testing with the same
pipe-command, it works fine. That is why we think there is some issue
with our environment setup for the tap test where it is not able to
parse the command.I also ran the following loop (started just before starting the test
run) to print the output of ps commands around 'cat >' to see what
happens.for i in $(seq 1 10000); do ps --forest -ef | grep "cat >" -A 5 >>
~/ps_output.txt; doneThe printed results showed that the child process with the pipe
command became defunct.nitinmo+ 3180211 3180160 5 17:05 pts/1 00:00:00 | |
\_ /usr/local/google/home/nitinmotiani/postgresql/tmp_install/usr/local/pgsql/bin/pg_dump
-Fd -B p ostgres --pipe-command="cat >
/usr/local/google/home/nitinmotiani/postgresql/src/bin/pg_dump/definite_dumpdir/%f"
nitinmo+ 3180215 3180211 0 17:05 pts/1 00:00:00 | |
\_ [sh] <defunct>We are not sure how to handle this issue. Please let us know your thoughts.
The latest patch set is not applying on HEAD can you rebase the patch
set. And also there are many TODOs in the patch, if those TODOs are
just good to do and you are planning for future development better to
get rid of those. OTOH if some of those TODOs are mandatory to do
before we can commit the patch then are you planning to work on those
soon? I am planning to review this patch so are you planning to send
the rebased version with implementing the TODO which are required for
the first version.
--
Regards,
Dilip Kumar
Google
On Tue, Sep 9, 2025 at 12:07 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
The latest patch set is not applying on HEAD can you rebase the patch
set. And also there are many TODOs in the patch, if those TODOs are
just good to do and you are planning for future development better to
get rid of those. OTOH if some of those TODOs are mandatory to do
before we can commit the patch then are you planning to work on those
soon? I am planning to review this patch so are you planning to send
the rebased version with implementing the TODO which are required for
the first version.
Thanks for the feedback, Dilip. We will rebase the patch soon and send
it. Regarding the TODOs, some of those are plans for future
development (i.e. refactor). There are also TODOs in the first patch
file 0001 which are actually removed in file 0002. We can clean those
up or combine the two files. Other than that, some are about the open
questions. We will remove those from the code and will discuss those
issues on the thread.
Thanks,
Nitin Motiani
Google
Hi Dilip,
I have rebased the patches and cleaned up the TODOs. Please take a
look. The fifth patch is still WIP because I'm working on figuring out
the test failure. I'll report my findings on that soon.
Thanks,
Nitin Motiani
Google
Show quoted text
On Tue, Sep 9, 2025 at 8:41 PM Nitin Motiani <nitinmotiani@google.com> wrote:
On Tue, Sep 9, 2025 at 12:07 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
The latest patch set is not applying on HEAD can you rebase the patch
set. And also there are many TODOs in the patch, if those TODOs are
just good to do and you are planning for future development better to
get rid of those. OTOH if some of those TODOs are mandatory to do
before we can commit the patch then are you planning to work on those
soon? I am planning to review this patch so are you planning to send
the rebased version with implementing the TODO which are required for
the first version.Thanks for the feedback, Dilip. We will rebase the patch soon and send
it. Regarding the TODOs, some of those are plans for future
development (i.e. refactor). There are also TODOs in the first patch
file 0001 which are actually removed in file 0002. We can clean those
up or combine the two files. Other than that, some are about the open
questions. We will remove those from the code and will discuss those
issues on the thread.Thanks,
Nitin Motiani
Attachments:
v6-0003-Add-basic-tests-for-pipe-command.patchapplication/x-patch; name=v6-0003-Add-basic-tests-for-pipe-command.patchDownload+36-1
v6-0004-Add-documentation-for-pipe-command-in-pg_dump-and.patchapplication/x-patch; name=v6-0004-Add-documentation-for-pipe-command-in-pg_dump-and.patchDownload+123-2
v6-0002-Add-pipe-command-support-in-pg_restore.patchapplication/x-patch; name=v6-0002-Add-pipe-command-support-in-pg_restore.patchDownload+62-17
v6-0005-WIP-This-is-WIP-patch-for-adding-tests-to-pg_dump.patchapplication/x-patch; name=v6-0005-WIP-This-is-WIP-patch-for-adding-tests-to-pg_dump.patchDownload+38-1
v6-0001-Add-pipe-command-support-for-directory-mode-of-pg.patchapplication/x-patch; name=v6-0001-Add-pipe-command-support-for-directory-mode-of-pg.patchDownload+220-57
On Tue, Jan 13, 2026 at 4:20 PM Nitin Motiani <nitinmotiani@google.com> wrote:
Hi Dilip,
I have rebased the patches and cleaned up the TODOs. Please take a
look. The fifth patch is still WIP because I'm working on figuring out
the test failure. I'll report my findings on that soon.
Attaching the new version after rebasing and fixing a fallthrough bug
introduced in v6.
Thanks,
Nitin Motiani
Google
Attachments:
v7-0004-Add-documentation-for-pipe-command-in-pg_dump-and.patchapplication/x-patch; name=v7-0004-Add-documentation-for-pipe-command-in-pg_dump-and.patchDownload+123-2
v7-0001-Add-pipe-command-support-for-directory-mode-of-pg.patchapplication/x-patch; name=v7-0001-Add-pipe-command-support-for-directory-mode-of-pg.patchDownload+221-57
v7-0005-WIP-This-is-WIP-patch-for-adding-tests-to-pg_dump.patchapplication/x-patch; name=v7-0005-WIP-This-is-WIP-patch-for-adding-tests-to-pg_dump.patchDownload+38-1
v7-0003-Add-basic-tests-for-pipe-command.patchapplication/x-patch; name=v7-0003-Add-basic-tests-for-pipe-command.patchDownload+36-1
v7-0002-Add-pipe-command-support-in-pg_restore.patchapplication/x-patch; name=v7-0002-Add-pipe-command-support-in-pg_restore.patchDownload+62-17
Hi,
I fixed the failing tap test. Attaching rebased patches along with the
new test. The last patch is still WIP because I need to add more test
cases.
Thanks,
Nitin Motiani
Google
Attachments:
v8-0003-Add-basic-tests-for-pipe-command.patchapplication/x-patch; name=v8-0003-Add-basic-tests-for-pipe-command.patchDownload+36-1
v8-0005-POC-Add-tests-to-pg_dump.pl.patchapplication/x-patch; name=v8-0005-POC-Add-tests-to-pg_dump.pl.patchDownload+19-1
v8-0004-Add-documentation-for-pipe-command-in-pg_dump-and.patchapplication/x-patch; name=v8-0004-Add-documentation-for-pipe-command-in-pg_dump-and.patchDownload+123-2
v8-0002-Add-pipe-command-support-in-pg_restore.patchapplication/x-patch; name=v8-0002-Add-pipe-command-support-in-pg_restore.patchDownload+62-17
v8-0001-Add-pipe-command-support-for-directory-mode-of-pg.patchapplication/x-patch; name=v8-0001-Add-pipe-command-support-for-directory-mode-of-pg.patchDownload+220-57
Rebased and added some extra logs for testing.
Thanks
Attachments:
v9-0005-POC-Add-tests-to-pg_dump.pl.patchapplication/x-patch; name=v9-0005-POC-Add-tests-to-pg_dump.pl.patchDownload+28-2
v9-0004-Add-documentation-for-pipe-command-in-pg_dump-and.patchapplication/x-patch; name=v9-0004-Add-documentation-for-pipe-command-in-pg_dump-and.patchDownload+123-2
v9-0001-Add-pipe-command-support-for-directory-mode-of-pg.patchapplication/x-patch; name=v9-0001-Add-pipe-command-support-for-directory-mode-of-pg.patchDownload+223-60
v9-0002-Add-pipe-command-support-in-pg_restore.patchapplication/x-patch; name=v9-0002-Add-pipe-command-support-in-pg_restore.patchDownload+72-28
v9-0003-Add-basic-tests-for-pipe-command.patchapplication/x-patch; name=v9-0003-Add-basic-tests-for-pipe-command.patchDownload+36-1
On Wed, 4 Mar 2026 at 15:27, Nitin Motiani <nitinmotiani@google.com> wrote:
Rebased and added some extra logs for testing.
Thanks
Thanks Nitin for the updated patch.
+1 for this idea.
Here are my few review comments.
*Comment1*:
02 is not applying so this should be rebased.
fix:
if (!no_globals)
n_errors = restore_global_objects(global_path,
tmpopts, filespec_is_pipe);
*Comment2*:
03 patch has test cases. Please change these error messages as per other
errors in code.
+command_fails_like(
+ [ 'pg_dump', '-Fd', '--pipe-command="cat"', '--compress=lz4',
'test'],
+ qr/\Qpg_dump: hint: Option --pipe-command is not supported with any
compression type\E/,
+ 'pg_dump: hint: Option --pipe-command is not supported with any
compression type'
+);
Above should be:
+command_fails_like(
+ [ 'pg_dump', '-Fd', '--pipe-command="cat"', '--compress=lz4',
'test'],
+ qr/\Qpg_dump: error: option --pipe-command is not supported with
any compression type\E/,
+ 'pg_dump option --pipe-command is not supported with any
compression type'
+);
*Comment3*:
Please can we rename "--pipe-command" to "--pipe" only.
*Commen4*: --pipe or --pipe-command should be added into the usage function
in pg_dump.c and pg_restore.c files.
*Comment5*: I think we can support this new pipe option with pg_dumpall
also as we support directory mode in pg_dumpall from v19.
*Comment6:*
mst@localhost:~/pg60/postgres/inst/bin$ ./pg_dump -Fd --pipe-command="a"
-d postgres
sh: line 1: a: command not found
pg_dump: error: could not close file: Success
pg_dump: error: could not close TOC file: Success
mst@localhost:~/pg60/postgres/inst/bin$
we should add more processing for the "--pipe-command" argument so that we
don't get the above error. I think we should validate the path.
*Comment7*: + bool path_is_pipe_command)
I think we can rename to is_pipe, similarly filespec_is_pipe ->
is_pipe, fSpecIsPipe->is_pipe
*Comment8*:
+ This option is only supported with the directory output + format. It can be used to write to multiple streams which + otherwise would not be possible with the directory mode. + For each stream, it starts a process which runs the + specified command and pipes the pg_dump output to this + process. + This option is not valid if <option>--file</option> + is also specified. + </para> + <para> + The pipe-command can be used to perform operations like compress + using a custom algorithm, filter, or write the output to a cloud + storage etc. The user would need a way to pipe the final output of + each stream to a file. To handle that, the pipe command supports a format + specifier %f. And all the instances of %f in the command string + will be replaced with the corresponding file name which + would have been used in the directory mode with <option>--file</option>. + See <xref linkend="pg-dump-examples"/> below. + </para> + </listitem> + </varlistentry>
please use 80 chars in all lines.
*Comment9*:
mst@localhost:~/pg60/postgres/inst/bin$ ./pg_restore dumpdir
--pipe-command="cat"
pg_restore: hint: Only one of [filespec, --pipe-command] allowed
Above error should be similar to the other errors in pg_restore.
*Comment10*:
mst@localhost:~/pg60/postgres/inst/bin$ ./pg_restore --pipe-command="cat"
-d postgres --format=tar
pg_restore: error: could not open TOC file "cat" for input: No such file or
directory
mst@localhost:~/pg60/postgres/inst/bin$
we should add an error for non-directory format. If no format is specified,
then we can continue or for unknown format also we can report an error.
I will do some more review and testing.
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
On Sat, Mar 14, 2026 at 10:37 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:
Thanks for reviewing this.
Comment3:
Please can we rename "--pipe-command" to "--pipe" only.
the original choice was inspired by the COPY command which has
COPY ... TO { 'filename' | PROGRAM 'command' | STDOUT }
COPY ... FROM { 'filename' | PROGRAM 'command' | STDIN }
and the lack of TO or FROM, so PROGRAM 'command' became --pipe-command
Other options could be --to-pipe for pg_dump and --from-pipe for pg_restore.
Or --to-program , --from-program to be closer to COPY syntax,
Or even --pipe-command-pattern to signify that there is some name
replacement done for each invocation
But thinking more about it the option name does indeed not need to be
fully self-documenting :)
So I would be ok to shortening it to just --pipe
---
Regards
Hannu