Set AUTOCOMMIT to on in script output by pg_dump

Started by Shinya Katoover 1 year ago15 messages
Jump to latest
#1Shinya Kato
Shinya11.Kato@oss.nttdata.com

Hi hackers!

When SQL scripts created with pg_dump/pg_dumpall/pg_restore are executed
in psql with AUTOCOMMIT turned off, they will not succeed in many cases.
This is because the script contains SQL statements that cannot be
executed within a transaction block.

If you simply add set AUTOCOMMIT on to the scripts created by
pg_dump/pg_dumpall/pg_restore, they will work fine.
A patch is attached

No documentation has been added as we could not find any documentation
on the details in the script.

Do you think?

Regards,
Shinya Kato
NTT DATA GROUP CORPORATION

Attachments:

v1-0001-Set-AUTOCOMMIT-to-on-in-script-output-by-pg_dump.patchtext/x-diff; name=v1-0001-Set-AUTOCOMMIT-to-on-in-script-output-by-pg_dump.patchDownload+13-1
#2Yugo Nagata
nagata@sraoss.co.jp
In reply to: Shinya Kato (#1)
Re: Set AUTOCOMMIT to on in script output by pg_dump

On Wed, 09 Oct 2024 11:10:37 +0900
Shinya Kato <Shinya11.Kato@oss.nttdata.com> wrote:

Hi hackers!

When SQL scripts created with pg_dump/pg_dumpall/pg_restore are executed
in psql with AUTOCOMMIT turned off, they will not succeed in many cases.
This is because the script contains SQL statements that cannot be
executed within a transaction block.

If you simply add set AUTOCOMMIT on to the scripts created by
pg_dump/pg_dumpall/pg_restore, they will work fine.
A patch is attached

No documentation has been added as we could not find any documentation
on the details in the script.

Do you think?

I am not sure if it is good to include psql's meta-command in pg_dump/pg_dumpall
results. Can we assume users will always use psql to restore the pg_dump results?

Regards,
Yugo Nagata

--
Yugo Nagata <nagata@sraoss.co.jp>

#3David G. Johnston
david.g.johnston@gmail.com
In reply to: Yugo Nagata (#2)
Re: Set AUTOCOMMIT to on in script output by pg_dump

On Tuesday, October 8, 2024, Yugo Nagata <nagata@sraoss.co.jp> wrote:

On Wed, 09 Oct 2024 11:10:37 +0900
Shinya Kato <Shinya11.Kato@oss.nttdata.com> wrote:

Hi hackers!

When SQL scripts created with pg_dump/pg_dumpall/pg_restore are executed
in psql with AUTOCOMMIT turned off, they will not succeed in many cases.
This is because the script contains SQL statements that cannot be
executed within a transaction block.

If you simply add set AUTOCOMMIT on to the scripts created by
pg_dump/pg_dumpall/pg_restore, they will work fine.
A patch is attached

I am not sure if it is good to include psql's meta-command in
pg_dump/pg_dumpall
results. Can we assume users will always use psql to restore the pg_dump
results?

Agreed. If we aren’t already outputting psql-only stuff I am a strong -1
for making this the first such case.

It would be nice to describe exactly when there is a problem as well since
very few things require being outside of a transaction. There might be
documentation or code patches possible here to improve matters (like adding
a switch to output begin/commit in the places we’re a user might want
single-transaction behavior) but this approach breaks well-established
encapsulation and overrides user expectations in a bad way (since
autocommit=on is the default they choose to turn it off so turning it back
on silently - not even documented - is bad.)

David J.

#4Tatsuo Ishii
ishii@postgresql.org
In reply to: David G. Johnston (#3)
Re: Set AUTOCOMMIT to on in script output by pg_dump

I am not sure if it is good to include psql's meta-command in
pg_dump/pg_dumpall
results. Can we assume users will always use psql to restore the pg_dump
results?

Agreed. If we aren’t already outputting psql-only stuff I am a strong -1
for making this the first such case.

I think the pg_dumpall output already includes "\connect".

It would be nice to describe exactly when there is a problem as well since
very few things require being outside of a transaction. There might be
documentation or code patches possible here to improve matters (like adding
a switch to output begin/commit in the places we’re a user might want
single-transaction behavior) but this approach breaks well-established
encapsulation and overrides user expectations in a bad way (since
autocommit=on is the default they choose to turn it off so turning it back
on silently - not even documented - is bad.)

+1.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#3)
Re: Set AUTOCOMMIT to on in script output by pg_dump

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Tuesday, October 8, 2024, Yugo Nagata <nagata@sraoss.co.jp> wrote:

On Wed, 09 Oct 2024 11:10:37 +0900
Shinya Kato <Shinya11.Kato@oss.nttdata.com> wrote:

When SQL scripts created with pg_dump/pg_dumpall/pg_restore are executed
in psql with AUTOCOMMIT turned off, they will not succeed in many cases.

Agreed. If we aren’t already outputting psql-only stuff I am a strong -1
for making this the first such case.

I really doubt that this is the only way in which you can break a
pg_dump script by executing it in a non-default psql environment.
We'd likely be better advised to spend some documentation effort
recommending that pg_dump scripts be executed under "psql --no-psqlrc".

If AUTOCOMMIT were a mainstream feature then maybe it'd be worth
doing something about this, but IMO it's a deprecated backwater,
so I'm not very excited about it.

If we do want to do something about it, the patch needs more thought
about where to put the additional output. As an example, it looks
like it breaks the expectation that pg_dump-to-text should generate
output identical to pg_dump-to-archive followed by pg_restore-to-text.

... but this approach breaks well-established
encapsulation and overrides user expectations in a bad way (since
autocommit=on is the default they choose to turn it off so turning it back
on silently - not even documented - is bad.)

That particular angle doesn't bother me so much, because pg_dump
scripts already feel free to change search_path as well as a bunch
of other server parameters.

regards, tom lane

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Lane (#5)
Re: Set AUTOCOMMIT to on in script output by pg_dump

On Tuesday, October 8, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Tuesday, October 8, 2024, Yugo Nagata <nagata@sraoss.co.jp> wrote:

On Wed, 09 Oct 2024 11:10:37 +0900
Shinya Kato <Shinya11.Kato@oss.nttdata.com> wrote:

When SQL scripts created with pg_dump/pg_dumpall/pg_restore are

executed

in psql with AUTOCOMMIT turned off, they will not succeed in many

cases.

Agreed. If we aren’t already outputting psql-only stuff I am a strong -1
for making this the first such case.

I really doubt that this is the only way in which you can break a
pg_dump script by executing it in a non-default psql environment.
We'd likely be better advised to spend some documentation effort
recommending that pg_dump scripts be executed under "psql --no-psqlrc".

+1

Reinforcing that our output script basically assumes a default execution
environment seems worth mentioning even if it seems self-evident once it’s
said.

... but this approach breaks well-established

encapsulation and overrides user expectations in a bad way (since
autocommit=on is the default they choose to turn it off so turning it

back

on silently - not even documented - is bad.)

That particular angle doesn't bother me so much, because pg_dump
scripts already feel free to change search_path as well as a bunch
of other server parameters.

I wasn’t referring to the idea these should be restorable on non-PostgreSQL
systems though, only that if someone wanted to just open a connection in
their rust driver and send this text through that session it will (mostly?)
work.

pg_dumpall, though, is fundamentally tied to psql if databases are dumped,
if the resultant script has to be platform independently executable. I’m
open to a patch addressing this more narrowly but I’m still thinking that
we should be telling the user to use defaults instead of enforcing
ourselves.

David J.

#7Yugo Nagata
nagata@sraoss.co.jp
In reply to: David G. Johnston (#6)
Re: Set AUTOCOMMIT to on in script output by pg_dump

On Tue, 8 Oct 2024 21:37:38 -0700
"David G. Johnston" <david.g.johnston@gmail.com> wrote:

On Tuesday, October 8, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Tuesday, October 8, 2024, Yugo Nagata <nagata@sraoss.co.jp> wrote:

On Wed, 09 Oct 2024 11:10:37 +0900
Shinya Kato <Shinya11.Kato@oss.nttdata.com> wrote:

When SQL scripts created with pg_dump/pg_dumpall/pg_restore are

executed

in psql with AUTOCOMMIT turned off, they will not succeed in many

cases.

Agreed. If we aren’t already outputting psql-only stuff I am a strong -1
for making this the first such case.

I really doubt that this is the only way in which you can break a
pg_dump script by executing it in a non-default psql environment.
We'd likely be better advised to spend some documentation effort
recommending that pg_dump scripts be executed under "psql --no-psqlrc".

+1

Reinforcing that our output script basically assumes a default execution
environment seems worth mentioning even if it seems self-evident once it’s
said.

+1

Regards,
Yugo Nagata

--
Yugo Nagata <nagata@sraoss.co.jp>

#8Shinya Kato
Shinya11.Kato@oss.nttdata.com
In reply to: Yugo Nagata (#7)
Re: Set AUTOCOMMIT to on in script output by pg_dump

Thank you all for the comments!

On 2024-10-09 15:15, Yugo Nagata wrote:

On Tue, 8 Oct 2024 21:37:38 -0700
"David G. Johnston" <david.g.johnston@gmail.com> wrote:

On Tuesday, October 8, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Tuesday, October 8, 2024, Yugo Nagata <nagata@sraoss.co.jp> wrote:

On Wed, 09 Oct 2024 11:10:37 +0900
Shinya Kato <Shinya11.Kato@oss.nttdata.com> wrote:

When SQL scripts created with pg_dump/pg_dumpall/pg_restore are

executed

in psql with AUTOCOMMIT turned off, they will not succeed in many

cases.

Agreed. If we aren’t already outputting psql-only stuff I am a strong -1
for making this the first such case.

I really doubt that this is the only way in which you can break a
pg_dump script by executing it in a non-default psql environment.
We'd likely be better advised to spend some documentation effort
recommending that pg_dump scripts be executed under "psql --no-psqlrc".

+1

Reinforcing that our output script basically assumes a default
execution
environment seems worth mentioning even if it seems self-evident once
it’s
said.

+1

While adding to the documentation is sufficient if users use it
correctly, users often behave unexpectedly. My intention was to
implement it in a way that works without issues even if misused.
However, since the prevailing opinion seems to favor simply updating the
documentation, I will proceed with that approach.

A new patch is attached.
I am not a native English, so corrections to the texts are welcome.

--
Regards,
Shinya Kato
NTT DATA GROUP CORPORATION

Attachments:

v2-0001-doc-Add-no-psqlrc-in-pg_dump-pg_dumpall-docs.patchtext/x-diff; name=v2-0001-doc-Add-no-psqlrc-in-pg_dump-pg_dumpall-docs.patchDownload+27-10
#9Shinya Kato
Shinya11.Kato@oss.nttdata.com
In reply to: Shinya Kato (#8)
Re: Set AUTOCOMMIT to on in script output by pg_dump

On 2024-10-10 14:56, Shinya Kato wrote:

A new patch is attached.
I am not a native English, so corrections to the texts are welcome.

I created a commit fest entry.
https://commitfest.postgresql.org/50/5306/

--
Regards,
Shinya Kato
NTT DATA GROUP CORPORATION

#10Robert Treat
xzilla@users.sourceforge.net
In reply to: Shinya Kato (#8)
Re: Set AUTOCOMMIT to on in script output by pg_dump

On Thu, Oct 10, 2024 at 1:56 AM Shinya Kato
<Shinya11.Kato@oss.nttdata.com> wrote:

Thank you all for the comments!

<snip>

While adding to the documentation is sufficient if users use it
correctly, users often behave unexpectedly. My intention was to
implement it in a way that works without issues even if misused.
However, since the prevailing opinion seems to favor simply updating the
documentation, I will proceed with that approach.

A new patch is attached.
I am not a native English, so corrections to the texts are welcome.

This looks pretty good to me. I think there are a couple of minor
grammar changes that could be made, and I think the pg_dumpall section
could use a couple tweaks, specifically 1) not all incantations of
pg_dumpall emit psql meta commands (-g comes to mind quickly) and ISTR
some non-psql clients honor psql meta commands, so I would lessen the
language around incompatibility, and 2) I think adding an explicit -f
for the database name in the pg_dumpall is clearer, and mirrors the
pg_dump example.

suggested diffs attached, let me know if you would like a consolidated patch

Robert Treat
https://xzilla.net

Attachments:

xzilla-pg_dump-no-psqlrc.diffapplication/octet-stream; name=xzilla-pg_dump-no-psqlrc.diffDownload+5-5
#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Treat (#10)
Re: Set AUTOCOMMIT to on in script output by pg_dump

Robert Treat <rob@xzilla.net> writes:

suggested diffs attached, let me know if you would like a consolidated patch

Sadly, the cfbot is now confused since it doesn't understand the idea
of an incremental patch. Somebody please post a consolidated patch.

For myself, I'd suggest writing the examples with -X not --no-psqlrc.
-X is shorter, it's more likely to be what people would actually
type, and it's not jarringly inconsistent with the adjacent use
of -h and other short-form switches. We can write the added
paragraphs like

It is generally recommended to use the <option>-X</option>
(<option>--no-psqlrc</option>) option when restoring a database ...

to provide clarity about what the switch does.

regards, tom lane

#12Shinya Kato
Shinya11.Kato@oss.nttdata.com
In reply to: Tom Lane (#11)
Re: Set AUTOCOMMIT to on in script output by pg_dump

Hi, thank you for the reviews.

On 2025-01-18 00:45, Robert Treat wrote:

This looks pretty good to me. I think there are a couple of minor
grammar changes that could be made, and I think the pg_dumpall section
could use a couple tweaks, specifically 1) not all incantations of
pg_dumpall emit psql meta commands (-g comes to mind quickly) and ISTR
some non-psql clients honor psql meta commands, so I would lessen the
language around incompatibility, and 2) I think adding an explicit -f
for the database name in the pg_dumpall is clearer, and mirrors the
pg_dump example.

Thanks, I had no reason to oppose it, so I reflected that in the patch.

On 2025-01-18 05:11, Tom Lane wrote:

Robert Treat <rob@xzilla.net> writes:

suggested diffs attached, let me know if you would like a consolidated
patch

Sadly, the cfbot is now confused since it doesn't understand the idea
of an incremental patch. Somebody please post a consolidated patch.

I've attached new consolidated patch.

For myself, I'd suggest writing the examples with -X not --no-psqlrc.
-X is shorter, it's more likely to be what people would actually
type, and it's not jarringly inconsistent with the adjacent use
of -h and other short-form switches. We can write the added
paragraphs like

It is generally recommended to use the <option>-X</option>
(<option>--no-psqlrc</option>) option when restoring a database ...

to provide clarity about what the switch does.

I agree to it and fixed the patch.

--
Regards,
Shinya Kato
NTT DATA GROUP CORPORATION

Attachments:

v3-0001-doc-Add-no-psqlrc-in-pg_dump-pg_dumpall-docs.patchtext/x-diff; name=v3-0001-doc-Add-no-psqlrc-in-pg_dump-pg_dumpall-docs.patchDownload+30-11
#13Robert Treat
xzilla@users.sourceforge.net
In reply to: Shinya Kato (#12)
Re: Set AUTOCOMMIT to on in script output by pg_dump

On Wed, Jan 22, 2025 at 8:02 AM Shinya Kato
<Shinya11.Kato@oss.nttdata.com> wrote:

Hi, thank you for the reviews.

On 2025-01-18 00:45, Robert Treat wrote:

This looks pretty good to me. I think there are a couple of minor
grammar changes that could be made, and I think the pg_dumpall section
could use a couple tweaks, specifically 1) not all incantations of
pg_dumpall emit psql meta commands (-g comes to mind quickly) and ISTR
some non-psql clients honor psql meta commands, so I would lessen the
language around incompatibility, and 2) I think adding an explicit -f
for the database name in the pg_dumpall is clearer, and mirrors the
pg_dump example.

Thanks, I had no reason to oppose it, so I reflected that in the patch.

On 2025-01-18 05:11, Tom Lane wrote:

Robert Treat <rob@xzilla.net> writes:

suggested diffs attached, let me know if you would like a consolidated
patch

Sadly, the cfbot is now confused since it doesn't understand the idea
of an incremental patch. Somebody please post a consolidated patch.

I've attached new consolidated patch.

For myself, I'd suggest writing the examples with -X not --no-psqlrc.
-X is shorter, it's more likely to be what people would actually
type, and it's not jarringly inconsistent with the adjacent use
of -h and other short-form switches. We can write the added
paragraphs like

It is generally recommended to use the <option>-X</option>
(<option>--no-psqlrc</option>) option when restoring a database ...

to provide clarity about what the switch does.

I agree to it and fixed the patch.

LGTM

Robert Treat
https://xzilla.net

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Treat (#13)
Re: Set AUTOCOMMIT to on in script output by pg_dump

Robert Treat <rob@xzilla.net> writes:

On Wed, Jan 22, 2025 at 8:02 AM Shinya Kato
<Shinya11.Kato@oss.nttdata.com> wrote:

I agree to it and fixed the patch.

LGTM

LGTM too. Pushed with a couple of very minor tweaks.

regards, tom lane

#15Shinya Kato
Shinya11.Kato@oss.nttdata.com
In reply to: Tom Lane (#14)
Re: Set AUTOCOMMIT to on in script output by pg_dump

On 2025-01-26 02:45, Tom Lane wrote:

Robert Treat <rob@xzilla.net> writes:

On Wed, Jan 22, 2025 at 8:02 AM Shinya Kato
<Shinya11.Kato@oss.nttdata.com> wrote:

I agree to it and fixed the patch.

LGTM

LGTM too. Pushed with a couple of very minor tweaks.

regards, tom lane

Thank you for pushing!

--
Regards,
Shinya Kato
NTT DATA GROUP CORPORATION