pg_dump's results have quite different size

Started by Kaijiang Chenover 9 years ago12 messagesbugs
Jump to latest
#1Kaijiang Chen
chenkaijiang@gmail.com

Hi, I have postgresql 9.4.10, CentOS 6.5, 64 bit. I got the source codes,
built and gmake install.

I pg_dump the database 3 times everyday. The weird thing is that the sizes
of the dump files are sometimes quite different.

For example, the file dumped in 7am is 1.5G and the file dumped in 11am is
0.91G -- about 60% gap! But no massive deletion happened. Only some
insertion/updating (less than 10K) in between 2 dump operations. Un-logical
to have different sizes.

Note that in about 70% cases, file sizes are similar to each other and
keeps increasing a little bit, which is very logical. But sometimes I got
weird things as described above.

Is that a bug? Or something else?

Regards,
Kaijiang

#2Kaijiang Chen
chenkaijiang@gmail.com
In reply to: Kaijiang Chen (#1)
Re: pg_dump's results have quite different size

I found that it is because I run the pg_dump on standby and it conflicts
with the stream replication. I enlarge the max_standby_streaming_delay. It
should work.

On Fri, Dec 9, 2016 at 12:27 PM, Kaijiang Chen <chenkaijiang@gmail.com>
wrote:

Show quoted text

Hi, I have postgresql 9.4.10, CentOS 6.5, 64 bit. I got the source codes,
built and gmake install.

I pg_dump the database 3 times everyday. The weird thing is that the sizes
of the dump files are sometimes quite different.

For example, the file dumped in 7am is 1.5G and the file dumped in 11am is
0.91G -- about 60% gap! But no massive deletion happened. Only some
insertion/updating (less than 10K) in between 2 dump operations. Un-logical
to have different sizes.

Note that in about 70% cases, file sizes are similar to each other and
keeps increasing a little bit, which is very logical. But sometimes I got
weird things as described above.

Is that a bug? Or something else?

Regards,
Kaijiang

#3Shulgin, Oleksandr
oleksandr.shulgin@zalando.de
In reply to: Kaijiang Chen (#2)
Re: pg_dump's results have quite different size

On Dec 14, 2016 06:01, "Kaijiang Chen" <chenkaijiang@gmail.com> wrote:

I found that it is because I run the pg_dump on standby and it conflicts
with the stream replication. I enlarge the max_standby_streaming_delay. It
should work.

Does the conflict cause a fatal error on the pg_dump side?

--
Alex

#4Kaijiang Chen
chenkaijiang@gmail.com
In reply to: Shulgin, Oleksandr (#3)
Re: pg_dump's results have quite different size

Yes. The pg_dump quits with the message:

pg_dump: Dumping the contents of table "data_histories" failed:
PQgetResult() failed.
pg_dump: Error message from server: ERROR: canceling statement due to
conflict with recovery
DETAIL: User query might have needed to see row versions that must be
removed.
pg_dump: The command was: COPY public.data_histories (id, user_id,
user_name, type, type_id, old_data, action, new_data, created_at,
updated_at) TO stdout;

On Wed, Dec 14, 2016 at 4:23 PM, Oleksandr Shulgin <
oleksandr.shulgin@zalando.de> wrote:

Show quoted text

On Dec 14, 2016 06:01, "Kaijiang Chen" <chenkaijiang@gmail.com> wrote:

I found that it is because I run the pg_dump on standby and it conflicts
with the stream replication. I enlarge the max_standby_streaming_delay.
It should work.

Does the conflict cause a fatal error on the pg_dump side?

--
Alex

#5Shulgin, Oleksandr
oleksandr.shulgin@zalando.de
In reply to: Kaijiang Chen (#4)
Re: pg_dump's results have quite different size

On Wed, Dec 14, 2016 at 9:29 AM, Kaijiang Chen <chenkaijiang@gmail.com>
wrote:

Yes. The pg_dump quits with the message:

pg_dump: Dumping the contents of table "data_histories" failed:
PQgetResult() failed.
pg_dump: Error message from server: ERROR: canceling statement due to
conflict with recovery
DETAIL: User query might have needed to see row versions that must be
removed.
pg_dump: The command was: COPY public.data_histories (id, user_id,
user_name, type, type_id, old_data, action, new_data, created_at,
updated_at) TO stdout;

Ah, then it's just that your backup script is broken: it should have
reported the error.

Please do not top-post.
--
Alex

#6Mark Kirkwood
mark.kirkwood@catalyst.net.nz
In reply to: Shulgin, Oleksandr (#5)
Re: pg_dump's results have quite different size

On 14/12/16 22:09, Oleksandr Shulgin wrote:

On Wed, Dec 14, 2016 at 9:29 AM, Kaijiang Chen <chenkaijiang@gmail.com
<mailto:chenkaijiang@gmail.com>> wrote:

Yes. The pg_dump quits with the message:

pg_dump: Dumping the contents of table "data_histories" failed:
PQgetResult() failed.
pg_dump: Error message from server: ERROR: canceling statement
due to conflict with recovery
DETAIL: User query might have needed to see row versions that
must be removed.
pg_dump: The command was: COPY public.data_histories (id, user_id,
user_name, type, type_id, old_data, action, new_data, created_at,
updated_at) TO stdout;

Ah, then it's just that your backup script is broken: it should have
reported the error.

Please do not top-post.

Oleksandr - how about a helpful response? E.g suggesting that maybe
increasing max_standby_streaming_delay might help? Goddamn! folk asking
for help deserve better than just being told 'it is broken dickhead'...

This open src stuff originally started as a way for us all to *help*
each other, let's not forget that!

regards

Mark

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#7Jeff Janes
jeff.janes@gmail.com
In reply to: Mark Kirkwood (#6)
Re: pg_dump's results have quite different size

On Thu, Dec 15, 2016 at 11:18 PM, Mark Kirkwood <
mark.kirkwood@catalyst.net.nz> wrote:

On 14/12/16 22:09, Oleksandr Shulgin wrote:

On Wed, Dec 14, 2016 at 9:29 AM, Kaijiang Chen <chenkaijiang@gmail.com

<mailto:chenkaijiang@gmail.com>> wrote:

Yes. The pg_dump quits with the message:

pg_dump: Dumping the contents of table "data_histories" failed:
PQgetResult() failed.
pg_dump: Error message from server: ERROR: canceling statement
due to conflict with recovery
DETAIL: User query might have needed to see row versions that
must be removed.
pg_dump: The command was: COPY public.data_histories (id, user_id,
user_name, type, type_id, old_data, action, new_data, created_at,
updated_at) TO stdout;

Ah, then it's just that your backup script is broken: it should have
reported the error.

Please do not top-post.

Oleksandr - how about a helpful response? E.g suggesting that maybe
increasing max_standby_streaming_delay might help? Goddamn! folk asking for
help deserve better than just being told 'it is broken dickhead'...

The problem *is* that his backups are broken without him knowing it. Maybe
increasing max_standby_streaming_delay is an answer, but maybe he would
rather have occasional broken backups *which he knows about* then suffer
the consequences of an increased max_standby_streaming_delay.
Maybe hot_standby_feedback would be a better option, or maybe
vacuum_defer_cleanup_age (but that is less likely).

The only thing we actually know with reasonable certainty is that his
backup script is broken, and that this is bad. Randomly changing settings
so that the brokenness is still there but just less obvious is more
dangerous than helpful.

Cheers,

Jeff

#8David G. Johnston
david.g.johnston@gmail.com
In reply to: Jeff Janes (#7)
Re: pg_dump's results have quite different size

On Fri, Dec 16, 2016 at 1:47 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

The only thing we actually know with reasonable certainty is that his
backup script is broken, and that this is bad. Randomly changing settings
so that the brokenness is still there but just less obvious is more
dangerous than helpful.

​Its been suggested before but pg_dump could be a bit more helpful if it
dumped contents to a temporary location first and then, if the dump
completes successfully, moves it to the user-specified location. It would
be a bit of a subtle error if the dump fails silently and the user doesn't
notice that the timestamp of an already existing dump at the named location
didn't change - but if the target location is already unoccupied then the
absence of a file would be very obvious.

David J.

#9Mark Kirkwood
mark.kirkwood@catalyst.net.nz
In reply to: Jeff Janes (#7)
Re: pg_dump's results have quite different size

On 17/12/16 09:47, Jeff Janes wrote:

On Thu, Dec 15, 2016 at 11:18 PM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz <mailto:mark.kirkwood@catalyst.net.nz>>
wrote:

On 14/12/16 22:09, Oleksandr Shulgin wrote:

On Wed, Dec 14, 2016 at 9:29 AM, Kaijiang Chen
<chenkaijiang@gmail.com <mailto:chenkaijiang@gmail.com>
<mailto:chenkaijiang@gmail.com
<mailto:chenkaijiang@gmail.com>>> wrote:

Yes. The pg_dump quits with the message:

pg_dump: Dumping the contents of table "data_histories"
failed:
PQgetResult() failed.
pg_dump: Error message from server: ERROR: canceling statement
due to conflict with recovery
DETAIL: User query might have needed to see row versions that
must be removed.
pg_dump: The command was: COPY public.data_histories (id,
user_id,
user_name, type, type_id, old_data, action, new_data,
created_at,
updated_at) TO stdout;

Ah, then it's just that your backup script is broken: it
should have reported the error.

Please do not top-post.

Oleksandr - how about a helpful response? E.g suggesting that
maybe increasing max_standby_streaming_delay might help? Goddamn!
folk asking for help deserve better than just being told 'it is
broken dickhead'...

The problem *is* that his backups are broken without him knowing it.
Maybe increasing max_standby_streaming_delay is an answer, but maybe
he would rather have occasional broken backups *which he knows about*
then suffer the consequences of an increased
max_standby_streaming_delay. Maybe hot_standby_feedback would be a
better option, or maybe vacuum_defer_cleanup_age (but that is less
likely).

The only thing we actually know with reasonable certainty is that his
backup script is broken, and that this is bad. Randomly changing
settings so that the brokenness is still there but just less obvious
is more dangerous than helpful.

It seems we know quite a lot more, evidenced by the error above. It also
seems clear that he wants his backups to work, not just report errors
surely? That the script should check return codes better (or at all)
sure, that seems to have been emphasized quite sufficiently.

regards

Mark

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#10Kaijiang Chen
chenkaijiang@gmail.com
In reply to: Mark Kirkwood (#9)
Re: pg_dump's results have quite different size

After I enlarged the max_standby_streaming_delay, it works well for some
days (I have to increase the max_standby_streaming_delay when data is
bigger).

Thank you all!

A suggestion might be: pg_dump from the standby is frequently used in
backup tasks; it'll be much better if we have some options to ensure the
pg_dump to finish; for example, an option to temporary stop the
replication? Or do we already have some approach?

On Sat, Dec 17, 2016 at 6:30 AM, Mark Kirkwood <
mark.kirkwood@catalyst.net.nz> wrote:

Show quoted text

On 17/12/16 09:47, Jeff Janes wrote:

On Thu, Dec 15, 2016 at 11:18 PM, Mark Kirkwood <

mark.kirkwood@catalyst.net.nz <mailto:mark.kirkwood@catalyst.net.nz>>
wrote:

On 14/12/16 22:09, Oleksandr Shulgin wrote:

On Wed, Dec 14, 2016 at 9:29 AM, Kaijiang Chen
<chenkaijiang@gmail.com <mailto:chenkaijiang@gmail.com>
<mailto:chenkaijiang@gmail.com
<mailto:chenkaijiang@gmail.com>>> wrote:

Yes. The pg_dump quits with the message:

pg_dump: Dumping the contents of table "data_histories"
failed:
PQgetResult() failed.
pg_dump: Error message from server: ERROR: canceling statement
due to conflict with recovery
DETAIL: User query might have needed to see row versions that
must be removed.
pg_dump: The command was: COPY public.data_histories (id,
user_id,
user_name, type, type_id, old_data, action, new_data,
created_at,
updated_at) TO stdout;

Ah, then it's just that your backup script is broken: it
should have reported the error.

Please do not top-post.

Oleksandr - how about a helpful response? E.g suggesting that
maybe increasing max_standby_streaming_delay might help? Goddamn!
folk asking for help deserve better than just being told 'it is
broken dickhead'...

The problem *is* that his backups are broken without him knowing it.
Maybe increasing max_standby_streaming_delay is an answer, but maybe he
would rather have occasional broken backups *which he knows about* then
suffer the consequences of an increased max_standby_streaming_delay. Maybe
hot_standby_feedback would be a better option, or maybe
vacuum_defer_cleanup_age (but that is less likely).

The only thing we actually know with reasonable certainty is that his
backup script is broken, and that this is bad. Randomly changing settings
so that the brokenness is still there but just less obvious is more
dangerous than helpful.

It seems we know quite a lot more, evidenced by the error above. It also
seems clear that he wants his backups to work, not just report errors
surely? That the script should check return codes better (or at all) sure,
that seems to have been emphasized quite sufficiently.

regards

Mark

#11Stephen Frost
sfrost@snowman.net
In reply to: Kaijiang Chen (#10)
Re: pg_dump's results have quite different size

Greetings,

* Kaijiang Chen (chenkaijiang@gmail.com) wrote:

After I enlarged the max_standby_streaming_delay, it works well for some
days (I have to increase the max_standby_streaming_delay when data is
bigger).

Thank you all!

A suggestion might be: pg_dump from the standby is frequently used in
backup tasks; it'll be much better if we have some options to ensure the
pg_dump to finish; for example, an option to temporary stop the
replication? Or do we already have some approach?

If you don't want long-running transactions on the replica to be killed
then you can enable hot standby feedback, but you'll have to accept that
there may be additional bloat on the master then.

Another option might be to consider alternative backup methods, such as
using physical backup tools instead of pg_dump.

Thanks!

Stephen

#12Mark Kirkwood
mark.kirkwood@catalyst.net.nz
In reply to: Kaijiang Chen (#10)
Re: pg_dump's results have quite different size

On 20/12/16 23:38, Kaijiang Chen wrote:

After I enlarged the max_standby_streaming_delay, it works well for some
days (I have to increase the max_standby_streaming_delay when data is
bigger).

Thank you all!

A suggestion might be: pg_dump from the standby is frequently used in
backup tasks; it'll be much better if we have some options to ensure the
pg_dump to finish; for example, an option to temporary stop the
replication? Or do we already have some approach?

You could perhaps pause replay on the standby before the dump [1]functions pg_xlog_replay_pause() and pg_xlog_replay_resume(), and
resume it afterwards. This is a bit clumsy, but is easy to implement as
part of the dump process.

regards

Mark

[1]: functions pg_xlog_replay_pause() and pg_xlog_replay_resume()

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs