question: data file update when pg_basebackup in progress

Started by Rui Hai Jiangover 8 years ago3 messages
#1Rui Hai Jiang
ruihaijiang@msn.com

Hello,
I'm checking how the pg_basebackup works and I got a question(maybe there are no such issues):

When pg_basebackup is launched, a checkpoint is created first, then all files are transferred to the pg_basebackup client. Is it possible that a data page(say page-N) in a data file is changed after the checkpoint and before the pg_basebackup is finished?

If this happens, is it possible that only part of the changed page be transferred to the pg_basebackup client? i.e. the pg_basebackup client gets page-N with part of the old content and part of the new content. How does postgreSQL handle this kind of data page?

Thanks,
Rui Hai

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Rui Hai Jiang (#1)
Re: question: data file update when pg_basebackup in progress

On Tue, Apr 25, 2017 at 9:08 AM, Rui Hai Jiang <ruihaijiang@msn.com> wrote:

When pg_basebackup is launched, a checkpoint is created first, then all
files are transferred to the pg_basebackup client. Is it possible that a
data page(say page-N) in a data file is changed after the checkpoint and
before the pg_basebackup is finished?

​I believe so.

If this happens, is it possible that only part of the changed page be
transferred to the pg_basebackup client? i.e. the pg_basebackup client
gets page-N with part of the old content and part of the new content. How
does postgreSQL handle this kind of data page?

​The first write to a page after a checkpoint is always recorded in the WAL
as a full page write. Every ​WAL file since the checkpoint must also be
copied to the backed up system. The replay of those WAL files is what
brings the remote and local system into sync with respect to all changes
since the backup checkpoint.

David J.

#3Michael Paquier
michael.paquier@gmail.com
In reply to: David G. Johnston (#2)
Re: question: data file update when pg_basebackup in progress

On Wed, Apr 26, 2017 at 1:45 AM, David G. Johnston
<david.g.johnston@gmail.com> wrote:

The first write to a page after a checkpoint is always recorded in the WAL
as a full page write. Every WAL file since the checkpoint must also be
copied to the backed up system. The replay of those WAL files is what
brings the remote and local system into sync with respect to all changes
since the backup checkpoint.

Bringing to the point that the presence of backup_label in a backup is
critical, as this tells Postgres from which position in WAL it should
begin recovery to bring the system up to a consistent state.
pg_basebackup also makes sure that the last WAL segment needed is
archived before the backup completes so as recovery can completely be
done.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers