Proposed doc-patch: Identifying the Current WAL file

bruce@momjian.us

about 20 years ago

In reply to: Richard Huxton (#1)

Re: Proposed doc-patch: Identifying the Current WAL file

Richard Huxton wrote:

Reading Tom's posting here:
http://archives.postgresql.org/pgsql-general/2006-04/msg00499.php

I just realised we don't seem to mention this in the docs anywhere. I
propose adding a short paragraph to 23.3.1
http://www.postgresql.org/docs/8.1/static/backup-online.html#BACKUP-ONLINE-CAVEATS

After "Except in the case of retrying a failure, it will be called only
once for any given file name."

To identify the current, partially-filled WAL segment, sort first by
mtime and second by file name. That is, take the latest mtime among the
properly-named files, breaking ties by taking the higher filename.

I am confused by this. Why do both mtime and file name need to be
checked?

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

tgl@sss.pgh.pa.us

about 20 years ago

In reply to: Bruce Momjian (#2)

Re: Proposed doc-patch: Identifying the Current WAL file

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Richard Huxton wrote:

To identify the current, partially-filled WAL segment, sort first by
mtime and second by file name. That is, take the latest mtime among the
properly-named files, breaking ties by taking the higher filename.

I am confused by this. Why do both mtime and file name need to be
checked?

Because recycled WAL segments are renamed to have higher file names than
the currently-in-use segment. So you can't depend on file name first.
However, shortly after a segment switch two WAL segments could have the
same mtime (to within whatever the mtime granularity is, typ. 1 second).

The proposed rule should be OK as long as checkpoints (and ensuing
renames) can't occur oftener than the mtime granularity. If you're
checkpointing more than once a second, well, you need help ...

regards, tom lane

bruce@momjian.us

about 20 years ago

In reply to: Tom Lane (#3)

Re: Proposed doc-patch: Identifying the Current WAL file

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Richard Huxton wrote:

To identify the current, partially-filled WAL segment, sort first by
mtime and second by file name. That is, take the latest mtime among the
properly-named files, breaking ties by taking the higher filename.

I am confused by this. Why do both mtime and file name need to be
checked?

Because recycled WAL segments are renamed to have higher file names than
the currently-in-use segment. So you can't depend on file name first.
However, shortly after a segment switch two WAL segments could have the
same mtime (to within whatever the mtime granularity is, typ. 1 second).

The proposed rule should be OK as long as checkpoints (and ensuing
renames) can't occur oftener than the mtime granularity. If you're
checkpointing more than once a second, well, you need help ...

I am trying to figure out how this could even be done in a shell script.
'ls -lt' is going to show:

-rw-r--r-- 1 root postgres 0 Apr 15 11:56 x1
-rw-r--r-- 1 root postgres 0 Apr 15 11:56 x2
-rw-r--r-- 1 root postgres 0 Apr 15 11:56 x3

but this might be with second resolution:

-rw-r--r-- 1 root postgres 0 Apr 15 11:56:47 x1
-rw-r--r-- 1 root postgres 0 Apr 15 11:56:47 x2
-rw-r--r-- 1 root postgres 0 Apr 15 11:56:34 x3

or it might be:

-rw-r--r-- 1 root postgres 0 Apr 15 11:56:47 x1
-rw-r--r-- 1 root postgres 0 Apr 15 11:56:34 x2
-rw-r--r-- 1 root postgres 0 Apr 15 11:56:25 x3

In the first case, x2 is current, having be just switched to from x1,
while in the second case, x1 is current. In BSD, you can use ls -ltT to
see the seconds, but in Linux it is something different, and I am sure
there are some operating systems that don't allow you to see the seconds
at all. What general command-line solution can we propose for this
process? And if we can't provide one, should we supply an SQL function
to return the current WAL name?

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

tgl@sss.pgh.pa.us

about 20 years ago

In reply to: Bruce Momjian (#4)

Re: Proposed doc-patch: Identifying the Current WAL file

Bruce Momjian <pgman@candle.pha.pa.us> writes:

In the first case, x2 is current, having be just switched to from x1,
while in the second case, x1 is current. In BSD, you can use ls -ltT to
see the seconds, but in Linux it is something different, and I am sure
there are some operating systems that don't allow you to see the seconds
at all. What general command-line solution can we propose for this
process?

For a command-line solution it's probably sufficient to sort by mtime,
ie
ls -t | head -1

You'll be at worst 1 second behind reality, assuming 1-second
granularity of mtime (and assuming ls sorts by the real mtime not what
it shows you, but that's true everywhere AFAIK).

regards, tom lane

http://archives.postgresql.org/pgsql-admin/2005-10/msg00173.php

jeff@frostconsultingllc.com

about 20 years ago

In reply to: Tom Lane (#5)

Re: Proposed doc-patch: Identifying the Current WAL file

On Sat, 15 Apr 2006, Tom Lane wrote:

For a command-line solution it's probably sufficient to sort by mtime,
ie
ls -t | head -1

A while back when I was trying to work this out on the admin list, I believe
we came up with the following:

ls -tp /pg_xlog/ | grep -v "backup\|/" | head -1

which seems to work fairly well. Looks like that thread is here:

--
Jeff Frost, Owner <jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

bruce@momjian.us

about 20 years ago

In reply to: Jeff Frost (#6)

Re: Proposed doc-patch: Identifying the Current WAL file

Jeff Frost wrote:

On Sat, 15 Apr 2006, Tom Lane wrote:

For a command-line solution it's probably sufficient to sort by mtime,
ie
ls -t | head -1

A while back when I was trying to work this out on the admin list, I believe
we came up with the following:

ls -tp /pg_xlog/ | grep -v "backup\|/" | head -1

which seems to work fairly well. Looks like that thread is here:

http://archives.postgresql.org/pgsql-admin/2005-10/msg00173.php

What does the -p and \| pipe check do? We don't have named pipes in
that directory, do we?

Also, what happens if the log switch happens, and some data change is
written to the new WAL file in the first second, but nothing happens to
the database after that for a minute? Your test would still show the
old log file.

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

jeff@frostconsultingllc.com

about 20 years ago

In reply to: Bruce Momjian (#7)

Re: Proposed doc-patch: Identifying the Current WAL file

On Sat, 15 Apr 2006, Bruce Momjian wrote:

ls -tp /pg_xlog/ | grep -v "backup\|/" | head -1

which seems to work fairly well. Looks like that thread is here:

http://archives.postgresql.org/pgsql-admin/2005-10/msg00173.php

What does the -p and \| pipe check do? We don't have named pipes in
that directory, do we?

Also, what happens if the log switch happens, and some data change is
written to the new WAL file in the first second, but nothing happens to
the database after that for a minute? Your test would still show the
old log file.

The -p shows forward slashes after directories and the \| acts as an or and
the / following that just lets us filter directories out. I added that
because I used to find the archive_status directory winning the ls -t from
time to time.

Now about your what if question. I don't know if there is a way to get past
the mtime granularity. If I understand your scenario correctly, you indicate
that the previous log is written to and the new log is switched in during the
same mtime second. I did a quick test on linux to see how that operates:

touch AA AB

This yields both files with the same mtime.

ls -tp |head -1

yields AA as you suggested it would.

The following seems to do the trick if we can rely on alphabetizing to
properly decide the winner of a tie:

ls -tp | head -2 | sort -r | head -1

So, with my previous example, it would look like:

--
Jeff Frost, Owner <jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

tgl@sss.pgh.pa.us

about 20 years ago

In reply to: Jeff Frost (#8)

Re: Proposed doc-patch: Identifying the Current WAL file

Jeff Frost <jeff@frostconsultingllc.com> writes:

ls -tp /pg_xlog/ | grep -v "backup\|/" | head -1

What does the -p and \| pipe check do? We don't have named pipes in
that directory, do we?

The -p shows forward slashes after directories and the \| acts as an or and
the / following that just lets us filter directories out.

This seems both overly cute and wrong, because it fails to filter plain
files that might have a new mtime but aren't WAL files. I'd suggest a
simple test on file name to make sure it looks like a WAL file, ie,
24 hex digits.

ls -t .../pg_xlog | grep '^[0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]$' | head -1

regards, tom lane

#10

tgl@sss.pgh.pa.us

about 20 years ago

In reply to: Bruce Momjian (#7)

Re: Proposed doc-patch: Identifying the Current WAL file

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Also, what happens if the log switch happens, and some data change is
written to the new WAL file in the first second, but nothing happens to
the database after that for a minute? Your test would still show the
old log file.

You seem to be assuming that ls will sort on the basis of the truncated
mtime that it displays, which is not the actual behavior of ls AFAIK.

regards, tom lane

#11

bruce@momjian.us

about 20 years ago

In reply to: Tom Lane (#10)

Re: Proposed doc-patch: Identifying the Current WAL file

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Also, what happens if the log switch happens, and some data change is
written to the new WAL file in the first second, but nothing happens to
the database after that for a minute? Your test would still show the
old log file.

You seem to be assuming that ls will sort on the basis of the truncated
mtime that it displays, which is not the actual behavior of ls AFAIK.

No, I am not:

$ touch x1 x2; touch x2
$ sleep 2; ls -lt
total 0
-rw-r--r-- 1 root postgres 0 Apr 15 14:04 x1
-rw-r--r-- 1 root postgres 0 Apr 15 14:04 x2

If the write to x2 happens in the first second, but no later writes
happen, you still see x1 as first, even though x2 is the new one and
might have WAL data in it. The point is that the test does not have a
one-second window of showing the wrong answer, meaning I could wait for
60 seconds, and still see the wrong WAL file at the top.

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#12

jeff@frostconsultingllc.com

about 20 years ago

In reply to: Tom Lane (#10)

Re: Proposed doc-patch: Identifying the Current WAL file

On Sat, 15 Apr 2006, Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Also, what happens if the log switch happens, and some data change is
written to the new WAL file in the first second, but nothing happens to
the database after that for a minute? Your test would still show the
old log file.

You seem to be assuming that ls will sort on the basis of the truncated
mtime that it displays, which is not the actual behavior of ls AFAIK.

I believe you're correct, at least with the ls I have here on my linux system.
I created two files quickly with touch. Here is the stat output:

File: `AA'
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 802h/2050d Inode: 2736263 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 26/postgres) Gid: ( 26/postgres)
Access: 2006-04-15 11:02:46.000000000 -0700
Modify: 2006-04-15 11:02:46.000000000 -0700
Change: 2006-04-15 11:02:46.000000000 -0700
File: `AB'
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 802h/2050d Inode: 2736264 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 26/postgres) Gid: ( 26/postgres)
Access: 2006-04-15 11:02:48.000000000 -0700
Modify: 2006-04-15 11:02:48.000000000 -0700
Change: 2006-04-15 11:02:48.000000000 -0700

ls -t | head -1
AB

So it looks like the original was correct all along, but with Tom's regex it's
much cleaner. Too bad grep's regex engine doesn't support {24}.

I'll change all my scripts to use the following:

ls -t /pg_xlog/ | grep '^[0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]$' | head -1

Thanks Tom!

--
Jeff Frost, Owner <jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

#13

jeff@frostconsultingllc.com

about 20 years ago

In reply to: Bruce Momjian (#11)

Re: Proposed doc-patch: Identifying the Current WAL file

On Sat, 15 Apr 2006, Bruce Momjian wrote:

Tom Lane wrote:

No, I am not:

$ touch x1 x2; touch x2
$ sleep 2; ls -lt
total 0
-rw-r--r-- 1 root postgres 0 Apr 15 14:04 x1
-rw-r--r-- 1 root postgres 0 Apr 15 14:04 x2

If the write to x2 happens in the first second, but no later writes
happen, you still see x1 as first, even though x2 is the new one and
might have WAL data in it. The point is that the test does not have a
one-second window of showing the wrong answer, meaning I could wait for
60 seconds, and still see the wrong WAL file at the top.

Bruce, what does

stat x1 x2

look like on your system? Which OS? Maybe we need caveats for various OSes?

--
Jeff Frost, Owner <jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

#14

bruce@momjian.us

about 20 years ago

In reply to: Jeff Frost (#13)

Re: Proposed doc-patch: Identifying the Current WAL file

Jeff Frost wrote:

On Sat, 15 Apr 2006, Bruce Momjian wrote:

Tom Lane wrote:

No, I am not:

$ touch x1 x2; touch x2
$ sleep 2; ls -lt
total 0
-rw-r--r-- 1 root postgres 0 Apr 15 14:04 x1
-rw-r--r-- 1 root postgres 0 Apr 15 14:04 x2

If the write to x2 happens in the first second, but no later writes
happen, you still see x1 as first, even though x2 is the new one and
might have WAL data in it. The point is that the test does not have a
one-second window of showing the wrong answer, meaning I could wait for
60 seconds, and still see the wrong WAL file at the top.

Bruce, what does

stat x1 x2

look like on your system? Which OS? Maybe we need caveats for various OSes?

System is BSD/OS:

$ touch x1 x2; stat x1 x2
filename: x1
inode: 4547362
device: 18,7
size: 0
type: regular file
links: 1
mode: 0644(-rw-r--r--)
owner: 0(root)
group: 102(postgres)
access: Sat Apr 15 14:41:12 2006
modification: Sat Apr 15 14:41:12 2006
change: Sat Apr 15 14:41:12 2006

filename: x2
inode: 4547363
device: 18,7
size: 0
type: regular file
links: 1
mode: 0644(-rw-r--r--)
owner: 0(root)
group: 102(postgres)
access: Sat Apr 15 14:41:12 2006
modification: Sat Apr 15 14:41:12 2006
change: Sat Apr 15 14:41:12 2006

And I tried it on Fedora Core 2:

bmomjian@x86-linux2:~$ touch x1 x2 ; stat x1 x2
File: `x1'
Size: 0 Blocks: 0 IO Block: 8192 regular
empty file
Device: 11h/17d Inode: 24707337 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 7078/bmomjian) Gid: ( 100/ users)
Access: 2006-04-15 11:41:53.000000000 -0700
Modify: 2006-04-15 11:41:53.000000000 -0700
Change: 2006-04-15 11:41:53.000000000 -0700
File: `x2'
Size: 0 Blocks: 0 IO Block: 8192 regular
empty file
Device: 11h/17d Inode: 24707338 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 7078/bmomjian) Gid: ( 100/ users)
Access: 2006-04-15 11:41:53.000000000 -0700
Modify: 2006-04-15 11:41:53.000000000 -0700
Change: 2006-04-15 11:41:53.000000000 -0700

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#15

jeff@frostconsultingllc.com

about 20 years ago

In reply to: Bruce Momjian (#14)

Re: Proposed doc-patch: Identifying the Current WAL file

On Sat, 15 Apr 2006, Bruce Momjian wrote:

And I tried it on Fedora Core 2:

Device: 11h/17d Inode: 24707338 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 7078/bmomjian) Gid: ( 100/ users)
Access: 2006-04-15 11:41:53.000000000 -0700
Modify: 2006-04-15 11:41:53.000000000 -0700
Change: 2006-04-15 11:41:53.000000000 -0700

And you know what? I pulled a bonehead maneuver when I read the output of my
stat command. FC3/4 appear to be the same. I wonder if this is filesystem
dependent since stat has all those trailing 0's for some reason. Maybe we are
back to using sort to decide the winner of a tie?

--
Jeff Frost, Owner <jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

#16

tgl@sss.pgh.pa.us

about 20 years ago

In reply to: Bruce Momjian (#11)

Re: Proposed doc-patch: Identifying the Current WAL file

Bruce Momjian <pgman@candle.pha.pa.us> writes:

The point is that the test does not have a
one-second window of showing the wrong answer, meaning I could wait for
60 seconds, and still see the wrong WAL file at the top.

Oh, I see your point: you can lose at most one second's worth of data,
but that second could be arbitrarily long ago if it was the latest
activity in the database. Yeah, that's a bit unpleasant. So we really
do need both parts of the ordering rule, and there seems no way to do
that with just 'ls'.

I wonder if you could do anything with find(1)'s -newer switch?
It seems to be a '>' condition not a '>=' condition, so it'd be
pretty awkward ... certainly not a one-liner.

I think everyone agrees that adding a SQL function would be a reasonable
thing to do, anyway.

regards, tom lane

#17

Simon Riggs

simon@2ndQuadrant.com

about 20 years ago

In reply to: Bruce Momjian (#4)

Re: Proposed doc-patch: Identifying the Current WAL file

On Sat, 2006-04-15 at 12:24 -0400, Bruce Momjian wrote:

And if we can't provide one, should we supply an SQL function
to return the current WAL name?

I'll do this. Just give me a few days to get my feet under the new desk.
I know its well past time I sorted this and a few other things out.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com/

#18