[PATCH] xlogreader: do not read a file block twice

Started by Arthur Zakirovalmost 7 years ago11 messages
#1Arthur Zakirov
a.zakirov@postgrespro.ru
1 attachment(s)

Hello hackers,

Grigory noticed that one of our utilities has very slow performance when
xlogreader reads zlib archives. We found out that xlogreader sometimes
reads a WAL file block twice.

zlib has slow performance when you read an archive not in sequential
order. I think reading a block twice in same position isn't sequential,
because gzread() moves current position forward and next call gzseek()
to the same position moves it back.

It seems that the attached patch solves the issue. I think when reqLen
== state->readLen the requested block already is in the xlogreader's buffer.

What do you think?

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Attachments:

xlogreader-reqlen.patchtext/x-patch; name=xlogreader-reqlen.patchDownload
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 69b6226f8f..cbc7e4e7ea 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -536,7 +536,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen)
 
 	/* check whether we have all the requested data already */
 	if (targetSegNo == state->readSegNo && targetPageOff == state->readOff &&
-		reqLen < state->readLen)
+		reqLen <= state->readLen)
 		return state->readLen;
 
 	/*
#2Grigory Smolkin
g.smolkin@postgrespro.ru
In reply to: Arthur Zakirov (#1)
Re: [PATCH] xlogreader: do not read a file block twice

Hm, looks like it could speed up PostgreSQL recovery, but is it safe?

On 02/11/2019 07:25 PM, Arthur Zakirov wrote:

Hello hackers,

Grigory noticed that one of our utilities has very slow performance
when xlogreader reads zlib archives. We found out that xlogreader
sometimes reads a WAL file block twice.

zlib has slow performance when you read an archive not in sequential
order. I think reading a block twice in same position isn't
sequential, because gzread() moves current position forward and next
call gzseek() to the same position moves it back.

It seems that the attached patch solves the issue. I think when reqLen
== state->readLen the requested block already is in the xlogreader's
buffer.

What do you think?

--
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#3Michael Paquier
michael@paquier.xyz
In reply to: Grigory Smolkin (#2)
Re: [PATCH] xlogreader: do not read a file block twice

On Mon, Feb 11, 2019 at 07:32:59PM +0300, Grigory Smolkin wrote:

Hm, looks like it could speed up PostgreSQL recovery, but is it
safe?

(Please avoid top-posting.)

On 02/11/2019 07:25 PM, Arthur Zakirov wrote:

Grigory noticed that one of our utilities has very slow performance when
xlogreader reads zlib archives. We found out that xlogreader sometimes
reads a WAL file block twice.

What do you think?

I think that such things, even if they look simple, need a careful
lookup, and I have not looked at the proposal yet. Could you add it
to the next commit fest so as we don't lose track of it?
https://commitfest.postgresql.org/22/
--
Michael

#4Arthur Zakirov
a.zakirov@postgrespro.ru
In reply to: Michael Paquier (#3)
Re: [PATCH] xlogreader: do not read a file block twice

On 12.02.2019 07:23, Michael Paquier wrote:

On 02/11/2019 07:25 PM, Arthur Zakirov wrote:

Grigory noticed that one of our utilities has very slow performance when
xlogreader reads zlib archives. We found out that xlogreader sometimes
reads a WAL file block twice.

What do you think?

I think that such things, even if they look simple, need a careful
lookup, and I have not looked at the proposal yet. Could you add it
to the next commit fest so as we don't lose track of it?
https://commitfest.postgresql.org/22/

Of course. Agree, it may be a non trivial case. Added as a bug fix:
https://commitfest.postgresql.org/22/1994/

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

#5Andrey Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Arthur Zakirov (#1)
Re: [PATCH] xlogreader: do not read a file block twice

On 11.02.2019 21:25, Arthur Zakirov wrote:

Hello hackers,

Grigory noticed that one of our utilities has very slow performance when
xlogreader reads zlib archives. We found out that xlogreader sometimes
reads a WAL file block twice.

zlib has slow performance when you read an archive not in sequential
order. I think reading a block twice in same position isn't sequential,
because gzread() moves current position forward and next call gzseek()
to the same position moves it back.

It seems that the attached patch solves the issue. I think when reqLen
== state->readLen the requested block already is in the xlogreader's
buffer.

What do you think?

I looked at the history of the code changes:

---------------------------------------------------------------
7fcbf6a405f (Alvaro Herrera 2013-01-16 16:12:53 -0300 539)
reqLen < state->readLen)

1bb2558046c (Heikki Linnakangas 2010-01-27 15:27:51 +0000 9349)
targetPageOff == readOff && targetRecOff < readLen)

eaef111396e (Tom Lane 2006-04-03 23:35:05 +0000 3842)
len = XLOG_BLCKSZ - RecPtr->xrecoff % XLOG_BLCKSZ;
4d14fe0048c (Tom Lane 2001-03-13 01:17:06 +0000 3843)
if (total_len > len)
---------------------------------------------------------------

In the original code of Tom Lane, condition (total_len > len) caused a
page reread from disk. As I understand it, this is equivalent to your
proposal.
Th code line in commit 1bb2558046c seems tantamount to the corresponding
line in commit 7fcbf6a405f but have another semantics: the targetPageOff
value can't be more or equal XLOG_BLCKSZ, but the reqLen value can be.
It may be a reason of appearance of possible mistake, introduced by
commit 7fcbf6a405f.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

#6Arthur Zakirov
a.zakirov@postgrespro.ru
In reply to: Andrey Lepikhov (#5)
Re: [PATCH] xlogreader: do not read a file block twice

On 12.02.2019 20:47, Andrey Lepikhov wrote:

I looked at the history of the code changes:

---------------------------------------------------------------
7fcbf6a405f (Alvaro Herrera 2013-01-16 16:12:53 -0300 539) reqLen <
state->readLen)

1bb2558046c (Heikki Linnakangas 2010-01-27 15:27:51 +0000 9349)
targetPageOff == readOff && targetRecOff < readLen)

eaef111396e (Tom Lane 2006-04-03 23:35:05 +0000 3842)
len = XLOG_BLCKSZ - RecPtr->xrecoff % XLOG_BLCKSZ;
4d14fe0048c (Tom Lane 2001-03-13 01:17:06 +0000 3843)
if (total_len > len)
---------------------------------------------------------------

In the original code of Tom Lane, condition (total_len > len) caused a
page reread from disk. As I understand it, this is equivalent to your
proposal.
Th code line in commit 1bb2558046c seems tantamount to the corresponding
line in commit 7fcbf6a405f but have another semantics: the targetPageOff
value can't be more or equal XLOG_BLCKSZ, but the reqLen value can be.
It may be a reason of appearance of possible mistake, introduced by
commit 7fcbf6a405f.

Thank you for your research. Indeed, it makes sense now.

In my case after reading a page both `reqLen` and `state->readLen` equal
to XLOG_BLCKSZ. This leads to a page reread, since `pageptr` is the same
as the previous read. But `targetRecOff` is different in the second case
because we want to read next record, which probably doesn't fit into the
page wholly (that's why `reqLen` is equal to XLOG_BLCKSZ).

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

#7Michael Paquier
michael@paquier.xyz
In reply to: Arthur Zakirov (#4)
Re: [PATCH] xlogreader: do not read a file block twice

On Tue, Feb 12, 2019 at 11:44:14AM +0300, Arthur Zakirov wrote:

Of course. Agree, it may be a non trivial case. Added as a bug fix:
https://commitfest.postgresql.org/22/1994/

I have been looking at the patch, and I agree that the current coding
is a bit crazy. If the wanted data has already been read, it makes
little sense to require reading it again if the size requested by the
caller of ReadPageInternal() exactly equals what has been read
already, and that's what the code is doing.

Now I don't actually agree that this qualifies as a bug fix. As
things stand, a page may finish by being more than once if what has
been read previously equals what is requested, however this does not
prevent the code to work correctly. The performance gain is also
heavily dependent on the callback reading a page and the way the WAL
reader is used. How do you actually read WAL pages in your own
plugin with compressed data? It begins by reading a full page once,
then it moves on to a per-record read after making sure that the page
has been read?
--
Michael

#8Arthur Zakirov
a.zakirov@postgrespro.ru
In reply to: Michael Paquier (#7)
Re: [PATCH] xlogreader: do not read a file block twice

On 14.02.2019 09:51, Michael Paquier wrote:

Now I don't actually agree that this qualifies as a bug fix. As
things stand, a page may finish by being more than once if what has
been read previously equals what is requested, however this does not
prevent the code to work correctly. The performance gain is also
heavily dependent on the callback reading a page and the way the WAL
reader is used. How do you actually read WAL pages in your own
plugin with compressed data? It begins by reading a full page once,
then it moves on to a per-record read after making sure that the page
has been read?

Yes, an application reads WAL pages wholly at a time. It is done within
SimpleXLogPageRead() (it is a read_page callback passed to
XLogReaderAllocate()). It returns XLOG_BLCKSZ.

Here is the part of the code, not sure that it will be useful though:

SimpleXLogPageRead(...)
{
...
targetPageOff = targetPagePtr % private_data->xlog_seg_size;
...
if (gzseek(private_data->gz_xlogfile, (z_off_t) targetPageOff,
SEEK_SET) == -1)
...
if (gzread(private_data->gz_xlogfile, readBuf, XLOG_BLCKSZ) !=
XLOG_BLCKSZ)
...
return XLOG_BLCKSZ;
}

So we read whole page with size XLOG_BLCKSZ. The full code:
https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1074

Here is the little optimization I made. Mainly I just add a buffer to
store previous read page:
https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1046

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

#9Michael Paquier
michael@paquier.xyz
In reply to: Arthur Zakirov (#8)
Re: [PATCH] xlogreader: do not read a file block twice

On Thu, Feb 14, 2019 at 11:20:56AM +0300, Arthur Zakirov wrote:

So we read whole page with size XLOG_BLCKSZ. The full code:
https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1074

Here is the little optimization I made. Mainly I just add a buffer to store
previous read page:
https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1046

Thanks, I see what you have done. I cannot comment if your shortcut
is actually fully correct based on my knowledge of this code, but
things cannot be in the best conditions without having the WAL reader
handle properly the limits. So I am planning to commit what you
propose after an extra pass on it in the next couple of days or so.
--
Michael

#10Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#9)
Re: [PATCH] xlogreader: do not read a file block twice

On Fri, Feb 15, 2019 at 08:06:16AM +0900, Michael Paquier wrote:

Thanks, I see what you have done. I cannot comment if your shortcut
is actually fully correct based on my knowledge of this code, but
things cannot be in the best conditions without having the WAL reader
handle properly the limits. So I am planning to commit what you
propose after an extra pass on it in the next couple of days or so.

And done, after doing and extra pass, doing more testing using by own
plugins, pg_waldump and more fancy stuff with a primary/standby and
pgbench.
--
Michael

#11Arthur Zakirov
a.zakirov@postgrespro.ru
In reply to: Michael Paquier (#10)
Re: [PATCH] xlogreader: do not read a file block twice

On 18.02.2019 04:09, Michael Paquier wrote:

And done, after doing and extra pass, doing more testing using by own
plugins, pg_waldump and more fancy stuff with a primary/standby and
pgbench.

Thank you Michael.

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company