doPickSplit stack buffer overflow in XLogInsert?
I happened to build in a shell that was still set up for the clang
address sanitizer, and got the attached report. On a rerun it was
repeatable. XLogInsert() seems to read past the end of a variable
allocated on the stack in doPickSplit(). I haven't tried to analyze
it past that, since this part of the code is unfamiliar to me.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-11-26 14:14:38 -0800, Kevin Grittner wrote:
I happened to build in a shell that was still set up for the clang
address sanitizer, and got the attached report.� On a rerun it was
repeatable.� XLogInsert() seems to read past the end of a variable
allocated on the stack in doPickSplit(). I haven't tried to analyze
it past that, since this part of the code is unfamiliar to me.
Yea, I've seen that one before as well and planned to report it at some
point. The reason is the MAXALIGN()s in ACCEPT_RDATA_DATA(). That rounds
up to 8byte boundaries, while we've e.g. only added 2bytes of slop to
toDelete.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-11-26 14:14:38 -0800, Kevin Grittner wrote:
I happened to build in a shell that was still set up for the clang
address sanitizer, and got the attached report. On a rerun it was
repeatable. XLogInsert() seems to read past the end of a variable
allocated on the stack in doPickSplit(). I haven't tried to analyze
it past that, since this part of the code is unfamiliar to me.Yea, I've seen that one before as well and planned to report it at some
point. The reason is the MAXALIGN()s in ACCEPT_RDATA_DATA(). That rounds
up to 8byte boundaries, while we've e.g. only added 2bytes of slop to
toDelete.
Have you established whether having the CRC calculation read past
the end of the buffer can cause problems on recovery or standby
systems? Should we try to get this fixed by Monday?
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 27, 2013 at 06:23:38AM -0800, Kevin Grittner wrote:
Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-11-26 14:14:38 -0800, Kevin Grittner wrote:
I happened to build in a shell that was still set up for the clang
address sanitizer, and got the attached report.� On a rerun it was
(Kevin, I saw no attachment.)
repeatable.� XLogInsert() seems to read past the end of a variable
allocated on the stack in doPickSplit(). I haven't tried to analyze
it past that, since this part of the code is unfamiliar to me.Yea, I've seen that one before as well and planned to report it at some
point. The reason is the MAXALIGN()s in ACCEPT_RDATA_DATA(). That rounds
up to 8byte boundaries, while we've e.g. only added 2bytes of slop to
toDelete.Have you established whether having the CRC calculation read past
the end of the buffer can cause problems on recovery or standby
systems?� Should we try to get this fixed by Monday?
The threat is that rounding the read size up to the next MAXALIGN would cross
into an unreadable memory page, resulting in a SIGSEGV. Every palloc chunk
has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
cause a SIGSEGV. For a stack variable, it depends on the ABI. I'm not aware
of an ABI where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic. We should fix
this in due course on code hygiene grounds, but I would not back-patch it.
--
Noah Misch
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Noah Misch <noah@leadboat.com> wrote:
(Kevin, I saw no attachment.)
Apologies. Trying again.
The threat is that rounding the read size up to the next MAXALIGN
would cross into an unreadable memory page, resulting in a
SIGSEGV. Every palloc chunk has MAXALIGN'd size under the hood,
so the excess read of "toDelete" cannot cause a SIGSEGV. For a
stack variable, it depends on the ABI. I'm not aware of an ABI
where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic. We
should fix this in due course on code hygiene grounds, but I
would not back-patch it.
If you're sure. I hadn't worked through the code, but had two
concerns (neither of which was about a SEGSEGV):
(1) That multiple MAXALIGNs of shorter values could push the
structure into overlap with the next thing on the stack, allowing
one or the other to get stepped on.
(2) That the CRC calculation might picking up uninitialized data
which was not actually going to match what was used during
recovery, leading to "end of recovery" on replay.
If you are confident that neither of these is a real risk, I'll
relax about this.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
doPickSplit-stack-buffer-overrun.txttext/plain; name=doPickSplit-stack-buffer-overrun.txtDownload
On Wed, Nov 27, 2013 at 11:38:23AM -0800, Kevin Grittner wrote:
Noah Misch <noah@leadboat.com> wrote:
The threat is that rounding the read size up to the next MAXALIGN
would cross into an unreadable memory page, resulting in a
SIGSEGV.� Every palloc chunk has MAXALIGN'd size under the hood,
so the excess read of "toDelete" cannot cause a SIGSEGV.� For a
stack variable, it depends on the ABI.� I'm not aware of an ABI
where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic.� We
should fix this in due course on code hygiene grounds, but I
would not back-patch it.If you're sure.� I hadn't worked through the code, but had two
concerns (neither of which was about a SEGSEGV):(1)� That multiple MAXALIGNs of shorter values could push the
structure into overlap with the next thing on the stack, allowing
one or the other to get stepped on.
These are out-of-bounds reads only, not writes. Also, the excess doesn't
accumulate that way; the code reads beyond any particular stack variable or
palloc chunk by no more than 7 bytes.
(2)� That the CRC calculation might picking up uninitialized data
which was not actually going to match what was used during
recovery, leading to "end of recovery" on replay.
The CRC calculation does pick up unspecified bytes, but we copy those same
bytes into the actual WAL record. The effect is similar to that of
unspecified pad bytes in the middle of xlrec structures.
If you are confident that neither of these is a real risk, I'll
relax about this.
If there is a real risk, I'm not seeing it.
--
Noah Misch
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-11-27 15:29:24 -0500, Noah Misch wrote:
If you are confident that neither of these is a real risk, I'll
relax about this.If there is a real risk, I'm not seeing it.
Me neither.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 11/26/13, 5:14 PM, Kevin Grittner wrote:
I happened to build in a shell that was still set up for the clang
address sanitizer, and got the attached report. On a rerun it was
repeatable. XLogInsert() seems to read past the end of a variable
allocated on the stack in doPickSplit(). I haven't tried to analyze
it past that, since this part of the code is unfamiliar to me.
I also see that. It only happens in 64-bit builds.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch <noah@leadboat.com> wrote:
The threat is that rounding the read size up to the next MAXALIGN would cross
into an unreadable memory page, resulting in a SIGSEGV. Every palloc chunk
has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
cause a SIGSEGV. For a stack variable, it depends on the ABI. I'm not aware
of an ABI where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic. We should fix
this in due course on code hygiene grounds, but I would not back-patch it.
Attached patch silences the "Invalid read of size n" complaints of
Valgrind. I agree with your general thoughts around backpatching. Note
that the patch addresses a distinct complaint from Kevin's, as
Valgrind doesn't take issue with the invalid reads past the end of
spgxlogPickSplit variables on the stack.
--
Peter Geoghegan
Attachments:
spgsplit_invalid_read.patchtext/x-patch; charset=US-ASCII; name=spgsplit_invalid_read.patchDownload+10-10
On Wed, Mar 26, 2014 at 9:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch <noah@leadboat.com> wrote:
The threat is that rounding the read size up to the next MAXALIGN would cross
into an unreadable memory page, resulting in a SIGSEGV. Every palloc chunk
has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
cause a SIGSEGV. For a stack variable, it depends on the ABI. I'm not aware
of an ABI where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic. We should fix
this in due course on code hygiene grounds, but I would not back-patch it.Attached patch silences the "Invalid read of size n" complaints of
Valgrind. I agree with your general thoughts around backpatching. Note
that the patch addresses a distinct complaint from Kevin's, as
Valgrind doesn't take issue with the invalid reads past the end of
spgxlogPickSplit variables on the stack.
Is the needless zeroing this patch introduces apt to cause a
performance problem?
This function is actually pretty wacky. If we're stuffing bytes with
undefined contents into the WAL record, maybe the answer isn't to
force the contents of those bytes to be defined, but rather to elide
them from the WAL record.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
We really should fix this one of these days.
On 2014-03-26 18:45:54 -0700, Peter Geoghegan wrote:
Attached patch silences the "Invalid read of size n" complaints of
Valgrind. I agree with your general thoughts around backpatching. Note
that the patch addresses a distinct complaint from Kevin's, as
Valgrind doesn't take issue with the invalid reads past the end of
spgxlogPickSplit variables on the stack.
I don't think that's entirely sufficient. The local spgxlogPickSplit
xlrec;/spgxlogMoveLeafs xlrec; variables are also inserted while
MAXLIGNing their size. That's slightly harder to fix :(. I don't have a
better idea than also allocating them dynamically :(
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 03/31/2014 09:08 PM, Robert Haas wrote:
On Wed, Mar 26, 2014 at 9:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch <noah@leadboat.com> wrote:
The threat is that rounding the read size up to the next MAXALIGN would cross
into an unreadable memory page, resulting in a SIGSEGV. Every palloc chunk
has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
cause a SIGSEGV. For a stack variable, it depends on the ABI. I'm not aware
of an ABI where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic. We should fix
this in due course on code hygiene grounds, but I would not back-patch it.Attached patch silences the "Invalid read of size n" complaints of
Valgrind. I agree with your general thoughts around backpatching. Note
that the patch addresses a distinct complaint from Kevin's, as
Valgrind doesn't take issue with the invalid reads past the end of
spgxlogPickSplit variables on the stack.Is the needless zeroing this patch introduces apt to cause a
performance problem?This function is actually pretty wacky. If we're stuffing bytes with
undefined contents into the WAL record, maybe the answer isn't to
force the contents of those bytes to be defined, but rather to elide
them from the WAL record.
Agreed. I propose the attached, which removes the MAXALIGNs. It's not
suitable for backpatching, though, as it changes the format of the WAL
record.
- Heikki
Attachments:
spgist-wal-remove-maxaligns-1.patchtext/x-diff; name=spgist-wal-remove-maxaligns-1.patchDownload+49-38
On 2014-05-06 13:33:01 +0300, Heikki Linnakangas wrote:
On 03/31/2014 09:08 PM, Robert Haas wrote:
On Wed, Mar 26, 2014 at 9:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch <noah@leadboat.com> wrote:
The threat is that rounding the read size up to the next MAXALIGN would cross
into an unreadable memory page, resulting in a SIGSEGV. Every palloc chunk
has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
cause a SIGSEGV. For a stack variable, it depends on the ABI. I'm not aware
of an ABI where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic. We should fix
this in due course on code hygiene grounds, but I would not back-patch it.Attached patch silences the "Invalid read of size n" complaints of
Valgrind. I agree with your general thoughts around backpatching. Note
that the patch addresses a distinct complaint from Kevin's, as
Valgrind doesn't take issue with the invalid reads past the end of
spgxlogPickSplit variables on the stack.Is the needless zeroing this patch introduces apt to cause a
performance problem?This function is actually pretty wacky. If we're stuffing bytes with
undefined contents into the WAL record, maybe the answer isn't to
force the contents of those bytes to be defined, but rather to elide
them from the WAL record.Agreed. I propose the attached, which removes the MAXALIGNs. It's not
suitable for backpatching, though, as it changes the format of the WAL
record.
Not knowing anything about spgist this looks sane to me. Do we need a
backpatchable solution given that we seem to agree that these bugs
aren't likely to cause harm?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 05/06/2014 07:36 PM, Andres Freund wrote:
On 2014-05-06 13:33:01 +0300, Heikki Linnakangas wrote:
On 03/31/2014 09:08 PM, Robert Haas wrote:
On Wed, Mar 26, 2014 at 9:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch <noah@leadboat.com> wrote:
The threat is that rounding the read size up to the next MAXALIGN would cross
into an unreadable memory page, resulting in a SIGSEGV. Every palloc chunk
has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
cause a SIGSEGV. For a stack variable, it depends on the ABI. I'm not aware
of an ABI where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic. We should fix
this in due course on code hygiene grounds, but I would not back-patch it.Attached patch silences the "Invalid read of size n" complaints of
Valgrind. I agree with your general thoughts around backpatching. Note
that the patch addresses a distinct complaint from Kevin's, as
Valgrind doesn't take issue with the invalid reads past the end of
spgxlogPickSplit variables on the stack.Is the needless zeroing this patch introduces apt to cause a
performance problem?This function is actually pretty wacky. If we're stuffing bytes with
undefined contents into the WAL record, maybe the answer isn't to
force the contents of those bytes to be defined, but rather to elide
them from the WAL record.Agreed. I propose the attached, which removes the MAXALIGNs. It's not
suitable for backpatching, though, as it changes the format of the WAL
record.y
Not knowing anything about spgist this looks sane to me.
Since we did a catversion bump anyway, I revisited this. Committed an
expanded version of the patch I posted earlier. I went through all the
SP-GiST WAL record types and removed the alignment requirements of
tuples from all of them. Most of them didn't have a problem because the
structs happened to have suitable alignment by accident, but seems best
to not rely on that, for the sake of consistency and robustness if the
structs are modified later.
I ran this through my testing tool that compares page image on master
and standby after every WAL record replay, with full_page_writes on and
off, and found no errors.
Do we need a
backpatchable solution given that we seem to agree that these bugs
aren't likely to cause harm?
I don't think we do.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers