Long text values destroys logical replication slots

Started by Adam Dratwińskiover 10 years ago10 messagesbugs
Jump to latest
#1Adam Dratwiński
adam.dratwinski@gmail.com

I recently encountered a bug on my production postgres which results in
killing replication slot. It happens when there is an UPDATE statement
called on a record having text column with a very long value (about 1mb).
The problem can be easily reproduced by this query:

https://gist.github.com/adwinsky/f7a6dc4382776fb9ea60

Here I am using decoderbufs decoder https://github.com/krzychk/decoderbufs
but it doesn't seems to be related with this plugin as it also happen on
the postgres test decoder.

#2Andres Freund
andres@anarazel.de
In reply to: Adam Dratwiński (#1)
Re: Long text values destroys logical replication slots

On October 28, 2015 7:51:18 PM GMT+01:00, "Adam Dratwiński" <adam.dratwinski@gmail.com> wrote:

I recently encountered a bug on my production postgres which results in
killing replication slot. It happens when there is an UPDATE statement
called on a record having text column with a very long value (about
1mb).
The problem can be easily reproduced by this query:

https://gist.github.com/adwinsky/f7a6dc4382776fb9ea60

Here I am using decoderbufs decoder
https://github.com/krzychk/decoderbufs
but it doesn't seems to be related with this plugin as it also happen
on
the postgres test decoder.

Did you try the last minor release?

Andres
Hi
--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Adam Dratwiński
adam.dratwinski@gmail.com
In reply to: Andres Freund (#2)
Re: Long text values destroys logical replication slots

I managed to reproduce this problem on postgres 9.4.1 and 9.4.5 on osx and
gentoo.

On Wed, Oct 28, 2015 at 9:54 PM, Andres Freund <andres@anarazel.de> wrote:

Show quoted text

On October 28, 2015 7:51:18 PM GMT+01:00, "Adam Dratwiński" <
adam.dratwinski@gmail.com> wrote:

I recently encountered a bug on my production postgres which results in
killing replication slot. It happens when there is an UPDATE statement
called on a record having text column with a very long value (about
1mb).
The problem can be easily reproduced by this query:

https://gist.github.com/adwinsky/f7a6dc4382776fb9ea60

Here I am using decoderbufs decoder
https://github.com/krzychk/decoderbufs
but it doesn't seems to be related with this plugin as it also happen
on
the postgres test decoder.

Did you try the last minor release?

Andres
Hi
---
Please excuse brevity and formatting - I am writing this on my mobile
phone.

#4Andres Freund
andres@anarazel.de
In reply to: Adam Dratwiński (#3)
Re: Long text values destroys logical replication slots
Can you reproduce it with test_decoding as the output plugin?
--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#5Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#4)
Re: Long text values destroys logical replication slots

On Wed, Oct 28, 2015 at 10:17 PM, Andres Freund <andres@anarazel.de> wrote:

Can you reproduce it with test_decoding as the output plugin?

You can just use that for example to get an assertion failure:
CREATE TABLE a (b text);
ALTER TABLE ONLY a REPLICA IDENTITY FULL;
SELECT * FROM pg_create_logical_replication_slot('new', 'test_decoding');
INSERT INTO a (b) VALUES (repeat('k', 2000000));
UPDATE a SET b = 'c';
select * from pg_logical_slot_peek_changes('new', NULL, NULL); -- boom

frame #3: 0x0000000100458ca9
postgres`DecodeXLogTuple(data=0x00007fb7e2126046, len=22910,
tuple=0x000000010a32e038) + 137 at decode.c:856
853 int datalen = len - SizeOfHeapHeader;
854
855 Assert(datalen >= 0);
-> 856 Assert(datalen <= MaxHeapTupleSize);
(lldb) p datalen
(int) $0 = 22905
--
Michael

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#6Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#5)
Re: Long text values destroys logical replication slots

On 2015-10-28 23:00:53 +0100, Michael Paquier wrote:

On Wed, Oct 28, 2015 at 10:17 PM, Andres Freund <andres@anarazel.de> wrote:

Can you reproduce it with test_decoding as the output plugin?

You can just use that for example to get an assertion failure:
CREATE TABLE a (b text);
ALTER TABLE ONLY a REPLICA IDENTITY FULL;
SELECT * FROM pg_create_logical_replication_slot('new', 'test_decoding');
INSERT INTO a (b) VALUES (repeat('k', 2000000));
UPDATE a SET b = 'c';
select * from pg_logical_slot_peek_changes('new', NULL, NULL); -- boom

frame #3: 0x0000000100458ca9
postgres`DecodeXLogTuple(data=0x00007fb7e2126046, len=22910,
tuple=0x000000010a32e038) + 137 at decode.c:856
853 int datalen = len - SizeOfHeapHeader;
854
855 Assert(datalen >= 0);
-> 856 Assert(datalen <= MaxHeapTupleSize);
(lldb) p datalen
(int) $0 = 22905

Ugh, that's some monumental stupidity on my side in the handling of
oldtuple values. Newtuple datums never can be bigger than 8k (all
relevant columns point into toasted datums)- but that's absolutely not
the case for the "old" values. That can't happen for primary keys (due
to the btree limitations), which is why this wasn't noticed so far.
Gotta think about this one.

Greetings,

Andres Freund

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#7Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#6)
Re: Long text values destroys logical replication slots

On Thu, Oct 29, 2015 at 7:23 AM, Andres Freund <andres@anarazel.de> wrote:

On 2015-10-28 23:00:53 +0100, Michael Paquier wrote:

On Wed, Oct 28, 2015 at 10:17 PM, Andres Freund <andres@anarazel.de> wrote:

Can you reproduce it with test_decoding as the output plugin?

You can just use that for example to get an assertion failure:
CREATE TABLE a (b text);
ALTER TABLE ONLY a REPLICA IDENTITY FULL;
SELECT * FROM pg_create_logical_replication_slot('new', 'test_decoding');
INSERT INTO a (b) VALUES (repeat('k', 2000000));
UPDATE a SET b = 'c';
select * from pg_logical_slot_peek_changes('new', NULL, NULL); -- boom

frame #3: 0x0000000100458ca9
postgres`DecodeXLogTuple(data=0x00007fb7e2126046, len=22910,
tuple=0x000000010a32e038) + 137 at decode.c:856
853 int datalen = len - SizeOfHeapHeader;
854
855 Assert(datalen >= 0);
-> 856 Assert(datalen <= MaxHeapTupleSize);
(lldb) p datalen
(int) $0 = 22905

Ugh, that's some monumental stupidity on my side in the handling of
oldtuple values. Newtuple datums never can be bigger than 8k (all
relevant columns point into toasted datums)- but that's absolutely not
the case for the "old" values. That can't happen for primary keys (due
to the btree limitations), which is why this wasn't noticed so far.
Gotta think about this one.

FWIW, I am going to play with that..
--
Michael

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#8Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#7)
Re: Long text values destroys logical replication slots

On January 29, 2016 8:07:29 AM GMT+01:00, Michael Paquier <michael.paquier@gmail.com> wrote:

On Thu, Oct 29, 2015 at 7:23 AM, Andres Freund <andres@anarazel.de>
wrote:

On 2015-10-28 23:00:53 +0100, Michael Paquier wrote:

On Wed, Oct 28, 2015 at 10:17 PM, Andres Freund <andres@anarazel.de>

wrote:

Can you reproduce it with test_decoding as the output plugin?

You can just use that for example to get an assertion failure:
CREATE TABLE a (b text);
ALTER TABLE ONLY a REPLICA IDENTITY FULL;
SELECT * FROM pg_create_logical_replication_slot('new',

'test_decoding');

INSERT INTO a (b) VALUES (repeat('k', 2000000));
UPDATE a SET b = 'c';
select * from pg_logical_slot_peek_changes('new', NULL, NULL); --

boom

frame #3: 0x0000000100458ca9
postgres`DecodeXLogTuple(data=0x00007fb7e2126046, len=22910,
tuple=0x000000010a32e038) + 137 at decode.c:856
853 int datalen = len - SizeOfHeapHeader;
854
855 Assert(datalen >= 0);
-> 856 Assert(datalen <= MaxHeapTupleSize);
(lldb) p datalen
(int) $0 = 22905

Ugh, that's some monumental stupidity on my side in the handling of
oldtuple values. Newtuple datums never can be bigger than 8k (all
relevant columns point into toasted datums)- but that's absolutely

not

the case for the "old" values. That can't happen for primary keys

(due

to the btree limitations), which is why this wasn't noticed so far.
Gotta think about this one.

FWIW, I am going to play with that..

I've a patch for this, in just not yet happy with the API changes...

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#9Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#8)
Re: Long text values destroys logical replication slots

On Fri, Jan 29, 2016 at 4:37 PM, Andres Freund <andres@anarazel.de> wrote:

On January 29, 2016 8:07:29 AM GMT+01:00, Michael Paquier <michael.paquier@gmail.com> wrote:

On Thu, Oct 29, 2015 at 7:23 AM, Andres Freund <andres@anarazel.de>
wrote:

On 2015-10-28 23:00:53 +0100, Michael Paquier wrote:

On Wed, Oct 28, 2015 at 10:17 PM, Andres Freund <andres@anarazel.de>

wrote:

Can you reproduce it with test_decoding as the output plugin?

You can just use that for example to get an assertion failure:
CREATE TABLE a (b text);
ALTER TABLE ONLY a REPLICA IDENTITY FULL;
SELECT * FROM pg_create_logical_replication_slot('new',

'test_decoding');

INSERT INTO a (b) VALUES (repeat('k', 2000000));
UPDATE a SET b = 'c';
select * from pg_logical_slot_peek_changes('new', NULL, NULL); --

boom

frame #3: 0x0000000100458ca9
postgres`DecodeXLogTuple(data=0x00007fb7e2126046, len=22910,
tuple=0x000000010a32e038) + 137 at decode.c:856
853 int datalen = len - SizeOfHeapHeader;
854
855 Assert(datalen >= 0);
-> 856 Assert(datalen <= MaxHeapTupleSize);
(lldb) p datalen
(int) $0 = 22905

Ugh, that's some monumental stupidity on my side in the handling of
oldtuple values. Newtuple datums never can be bigger than 8k (all
relevant columns point into toasted datums)- but that's absolutely

not

the case for the "old" values. That can't happen for primary keys

(due

to the btree limitations), which is why this wasn't noticed so far.
Gotta think about this one.

FWIW, I am going to play with that..

I've a patch for this, in just not yet happy with the API changes...

Oh, OK, good to know. If you want to have extra eyes on it, just ping
me here then.
--
Michael

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#10Andres Freund
andres@anarazel.de
In reply to: Adam Dratwiński (#1)
Re: Long text values destroys logical replication slots

Hi Adam,

On 2015-10-28 19:51:18 +0100, Adam Dratwiński wrote:

I recently encountered a bug on my production postgres which results in
killing replication slot. It happens when there is an UPDATE statement
called on a record having text column with a very long value (about 1mb).
The problem can be easily reproduced by this query:

https://gist.github.com/adwinsky/f7a6dc4382776fb9ea60

Here I am using decoderbufs decoder https://github.com/krzychk/decoderbufs
but it doesn't seems to be related with this plugin as it also happen on
the postgres test decoder.

I pushed a fix for this; it'll be included in the next maintenance
releases which are coming up soon-ish. Sorry that it took so long.

Thanks for the report!

Regards,

Andres

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs