Alter table set logged hanging after writing out all WAL

Started by Jeremy Finzelabout 8 years ago7 messagesgeneral
Jump to latest
#1Jeremy Finzel
finzelj@gmail.com

We are running:

PostgreSQL 10.1 on x86_64-pc-linux-gnu, compiled by gcc (Debian 4.7.2-5)
4.7.2, 64-bit

The table I am setting to logged is 32GB with indexes. I see it writing
WAL files like crazy but after about an hour and a half, it has written out
some 2500 WAL segments, then it just sits and continues to run as "active",
but no more WAL files are being created. There are no locks, and no other
transactions in the system running.

Any ideas? Is it still doing something that I need to wait for?

Thanks,
Jeremy

#2Michael Paquier
michael@paquier.xyz
In reply to: Jeremy Finzel (#1)
Re: Alter table set logged hanging after writing out all WAL

On Tue, Feb 06, 2018 at 12:50:56AM -0600, Jeremy Finzel wrote:

The table I am setting to logged is 32GB with indexes. I see it writing
WAL files like crazy but after about an hour and a half, it has written out
some 2500 WAL segments, then it just sits and continues to run as "active",
but no more WAL files are being created. There are no locks, and no other
transactions in the system running.

Switching an unlogged table to be logged causes an entire image of the
table to be WAL-logged so as the operation is consistent in case of a
crash.

Any ideas? Is it still doing something that I need to wait for?

Do you have a backtrace with the process doing the ALTER TABLE hanging?
How is structured you table with its indexes? It is a bit hard to guess
much without more information.
--
Michael

#3Jeremy Finzel
finzelj@gmail.com
In reply to: Michael Paquier (#2)
Re: Alter table set logged hanging after writing out all WAL

Here is the basic structure - is the gist index significant?:

CREATE UNLOGGED TABLE foo (
as_of_date daterange NOT NULL,
customer_id integer,
bunch_of_fields_here);

ALTER TABLE ONLY foo
ADD CONSTRAINT foo_as_of_date_excl EXCLUDE USING gist (customer_id WITH
=, as_of_date WITH &&);

CREATE UNIQUE INDEX foo_idx1 ON foo USING btree (customer_id) WHERE
(upper(as_of_date) = 'infinity'::date);

CREATE INDEX foo_idx2 ON foo USING btree (customer_id, lower(as_of_date))
WHERE (upper(as_of_date) = 'infinity'::date);

CREATE UNIQUE INDEX foo_idx3 ON foo USING btree (customer_id,
lower(as_of_date));

This is all I see - please help me if there's a better command I can run:

strace -p 6497
read(32, "\347\0\0\0pi\3\222p\371\0\0\254\0H\1\0 \4
\0\0\0\0000\237\240\1X\236\260\1"..., 8192) = 8192
read(32, "\347\0\0\0\340\214\3\222\314x\0\0\254\0H\1\0 \4
\0\0\0\0(\237\260\1P\236\260\1"..., 8192) = 8192
read(32, "\347\0\0\0(\260\3\222\242A\0\0\254\0p\1\0 \4
\0\0\0\0(\237\260\1X\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0\220\323\3\222Kg\0\0\254\0P\1\0 \4
\0\0\0\0000\237\240\1`\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0\340\366\3\222\v|\0\0\254\0h\1\0 \4
\0\0\0\0(\237\260\1X\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0`\32\4\222\230m\0\0\254\0008\1\0 \4
\0\0\0\0000\237\240\1X\236\260\1"..., 8192) = 8192
read(32, "\347\0\0\0\350=\4\222\235\342\0\0\254\0000\1\0 \4
\0\0\0\0000\237\240\1`\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0 b\4\222\224f\0\0\260\0\270\0\0 \4
\0\0\0\0000\237\240\1`\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0(\206\4\222\301\373\0\0\260\0\320\0\0 \4
\0\0\0\0(\237\260\1X\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0\260\251\4\222Hx\0\0\254\0000\1\0 \4
\0\0\0\0(\237\260\1H\236\300\1"..., 8192) = 8192
read(32, "\347\0\0\0\260\315\4\222\254d\0\0\260\0\330\0\0 \4
\0\0\0\0000\237\240\1`\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0\370\360\4\222\357\235\0\0\254\0p\1\0 \4
\0\0\0\0000\237\240\1`\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0 \25\5\222\250k\0\0\260\0\260\0\0 \4
\0\0\0\0(\237\260\1X\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0\0209\5\222\353\1\0\0\260\0\350\0\0 \4
\0\0\0\0000\237\240\1X\236\260\1"..., 8192) = 8192
read(32, "\347\0\0\0\30]\5\222UQ\0\0\260\0\320\0\0 \4
\0\0\0\0000\237\240\1`\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0P\201\5\222g\226\0\0\260\0\270\0\0 \4
\0\0\0\0000\237\240\1`\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0\210\244\5\222\332\324\0\0\254\0\200\1\0 \4
\0\0\0\0000\237\240\1`\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0\310\307\5\222Lr\0\0\254\0x\1\0 \4
\0\0\0\0(\237\260\1`\236\220\1"..., 8192) = 8192
read(32, "\347\0\0\0\0\353\5\2225\346\0\0\254\0\200\1\0 \4
\0\0\0\0(\237\260\1P\236\260\1"..., 8192) = 8192
read(32, "\347\0\0\0p\16\6\222\36w\0\0\254\0H\1\0 \4
\0\0\0\0(\237\260\1P\236\260\1"..., 8192) = 8192
read(32, "\347\0\0\0\2302\6\222I\251\0\0\260\0\260\0\0 \4
\0\0\0\0(\237\260\1X\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0008V\6\222{N\0\0\254\0\30\1\0 \4
\0\0\0\0(\237\260\1P\236\260\1"..., 8192) = 8192
read(32, "\347\0\0\0py\6\222}t\0\0\254\0\200\1\0 \4
\0\0\0\0(\237\260\1P\236\260\1"..., 8192) = 8192
read(32, "\347\0\0\0\220\235\6\222\1\17\0\0\260\0\270\0\0 \4
\0\0\0\0(\237\260\1X\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0\10\301\6\222\236\352\0\0\254\0X\1\0 \4 \0\0\0\0
\237\300\1H\236\260\1"..., 8192) = 8192
read(32, "\347\0\0\0\210\344\6\2226%\0\0\254\0008\1\0 \4
\0\0\0\0000\237\240\1X\236\260\1"..., 8192) = 8192
read(32, "\347\0\0\0\330\7\7\222\1\360\0\0\254\0h\1\0 \4
\0\0\0\0000\237\240\1`\236\240\1"..., 8192) = 8192
read(32, "\347\0\0\0\0,\7\222G'\0\0\260\0\260\0\0 \4
\0\0\0\0000\237\240\1`\236\240\1"..., 8192) = 8192

On Tue, Feb 6, 2018 at 1:19 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:

Show quoted text

On Tue, Feb 06, 2018 at 12:50:56AM -0600, Jeremy Finzel wrote:

The table I am setting to logged is 32GB with indexes. I see it writing
WAL files like crazy but after about an hour and a half, it has written

out

some 2500 WAL segments, then it just sits and continues to run as

"active",

but no more WAL files are being created. There are no locks, and no

other

transactions in the system running.

Switching an unlogged table to be logged causes an entire image of the
table to be WAL-logged so as the operation is consistent in case of a
crash.

Any ideas? Is it still doing something that I need to wait for?

Do you have a backtrace with the process doing the ALTER TABLE hanging?
How is structured you table with its indexes? It is a bit hard to guess
much without more information.
--
Michael

#4Michael Paquier
michael@paquier.xyz
In reply to: Jeremy Finzel (#3)
Re: Alter table set logged hanging after writing out all WAL

On Tue, Feb 06, 2018 at 01:36:04AM -0600, Jeremy Finzel wrote:

Here is the basic structure - is the gist index significant?:

CREATE UNLOGGED TABLE foo (
as_of_date daterange NOT NULL,
customer_id integer,
bunch_of_fields_here);

ALTER TABLE ONLY foo
ADD CONSTRAINT foo_as_of_date_excl EXCLUDE USING gist (customer_id WITH
=, as_of_date WITH &&);

CREATE UNIQUE INDEX foo_idx1 ON foo USING btree (customer_id) WHERE
(upper(as_of_date) = 'infinity'::date);

CREATE INDEX foo_idx2 ON foo USING btree (customer_id, lower(as_of_date))
WHERE (upper(as_of_date) = 'infinity'::date);

CREATE UNIQUE INDEX foo_idx3 ON foo USING btree (customer_id,
lower(as_of_date));

I am not sure, but I would think about something related to gist here
when heavy insertions are done on it... I cannot put my finger on the
thread though.

This is all I see - please help me if there's a better command I can
run:

If the process is still running, can you attach gdb to it and then run
the command bt? You may need to install debugging symbols to make the
trace readable.
--
Michael

#5Jeremy Finzel
finzelj@gmail.com
In reply to: Michael Paquier (#4)
Re: Alter table set logged hanging after writing out all WAL

On Tue, Feb 6, 2018 at 6:02 PM Michael Paquier <michael.paquier@gmail.com>
wrote:

On Tue, Feb 06, 2018 at 01:36:04AM -0600, Jeremy Finzel wrote:

Here is the basic structure - is the gist index significant?:

CREATE UNLOGGED TABLE foo (
as_of_date daterange NOT NULL,
customer_id integer,
bunch_of_fields_here);

ALTER TABLE ONLY foo
ADD CONSTRAINT foo_as_of_date_excl EXCLUDE USING gist (customer_id

WITH

=, as_of_date WITH &&);

CREATE UNIQUE INDEX foo_idx1 ON foo USING btree (customer_id) WHERE
(upper(as_of_date) = 'infinity'::date);

CREATE INDEX foo_idx2 ON foo USING btree (customer_id, lower(as_of_date))
WHERE (upper(as_of_date) = 'infinity'::date);

CREATE UNIQUE INDEX foo_idx3 ON foo USING btree (customer_id,
lower(as_of_date));

I am not sure, but I would think about something related to gist here
when heavy insertions are done on it... I cannot put my finger on the
thread though.

This is all I see - please help me if there's a better command I can
run:

If the process is still running, can you attach gdb to it and then run
the command bt? You may need to install debugging symbols to make the
trace readable.
--
Michael

I am trying a few other scenarios to see if I can reproduce. I was able to
set to logged a copy of the table with no indexes. I am now attempting same
with only the gist index. If I can reproduce it on a non production server
I will try gdb.

Thank you much for the follow up.

Jeremy

#6Jeremy Finzel
finzelj@gmail.com
In reply to: Jeremy Finzel (#5)
Re: Alter table set logged hanging after writing out all WAL

On Tue, Feb 6, 2018 at 9:48 PM, Jeremy Finzel <finzelj@gmail.com> wrote:

On Tue, Feb 6, 2018 at 6:02 PM Michael Paquier <michael.paquier@gmail.com>
wrote:

On Tue, Feb 06, 2018 at 01:36:04AM -0600, Jeremy Finzel wrote:

Here is the basic structure - is the gist index significant?:

CREATE UNLOGGED TABLE foo (
as_of_date daterange NOT NULL,
customer_id integer,
bunch_of_fields_here);

ALTER TABLE ONLY foo
ADD CONSTRAINT foo_as_of_date_excl EXCLUDE USING gist (customer_id

WITH

=, as_of_date WITH &&);

CREATE UNIQUE INDEX foo_idx1 ON foo USING btree (customer_id) WHERE
(upper(as_of_date) = 'infinity'::date);

CREATE INDEX foo_idx2 ON foo USING btree (customer_id,

lower(as_of_date))

WHERE (upper(as_of_date) = 'infinity'::date);

CREATE UNIQUE INDEX foo_idx3 ON foo USING btree (customer_id,
lower(as_of_date));

I am not sure, but I would think about something related to gist here
when heavy insertions are done on it... I cannot put my finger on the
thread though.

This is all I see - please help me if there's a better command I can
run:

If the process is still running, can you attach gdb to it and then run
the command bt? You may need to install debugging symbols to make the
trace readable.
--
Michael

I am trying a few other scenarios to see if I can reproduce. I was able to
set to logged a copy of the table with no indexes. I am now attempting same
with only the gist index. If I can reproduce it on a non production server
I will try gdb.

Thank you much for the follow up.

Jeremy

I was able to get it to finish by just waiting awhile. To give you an
idea, the table with no indexes was set logged in 7 minutes. With the gist
index, it took 3 hours but finally finished. It is only writing WAL for
about the first 30 minutes, then it apparently is not writing any more WAL
but takes very long to finish.

Thanks,
Jeremy

#7Michael Paquier
michael@paquier.xyz
In reply to: Jeremy Finzel (#6)
Re: Alter table set logged hanging after writing out all WAL

On Wed, Feb 07, 2018 at 11:35:13AM -0600, Jeremy Finzel wrote:

I was able to get it to finish by just waiting awhile. To give you an
idea, the table with no indexes was set logged in 7 minutes. With the gist
index, it took 3 hours but finally finished. It is only writing WAL for
about the first 30 minutes, then it apparently is not writing any more WAL
but takes very long to finish.

An idea to reduce the activity would be to temporarily run PostgreSQL on
scissors, by for example disabling full_page_writes and wal_level =
minimal, but don't do that if other sessions are running in parallel,
and take proper backups before doing it.
--
Michael