Blocked updates and background writer performance

Started by Cory Tuckerabout 10 years ago6 messagesgeneral
Jump to latest
#1Cory Tucker
cory.tucker@gmail.com

PG 9.4.4 (RDS)

I'm experiencing an issue when trying to update many rows in a single table
(one row at a time, but parallelized across ~12 connections). The issue we
see is that the writes will periodically be blocked for a duration of
several minutes and then pick back up. After digging through our
monitoring stack, I was able to uncover these stats which seem to allude to
it being a background writer performance problem:

[image: bg_writer.png]
(apologies for the image)

Our settings for the background writer are pretty standard OOB (I threw in
some others that I thought might be helpful, too):

name | setting | unit
-------------------------+---------+------
bgwriter_delay | 200 | ms
bgwriter_lru_maxpages | 100 |
bgwriter_lru_multiplier | 2 |
maintenance_work_mem | 65536 | kB
max_worker_processes | 8 |
work_mem | 32768 | kB

The table that is being written to contains a jsonb column with a GIN index:

Table "public.ced"
Column | Type | Modifiers
---------------+--------------------------+-----------
id | bigint | not null
created_at | timestamp with time zone |
modified_at | timestamp with time zone |
bean_version | bigint | default 0
account_id | bigint | not null
data | jsonb | not null
Indexes:
"ced_pkey" PRIMARY KEY, btree (id)
"ced_data" gin (data jsonb_path_ops)
"partition_key_idx" btree (account_id, id)

It seems to me that the background writer just can't keep up with the
amount of writes that I am trying to do and freezes all the updates. What
are my options to improve the background writer performance here?

thanks
--Cory

Attachments:

bg_writer.pngimage/png; name=bg_writer.pngDownload+2-3
#2Dorian Hoxha
dorian.hoxha@gmail.com
In reply to: Cory Tucker (#1)
Re: Blocked updates and background writer performance

Maybe you're not doing this.... but:
Using "data" json(b)/hstore column for all/most/many fields is an
antipattern. Use 'data' *ONLY* for columns that you know will be dynamic.
This way you'll write less data into static-columns (no key-names overhead
and better types) --> less data to disk etc (selects will also be faster).

On Tue, Jan 12, 2016 at 7:25 PM, Cory Tucker <cory.tucker@gmail.com> wrote:

Show quoted text

PG 9.4.4 (RDS)

I'm experiencing an issue when trying to update many rows in a single
table (one row at a time, but parallelized across ~12 connections). The
issue we see is that the writes will periodically be blocked for a duration
of several minutes and then pick back up. After digging through our
monitoring stack, I was able to uncover these stats which seem to allude to
it being a background writer performance problem:

[image: bg_writer.png]
(apologies for the image)

Our settings for the background writer are pretty standard OOB (I threw in
some others that I thought might be helpful, too):

name | setting | unit
-------------------------+---------+------
bgwriter_delay | 200 | ms
bgwriter_lru_maxpages | 100 |
bgwriter_lru_multiplier | 2 |
maintenance_work_mem | 65536 | kB
max_worker_processes | 8 |
work_mem | 32768 | kB

The table that is being written to contains a jsonb column with a GIN
index:

Table "public.ced"
Column | Type | Modifiers
---------------+--------------------------+-----------
id | bigint | not null
created_at | timestamp with time zone |
modified_at | timestamp with time zone |
bean_version | bigint | default 0
account_id | bigint | not null
data | jsonb | not null
Indexes:
"ced_pkey" PRIMARY KEY, btree (id)
"ced_data" gin (data jsonb_path_ops)
"partition_key_idx" btree (account_id, id)

It seems to me that the background writer just can't keep up with the
amount of writes that I am trying to do and freezes all the updates. What
are my options to improve the background writer performance here?

thanks
--Cory

Attachments:

bg_writer.pngimage/png; name=bg_writer.pngDownload+2-3
#3Cory Tucker
cory.tucker@gmail.com
In reply to: Dorian Hoxha (#2)
Re: Blocked updates and background writer performance

Thanks for the reply Dorian. For the sake of argument, lets just say I'm
definitely not doing what you mentioned. My question was not so much around
modeling json storage as it is around tuning the background writer
performance.

On Tue, Jan 12, 2016 at 2:14 PM Dorian Hoxha <dorian.hoxha@gmail.com> wrote:

Show quoted text

Maybe you're not doing this.... but:
Using "data" json(b)/hstore column for all/most/many fields is an
antipattern. Use 'data' *ONLY* for columns that you know will be dynamic.
This way you'll write less data into static-columns (no key-names overhead
and better types) --> less data to disk etc (selects will also be faster).

On Tue, Jan 12, 2016 at 7:25 PM, Cory Tucker <cory.tucker@gmail.com>
wrote:

PG 9.4.4 (RDS)

I'm experiencing an issue when trying to update many rows in a single
table (one row at a time, but parallelized across ~12 connections). The
issue we see is that the writes will periodically be blocked for a duration
of several minutes and then pick back up. After digging through our
monitoring stack, I was able to uncover these stats which seem to allude to
it being a background writer performance problem:

[image: bg_writer.png]
(apologies for the image)

Our settings for the background writer are pretty standard OOB (I threw
in some others that I thought might be helpful, too):

name | setting | unit
-------------------------+---------+------
bgwriter_delay | 200 | ms
bgwriter_lru_maxpages | 100 |
bgwriter_lru_multiplier | 2 |
maintenance_work_mem | 65536 | kB
max_worker_processes | 8 |
work_mem | 32768 | kB

The table that is being written to contains a jsonb column with a GIN
index:

Table "public.ced"
Column | Type | Modifiers
---------------+--------------------------+-----------
id | bigint | not null
created_at | timestamp with time zone |
modified_at | timestamp with time zone |
bean_version | bigint | default 0
account_id | bigint | not null
data | jsonb | not null
Indexes:
"ced_pkey" PRIMARY KEY, btree (id)
"ced_data" gin (data jsonb_path_ops)
"partition_key_idx" btree (account_id, id)

It seems to me that the background writer just can't keep up with the
amount of writes that I am trying to do and freezes all the updates. What
are my options to improve the background writer performance here?

thanks
--Cory

Attachments:

bg_writer.pngimage/png; name=bg_writer.pngDownload+2-3
#4Jeff Janes
jeff.janes@gmail.com
In reply to: Cory Tucker (#1)
Re: Blocked updates and background writer performance

On Tue, Jan 12, 2016 at 10:25 AM, Cory Tucker <cory.tucker@gmail.com> wrote:

PG 9.4.4 (RDS)

I'm experiencing an issue when trying to update many rows in a single
table (one row at a time, but parallelized across ~12 connections). The
issue we see is that the writes will periodically be blocked for a duration
of several minutes and then pick back up. After digging through our
monitoring stack, I was able to uncover these stats which seem to allude to
it being a background writer performance problem:

I'm not familiar with your monitoring stack. I assume bgwriter.sync_time
refers to pg_stat_bgwriter.checkpoint_sync_time? Also, most of the stats
shown will increase monotonically until the stats are reset. So it looks
like our monitoring stack is either resetting stats frequently, or is
implicitly doing a delta between each consecutive period for display
purposes. And what are the units? bgwriter.buffers_checkpoint is
presumably in buffers as the numerator, but over what period of time in the
denominator?

Anyway, it looks to me like you have a checkpoint problem. The checkpoint
overwhelms your IO system. The overwhelmed IO system then backs up into
the bgwriter. What you see in the bgwriter is just a symptom, not the
cause. The background writer is usually not very useful in recent versions
of PostgreSQL, anyway. But, the same IO problem that is clogging up the
background writer is also clogging up either your buffer_backend, or your
WAL writes/fsyncs. And both of those will destroy your throughput.

[image: bg_writer.png]
(apologies for the image)

Our settings for the background writer are pretty standard OOB (I threw in
some others that I thought might be helpful, too):

name | setting | unit
-------------------------+---------+------
bgwriter_delay | 200 | ms
bgwriter_lru_maxpages | 100 |
bgwriter_lru_multiplier | 2 |
maintenance_work_mem | 65536 | kB
max_worker_processes | 8 |
work_mem | 32768 | kB

What are your checkpoint settings?

Also, you should turn on log_checkpoints.

It seems to me that the background writer just can't keep up with the
amount of writes that I am trying to do and freezes all the updates. What
are my options to improve the background writer performance here?

You probably just need more IO throughput in general.

Cheers,

Jeff

Attachments:

bg_writer.pngimage/png; name=bg_writer.pngDownload+2-3
#5Vick Khera
vivek@khera.org
In reply to: Jeff Janes (#4)
Re: Blocked updates and background writer performance

On Wed, Jan 13, 2016 at 11:51 AM, Jeff Janes <jeff.janes@gmail.com> wrote:

Anyway, it looks to me like you have a checkpoint problem. The checkpoint
overwhelms your IO system. The overwhelmed IO system then backs up into
the bgwriter. What you see in the bgwriter is just a symptom, not the
cause. The background writer is usually not very useful in recent versions
of PostgreSQL, anyway. But, the same IO problem that is clogging up the
background writer is also clogging up either your buffer_backend, or your
WAL writes/fsyncs. And both of those will destroy your throughput.

That was my intuition too. Not enough I/O available from the hardware for
the workload requested.

As recommended, log your checkpoints and try tuning them to spread the load.

#6Cory Tucker
cory.tucker@gmail.com
In reply to: Vick Khera (#5)
Re: Blocked updates and background writer performance

On Wed, Jan 13, 2016 at 9:48 AM Vick Khera <vivek@khera.org> wrote:

That was my intuition too. Not enough I/O available from the hardware for
the workload requested.

As recommended, log your checkpoints and try tuning them to spread the
load.

Thanks guys, will turn on checkpoint logging and try to sniff this out
further.