disk writes within a transaction

Started by Jonathan Vanascoabout 9 years ago4 messagesgeneral
Jump to latest
#1Jonathan Vanasco
postgres@2xlp.com

Can someone enlighten me to how postgres handles disk writing? I've read some generic remarks about buffers, but that's about it.

We have a chunk of code that calls Postgres in a less-than-optimal way within a transaction block. I'm wondering where to prioritize fixing it, as the traffic on the wire isn't an issue.

Basically the code looks like this:

begin;
update foo set foo.a='1' where foo.bar = 1;
...
update foo set foo.b='2' where foo.bar = 1;
...
update foo set foo.c='3' where foo.bar = 1;
commit;

If the updates are likely to be a memory based operation, consolidating them can wait. If they are likely to hit the disk, I should schedule refactoring this code sooner than later.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Jonathan Vanasco (#1)
Re: disk writes within a transaction

On 02/16/2017 11:33 AM, 2xlp - ListSubscriptions wrote:

Can someone enlighten me to how postgres handles disk writing? I've read some generic remarks about buffers, but that's about it.

We have a chunk of code that calls Postgres in a less-than-optimal way within a transaction block. I'm wondering where to prioritize fixing it, as the traffic on the wire isn't an issue.

Basically the code looks like this:

begin;
update foo set foo.a='1' where foo.bar = 1;
...
update foo set foo.b='2' where foo.bar = 1;
...
update foo set foo.c='3' where foo.bar = 1;
commit;

If the updates are likely to be a memory based operation, consolidating them can wait. If they are likely to hit the disk, I should schedule refactoring this code sooner than later.

I would suggest taking a look at:

https://www.postgresql.org/docs/9.6/static/wal-configuration.html

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Jeff Janes
jeff.janes@gmail.com
In reply to: Jonathan Vanasco (#1)
Re: disk writes within a transaction

On Thu, Feb 16, 2017 at 11:33 AM, 2xlp - ListSubscriptions <
postgres@2xlp.com> wrote:

Can someone enlighten me to how postgres handles disk writing? I've read
some generic remarks about buffers, but that's about it.

We have a chunk of code that calls Postgres in a less-than-optimal way
within a transaction block. I'm wondering where to prioritize fixing it,
as the traffic on the wire isn't an issue.

Basically the code looks like this:

begin;
update foo set foo.a='1' where foo.bar = 1;
...
update foo set foo.b='2' where foo.bar = 1;
...
update foo set foo.c='3' where foo.bar = 1;
commit;

If the updates are likely to be a memory based operation, consolidating
them can wait. If they are likely to hit the disk, I should schedule
refactoring this code sooner than later.

You are going to generate more volume of WAL data, which has to reach disk
eventually. Although it is likely they will all be consolidated into about
the same number of physical writes and syncs.

You are also likely to inhibit the Heap-only-tuple mechanism, because you
will end up with 4 copies of the row which all have to fit in the same
block. If they don't, it has to migrate some of them to a different block
plus do index maintenance, so you will generate more dirty blocks that
way. How many more depends on how many indexes you have, and whether the
columns being updated are themselves included in indexes.

There is also a CPU issue when the same tuple is updated repeatedly in a
single transaction. Each update has to wade through all the previous row
versions, so it is an N^2 operation in the number of updates.

It will probably be easier to refactor the code than to quantify just how
much damage it does.

cheers,

Jeff

#4Jonathan Vanasco
postgres@2xlp.com
In reply to: Jeff Janes (#3)
Re: disk writes within a transaction

On Feb 17, 2017, at 4:05 PM, Jeff Janes wrote:

It will probably be easier to refactor the code than to quantify just how much damage it does.

Thanks for all the info. It looks like this is something worth prioritizing because of the effects on indexes.

We had discussed a fix and pointed it; rewriting the code that causes this is pretty massive, and will require blocking out a resource FT for 2 weeks on rewrites and testing. We don't really have time to spare any of those devs, so time to make product tradeoffs ;(

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general