How to avoid transaction ID wrap

Started by Mark Woodwardover 19 years ago33 messages
#1Mark Woodward
pgsql@mohawksoft.com

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.
We want to keep about a years worth of data at any specific time.
We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.

From the numbers, you can see the PostgreSQL database is VERY loaded.

Running VACUUM may not always be possible without losing data.
The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.

I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
of transactions, COPY, etc. so I'm not dead in the water, but I would be
interested in any observations yo may have.

#2Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Mark Woodward (#1)
Re: How to avoid transaction ID wrap

On Tue, 6 Jun 2006, Mark Woodward wrote:

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.

I would suggest doing all the inserts of one frame in one transaction.
Maybe even multiple frames in one transaction. That should bring down the
number of transactions significantly.

We want to keep about a years worth of data at any specific time.
We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.

Running VACUUM may not always be possible without losing data.

Why not?

The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.

If you can get that maximum up above one year (which was how long you want
to keep the data), you won't need to freeze the records to
avoid ID wraparound.

- Heikki

#3Hannu Krosing
hannu@skype.net
In reply to: Mark Woodward (#1)
Re: How to avoid transaction ID wrap

Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.
We want to keep about a years worth of data at any specific time.

partition by month, then you have better chances of removing old data
without causing overload/data loss;

We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.

actually 4.5 months as you will start having problems at 2G xacts.

I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
of transactions, COPY, etc. so I'm not dead in the water, but I would be
interested in any observations yo may have.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

#4Alvaro Herrera
alvherre@commandprompt.com
In reply to: Mark Woodward (#1)
Re: How to avoid transaction ID wrap

Mark Woodward wrote:

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.

Maybe if you grouped multiple operations on bigger transactions, the I/O
savings could be enough to buy you the ability to vacuum once in a
while. Or consider buffering somehow -- save the data elsewhere, and
have some sort of daemon to put it into the database. This would allow
to cope with the I/O increase during vacuum.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#5Christopher Browne
cbbrowne@acm.org
In reply to: Mark Woodward (#1)
Re: How to avoid transaction ID wrap

Clinging to sanity, hannu@skype.net (Hannu Krosing) mumbled into her beard:

�hel kenal p�eval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.
We want to keep about a years worth of data at any specific time.

partition by month, then you have better chances of removing old data
without causing overload/data loss;

It's certainly worth something to be able to TRUNCATE an elderly
partition; that cleans things out very nicely...

We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

"Friendly delay settings" can have adverse effects; it is likely to
make vacuum run on the order of 3x as long, which means that if you
have a very large table that takes 12h to VACUUM, "vacuum delay" will
increase that to 36h, which means you'll have a transaction open for
36h.

That'll be very evil, to be sure...

The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.

actually 4.5 months as you will start having problems at 2G xacts.

Right.

I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
of transactions, COPY, etc. so I'm not dead in the water, but I would be
interested in any observations yo may have.

Grouping work together to diminish numbers of transactions is almost
always something of a win...
--
select 'cbbrowne' || '@' || 'gmail.com';
http://linuxdatabases.info/info/rdbms.html
Roses are red,
Violets are blue,
I'm schizophrenic...
And I am too.

#6Rod Taylor
pg@rbt.ca
In reply to: Christopher Browne (#5)
Re: How to avoid transaction ID wrap

On Tue, 2006-06-06 at 13:53 -0400, Christopher Browne wrote:

Clinging to sanity, hannu@skype.net (Hannu Krosing) mumbled into her beard:

Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.
We want to keep about a years worth of data at any specific time.

partition by month, then you have better chances of removing old data
without causing overload/data loss;

It's certainly worth something to be able to TRUNCATE an elderly
partition; that cleans things out very nicely...

With one potential snafu -- it blocks new SELECTs against the parent
table while truncate runs on the child (happens with constraint
exclusion as well).

If your transactions are short then it won't be an issue. If you have
mixed length transactions (many short which the occasional long select)
then it becomes tricky since those short transactions will be blocked.

We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

"Friendly delay settings" can have adverse effects; it is likely to
make vacuum run on the order of 3x as long, which means that if you
have a very large table that takes 12h to VACUUM, "vacuum delay" will
increase that to 36h, which means you'll have a transaction open for
36h.

Sounds like this is almost strictly inserts and selects though. If there
is limited garbage collection (updates, deletes, rollbacks of inserts)
required then it isn't all that bad.

--

#7Alvaro Herrera
alvherre@commandprompt.com
In reply to: Rod Taylor (#6)
Re: How to avoid transaction ID wrap

Rod Taylor wrote:

With one potential snafu -- it blocks new SELECTs against the parent
table while truncate runs on the child (happens with constraint
exclusion as well).

If your transactions are short then it won't be an issue. If you have
mixed length transactions (many short which the occasional long select)
then it becomes tricky since those short transactions will be blocked.

One idea is to try to acquire the lock before issuing the TRUNCATE
itself. If the LOCK TABLE times out, you know you should wait for a
long-running transaction ...

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#8Mark Woodward
pgsql@mohawksoft.com
In reply to: Alvaro Herrera (#4)
Re: How to avoid transaction ID wrap

Mark Woodward wrote:

OK, here's my problem, I have a nature study where we have about 10
video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.

Maybe if you grouped multiple operations on bigger transactions, the I/O
savings could be enough to buy you the ability to vacuum once in a
while. Or consider buffering somehow -- save the data elsewhere, and
have some sort of daemon to put it into the database. This would allow
to cope with the I/O increase during vacuum.

The problem is ssufficiently large that any minor modification can easily
hide the problem for a predictble amount of time. My hope was that someone
would have a real "long term" work around.

#9Koichi Suzuki
suzuki.koichi@oss.ntt.co.jp
In reply to: Mark Woodward (#8)
Re: How to avoid transaction ID wrap

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit
transaction ID has to pay about a couple of percent of performance. If
64bit transaction ID is a reasonable fix, I've already posted this
patch. Anyone can apply this to later versions.

Mark Woodward wrote:

Mark Woodward wrote:

OK, here's my problem, I have a nature study where we have about 10
video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.

Maybe if you grouped multiple operations on bigger transactions, the I/O
savings could be enough to buy you the ability to vacuum once in a
while. Or consider buffering somehow -- save the data elsewhere, and
have some sort of daemon to put it into the database. This would allow
to cope with the I/O increase during vacuum.

The problem is ssufficiently large that any minor modification can easily
hide the problem for a predictble amount of time. My hope was that someone
would have a real "long term" work around.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--
Koichi Suzuki

#10Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Koichi Suzuki (#9)
Re: How to avoid transaction ID wrap

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit
transaction ID has to pay about a couple of percent of performance.
If 64bit transaction ID is a reasonable fix, I've already posted this
patch. Anyone can apply this to later versions.

Did you check performance on 32-bit or 64-bit systems and 64-bit binary
version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

Zdenek

#11Alvaro Herrera
alvherre@commandprompt.com
In reply to: Koichi Suzuki (#9)
Re: How to avoid transaction ID wrap

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit
transaction ID has to pay about a couple of percent of performance. If
64bit transaction ID is a reasonable fix, I've already posted this
patch. Anyone can apply this to later versions.

Be careful, the pg_multixact stuff assumes that a MultiXactId is the
same size as TransactionId, so you have to change that too. I don't
recall offhand if it was defined in a way that would make it "just work"
automatically. (You'd also have to be careful about it not overrunning
the SLRU files when it's close to the end of 'em.).

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#12Martijn van Oosterhout
kleptog@svana.org
In reply to: Zdenek Kotala (#10)
Re: How to avoid transaction ID wrap

On Wed, Jun 07, 2006 at 01:48:50PM +0200, Zdenek Kotala wrote:

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit
transaction ID has to pay about a couple of percent of performance.
If 64bit transaction ID is a reasonable fix, I've already posted this
patch. Anyone can apply this to later versions.

Did you check performance on 32-bit or 64-bit systems and 64-bit binary
version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

I doubt performance is the issue directly. Increasing the size of the
transaction counter would increase the size of narrow tables by maybe
30%. That's 30% more disk space and 30% more memory usage in some
places. Maybe at some point it'll be worth it, but right now I don't
think those commodoties are cheap enough to use like this for fairly
marginal benefits.

Beside, memory bandwidth hasn't grown anywhere enar as fast as memory
space, so it's always a good idea to use as little memory as possible.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zdenek Kotala (#10)
Re: How to avoid transaction ID wrap

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX).

Did you check performance on 32-bit or 64-bit systems and 64-bit binary
version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

The percentage increase in I/O demand is the main reason the patch was
rejected, not so much the arithmetic.

regards, tom lane

#14Jim C. Nasby
jnasby@pervasive.com
In reply to: Tom Lane (#13)
Re: How to avoid transaction ID wrap

On Wed, Jun 07, 2006 at 11:47:45AM -0400, Tom Lane wrote:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX).

Did you check performance on 32-bit or 64-bit systems and 64-bit binary
version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

The percentage increase in I/O demand is the main reason the patch was
rejected, not so much the arithmetic.

Before considering 64 bit XIDs, it'd be very helpful to know why Mark
can't vacuum frequently enough to handle rollover...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#15Hannu Krosing
hannu@skype.net
In reply to: Christopher Browne (#5)
Re: How to avoid transaction ID wrap

Ühel kenal päeval, T, 2006-06-06 kell 13:53, kirjutas Christopher
Browne:

We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

"Friendly delay settings" can have adverse effects; it is likely to
make vacuum run on the order of 3x as long, which means that if you
have a very large table that takes 12h to VACUUM, "vacuum delay" will
increase that to 36h, which means you'll have a transaction open for
36h.

That'll be very evil, to be sure...

Not always. I know that it is evil in slony1 context, but often it *is*
possible to design your system in a way where a superlong transaction is
almost unnoticable.

Long transactions are evil in case they cause some fast-changing table
to grow its storage size several orders of magnitude, but if that is not
the case then they just run there in backgroun with no ill effects,
especially do-nothing transactions like vacuum.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

#16Jim C. Nasby
jnasby@pervasive.com
In reply to: Hannu Krosing (#15)
Re: How to avoid transaction ID wrap

On Thu, Jun 08, 2006 at 01:30:42AM +0300, Hannu Krosing wrote:

??hel kenal p??eval, T, 2006-06-06 kell 13:53, kirjutas Christopher
Browne:

We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

"Friendly delay settings" can have adverse effects; it is likely to
make vacuum run on the order of 3x as long, which means that if you
have a very large table that takes 12h to VACUUM, "vacuum delay" will
increase that to 36h, which means you'll have a transaction open for
36h.

That'll be very evil, to be sure...

Not always. I know that it is evil in slony1 context, but often it *is*
possible to design your system in a way where a superlong transaction is
almost unnoticable.

Long transactions are evil in case they cause some fast-changing table
to grow its storage size several orders of magnitude, but if that is not
the case then they just run there in backgroun with no ill effects,
especially do-nothing transactions like vacuum.

Plus, if the only issue here is in fact the long-running transaction for
vacuum, there's other ways to address that which would be a lot less
intrusive than doing something like going to 64 bit XIDs. IIRC, in 8.2
vacuum will start a new transaction every time it fills up
maintenance_work_mem, so just setting that low could solve the problem
(at the expense of a heck of a lot of extra IO).
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#17Jim C. Nasby
jnasby@pervasive.com
In reply to: Mark Woodward (#1)
Re: How to avoid transaction ID wrap

On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote:

I guess what I am saying is that PostgreSQL isn't "smooth," between
checkpoints and vacuum, it is near impossible to make a product that
performs consistently under high load.

Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to
find a case where I couldn't smooth out the IO load so that it wasn't an
issue.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#18Mark Woodward
pgsql@mohawksoft.com
In reply to: Jim C. Nasby (#14)
Re: How to avoid transaction ID wrap

On Wed, Jun 07, 2006 at 11:47:45AM -0400, Tom Lane wrote:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX).

Did you check performance on 32-bit or 64-bit systems and 64-bit

binary

version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

The percentage increase in I/O demand is the main reason the patch was
rejected, not so much the arithmetic.

Before considering 64 bit XIDs, it'd be very helpful to know why Mark
can't vacuum frequently enough to handle rollover...

The system is under heavy load, and while there are tricks that can be
done, vacuum is a process which is extra load the system when it is
running. It is a sliding scale, as always, you may get the system to the
point where it can vacuum AND perform as needed, but the database is
growing constantly. Eventually you will get to the point where you can't
run vacuum *and* keep up with the data stream.

I guess what I am saying is that PostgreSQL isn't "smooth," between
checkpoints and vacuum, it is near impossible to make a product that
performs consistently under high load.

Now don't flame me, I really do love PostgreSQL, it is just that I bump up
against these issues from time to time and it would be nice if there were
some way to work around them.

#19Koichi Suzuki
suzuki.koichi@oss.ntt.co.jp
In reply to: Tom Lane (#13)
Re: How to avoid transaction ID wrap

Tom Lane wrote:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX).

Did you check performance on 32-bit or 64-bit systems and 64-bit binary
version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

I checked the performance on 64-bit system and 64bit binary.

The percentage increase in I/O demand is the main reason the patch was
rejected, not so much the arithmetic.

That's right. I've also ovserved I/O demand increase. I remember we
have to pay three to five percent performance decrease in pgbench. So I
don't think we should apply this patch without further justification.
I'm looking for other reasons for larger transaction ID.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

--
Koichi Suzuki

#20Hannu Krosing
hannu@skype.net
In reply to: Jim C. Nasby (#16)
Re: How to avoid transaction ID wrap

Ühel kenal päeval, K, 2006-06-07 kell 17:45, kirjutas Jim C. Nasby:

Plus, if the only issue here is in fact the long-running transaction for
vacuum, there's other ways to address that which would be a lot less
intrusive than doing something like going to 64 bit XIDs. IIRC, in 8.2
vacuum will start a new transaction every time it fills up
maintenance_work_mem, so just setting that low could solve the problem
(at the expense of a heck of a lot of extra IO).

If the aim is to *only* avoid transaction wraparound, then maybe we
could introduce VACUUM FREEZE ONLY; which never removes any old tuples,
but instead just marks them by setting xmin=xmax for them, in addition
to its freezing of live-and-visible-to-all tuples.

This would avoid touching indexes at all and may well be what is desired
for tables with only very little updates/deletes.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

#21Mark Woodward
pgsql@mohawksoft.com
In reply to: Jim C. Nasby (#17)
Re: How to avoid transaction ID wrap

On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote:

I guess what I am saying is that PostgreSQL isn't "smooth," between
checkpoints and vacuum, it is near impossible to make a product that
performs consistently under high load.

Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to
find a case where I couldn't smooth out the IO load so that it wasn't an
issue.

In several project that I have been involved with, PostgreSQL had most of
the important features to be used, but in one project, checkpoints caused
us to time out under load. In this current project I am researching, I
know that vacuum may be an issue. The load is brutally constant.

#22Joshua D. Drake
jd@commandprompt.com
In reply to: Mark Woodward (#21)
Re: How to avoid transaction ID wrap

Mark Woodward wrote:

On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote:

I guess what I am saying is that PostgreSQL isn't "smooth," between
checkpoints and vacuum, it is near impossible to make a product that
performs consistently under high load.

Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to
find a case where I couldn't smooth out the IO load so that it wasn't an
issue.

In several project that I have been involved with, PostgreSQL had most of
the important features to be used, but in one project, checkpoints caused
us to time out under load. In this current project I am researching, I
know that vacuum may be an issue. The load is brutally constant.

I was recently involved in a project where we had to decrease the
checkpoint_timeout . The problem was, that the database was performing
so many transactions that if we waiting for 5 minutes, checkpoint would
take entirely too long.

We ended up doing checkpoints every two minutes which with the increase
in checkpoint_segments and adjustment of bgwriter settings would level
out the load.

Sincerely,

Joshua D. Drake

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joshua D. Drake (#22)
Re: How to avoid transaction ID wrap

"Joshua D. Drake" <jd@commandprompt.com> writes:

I was recently involved in a project where we had to decrease the
checkpoint_timeout . The problem was, that the database was performing
so many transactions that if we waiting for 5 minutes, checkpoint would
take entirely too long.

Seems like the correct fix for that is to make the bgwriter more
aggressive. Narrowing the checkpoint spacing is a pretty horrid answer
because of the resulting increase in full-page-image WAL traffic.

regards, tom lane

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#20)
Re: How to avoid transaction ID wrap

Hannu Krosing <hannu@skype.net> writes:

If the aim is to *only* avoid transaction wraparound, then maybe we
could introduce VACUUM FREEZE ONLY; which never removes any old tuples,
but instead just marks them by setting xmin=xmax for them, in addition
to its freezing of live-and-visible-to-all tuples.

This would avoid touching indexes at all and may well be what is desired
for tables with only very little updates/deletes.

Seems like useless complexity. If there are so few dead tuples that you
can afford to not reclaim them, then there are so few that reclaiming
them isn't really going to cost much either ...

regards, tom lane

#25Joshua D. Drake
jd@commandprompt.com
In reply to: Tom Lane (#23)
Re: How to avoid transaction ID wrap

Tom Lane wrote:

"Joshua D. Drake" <jd@commandprompt.com> writes:

I was recently involved in a project where we had to decrease the
checkpoint_timeout . The problem was, that the database was performing
so many transactions that if we waiting for 5 minutes, checkpoint would
take entirely too long.

Seems like the correct fix for that is to make the bgwriter more
aggressive. Narrowing the checkpoint spacing is a pretty horrid answer
because of the resulting increase in full-page-image WAL traffic.

Well we did that as well. Here are the basic symptons:

During normal processing which contained about 250 connections
everything was fine. A checkpoint would start and connections would
start piling up, sometimes breaking 1000.

We narrowed that down to users having to wait longer for query execution
so instead of just reusing connections new connections had to be
initiated because the existing connections were busy.

We tried many different parameters, and bgwriter did significantly help
but the only "solution" was to make checkpoints happen at a much more
aggressive time frame.

Modify bgwriters settings and the checkpoint actually increased our
velocity by about 70% by the time we were done. Bgwriter was definitely
the largest chunk of that although other parameters combined outweighed
it (effective_cache, shared_buffers etc...).

Sincerely,

Joshua D. Drake

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

#26Jim C. Nasby
jnasby@pervasive.com
In reply to: Joshua D. Drake (#25)
Re: How to avoid transaction ID wrap

On Thu, Jun 08, 2006 at 09:13:33AM -0700, Joshua D. Drake wrote:

Tom Lane wrote:

"Joshua D. Drake" <jd@commandprompt.com> writes:

I was recently involved in a project where we had to decrease the
checkpoint_timeout . The problem was, that the database was performing
so many transactions that if we waiting for 5 minutes, checkpoint would
take entirely too long.

Seems like the correct fix for that is to make the bgwriter more
aggressive. Narrowing the checkpoint spacing is a pretty horrid answer
because of the resulting increase in full-page-image WAL traffic.

Well we did that as well. Here are the basic symptons:

During normal processing which contained about 250 connections
everything was fine. A checkpoint would start and connections would
start piling up, sometimes breaking 1000.

We narrowed that down to users having to wait longer for query execution
so instead of just reusing connections new connections had to be
initiated because the existing connections were busy.

We tried many different parameters, and bgwriter did significantly help
but the only "solution" was to make checkpoints happen at a much more
aggressive time frame.

Modify bgwriters settings and the checkpoint actually increased our
velocity by about 70% by the time we were done. Bgwriter was definitely
the largest chunk of that although other parameters combined outweighed
it (effective_cache, shared_buffers etc...).

Did you try increasing the checkpoint interval, in the hopes that it
would allow the bgwritter enough extra time to get everything pushed
out?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#27Hannu Krosing
hannu@skype.net
In reply to: Tom Lane (#24)
Re: How to avoid transaction ID wrap

Ühel kenal päeval, N, 2006-06-08 kell 12:09, kirjutas Tom Lane:

Hannu Krosing <hannu@skype.net> writes:

If the aim is to *only* avoid transaction wraparound, then maybe we
could introduce VACUUM FREEZE ONLY; which never removes any old tuples,
but instead just marks them by setting xmin=xmax for them, in addition
to its freezing of live-and-visible-to-all tuples.

This would avoid touching indexes at all and may well be what is desired
for tables with only very little updates/deletes.

Seems like useless complexity. If there are so few dead tuples that you
can afford to not reclaim them, then there are so few that reclaiming
them isn't really going to cost much either ...

It will cost 1 full scan per index, which can be quite a lot of disk
read traffic, if indexes are not used, say when most access is local to
some hotspot.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

#28Trent Shipley
tshipley@deru.com
In reply to: Mark Woodward (#8)
Re: How to avoid transaction ID wrap

On Tuesday 2006-06-06 20:11, Mark Woodward wrote:

Mark Woodward wrote:

OK, here's my problem, I have a nature study where we have about 10
video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.

Maybe if you grouped multiple operations on bigger transactions, the I/O
savings could be enough to buy you the ability to vacuum once in a
while. Or consider buffering somehow -- save the data elsewhere, and
have some sort of daemon to put it into the database. This would allow
to cope with the I/O increase during vacuum.

The problem is ssufficiently large that any minor modification can easily
hide the problem for a predictble amount of time. My hope was that someone
would have a real "long term" work around.

I'm not certain that I understand the original problem correctly so I am going
to restate it.

VACCUM needs to be run for two reasons.
1) To recover the transaction counter.
2) To recover records marked for deletion.

VACCUM needs to be run over the entire database. If the data in the database
is N, then VACCUM is O(N). Roughly, VACCUM scales linearly with the size of
the database.

In the digital video problem:

Data is stored indefinitely online. (It is not archived.)
(Virtually no records need to be recovered from deletion.)
Data comes in at a constant rate, frames.
The database therefore grows at frames/time (D).

It follows that no matter how much tuning is done, given constant hardware,
VACCUM grows to consume so many resources that it is no longer possible to
process frames/time[m] before frames/time[m+1] arrives.

Ideally, the transaction management system would be proportional to the
marginal change in size of the database rather than the gross size of the
database. That is VACCUM being O(N) should be replaced (or there should be
an optional alternative) that scales with D, O^k(D) where any k > 1 involves
a tradeoff with VACCUM.

#29Jim C. Nasby
jnasby@pervasive.com
In reply to: Trent Shipley (#28)
Re: How to avoid transaction ID wrap

On Fri, Jun 09, 2006 at 06:20:21PM -0700, Trent Shipley wrote:

Ideally, the transaction management system would be proportional to the
marginal change in size of the database rather than the gross size of the
database. That is VACCUM being O(N) should be replaced (or there should be
an optional alternative) that scales with D, O^k(D) where any k > 1 involves
a tradeoff with VACCUM.

That's something that's been discussed quite a bit; search the archives
for 'dead space map'. Granted, that wasn't targeted so much at the need
to VACUUM FREEZE, but there's no reason it wouldn't be possible to make
it handle that as well. In the mean time, if you partition the table on
date, you won't need to be vacuuming the entire database to handle XID
wrap.

BTW, you're also ignoring the fact that technology keeps improving the
hardware that's out there. It's possible that hardware advances alone
would keep pace with your vacuuming needs.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#30Martijn van Oosterhout
kleptog@svana.org
In reply to: Trent Shipley (#28)
Re: How to avoid transaction ID wrap

On Fri, Jun 09, 2006 at 06:20:21PM -0700, Trent Shipley wrote:

VACCUM needs to be run for two reasons.
1) To recover the transaction counter.
2) To recover records marked for deletion.

VACCUM needs to be run over the entire database. If the data in the database
is N, then VACCUM is O(N). Roughly, VACCUM scales linearly with the size of
the database.

Well, you only need to vacuum the entire database once every billion
transactions.

Secondly, you can vacuum table by table. If you know a table will never
be modified, you can VACUUM FREZZE it, then it will never need to be
vacuumed again, ever (until you make changes ofcourse).

Ideally, the transaction management system would be proportional to the
marginal change in size of the database rather than the gross size of the
database. That is VACCUM being O(N) should be replaced (or there should be
an optional alternative) that scales with D, O^k(D) where any k > 1 involves
a tradeoff with VACCUM.

That's why people suggest partitions. Then you only vacuum the
partitions that are new and the old ones never need to be touched...

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#30)
Re: How to avoid transaction ID wrap

Martijn van Oosterhout <kleptog@svana.org> writes:

That's why people suggest partitions. Then you only vacuum the
partitions that are new and the old ones never need to be touched...

This will all work a lot better once we track XID wraparound risk on a
per-table rather than per-database basis. I hope that will be done in
time for 8.2.

regards, tom lane

#32Alvaro Herrera
alvherre@commandprompt.com
In reply to: Jim C. Nasby (#29)
Re: How to avoid transaction ID wrap

Jim C. Nasby wrote:

On Fri, Jun 09, 2006 at 06:20:21PM -0700, Trent Shipley wrote:

Ideally, the transaction management system would be proportional to the
marginal change in size of the database rather than the gross size of the
database. That is VACCUM being O(N) should be replaced (or there should be
an optional alternative) that scales with D, O^k(D) where any k > 1 involves
a tradeoff with VACCUM.

That's something that's been discussed quite a bit; search the archives
for 'dead space map'. Granted, that wasn't targeted so much at the need
to VACUUM FREEZE, but there's no reason it wouldn't be possible to make
it handle that as well. In the mean time, if you partition the table on
date, you won't need to be vacuuming the entire database to handle XID
wrap.

FWIW my patch for vacuum fixes some of these issues. First because you
can freeze a table and will never need to vacuum it again; database-wide
vacuums will not be necessary. And secondly, because as soon as a table
is frozen (either because you VACUUM FREEZE'd it, or because regular
vacuuming froze all tuples on it completely), then you don't need to
vacuum it again and indeed (non-full) VACUUM turns into a no-op.

Mix this with partitioned tables.

#33Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#31)
Re: How to avoid transaction ID wrap

Tom Lane wrote:

Martijn van Oosterhout <kleptog@svana.org> writes:

That's why people suggest partitions. Then you only vacuum the
partitions that are new and the old ones never need to be touched...

This will all work a lot better once we track XID wraparound risk on a
per-table rather than per-database basis. I hope that will be done in
time for 8.2.

FWIW I posted the patch for non-transactional pg_class stuff in -patches
awhile back, so it's pending review ;-) I'll repost it (today I expect,
or tomorrow at the latest) with some minor corrections, along with the
corresponding relminxid patch. I indend to commit both during next week
(or this week, for those whose weeks start on sundays), barring
objections.