How to avoid transaction ID wrap

Started by Mark Woodwardalmost 20 years ago33 messageshackers
Jump to latest
#1Mark Woodward
pgsql@mohawksoft.com

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.
We want to keep about a years worth of data at any specific time.
We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.

From the numbers, you can see the PostgreSQL database is VERY loaded.

Running VACUUM may not always be possible without losing data.
The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.

I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
of transactions, COPY, etc. so I'm not dead in the water, but I would be
interested in any observations yo may have.

#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Mark Woodward (#1)
Re: How to avoid transaction ID wrap

On Tue, 6 Jun 2006, Mark Woodward wrote:

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.

I would suggest doing all the inserts of one frame in one transaction.
Maybe even multiple frames in one transaction. That should bring down the
number of transactions significantly.

We want to keep about a years worth of data at any specific time.
We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.

Running VACUUM may not always be possible without losing data.

Why not?

The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.

If you can get that maximum up above one year (which was how long you want
to keep the data), you won't need to freeze the records to
avoid ID wraparound.

- Heikki

#3Hannu Krosing
hannu@tm.ee
In reply to: Mark Woodward (#1)
Re: How to avoid transaction ID wrap

Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.
We want to keep about a years worth of data at any specific time.

partition by month, then you have better chances of removing old data
without causing overload/data loss;

We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.

actually 4.5 months as you will start having problems at 2G xacts.

I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
of transactions, COPY, etc. so I'm not dead in the water, but I would be
interested in any observations yo may have.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Mark Woodward (#1)
Re: How to avoid transaction ID wrap

Mark Woodward wrote:

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.

Maybe if you grouped multiple operations on bigger transactions, the I/O
savings could be enough to buy you the ability to vacuum once in a
while. Or consider buffering somehow -- save the data elsewhere, and
have some sort of daemon to put it into the database. This would allow
to cope with the I/O increase during vacuum.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#5Chris Browne
cbbrowne@acm.org
In reply to: Mark Woodward (#1)
Re: How to avoid transaction ID wrap

Clinging to sanity, hannu@skype.net (Hannu Krosing) mumbled into her beard:

�hel kenal p�eval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.
We want to keep about a years worth of data at any specific time.

partition by month, then you have better chances of removing old data
without causing overload/data loss;

It's certainly worth something to be able to TRUNCATE an elderly
partition; that cleans things out very nicely...

We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

"Friendly delay settings" can have adverse effects; it is likely to
make vacuum run on the order of 3x as long, which means that if you
have a very large table that takes 12h to VACUUM, "vacuum delay" will
increase that to 36h, which means you'll have a transaction open for
36h.

That'll be very evil, to be sure...

The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.

actually 4.5 months as you will start having problems at 2G xacts.

Right.

I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
of transactions, COPY, etc. so I'm not dead in the water, but I would be
interested in any observations yo may have.

Grouping work together to diminish numbers of transactions is almost
always something of a win...
--
select 'cbbrowne' || '@' || 'gmail.com';
http://linuxdatabases.info/info/rdbms.html
Roses are red,
Violets are blue,
I'm schizophrenic...
And I am too.

#6Rod Taylor
rbt@rbt.ca
In reply to: Chris Browne (#5)
Re: How to avoid transaction ID wrap

On Tue, 2006-06-06 at 13:53 -0400, Christopher Browne wrote:

Clinging to sanity, hannu@skype.net (Hannu Krosing) mumbled into her beard:

Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.
We want to keep about a years worth of data at any specific time.

partition by month, then you have better chances of removing old data
without causing overload/data loss;

It's certainly worth something to be able to TRUNCATE an elderly
partition; that cleans things out very nicely...

With one potential snafu -- it blocks new SELECTs against the parent
table while truncate runs on the child (happens with constraint
exclusion as well).

If your transactions are short then it won't be an issue. If you have
mixed length transactions (many short which the occasional long select)
then it becomes tricky since those short transactions will be blocked.

We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

"Friendly delay settings" can have adverse effects; it is likely to
make vacuum run on the order of 3x as long, which means that if you
have a very large table that takes 12h to VACUUM, "vacuum delay" will
increase that to 36h, which means you'll have a transaction open for
36h.

Sounds like this is almost strictly inserts and selects though. If there
is limited garbage collection (updates, deletes, rollbacks of inserts)
required then it isn't all that bad.

--

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Rod Taylor (#6)
Re: How to avoid transaction ID wrap

Rod Taylor wrote:

With one potential snafu -- it blocks new SELECTs against the parent
table while truncate runs on the child (happens with constraint
exclusion as well).

If your transactions are short then it won't be an issue. If you have
mixed length transactions (many short which the occasional long select)
then it becomes tricky since those short transactions will be blocked.

One idea is to try to acquire the lock before issuing the TRUNCATE
itself. If the LOCK TABLE times out, you know you should wait for a
long-running transaction ...

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#8Mark Woodward
pgsql@mohawksoft.com
In reply to: Alvaro Herrera (#4)
Re: How to avoid transaction ID wrap

Mark Woodward wrote:

OK, here's my problem, I have a nature study where we have about 10
video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.

Maybe if you grouped multiple operations on bigger transactions, the I/O
savings could be enough to buy you the ability to vacuum once in a
while. Or consider buffering somehow -- save the data elsewhere, and
have some sort of daemon to put it into the database. This would allow
to cope with the I/O increase during vacuum.

The problem is ssufficiently large that any minor modification can easily
hide the problem for a predictble amount of time. My hope was that someone
would have a real "long term" work around.

#9Koichi Suzuki
koichi.szk@gmail.com
In reply to: Mark Woodward (#8)
Re: How to avoid transaction ID wrap

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit
transaction ID has to pay about a couple of percent of performance. If
64bit transaction ID is a reasonable fix, I've already posted this
patch. Anyone can apply this to later versions.

Mark Woodward wrote:

Mark Woodward wrote:

OK, here's my problem, I have a nature study where we have about 10
video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.

Maybe if you grouped multiple operations on bigger transactions, the I/O
savings could be enough to buy you the ability to vacuum once in a
while. Or consider buffering somehow -- save the data elsewhere, and
have some sort of daemon to put it into the database. This would allow
to cope with the I/O increase during vacuum.

The problem is ssufficiently large that any minor modification can easily
hide the problem for a predictble amount of time. My hope was that someone
would have a real "long term" work around.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--
Koichi Suzuki

#10Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Koichi Suzuki (#9)
Re: How to avoid transaction ID wrap

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit
transaction ID has to pay about a couple of percent of performance.
If 64bit transaction ID is a reasonable fix, I've already posted this
patch. Anyone can apply this to later versions.

Did you check performance on 32-bit or 64-bit systems and 64-bit binary
version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

Zdenek

#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Koichi Suzuki (#9)
Re: How to avoid transaction ID wrap

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit
transaction ID has to pay about a couple of percent of performance. If
64bit transaction ID is a reasonable fix, I've already posted this
patch. Anyone can apply this to later versions.

Be careful, the pg_multixact stuff assumes that a MultiXactId is the
same size as TransactionId, so you have to change that too. I don't
recall offhand if it was defined in a way that would make it "just work"
automatically. (You'd also have to be careful about it not overrunning
the SLRU files when it's close to the end of 'em.).

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#12Martijn van Oosterhout
kleptog@svana.org
In reply to: Zdenek Kotala (#10)
Re: How to avoid transaction ID wrap

On Wed, Jun 07, 2006 at 01:48:50PM +0200, Zdenek Kotala wrote:

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit
transaction ID has to pay about a couple of percent of performance.
If 64bit transaction ID is a reasonable fix, I've already posted this
patch. Anyone can apply this to later versions.

Did you check performance on 32-bit or 64-bit systems and 64-bit binary
version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

I doubt performance is the issue directly. Increasing the size of the
transaction counter would increase the size of narrow tables by maybe
30%. That's 30% more disk space and 30% more memory usage in some
places. Maybe at some point it'll be worth it, but right now I don't
think those commodoties are cheap enough to use like this for fairly
marginal benefits.

Beside, memory bandwidth hasn't grown anywhere enar as fast as memory
space, so it's always a good idea to use as little memory as possible.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zdenek Kotala (#10)
Re: How to avoid transaction ID wrap

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX).

Did you check performance on 32-bit or 64-bit systems and 64-bit binary
version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

The percentage increase in I/O demand is the main reason the patch was
rejected, not so much the arithmetic.

regards, tom lane

#14Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#13)
Re: How to avoid transaction ID wrap

On Wed, Jun 07, 2006 at 11:47:45AM -0400, Tom Lane wrote:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX).

Did you check performance on 32-bit or 64-bit systems and 64-bit binary
version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

The percentage increase in I/O demand is the main reason the patch was
rejected, not so much the arithmetic.

Before considering 64 bit XIDs, it'd be very helpful to know why Mark
can't vacuum frequently enough to handle rollover...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#15Hannu Krosing
hannu@tm.ee
In reply to: Chris Browne (#5)
Re: How to avoid transaction ID wrap

Ühel kenal päeval, T, 2006-06-06 kell 13:53, kirjutas Christopher
Browne:

We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

"Friendly delay settings" can have adverse effects; it is likely to
make vacuum run on the order of 3x as long, which means that if you
have a very large table that takes 12h to VACUUM, "vacuum delay" will
increase that to 36h, which means you'll have a transaction open for
36h.

That'll be very evil, to be sure...

Not always. I know that it is evil in slony1 context, but often it *is*
possible to design your system in a way where a superlong transaction is
almost unnoticable.

Long transactions are evil in case they cause some fast-changing table
to grow its storage size several orders of magnitude, but if that is not
the case then they just run there in backgroun with no ill effects,
especially do-nothing transactions like vacuum.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

#16Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Hannu Krosing (#15)
Re: How to avoid transaction ID wrap

On Thu, Jun 08, 2006 at 01:30:42AM +0300, Hannu Krosing wrote:

??hel kenal p??eval, T, 2006-06-06 kell 13:53, kirjutas Christopher
Browne:

We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

"Friendly delay settings" can have adverse effects; it is likely to
make vacuum run on the order of 3x as long, which means that if you
have a very large table that takes 12h to VACUUM, "vacuum delay" will
increase that to 36h, which means you'll have a transaction open for
36h.

That'll be very evil, to be sure...

Not always. I know that it is evil in slony1 context, but often it *is*
possible to design your system in a way where a superlong transaction is
almost unnoticable.

Long transactions are evil in case they cause some fast-changing table
to grow its storage size several orders of magnitude, but if that is not
the case then they just run there in backgroun with no ill effects,
especially do-nothing transactions like vacuum.

Plus, if the only issue here is in fact the long-running transaction for
vacuum, there's other ways to address that which would be a lot less
intrusive than doing something like going to 64 bit XIDs. IIRC, in 8.2
vacuum will start a new transaction every time it fills up
maintenance_work_mem, so just setting that low could solve the problem
(at the expense of a heck of a lot of extra IO).
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#17Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Mark Woodward (#1)
Re: How to avoid transaction ID wrap

On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote:

I guess what I am saying is that PostgreSQL isn't "smooth," between
checkpoints and vacuum, it is near impossible to make a product that
performs consistently under high load.

Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to
find a case where I couldn't smooth out the IO load so that it wasn't an
issue.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#18Mark Woodward
pgsql@mohawksoft.com
In reply to: Jim Nasby (#14)
Re: How to avoid transaction ID wrap

On Wed, Jun 07, 2006 at 11:47:45AM -0400, Tom Lane wrote:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX).

Did you check performance on 32-bit or 64-bit systems and 64-bit

binary

version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

The percentage increase in I/O demand is the main reason the patch was
rejected, not so much the arithmetic.

Before considering 64 bit XIDs, it'd be very helpful to know why Mark
can't vacuum frequently enough to handle rollover...

The system is under heavy load, and while there are tricks that can be
done, vacuum is a process which is extra load the system when it is
running. It is a sliding scale, as always, you may get the system to the
point where it can vacuum AND perform as needed, but the database is
growing constantly. Eventually you will get to the point where you can't
run vacuum *and* keep up with the data stream.

I guess what I am saying is that PostgreSQL isn't "smooth," between
checkpoints and vacuum, it is near impossible to make a product that
performs consistently under high load.

Now don't flame me, I really do love PostgreSQL, it is just that I bump up
against these issues from time to time and it would be nice if there were
some way to work around them.

#19Koichi Suzuki
koichi.szk@gmail.com
In reply to: Tom Lane (#13)
Re: How to avoid transaction ID wrap

Tom Lane wrote:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Koichi Suzuki wrote:

I've once proposed a patch for 64bit transaction ID, but this causes
some overhead to each tuple (XMIN and XMAX).

Did you check performance on 32-bit or 64-bit systems and 64-bit binary
version of PGSQL? I think that today is not problem to have 64-bit
architecture and 64-bit ID should increase scalability of Postgres.

I checked the performance on 64-bit system and 64bit binary.

The percentage increase in I/O demand is the main reason the patch was
rejected, not so much the arithmetic.

That's right. I've also ovserved I/O demand increase. I remember we
have to pay three to five percent performance decrease in pgbench. So I
don't think we should apply this patch without further justification.
I'm looking for other reasons for larger transaction ID.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

--
Koichi Suzuki

#20Hannu Krosing
hannu@tm.ee
In reply to: Jim Nasby (#16)
Re: How to avoid transaction ID wrap

Ühel kenal päeval, K, 2006-06-07 kell 17:45, kirjutas Jim C. Nasby:

Plus, if the only issue here is in fact the long-running transaction for
vacuum, there's other ways to address that which would be a lot less
intrusive than doing something like going to 64 bit XIDs. IIRC, in 8.2
vacuum will start a new transaction every time it fills up
maintenance_work_mem, so just setting that low could solve the problem
(at the expense of a heck of a lot of extra IO).

If the aim is to *only* avoid transaction wraparound, then maybe we
could introduce VACUUM FREEZE ONLY; which never removes any old tuples,
but instead just marks them by setting xmin=xmax for them, in addition
to its freezing of live-and-visible-to-all tuples.

This would avoid touching indexes at all and may well be what is desired
for tables with only very little updates/deletes.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

#21Mark Woodward
pgsql@mohawksoft.com
In reply to: Jim Nasby (#17)
#22Joshua D. Drake
jd@commandprompt.com
In reply to: Mark Woodward (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joshua D. Drake (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#20)
#25Joshua D. Drake
jd@commandprompt.com
In reply to: Tom Lane (#23)
#26Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Joshua D. Drake (#25)
#27Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#24)
#28Trent Shipley
tshipley@deru.com
In reply to: Mark Woodward (#8)
#29Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Trent Shipley (#28)
#30Martijn van Oosterhout
kleptog@svana.org
In reply to: Trent Shipley (#28)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#30)
#32Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Jim Nasby (#29)
#33Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#31)