Possible Commit Syntax Change for Improved TPS
Hi,
I have been studying the basic limitation that the number of committed
transactions per second possible in a relational databases. Since
each transaction requires at least write-ahead log data to be flushed
to disk the upper bound of transactions per second is equal to the
number of independent disk writes possible per second. Most of what I
know is from performance docs of PostgreSQL and MySQL.
Its often possible to increase the total transaction processing speed
by turning off the compulsory disc syncing at each commit, which means
that in the case of system failure some transactions may be lost *but*
the database would still be consistent if we are careful to make sure
the log is always written first.
I observed that in in many applications there are some transactions
that are more critical than others. I may have the same database
instance managing website visitor accounting and financial
transactions. I could tolerate the loss of a few transactions whose
only job is to tell me a user has clicked a page on my website but
would not dare risk this for any of the "real" financials work my
web-based app is doing.
In the case of bulk inserts, also, or in some special cases I might be
able to code around the need for guaranteed *durability* on
transaction commit as long as the database is consistent.
So I want to ask, "what is databases have a 'COMMIT NOSYNC;' option?"
Then we can really improve "transaction-per-second" performance for a
database that has lots of non-critical transactions while not
jeopardising the durability of critical transactions in the
(relatively unlikely) case of system failure. Primarily through
combining the log updates for several non-critical transactions.
COMMIT; --> COMMIT SYNC; (guarantees atomic, consistent, durable
write)
COMMIT NOSYNC; --> (sacrifice durability of non-critical transaction
for overall speed). So, the question is what people, especially those
who have done DBMS work, think about this!
Seun Osewa.
seunosewa@inaira.com (Seun Osewa) wrote:
COMMIT; --> COMMIT SYNC; (guarantees atomic, consistent, durable
write)
COMMIT NOSYNC; --> (sacrifice durability of non-critical transaction
for overall speed). So, the question is what people, especially those
who have done DBMS work, think about this!
I think that whenever my organization cares THAT much about
performance, I'll probably be able to get enough budget to pay for a
SCSI RAID card that has battery backed cache that makes that issue go
away, as it allows the fsync() to become _nearly_ as fast as a no-op.
The case you suggest, where there are a lot of 'unimportant'
transactions, seems of dubious likelihood. If some updates "actually
commit," why shouldn't others? And if the users know they can't
really trust the "COMMIT NOSYNC" updates, won't it be tough to
convince them to trust the "really commited" stuff?
The battery backed cache idea winds up helping out _all_ updates, in a
HUGE way. That seems the way to go. At least in part because having
universal answers (e.g. - that helps ALL transactions) is likely to be
simpler than having everything be a special case.
--
(reverse (concatenate 'string "gro.gultn" "@" "enworbbc"))
http://www.ntlug.org/~cbbrowne/spiritual.html
This is Linux country. On a quiet night, you can hear NT re-boot.
In the last exciting episode, seunosewa@inaira.com (Seun Osewa) wrote:
So I want to ask, "what is databases have a 'COMMIT NOSYNC;' option?"
Then we can really improve "transaction-per-second" performance for a
database that has lots of non-critical transactions while not
jeopardising the durability of critical transactions in the
(relatively unlikely) case of system failure. Primarily through
combining the log updates for several non-critical transactions.
Another possibility in this would be to have not one, but TWO
backends.
One database, on one port, is running in FSYNC mode, so that the
"really vital" stuff is sure to get committed quickly. The other, on
another port, has FSYNC turned off in its postgresql.conf file, and
the set of "untrusted" files go there.
That has the added merit that you can do other tuning that
distinguishes between the "important" and "unimportant" data. For
instance, if the "unimportant" stuff is a set of logs that don't get
directly referred to, you might set cacheing real low on that backend
so that cache isn't being wasted on unimportant data.
So if you really want this, you can have it right now without anyone
doing any implementation work.
--
let name="aa454" and tld="freenet.carleton.ca" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/internet.html
God is real unless declared integer.
Christopher Browne <cbbrowne@acm.org> writes:
In the last exciting episode, seunosewa@inaira.com (Seun Osewa) wrote:
So I want to ask, "what is databases have a 'COMMIT NOSYNC;' option?"
Another possibility in this would be to have not one, but TWO
backends.
One database, on one port, is running in FSYNC mode, so that the
"really vital" stuff is sure to get committed quickly. The other, on
another port, has FSYNC turned off in its postgresql.conf file, and
the set of "untrusted" files go there.
They would have in fact to be two separate installations (not two
databases under one postmaster). There is no way to make some
transactions less safe than others in a single installation, because
they're all hitting the same WAL log, and potentially modifying the
same disk buffers to boot. Anyone's WAL sync therefore syncs everyone's
changes-so-far.
regards, tom lane
Hi Christopher,
Just to go through your points.
COMMIT NOSYNC; --> (sacrifice durability of non-critical transaction
for overall speed). So, the question is what people, especially those
who have done DBMS work, think about this!I think that whenever my organization cares THAT much about
performance, I'll probably be able to get enough budget to pay for a
SCSI RAID card that has battery backed cache that makes that issue go
away, as it allows the fsync() to become _nearly_ as fast as a no-op.
I agree, but I would not want to throw hardware at something that can be
easily implemented with software. I think the functionality is in
about every RDBMS today, just not under the database users' control.
The case you suggest, where there are a lot of 'unimportant'
transactions, seems of dubious likelihood. If some updates "actually
commit," why shouldn't others?
I feel, if people have the choice they would feel free to use the DBMS
for some functions they don't use it for now cause of the limited update
speeds without battery backup. For example, Microsoft ASP.NET docs re-
peat that its slower to use a database to manage visitor sessions. In
many cases I can afford to risk "forgetting" information about the act-
ivity of a user (out of thousands) who visited a shopping site without
ordering anything. The ASP.NET script would get to choose which COMMIT
to use depending on a number of factors.
And if the users know they can't
really trust the "COMMIT NOSYNC" updates, won't it be tough to
convince them to trust the "really commited" stuff?
Actually, I see it the other way round. The existence of
COMMIT NOSYNC (faster, not durable in case of crash)
should remind users that the other COMMIT [SYNC] though
slower, is durable.
The battery backed cache idea winds up helping out _all_ updates, in a
HUGE way. That seems the way to go. At least in part because having
universal answers (e.g. - that helps ALL transactions) is likely to be
simpler than having everything be a special case.
I think if database programmers have it,
they will use it to optimize their applications.
Aside from increased speed there is the possibility people
will just get to do some things they have just not been
doing. I think its a nice concept, which can be exploited
for performance if implemented in a RdBMS.
Seun Osewa.
tgl@sss.pgh.pa.us (Tom Lane) wrote in message news:<19310.1064932466@sss.pgh.pa.us>...
Christopher Browne <cbbrowne@acm.org> writes:
In the last exciting episode, seunosewa@inaira.com (Seun Osewa) wrote:
So I want to ask, "what if databases have a 'COMMIT NOSYNC;' option?"
Another possibility in this would be to have not one, but TWO
backends.
One database, on one port, is running in FSYNC mode, so that the
"really vital" stuff is sure to get committed quickly. The other, on
another port, has FSYNC turned off in its postgresql.conf file, and
the set of "untrusted" files go there.They would have in fact to be two separate installations (not two
databases under one postmaster). There is no way to make some
transactions less safe than others in a single installation, because
they're all hitting the same WAL log, and potentially modifying the
same disk buffers to boot. Anyone's WAL sync therefore syncs everyone's
changes-so-far.
The beauty of the scheme is that the WAL syncs which "sync everyone's
changes so far" would cost about the same as the WAL syncs for just
one transaction being committed. But when there are so many trans-
actions we would not have to sync the WAL so often.
Seun Osewa
Seun Osewa wrote:
I observed that in in many applications there are some transactions
that are more critical than others. I may have the same database
instance managing website visitor accounting and financial
transactions. I could tolerate the loss of a few transactions whose
only job is to tell me a user has clicked a page on my website but
would not dare risk this for any of the "real" financials work my
web-based app is doing.
It is possible to split the data over 2 database clusters:
one which contains "important" data (this cluster will be configured
with fsync enabled), and a second one that contains the less
important data (configured with fsync=off for speed reasons).
Cheers,
Adrian Maier
(am@fx.ro)
On Thu, Oct 02, 2003 at 05:31:52AM -0700, Seun Osewa wrote:
The beauty of the scheme is that the WAL syncs which "sync everyone's
changes so far" would cost about the same as the WAL syncs for just
one transaction being committed. But when there are so many trans-
actions we would not have to sync the WAL so often.
In that case, why not go to a "lazy" policy in high-load situations,
where subsequent commits are bundled up into a single physical write?
Just hold up a commit until either there's a full buffer's worth of
commits waiting to be written, or some timer says it's time to flush
so the client doesn't wait too long.
It would increase per-client latency when viewed in isolation, but if
it really improves throughput that much you might end up getting a
faster response after all.
(BTW I haven't looked at the code involved so this may be completely
wrong, impossible, and/or how it works already)
Jeroen