unlogged tables
I am attempting to test this
https://commitfest.postgresql.org/action/patch_view?id=424
but I'm not sure which version of PG this should be applied to. (it would be really neat, on here:
https://commitfest.postgresql.org/action/commitfest_view?id=8
if there was a note that said, this test this stuff against git tag X or branch Y or whatever)
I got the git:
git clone git://git.postgresql.org/git/postgresql.git
downloaded the patches, and applied them ok. then did ./configure and make
after much spewage I got:
bufmgr.c: In function 'PrefetchBuffer':
bufmgr.c:126:10: error: 'struct RelationData' has no member named 'rd_istemp'
make[4]: *** [bufmgr.o] Error 1
Just to make sure everything was ok with the original, I reset:
git reset --hard HEAD^
./configure
make
and all was well.
so I tried again:
make clean
make maintainer-clean
patch -p1 < relpersistence-v1.patch
.. ok ..
but then...
$ patch -p1 < unlogged-tables-v1.patch
patching file doc/src/sgml/indexam.sgml
patching file doc/src/sgml/ref/create_table.sgml
patching file doc/src/sgml/ref/create_table_as.sgml
patching file src/backend/access/gin/gininsert.c
patching file src/backend/access/gist/gist.c
patching file src/backend/access/hash/hash.c
patching file src/backend/access/nbtree/nbtree.c
patching file src/backend/access/transam/xlog.c
patching file src/backend/catalog/catalog.c
patching file src/backend/catalog/heap.c
patching file src/backend/catalog/index.c
patching file src/backend/catalog/storage.c
patching file src/backend/parser/gram.y
patching file src/backend/storage/file/Makefile
patching file src/backend/storage/file/copydir.c
patching file src/backend/storage/file/fd.c
The next patch would create the file src/backend/storage/file/reinit.c,
which already exists! Assume -R? [n]
That didnt happen the first time... I'm almost positive.
Not sure what I should do now.
-Andy
On Mon, Nov 15, 2010 at 8:56 PM, Andy Colson <andy@squeakycode.net> wrote:
I am attempting to test this
https://commitfest.postgresql.org/action/patch_view?id=424
but I'm not sure which version of PG this should be applied to. (it would
be really neat, on here:
https://commitfest.postgresql.org/action/commitfest_view?id=8
if there was a note that said, this test this stuff against git tag X or
branch Y or whatever)
They're pretty much all against the master branch.
I got the git:
git clone git://git.postgresql.org/git/postgresql.git
downloaded the patches, and applied them ok. then did ./configure and make
after much spewage I got:
bufmgr.c: In function 'PrefetchBuffer':
bufmgr.c:126:10: error: 'struct RelationData' has no member named
'rd_istemp'
make[4]: *** [bufmgr.o] Error 1
Woops. Good catch. I guess USE_PREFETCH isn't defined on my system.
That line needs to be changed to say RelationUsesLocalBuffers(reln)
rather than reln->rd_istemp. Updated patches attached.
That didnt happen the first time... I'm almost positive.
When you applied the patches the first time, it created that file; but
git reset --hard doesn't remove untracked files.
Not sure what I should do now.
git clean -dfx
git reset --hard
git pull
Apply attached patches.
configure
make
make install
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
I was able to apply and compile and run ok, creating unlogged tables
seems to work as well.
I patched up pgbench to optionally create unlogged tables, and ran it
both ways. I get ~80tps normally, and ~1,500tps with unlogged. (Thats
from memory, was playing with it last night at home)
I also have a "real world" test I can try (import apache logs and run a
few stats).
What other things would be good to test:
indexes?
analyze/stats/plans?
dump/restore?
Is "create temp unlogged table stuff(...)" an option?
-Andy
Andy Colson <andy@squeakycode.net> writes:
Is "create temp unlogged table stuff(...)" an option?
temp tables are unlogged already.
regards, tom lane
On Tue, Nov 16, 2010 at 1:09 PM, Andy Colson <andy@squeakycode.net> wrote:
I was able to apply and compile and run ok, creating unlogged tables seems
to work as well.I patched up pgbench to optionally create unlogged tables, and ran it both
ways. I get ~80tps normally, and ~1,500tps with unlogged. (Thats from
memory, was playing with it last night at home)
What do you get with normal tables but with fsync, full_page_writes,
and synchronous_commits turned off?
What do you get with normal tables but with sychronous_commit (only) off?
Can you detect any performance regression on normal tables with the
patch vs. without the patch?
I also have a "real world" test I can try (import apache logs and run a few
stats).
That would be great.
What other things would be good to test:
indexes?
analyze/stats/plans?
dump/restore?
All of those. I guess there's a question of what pg_dump should emit
for an unlogged table. Clearly, we need to dump a CREATE UNLOGGED
TABLE statement (which we do), and right now we also dump the table
contents - which seems reasonable, but arguably someone could say that
we ought not to dump the contents of anything less than a
full-fledged, permanent table.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Excerpts from Robert Haas's message of mar nov 16 15:34:55 -0300 2010:
On Tue, Nov 16, 2010 at 1:09 PM, Andy Colson <andy@squeakycode.net> wrote:
dump/restore?
All of those. I guess there's a question of what pg_dump should emit
for an unlogged table. Clearly, we need to dump a CREATE UNLOGGED
TABLE statement (which we do), and right now we also dump the table
contents - which seems reasonable, but arguably someone could say that
we ought not to dump the contents of anything less than a
full-fledged, permanent table.
I think if you do a regular backup of the complete database, unlogged
tables should come out empty, but if you specifically request a dump of
it, it shouldn't.
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
Excerpts from Robert Haas's message of mar nov 16 15:34:55 -0300 2010:
On Tue, Nov 16, 2010 at 1:09 PM, Andy Colson <andy@squeakycode.net> wrote:
dump/restore?
All of those. I guess there's a question of what pg_dump should emit
for an unlogged table. Clearly, we need to dump a CREATE UNLOGGED
TABLE statement (which we do), and right now we also dump the table
contents - which seems reasonable, but arguably someone could say that
we ought not to dump the contents of anything less than a
full-fledged, permanent table.I think if you do a regular backup of the complete database, unlogged
tables should come out empty, but if you specifically request a dump of
it, it shouldn't.
Oh, wow. That seems confusing.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera
I think if you do a regular backup of the complete database, unlogged
tables should come out empty, but if you specifically request a dump of
it, it shouldn't.
Oh, wow. That seems confusing.
I don't like it either.
I think allowing pg_dump to dump the data in an unlogged table is not
only reasonable, but essential. Imagine that someone determines that
his reliability needs will be adequately served by unlogged tables plus
hourly backups. Now you're going to tell him that that doesn't work
because pg_dump arbitrarily excludes the data in unlogged tables?
regards, tom lane
On 11/16/2010 02:06 PM, Robert Haas wrote:
On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:I think if you do a regular backup of the complete database, unlogged
tables should come out empty, but if you specifically request a dump of
it, it shouldn't.Oh, wow. That seems confusing.
Yeah. And unnecessary. If you want it excluded we already have a switch
for that.
cheers
andrew
On Tue, Nov 16, 2010 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera
I think if you do a regular backup of the complete database, unlogged
tables should come out empty, but if you specifically request a dump of
it, it shouldn't.Oh, wow. That seems confusing.
I don't like it either.
I think allowing pg_dump to dump the data in an unlogged table is not
only reasonable, but essential. Imagine that someone determines that
his reliability needs will be adequately served by unlogged tables plus
hourly backups. Now you're going to tell him that that doesn't work
because pg_dump arbitrarily excludes the data in unlogged tables?
Yeah, you'd have to allow a flag to control the behavior. And in that
case I'd rather the flag have a single default rather than different
defaults depending on whether or not individual tables were selected.
Something like --omit-unlogged-data.
Incidentally, unlogged tables plus hourly backups is not dissimilar to
what some NoSQL products are offering for reliability. Except with
PG, you can (or soon will be able to, hopefully) selectively apply
that lowered degree of reliability to a subset of your data for which
you determine it's appropriate, while maintaining full reliability
guarantees for other data. I am not aware of any other product which
offers that level of fine-grained control over durability.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Yeah, you'd have to allow a flag to control the behavior. And in that
case I'd rather the flag have a single default rather than different
defaults depending on whether or not individual tables were selected.
Something like --omit-unlogged-data.
Are you sure we don't want to default the other way? It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
On Tue, Nov 16, 2010 at 02:00:33PM -0800, Josh Berkus wrote:
Yeah, you'd have to allow a flag to control the behavior. And in
that case I'd rather the flag have a single default rather than
different defaults depending on whether or not individual tables
were selected. Something like --omit-unlogged-data.Are you sure we don't want to default the other way?
+1 for defaulting the other way.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.
Or perhaps most people will want them backed up, because them being
unlogged the backup is the only way to get them back in case of a crash?
On 11/16/10 2:08 PM, Peter Eisentraut wrote:
On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.Or perhaps most people will want them backed up, because them being
unlogged the backup is the only way to get them back in case of a crash?
Yeah, hard to tell, really. Which default is less likely to become a
foot-gun?
Maybe it's time for a survey on -general.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
On Tue, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
Yeah, you'd have to allow a flag to control the behavior. And in that
case I'd rather the flag have a single default rather than different
defaults depending on whether or not individual tables were selected.
Something like --omit-unlogged-data.Are you sure we don't want to default the other way? It seems to me
that most people using unlogged tables won't want to back them up ...
+1
JD
--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt
On 11/16/2010 05:12 PM, Josh Berkus wrote:
On 11/16/10 2:08 PM, Peter Eisentraut wrote:
On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.Or perhaps most people will want them backed up, because them being
unlogged the backup is the only way to get them back in case of a crash?Yeah, hard to tell, really. Which default is less likely to become a
foot-gun?Maybe it's time for a survey on -general.
I would argue pretty strongly that backing something up is much less
likely to be a foot-gun than not backing it up, and treating unlogged
tables the same as logged tables for this purpose is also much less
likely to be a foot-gun. As I pointed out upthread, we already have a
mechanism for not backing up selected objects. I'd much rather have a
rule that says "everything gets backed up by default" than one that says
"everything gets backed up by default except unlogged tables".
cheers
andrew
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Nov 16, 2010 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think allowing pg_dump to dump the data in an unlogged table is not
only reasonable, but essential.
Yeah, you'd have to allow a flag to control the behavior. And in that
case I'd rather the flag have a single default rather than different
defaults depending on whether or not individual tables were selected.
Something like --omit-unlogged-data.
As long as the default is to include the data, I wouldn't object to
having such a flag. A default that drops data seems way too
foot-gun-like.
regards, tom lane
On Wed, 2010-11-17 at 00:08 +0200, Peter Eisentraut wrote:
On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.Or perhaps most people will want them backed up, because them being
unlogged the backup is the only way to get them back in case of a crash?
To me, the use of unlogged tables is going to be for dynamic, volatile
data that can be rebuilt from an integrity set on a crash. Session
tables, metadata tables, dynamic updates that are batched to logged
tables every 10 minutes, that type of thing.
I think Berkus has a good idea on asking general.
JD
--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt
On Tuesday 16 November 2010 23:12:10 Josh Berkus wrote:
On 11/16/10 2:08 PM, Peter Eisentraut wrote:
On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.Or perhaps most people will want them backed up, because them being
unlogged the backup is the only way to get them back in case of a crash?Yeah, hard to tell, really. Which default is less likely to become a
foot-gun?
Well. Maybe both possibilities are just propable(which I think is unlikely),
but the different impact is pretty clear.
One way your backup runs too long and too much data changes, the other way
round you loose the data which you assumed safely backuped.
Isn't that a *really* easy decision?
Andres
Andres Freund <andres@anarazel.de> wrote:
One way your backup runs too long and too much data changes, the
other way round you loose the data which you assumed safely
backuped.Isn't that a *really* easy decision?
Yeah. Count me in the camp which wants the default behavior to be
that pg_dump backs up all permanent tables, even those which aren't
WAL-logged (and therefore aren't kept up in PITR backups, hot/warm
standbys, or streaming replication).
-Kevin