unlogged tables

Started by Andy Colsonover 15 years ago69 messageshackers
Jump to latest
#1Andy Colson
andy@squeakycode.net

I am attempting to test this

https://commitfest.postgresql.org/action/patch_view?id=424

but I'm not sure which version of PG this should be applied to. (it would be really neat, on here:
https://commitfest.postgresql.org/action/commitfest_view?id=8
if there was a note that said, this test this stuff against git tag X or branch Y or whatever)

I got the git:

git clone git://git.postgresql.org/git/postgresql.git

downloaded the patches, and applied them ok. then did ./configure and make

after much spewage I got:

bufmgr.c: In function 'PrefetchBuffer':
bufmgr.c:126:10: error: 'struct RelationData' has no member named 'rd_istemp'
make[4]: *** [bufmgr.o] Error 1

Just to make sure everything was ok with the original, I reset:

git reset --hard HEAD^
./configure
make
and all was well.

so I tried again:
make clean
make maintainer-clean

patch -p1 < relpersistence-v1.patch
.. ok ..

but then...

$ patch -p1 < unlogged-tables-v1.patch
patching file doc/src/sgml/indexam.sgml
patching file doc/src/sgml/ref/create_table.sgml
patching file doc/src/sgml/ref/create_table_as.sgml
patching file src/backend/access/gin/gininsert.c
patching file src/backend/access/gist/gist.c
patching file src/backend/access/hash/hash.c
patching file src/backend/access/nbtree/nbtree.c
patching file src/backend/access/transam/xlog.c
patching file src/backend/catalog/catalog.c
patching file src/backend/catalog/heap.c
patching file src/backend/catalog/index.c
patching file src/backend/catalog/storage.c
patching file src/backend/parser/gram.y
patching file src/backend/storage/file/Makefile
patching file src/backend/storage/file/copydir.c
patching file src/backend/storage/file/fd.c
The next patch would create the file src/backend/storage/file/reinit.c,
which already exists! Assume -R? [n]

That didnt happen the first time... I'm almost positive.

Not sure what I should do now.

-Andy

#2Robert Haas
robertmhaas@gmail.com
In reply to: Andy Colson (#1)
Re: unlogged tables

On Mon, Nov 15, 2010 at 8:56 PM, Andy Colson <andy@squeakycode.net> wrote:

I am attempting to test this

https://commitfest.postgresql.org/action/patch_view?id=424

but I'm not sure which version of PG this should be applied to.  (it would
be really neat, on here:
https://commitfest.postgresql.org/action/commitfest_view?id=8
if there was a note that said, this test this stuff against git tag X or
branch Y or whatever)

They're pretty much all against the master branch.

I got the git:

git clone git://git.postgresql.org/git/postgresql.git

downloaded the patches, and applied them ok.  then did ./configure and make

after much spewage I got:

bufmgr.c: In function 'PrefetchBuffer':
bufmgr.c:126:10: error: 'struct RelationData' has no member named
'rd_istemp'
make[4]: *** [bufmgr.o] Error 1

Woops. Good catch. I guess USE_PREFETCH isn't defined on my system.
That line needs to be changed to say RelationUsesLocalBuffers(reln)
rather than reln->rd_istemp. Updated patches attached.

That didnt happen the first time... I'm almost positive.

When you applied the patches the first time, it created that file; but
git reset --hard doesn't remove untracked files.

Not sure what I should do now.

git clean -dfx
git reset --hard
git pull

Apply attached patches.

configure
make
make install

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

unlogged-tables-v2.patchapplication/octet-stream; name=unlogged-tables-v2.patchDownload+748-43
relpersistence-v2.patchapplication/octet-stream; name=relpersistence-v2.patchDownload+323-220
relax-sync-commit-v1.patchapplication/octet-stream; name=relax-sync-commit-v1.patchDownload+17-10
#3Andy Colson
andy@squeakycode.net
In reply to: Robert Haas (#2)
Re: unlogged tables

I was able to apply and compile and run ok, creating unlogged tables
seems to work as well.

I patched up pgbench to optionally create unlogged tables, and ran it
both ways. I get ~80tps normally, and ~1,500tps with unlogged. (Thats
from memory, was playing with it last night at home)

I also have a "real world" test I can try (import apache logs and run a
few stats).

What other things would be good to test:
indexes?
analyze/stats/plans?
dump/restore?

Is "create temp unlogged table stuff(...)" an option?

-Andy

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andy Colson (#3)
Re: unlogged tables

Andy Colson <andy@squeakycode.net> writes:

Is "create temp unlogged table stuff(...)" an option?

temp tables are unlogged already.

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Andy Colson (#3)
Re: unlogged tables

On Tue, Nov 16, 2010 at 1:09 PM, Andy Colson <andy@squeakycode.net> wrote:

I was able to apply and compile and run ok, creating unlogged tables seems
to work as well.

I patched up pgbench to optionally create unlogged tables, and ran it both
ways.  I get ~80tps normally, and ~1,500tps with unlogged.  (Thats from
memory, was playing with it last night at home)

What do you get with normal tables but with fsync, full_page_writes,
and synchronous_commits turned off?

What do you get with normal tables but with sychronous_commit (only) off?

Can you detect any performance regression on normal tables with the
patch vs. without the patch?

I also have a "real world" test I can try (import apache logs and run a few
stats).

That would be great.

What other things would be good to test:
indexes?
analyze/stats/plans?
dump/restore?

All of those. I guess there's a question of what pg_dump should emit
for an unlogged table. Clearly, we need to dump a CREATE UNLOGGED
TABLE statement (which we do), and right now we also dump the table
contents - which seems reasonable, but arguably someone could say that
we ought not to dump the contents of anything less than a
full-fledged, permanent table.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#5)
Re: unlogged tables

Excerpts from Robert Haas's message of mar nov 16 15:34:55 -0300 2010:

On Tue, Nov 16, 2010 at 1:09 PM, Andy Colson <andy@squeakycode.net> wrote:

dump/restore?

All of those. I guess there's a question of what pg_dump should emit
for an unlogged table. Clearly, we need to dump a CREATE UNLOGGED
TABLE statement (which we do), and right now we also dump the table
contents - which seems reasonable, but arguably someone could say that
we ought not to dump the contents of anything less than a
full-fledged, permanent table.

I think if you do a regular backup of the complete database, unlogged
tables should come out empty, but if you specifically request a dump of
it, it shouldn't.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#7Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#6)
Re: unlogged tables

On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Excerpts from Robert Haas's message of mar nov 16 15:34:55 -0300 2010:

On Tue, Nov 16, 2010 at 1:09 PM, Andy Colson <andy@squeakycode.net> wrote:

dump/restore?

All of those.  I guess there's a question of what pg_dump should emit
for an unlogged table.  Clearly, we need to dump a CREATE UNLOGGED
TABLE statement (which we do), and right now we also dump the table
contents - which seems reasonable, but arguably someone could say that
we ought not to dump the contents of anything less than a
full-fledged, permanent table.

I think if you do a regular backup of the complete database, unlogged
tables should come out empty, but if you specifically request a dump of
it, it shouldn't.

Oh, wow. That seems confusing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#7)
Re: unlogged tables

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera

I think if you do a regular backup of the complete database, unlogged
tables should come out empty, but if you specifically request a dump of
it, it shouldn't.

Oh, wow. That seems confusing.

I don't like it either.

I think allowing pg_dump to dump the data in an unlogged table is not
only reasonable, but essential. Imagine that someone determines that
his reliability needs will be adequately served by unlogged tables plus
hourly backups. Now you're going to tell him that that doesn't work
because pg_dump arbitrarily excludes the data in unlogged tables?

regards, tom lane

#9Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#7)
Re: unlogged tables

On 11/16/2010 02:06 PM, Robert Haas wrote:

On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

I think if you do a regular backup of the complete database, unlogged
tables should come out empty, but if you specifically request a dump of
it, it shouldn't.

Oh, wow. That seems confusing.

Yeah. And unnecessary. If you want it excluded we already have a switch
for that.

cheers

andrew

#10Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#8)
Re: unlogged tables

On Tue, Nov 16, 2010 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera

I think if you do a regular backup of the complete database, unlogged
tables should come out empty, but if you specifically request a dump of
it, it shouldn't.

Oh, wow.  That seems confusing.

I don't like it either.

I think allowing pg_dump to dump the data in an unlogged table is not
only reasonable, but essential.  Imagine that someone determines that
his reliability needs will be adequately served by unlogged tables plus
hourly backups.  Now you're going to tell him that that doesn't work
because pg_dump arbitrarily excludes the data in unlogged tables?

Yeah, you'd have to allow a flag to control the behavior. And in that
case I'd rather the flag have a single default rather than different
defaults depending on whether or not individual tables were selected.
Something like --omit-unlogged-data.

Incidentally, unlogged tables plus hourly backups is not dissimilar to
what some NoSQL products are offering for reliability. Except with
PG, you can (or soon will be able to, hopefully) selectively apply
that lowered degree of reliability to a subset of your data for which
you determine it's appropriate, while maintaining full reliability
guarantees for other data. I am not aware of any other product which
offers that level of fine-grained control over durability.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#11Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#10)
Re: unlogged tables

Yeah, you'd have to allow a flag to control the behavior. And in that
case I'd rather the flag have a single default rather than different
defaults depending on whether or not individual tables were selected.
Something like --omit-unlogged-data.

Are you sure we don't want to default the other way? It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#12David Fetter
david@fetter.org
In reply to: Josh Berkus (#11)
Re: unlogged tables

On Tue, Nov 16, 2010 at 02:00:33PM -0800, Josh Berkus wrote:

Yeah, you'd have to allow a flag to control the behavior. And in
that case I'd rather the flag have a single default rather than
different defaults depending on whether or not individual tables
were selected. Something like --omit-unlogged-data.

Are you sure we don't want to default the other way?

+1 for defaulting the other way.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#13Peter Eisentraut
peter_e@gmx.net
In reply to: Josh Berkus (#11)
Re: unlogged tables

On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:

It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.

Or perhaps most people will want them backed up, because them being
unlogged the backup is the only way to get them back in case of a crash?

#14Josh Berkus
josh@agliodbs.com
In reply to: Peter Eisentraut (#13)
Re: unlogged tables

On 11/16/10 2:08 PM, Peter Eisentraut wrote:

On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:

It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.

Or perhaps most people will want them backed up, because them being
unlogged the backup is the only way to get them back in case of a crash?

Yeah, hard to tell, really. Which default is less likely to become a
foot-gun?

Maybe it's time for a survey on -general.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#15Joshua D. Drake
jd@commandprompt.com
In reply to: Josh Berkus (#11)
Re: unlogged tables

On Tue, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:

Yeah, you'd have to allow a flag to control the behavior. And in that
case I'd rather the flag have a single default rather than different
defaults depending on whether or not individual tables were selected.
Something like --omit-unlogged-data.

Are you sure we don't want to default the other way? It seems to me
that most people using unlogged tables won't want to back them up ...

+1

JD

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

#16Andrew Dunstan
andrew@dunslane.net
In reply to: Josh Berkus (#14)
Re: unlogged tables

On 11/16/2010 05:12 PM, Josh Berkus wrote:

On 11/16/10 2:08 PM, Peter Eisentraut wrote:

On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:

It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.

Or perhaps most people will want them backed up, because them being
unlogged the backup is the only way to get them back in case of a crash?

Yeah, hard to tell, really. Which default is less likely to become a
foot-gun?

Maybe it's time for a survey on -general.

I would argue pretty strongly that backing something up is much less
likely to be a foot-gun than not backing it up, and treating unlogged
tables the same as logged tables for this purpose is also much less
likely to be a foot-gun. As I pointed out upthread, we already have a
mechanism for not backing up selected objects. I'd much rather have a
rule that says "everything gets backed up by default" than one that says
"everything gets backed up by default except unlogged tables".

cheers

andrew

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#10)
Re: unlogged tables

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Nov 16, 2010 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think allowing pg_dump to dump the data in an unlogged table is not
only reasonable, but essential.

Yeah, you'd have to allow a flag to control the behavior. And in that
case I'd rather the flag have a single default rather than different
defaults depending on whether or not individual tables were selected.
Something like --omit-unlogged-data.

As long as the default is to include the data, I wouldn't object to
having such a flag. A default that drops data seems way too
foot-gun-like.

regards, tom lane

#18Joshua D. Drake
jd@commandprompt.com
In reply to: Peter Eisentraut (#13)
Re: unlogged tables

On Wed, 2010-11-17 at 00:08 +0200, Peter Eisentraut wrote:

On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:

It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.

Or perhaps most people will want them backed up, because them being
unlogged the backup is the only way to get them back in case of a crash?

To me, the use of unlogged tables is going to be for dynamic, volatile
data that can be rebuilt from an integrity set on a crash. Session
tables, metadata tables, dynamic updates that are batched to logged
tables every 10 minutes, that type of thing.

I think Berkus has a good idea on asking general.

JD

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

#19Andres Freund
andres@anarazel.de
In reply to: Josh Berkus (#14)
Re: unlogged tables

On Tuesday 16 November 2010 23:12:10 Josh Berkus wrote:

On 11/16/10 2:08 PM, Peter Eisentraut wrote:

On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:

It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.

Or perhaps most people will want them backed up, because them being
unlogged the backup is the only way to get them back in case of a crash?

Yeah, hard to tell, really. Which default is less likely to become a
foot-gun?

Well. Maybe both possibilities are just propable(which I think is unlikely),
but the different impact is pretty clear.

One way your backup runs too long and too much data changes, the other way
round you loose the data which you assumed safely backuped.

Isn't that a *really* easy decision?

Andres

#20Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Andres Freund (#19)
Re: unlogged tables

Andres Freund <andres@anarazel.de> wrote:

One way your backup runs too long and too much data changes, the
other way round you loose the data which you assumed safely
backuped.

Isn't that a *really* easy decision?

Yeah. Count me in the camp which wants the default behavior to be
that pg_dump backs up all permanent tables, even those which aren't
WAL-logged (and therefore aren't kept up in PITR backups, hot/warm
standbys, or streaming replication).

-Kevin

#21Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#19)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#11)
#23Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#22)
#24Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#19)
#25marcin mank
marcin.mank@gmail.com
In reply to: Robert Haas (#2)
#26Robert Haas
robertmhaas@gmail.com
In reply to: marcin mank (#25)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#26)
#28Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#26)
#29Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#27)
#30David Fetter
david@fetter.org
In reply to: David Fetter (#12)
#31Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#29)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#31)
#33Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#32)
#34Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Bruce Momjian (#33)
#35Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#33)
#36Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#35)
#37Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#32)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#37)
#39Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#36)
#40Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#38)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#40)
In reply to: Tom Lane (#41)
#43Andrew Dunstan
andrew@dunslane.net
In reply to: Kenneth Marshall (#42)
#44Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#39)
#45Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#43)
#46Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#45)
#47Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#43)
#48Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#47)
#49Robert Haas
robertmhaas@gmail.com
In reply to: Andrew Dunstan (#43)
#50Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#46)
#51Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#49)
#52Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#45)
#53Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#52)
#54Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#51)
#55David Fetter
david@fetter.org
In reply to: Andrew Dunstan (#53)
#56Steve Crawford
scrawford@pinpointresearch.com
In reply to: Andrew Dunstan (#53)
#57Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#52)
#58A.M.
agentm@themactionfaction.com
In reply to: Kevin Grittner (#57)
#59Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#57)
#60Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#54)
#61Andrew Dunstan
andrew@dunslane.net
In reply to: Kevin Grittner (#57)
#62David Fetter
david@fetter.org
In reply to: Robert Haas (#59)
#63Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#59)
#64Steve Crawford
scrawford@pinpointresearch.com
In reply to: Tom Lane (#47)
#65Joshua D. Drake
jd@commandprompt.com
In reply to: David Fetter (#62)
#66Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Robert Haas (#59)
#67Robert Haas
robertmhaas@gmail.com
In reply to: Dimitri Fontaine (#66)
#68Andy Colson
andy@squeakycode.net
In reply to: Robert Haas (#5)
#69Robert Haas
robertmhaas@gmail.com
In reply to: Andy Colson (#68)