remove flatfiles.c

Started by Alvaro Herreraover 16 years ago69 messageshackers
Jump to latest
#1Alvaro Herrera
alvherre@2ndquadrant.com

This patch removes flatfiles.c for good.

It doesn't change the keeping of locks in dbcommands.c and user.c,
because at least some of them are still required.

Regarding sync commits that previously happen and now won't, I think the
only case worth worrying about is the one in vacuum.c. Do we need a
ForceSyncCommit() in there? I'm not sure if vacuum itself already
forces sync commit.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachments:

remove-flatfiles.patchtext/x-diff; charset=us-asciiDownload+64-205
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#1)
Re: remove flatfiles.c

Alvaro Herrera <alvherre@commandprompt.com> writes:

This patch removes flatfiles.c for good.

Aw, you beat me to it.

Regarding sync commits that previously happen and now won't, I think the
only case worth worrying about is the one in vacuum.c. Do we need a
ForceSyncCommit() in there? I'm not sure if vacuum itself already
forces sync commit.

Hmm, I had been assuming we wouldn't need that anymore.

regards, tom lane

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: remove flatfiles.c

Tom Lane wrote:

Alvaro Herrera <alvherre@commandprompt.com> writes:

Regarding sync commits that previously happen and now won't, I think the
only case worth worrying about is the one in vacuum.c. Do we need a
ForceSyncCommit() in there? I'm not sure if vacuum itself already
forces sync commit.

Hmm, I had been assuming we wouldn't need that anymore.

The comment in user.c and dbcommands.c says

/*
* Force synchronous commit, thus minimizing the window between
* creation of the database files and commital of the transaction. If
* we crash before committing, we'll have a DB that's taking up disk
* space but is not in pg_database, which is not good.
*/
ForceSyncCommit();

so I think those ones are still necessary. There's another call in
RenameDatabase() which I don't think needs a sync commit (because it
won't change the dir name), and one in vacuum.c:

/*
! * If we were able to advance datfrozenxid, mark the flat-file copy of
! * pg_database for update at commit, and see if we can truncate pg_clog.
! * Also force update if the shared XID-wrap-limit info is stale.
*/
if (dirty || !TransactionIdLimitIsValid())
- {
- database_file_update_needed();
vac_truncate_clog(newFrozenXid);
- }
}

AFAICT this doesn't need a sync commit. (Right now, VACUUM FULL forces
one, but lazy vacuum doesn't).

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#3)
Re: remove flatfiles.c

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom Lane wrote:

Hmm, I had been assuming we wouldn't need that anymore.

The comment in user.c and dbcommands.c says [...]
so I think those ones are still necessary.

Yeah, after a look through the code I think you can trust the associated
comments: if it says it needs sync commit, put in ForceSyncCommit, else
we don't need it.

regards, tom lane

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Alvaro Herrera (#1)
Re: remove flatfiles.c

On Mon, 2009-08-31 at 18:53 -0400, Alvaro Herrera wrote:

Regarding sync commits that previously happen and now won't, I think the
only case worth worrying about is the one in vacuum.c. Do we need a
ForceSyncCommit() in there? I'm not sure if vacuum itself already
forces sync commit.

VACUUM FULL requires ForceSyncCommit().

Not sure why removing them elsewhere is important? Getting robustness
wrong is a big, bad thing and this opens us to future error. We already
tuned VACUUM so it does very little if it has no work to do, why would
one extra I/O improve things so much? If it ain't broke...

VACUUM does so many things that I'd rather have it all safely on disk.
I'd feel happier with the rule "VACUUM always sync commits", so we all
remember it and can rely upon it to be the same from release to release.

--
Simon Riggs www.2ndQuadrant.com

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#5)
Re: remove flatfiles.c

Simon Riggs <simon@2ndQuadrant.com> writes:

VACUUM does so many things that I'd rather have it all safely on disk.
I'd feel happier with the rule "VACUUM always sync commits", so we all
remember it and can rely upon it to be the same from release to release.

Non-FULL vacuum has *never* done a sync commit, except in the unusual
corner case that it moves the database's datfrozenxid, which is a corner
case that didn't even exist until fairly recently. I think the argument
that we should have it force sync for no reason whatsoever is silly.
We get beat up on a regular basis about "spikes" in response time;
why would you want to have vacuum creating one when it doesn't need to?

As for the FULL case, the sync commit is to try to protect a horribly
unsafe kluge that should go away entirely (if vacuum full itself doesn't
go away entirely). That's hardly something I want to institutionalize
either.

regards, tom lane

#7Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#6)
Re: remove flatfiles.c

On Tue, 2009-09-01 at 09:58 -0400, Tom Lane wrote:

We get beat up on a regular basis about "spikes" in response time;
why would you want to have vacuum creating one when it doesn't need
to?

If one I/O on a background utility can cause such a spike, we are in
serious shitake. I would be more comfortable if the various important
things VACUUM does were protected by sync commit. I see no reason to
optimise away one I/O just because we might theoretically do so. Any
mistake in the theory and we are exposed. Why take the risk? We do many
things to check and secure our data, why not this one? If this was
suggested separately it as an optimisation you'd laugh and say why
bother?

--
Simon Riggs www.2ndQuadrant.com

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#7)
Re: remove flatfiles.c

Simon Riggs <simon@2ndQuadrant.com> writes:

On Tue, 2009-09-01 at 09:58 -0400, Tom Lane wrote:

We get beat up on a regular basis about "spikes" in response time;
why would you want to have vacuum creating one when it doesn't need
to?

If one I/O on a background utility can cause such a spike, we are in
serious shitake. I would be more comfortable if the various important
things VACUUM does were protected by sync commit. I see no reason to
optimise away one I/O just because we might theoretically do so. Any
mistake in the theory and we are exposed. Why take the risk?

*WHAT* risk? Most vacuums do not do a sync commit, and never have.

regards, tom lane

#9Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#6)
Re: remove flatfiles.c

On Tue, Sep 1, 2009 at 2:58 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

We get beat up on a regular basis about "spikes" in response time;
why would you want to have vacuum creating one when it doesn't need to?

Isn't this sync commit just going to do the same thing that the wal
writer is going to do in at most 200ms anyways?

As for the FULL case, the sync commit is to try to protect a horribly
unsafe kluge that should go away entirely (if vacuum full itself doesn't
go away entirely).

I'm all for throwing away VACUUM FULL btw. I was thinking of proposing
that we replace it with something like CLUSTER which just rewrites the
tuples in the order it finds them.

The use cases where VACUUM FULL wins currently are where storing two
copies of the table and its indexes concurrently just isn't practical.
Also perhaps tables where there are too many large indexes to make
rebuilding them all in one maintenance window practical.

I don't see any way to address these problems without something as
complex as xvac and moved_in/moved_off and without the index bloat
problems. I think we could improve the i/o access patterns we have
currently which make vacuum full so slow, but the fundamental problems
would remain.

So the question is whether those use cases are worth keeping our
existing vacuum full for or whether we could do without it and just
recommend partitioning for people with tables large enough to make
table rewrites impractical.

--
greg
http://mit.edu/~gsstark/resume.pdf

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bruce Momjian (#9)
Re: remove flatfiles.c

Greg Stark wrote:

The use cases where VACUUM FULL wins currently are where storing two
copies of the table and its indexes concurrently just isn't practical.

Yeah, but then do you really need to use VACUUM FULL? If that's really
a problem then there ain't that many dead tuples around.

Also perhaps tables where there are too many large indexes to make
rebuilding them all in one maintenance window practical.

If that's the concern maybe we oughta do something about concurrently
re-creating those indexes somehow. Plain REINDEX doesn't work of
course, but maybe we can do some trick with creating a new index and
dropping the original one afterwards.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#11Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#10)
Re: remove flatfiles.c

On Wed, Sep 2, 2009 at 12:01 AM, Alvaro
Herrera<alvherre@commandprompt.com> wrote:

The use cases where VACUUM FULL wins currently are where storing two
copies of the table and its indexes concurrently just isn't practical.

Yeah, but then do you really need to use VACUUM FULL?  If that's really
a problem then there ain't that many dead tuples around.

That's what I want to believe. But picture if you have, say a
1-terabyte table which is 50% dead tuples and you don't have a spare
1-terabytes to rewrite the whole table.

Also perhaps tables where there are too many large indexes to make
rebuilding them all in one maintenance window practical.

If that's the concern maybe we oughta do something about concurrently
re-creating those indexes somehow.  Plain REINDEX doesn't work of
course, but maybe we can do some trick with creating a new index and
dropping the original one afterwards.

Well that doesn't really work if you want to rewrite the table.
CLUSTER has to rebuild all the indexes when it's done.

I think the solution for both of these is actually partitioning. The
bottom line is that having a single table which contains very large
amounts of data is awkward to maintain.

--
greg
http://mit.edu/~gsstark/resume.pdf

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#11)
Re: remove flatfiles.c

Greg Stark <gsstark@mit.edu> writes:

On Wed, Sep 2, 2009 at 12:01 AM, Alvaro
Herrera<alvherre@commandprompt.com> wrote:

The use cases where VACUUM FULL wins currently are where storing two
copies of the table and its indexes concurrently just isn't practical.

Yeah, but then do you really need to use VACUUM FULL? �If that's really
a problem then there ain't that many dead tuples around.

That's what I want to believe. But picture if you have, say a
1-terabyte table which is 50% dead tuples and you don't have a spare
1-terabytes to rewrite the whole table.

But trying to VACUUM FULL that table is going to be horridly painful
too, and you'll still have bloated indexes afterwards. You might as
well just live with the 50% waste, especially since if you did a
full-table update once you'll probably do it again sometime.

I'm having a hard time believing that VACUUM FULL really has any
interesting use-case anymore.

regards, tom lane

#13Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Bruce Momjian (#11)
Re: remove flatfiles.c

Greg Stark wrote:

That's what I want to believe. But picture if you have, say a
1-terabyte table which is 50% dead tuples and you don't have a spare
1-terabytes to rewrite the whole table.

Could one hypothetically do
update bigtable set pk = pk where ctid in (select ctid from bigtable order by ctid desc limit 100);
vacuum;
and repeat until max(ctid) is small enough?

Sure, it'll take longer than vacuum full; but at first glance
it seems lightweight enough to do even on a live, heavily accessed
table.

IIRC I tried something like this once, and it worked to some extent,
but after a few loops didn't shrink the table as much as I had expected.

#14Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Ron Mayer (#13)
Re: remove flatfiles.c

Ron Mayer wrote:

Greg Stark wrote:

That's what I want to believe. But picture if you have, say a
1-terabyte table which is 50% dead tuples and you don't have a spare
1-terabytes to rewrite the whole table.

Could one hypothetically do
update bigtable set pk = pk where ctid in (select ctid from bigtable order by ctid desc limit 100);
vacuum;
and repeat until max(ctid) is small enough?

I remember Hannu Krosing said they used something like that to shrink
really bloated tables. Maybe we should try to explicitely support a
mechanism that worked in that fashion. I think I tried it at some point
and found that the problem with it was that ctid was too limited in what
it was able to do.

The neat thing is that now that we have the visibility fork, each vacuum
needn't scan the whole table each time.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#15Rod Taylor
rbt@rbt.ca
In reply to: Bruce Momjian (#11)
Re: remove flatfiles.c

On Tue, Sep 1, 2009 at 19:34, Greg Stark <gsstark@mit.edu> wrote:

On Wed, Sep 2, 2009 at 12:01 AM, Alvaro
Herrera<alvherre@commandprompt.com> wrote:

The use cases where VACUUM FULL wins currently are where storing two
copies of the table and its indexes concurrently just isn't practical.

Yeah, but then do you really need to use VACUUM FULL? If that's really
a problem then there ain't that many dead tuples around.

That's what I want to believe. But picture if you have, say a
1-terabyte table which is 50% dead tuples and you don't have a spare
1-terabytes to rewrite the whole table.

It would be interesting if there was something between VACUUM FULL and
CLUSTER which could, say, work on a single 1GB segment at a time in a manner
similar to cluster.

You would still end up with index bloat like vacuum full, though perhaps not
as bad, but shuffling around the tuples should be faster.

The idea here is that the files can be truncated individually. Two 500MB
files is pretty much the same as a single 1GB file on disk.

Of course, I'm hand waving and don't have the technical expertise to figure
out if it can be done easily within PostgreSQL.

#16Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#12)
Re: remove flatfiles.c

On Tue, Sep 1, 2009 at 7:42 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Greg Stark <gsstark@mit.edu> writes:

On Wed, Sep 2, 2009 at 12:01 AM, Alvaro
Herrera<alvherre@commandprompt.com> wrote:

The use cases where VACUUM FULL wins currently are where storing two
copies of the table and its indexes concurrently just isn't practical.

Yeah, but then do you really need to use VACUUM FULL?  If that's really
a problem then there ain't that many dead tuples around.

That's what I want to believe. But picture if you have, say a
1-terabyte table which is 50% dead tuples and you don't have a spare
1-terabytes to rewrite the whole table.

But trying to VACUUM FULL that table is going to be horridly painful
too, and you'll still have bloated indexes afterwards.  You might as
well just live with the 50% waste, especially since if you did a
full-table update once you'll probably do it again sometime.

I'm having a hard time believing that VACUUM FULL really has any
interesting use-case anymore.

What if your large table doesn't have an index? Then there's no way to cluster.

I'm a bit skeptical about partitioning as a solution, too. The
planner is just not clever enough with partitioned tables, yet.

...Robert

#17Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#14)
Re: remove flatfiles.c

On Tue, Sep 1, 2009 at 9:29 PM, Alvaro
Herrera<alvherre@commandprompt.com> wrote:

Ron Mayer wrote:

Greg Stark wrote:

That's what I want to believe. But picture if you have, say a
1-terabyte table which is 50% dead tuples and you don't have a spare
1-terabytes to rewrite the whole table.

Could one hypothetically do
   update bigtable set pk = pk where ctid in (select ctid from bigtable order by ctid desc limit 100);
   vacuum;
and repeat until max(ctid) is small enough?

I remember Hannu Krosing said they used something like that to shrink
really bloated tables.  Maybe we should try to explicitely support a
mechanism that worked in that fashion.  I think I tried it at some point
and found that the problem with it was that ctid was too limited in what
it was able to do.

I think a way to incrementally shrink large tables would be enormously
beneficial. Maybe vacuum could try to do a bit of that each time it
runs.

...Robert

#18Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#16)
Re: remove flatfiles.c

Robert Haas escribi�:

On Tue, Sep 1, 2009 at 7:42 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

But trying to VACUUM FULL that table is going to be horridly painful
too, and you'll still have bloated indexes afterwards. �You might as
well just live with the 50% waste, especially since if you did a
full-table update once you'll probably do it again sometime.

I'm having a hard time believing that VACUUM FULL really has any
interesting use-case anymore.

What if your large table doesn't have an index? Then there's no way to cluster.

But there's nothing saying we cannot provide a version of CLUSTER that
does not follow any index and just copies the live tuples.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#19Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#18)
Re: remove flatfiles.c

On Tue, Sep 1, 2009 at 10:58 PM, Alvaro
Herrera<alvherre@commandprompt.com> wrote:

Robert Haas escribió:

On Tue, Sep 1, 2009 at 7:42 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

But trying to VACUUM FULL that table is going to be horridly painful
too, and you'll still have bloated indexes afterwards.  You might as
well just live with the 50% waste, especially since if you did a
full-table update once you'll probably do it again sometime.

I'm having a hard time believing that VACUUM FULL really has any
interesting use-case anymore.

What if your large table doesn't have an index?  Then there's no way to cluster.

But there's nothing saying we cannot provide a version of CLUSTER that
does not follow any index and just copies the live tuples.

Agreed.

...Robert

#20Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Robert Haas (#16)
Re: remove flatfiles.c

On Tue, Sep 1, 2009 at 9:55 PM, Robert Haas<robertmhaas@gmail.com> wrote:

I'm a bit skeptical about partitioning as a solution, too.  The
planner is just not clever enough with partitioned tables, yet.

analyze and vacuum a *very* big table and even scan a huge index is
not a joke neither...
and yes the planner is not very clever about partitioning and
certainly that is something we need to fix not something we have to
live with... no that that will be easy but hey! we have very brilliant
people here (you being one of them)

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#21Bruce Momjian
bruce@momjian.us
In reply to: Jaime Casanova (#20)
#22Josh Berkus
josh@agliodbs.com
In reply to: Bruce Momjian (#21)
#23Joshua D. Drake
jd@commandprompt.com
In reply to: Josh Berkus (#22)
#24Bruce Momjian
bruce@momjian.us
In reply to: Josh Berkus (#22)
#25Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Bruce Momjian (#24)
#26Josh Berkus
josh@agliodbs.com
In reply to: Bruce Momjian (#24)
#27Joshua D. Drake
jd@commandprompt.com
In reply to: Josh Berkus (#26)
#28Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#24)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#24)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#28)
#31Bruce Momjian
bruce@momjian.us
In reply to: Kevin Grittner (#25)
#32Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#29)
#33Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#30)
#34Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#33)
#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#34)
#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#34)
#37Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#36)
#38Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#34)
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#37)
#40Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#39)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#40)
#42Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#41)
#43Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Robert Haas (#17)
#44Bruce Momjian
bruce@momjian.us
In reply to: Ron Mayer (#43)
#45Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#35)
#46daveg
daveg@sonic.net
In reply to: Tom Lane (#12)
#47Andrew Dunstan
andrew@dunslane.net
In reply to: daveg (#46)
#48daveg
daveg@sonic.net
In reply to: Andrew Dunstan (#47)
#49Josh Berkus
josh@agliodbs.com
In reply to: daveg (#48)
#50Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#49)
#51Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#50)
#52Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#50)
#53Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#51)
#54Joshua D. Drake
jd@commandprompt.com
In reply to: Tom Lane (#53)
#55Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Joshua D. Drake (#54)
#56Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#55)
#57Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#56)
#58Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Tom Lane (#56)
#59Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Dimitri Fontaine (#58)
#60Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Alvaro Herrera (#59)
#61Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Dimitri Fontaine (#60)
#62Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dimitri Fontaine (#60)
#63Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Alvaro Herrera (#61)
#64Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#57)
#65Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#64)
#66Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#65)
#67Andrew McNamara
andrewm@object-craft.com.au
In reply to: Tom Lane (#12)
#68Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#57)
#69Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#68)