tablespaces a priority for 7.5?

Started by Brian Maguireabout 22 years ago22 messagesgeneral
Jump to latest
#1Brian Maguire
bmaguire@vantage.com

Is support for tablespaces a priority feature for 7.5? I believe there
has been significant development in this area and it seems that
postgres' file structure opens it up nicely to support it. What are the
chances this will be completed?

In my opinion, it really is a critical feature to support and administer
enterprise databases. All the major databases currently support this
and it is a compelling enough reason drive big users from away from
using postgres for their enterprise/large databases. It really is a
database administrator's feature.

Brian

#2Noname
lnd@hnit.is
In reply to: Brian Maguire (#1)
Re: tablespaces a priority for 7.5?

In RAID era tablespaces are not such important regarding performance.

But for backup/restore - the ability to backup/restore selected tablespaces
while leaving other tablespaces is a big thing.
The whole point here is: it is assumed that backup/restore of tablespaces can
hapen quite quickly and as simple as to copy tablespace files from one
location to another(even while database is on - WAL can be used to handle
this) - this is compared to dump.

For example, index, tempoarary data tablespaces can be lost - not a big deal.

Undo(rollback) tablespaces - in a way can be lost as well.
While system data tablespace (table structure, stored procedures, etc) - at
no cost should be lost.
The same way application can be devided in "critical" and "not critical"
tablespaces and their backups maintained accordingly. For example, it may not
be a big deal to lose year 1996 tables while year 2004 tables should be
online.

Laimis

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Brian Maguire
Sent: 21. janúar 2004 16:06
To: pgsql-general@postgresql.org
Subject: [GENERAL] tablespaces a priority for 7.5?

Is support for tablespaces a priority feature for 7.5? I
believe there has been significant development in this area
and it seems that postgres' file structure opens it up nicely
to support it. What are the chances this will be completed?

In my opinion, it really is a critical feature to support and
administer enterprise databases. All the major databases
currently support this and it is a compelling enough reason
drive big users from away from using postgres for their
enterprise/large databases. It really is a database
administrator's feature.

Brian

---------------------------(end of
broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#3Brian Maguire
bmaguire@vantage.com
In reply to: Noname (#2)
Re: tablespaces a priority for 7.5?

I agree that RAID provides similar performance benifits especially with striping io benifits, however it is powerful and ideal to have both options. For example you may have a set of tables that are read-only for reporting and another set mostly write only. You could have they resting on different raid configurations ideal for each situtation.

I also agree there are several admin benifits in the areas of backup. You can also more easily create different frequency/schedules of backup for certain critical tables to a different schedule than other less important tables. The flexibility of easily growing your database beyond the current disk because of size limitations can be a life savior.

brian

-----Original Message-----
From: lnd@hnit.is [mailto:lnd@hnit.is]
Sent: Wed 1/21/2004 12:22 PM
To: Brian Maguire
Cc:
Subject: RE: [GENERAL] tablespaces a priority for 7.5?

In RAID era tablespaces are not such important regarding performance.

But for backup/restore - the ability to backup/restore selected tablespaces
while leaving other tablespaces is a big thing.
The whole point here is: it is assumed that backup/restore of tablespaces can
hapen quite quickly and as simple as to copy tablespace files from one
location to another(even while database is on - WAL can be used to handle
this) - this is compared to dump.

For example, index, tempoarary data tablespaces can be lost - not a big deal.

Undo(rollback) tablespaces - in a way can be lost as well.
While system data tablespace (table structure, stored procedures, etc) - at
no cost should be lost.
The same way application can be devided in "critical" and "not critical"
tablespaces and their backups maintained accordingly. For example, it may not
be a big deal to lose year 1996 tables while year 2004 tables should be
online.

Laimis

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Brian Maguire
Sent: 21. janúar 2004 16:06
To: pgsql-general@postgresql.org
Subject: [GENERAL] tablespaces a priority for 7.5?

Is support for tablespaces a priority feature for 7.5? I
believe there has been significant development in this area
and it seems that postgres' file structure opens it up nicely
to support it. What are the chances this will be completed?

In my opinion, it really is a critical feature to support and
administer enterprise databases. All the major databases
currently support this and it is a compelling enough reason
drive big users from away from using postgres for their
enterprise/large databases. It really is a database
administrator's feature.

Brian

---------------------------(end of
broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Brian Maguire (#3)
Re: tablespaces a priority for 7.5?

The whole point here is: it is assumed that backup/restore of tablespaces can
hapen quite quickly and as simple as to copy tablespace files from one
location to another(even while database is on - WAL can be used to handle
this) - this is compared to dump.

This is not going to happen. We intend to provide tablespaces in the
form of a simple management scheme for table files that are scattered
across multiple directories (typically on different filesystems).
That doesn't make it any safer to copy files behind the database's back.

regards, tom lane

#5Cott Lang
cott@internetstaff.com
In reply to: Brian Maguire (#1)
Re: tablespaces a priority for 7.5?

On Wed, 2004-01-21 at 09:05, Brian Maguire wrote:

In my opinion, it really is a critical feature to support and administer
enterprise databases. All the major databases currently support this
and it is a compelling enough reason drive big users from away from
using postgres for their enterprise/large databases. It really is a
database administrator's feature.

It seems to me that the lack of point-in-time recovery is a much bigger
roadblock against big users. :(

#6Noname
lnd@hnit.is
In reply to: Cott Lang (#5)
Re: tablespaces a priority for 7.5?

On Behalf Of Cott Lang
It seems to me that the lack of point-in-time recovery is a
much bigger roadblock against big users. :(

Meaning incremental (hot)-backups?
Or as protection against DROP/TRUNCATE/DELETE ALL TABLE/SCHEMA/DATABASE?

With a WAL it should be doable in some 7.x version, all ingredients are
there.

Possibly someone suceeded in doing it already? Having a baseline backup and
saved WAL logs, shouldn't it be possible to recover?

Laimis

#7Cott Lang
cott@internetstaff.com
In reply to: Noname (#6)
Re: tablespaces a priority for 7.5?

On Thu, 2004-01-22 at 07:38, lnd@hnit.is wrote:

Meaning incremental (hot)-backups?
Or as protection against DROP/TRUNCATE/DELETE ALL TABLE/SCHEMA/DATABASE?

With a WAL it should be doable in some 7.x version, all ingredients are
there.

Possibly someone suceeded in doing it already? Having a baseline backup and
saved WAL logs, shouldn't it be possible to recover?

Incremental pg_dumps would be a huge step in the right direction!

However, unless I am mistaken, a baseline backup would need to be taken
cold because you cannot take a consistent online backup of the data
files without using file system snapshots or split mirrors, and even
that's questionable.

#8Noname
lnd@hnit.is
In reply to: Cott Lang (#7)
Re: tablespaces a priority for 7.5?

Incremental pg_dumps would be a huge step in the right direction!

For big people (meaning bid databases) - not shure if pg_dump is the right
direction. Pg_dump must be quite slow also compact. Raw file copy is the way
to go: quick and simple, virtually no configuration is required, no possible
pg_dump bugs - the latest quite important(of course, no DB version and OS
changes are possible, but not really needed - this is recovery, not
migration).

However, unless I am mistaken, a baseline backup would need
to be taken cold because you cannot take a consistent online
backup of the data files without using file system snapshots
or split mirrors, and even that's questionable.

That's doable: depends what you have in WAL logs.
In short: a baseline full hot database backup is taken while database is
running, then when recovering WAL logs are put on top of this baseline
backup. WAL logs must actually account for a lot: table, index changes, etc.

From my understanding, this is done not at SQL level, but at pages level. as

far as I remember the method - a full hot database backup must be taken at
page level as well, i.e. simple OS file copy-utility may not suite for the
purpose.

Regards, Laimis

Show quoted text

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Cott Lang
Sent: 22. janúar 2004 15:00
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] tablespaces a priority for 7.5?

On Thu, 2004-01-22 at 07:38, lnd@hnit.is wrote:

Meaning incremental (hot)-backups?
Or as protection against DROP/TRUNCATE/DELETE ALL
TABLE/SCHEMA/DATABASE?

With a WAL it should be doable in some 7.x version, all ingredients
are there.

Possibly someone suceeded in doing it already? Having a baseline
backup and saved WAL logs, shouldn't it be possible to recover?

Incremental pg_dumps would be a huge step in the right direction!

However, unless I am mistaken, a baseline backup would need
to be taken cold because you cannot take a consistent online
backup of the data files without using file system snapshots
or split mirrors, and even that's questionable.

---------------------------(end of
broadcast)---------------------------
TIP 8: explain analyze is your friend

#9Jeff
threshar@torgo.978.org
In reply to: Cott Lang (#7)
Re: tablespaces a priority for 7.5?

On Jan 22, 2004, at 10:00 AM, Cott Lang wrote:

On Thu, 2004-01-22 at 07:38, lnd@hnit.is wrote:

Meaning incremental (hot)-backups?
Or as protection against DROP/TRUNCATE/DELETE ALL
TABLE/SCHEMA/DATABASE?

With a WAL it should be doable in some 7.x version, all ingredients
are
there.

Possibly someone suceeded in doing it already? Having a baseline
backup and
saved WAL logs, shouldn't it be possible to recover?

Incremental pg_dumps would be a huge step in the right direction!

However, unless I am mistaken, a baseline backup would need to be taken
cold because you cannot take a consistent online backup of the data
files without using file system snapshots or split mirrors, and even
that's questionable.

pg_dump always takes a consistent dump - things won't change underneath
it.
So you're backup won't have anything that changed after pg_dump
started.. thus where incremental would come in.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Cott Lang (#7)
Re: tablespaces a priority for 7.5?

Cott Lang <cott@internetstaff.com> writes:

However, unless I am mistaken, a baseline backup would need to be taken
cold because you cannot take a consistent online backup of the data
files without using file system snapshots or split mirrors, and even
that's questionable.

No, it wouldn't. All you need is to archive WAL beginning from the last
checkpoint record before you begin to take the baseline backup. The
baseline backup would not be consistent in itself --- but replaying WAL
from the previous checkpoint to any time later than the end of the
backup would bring it to a consistent state. Missed updates in the
backup are essentially damage that would get repaired by replay.

There are a few trivial things that still have to be done before this
can be a reality (one I can think of is that WAL really needs to have
entries for file creation/deletion), but it's not that far off in terms
of the base mechanisms. Writing the management software is the main
task.

There is a lot more info available in the pghackers archives.

regards, tom lane

#11Cott Lang
cott@internetstaff.com
In reply to: Jeff (#9)
Re: tablespaces a priority for 7.5?

However, unless I am mistaken, a baseline backup would need to be taken
cold because you cannot take a consistent online backup of the data
files without using file system snapshots or split mirrors, and even
that's questionable.

pg_dump always takes a consistent dump - things won't change underneath
it.
So you're backup won't have anything that changed after pg_dump
started.. thus where incremental would come in.

Sorry, I'm referring to two entirely different things there. :)

1. Being able to do incremental pg_dumps would be a big plus, because
you could take much quicker dumps and thus do it more regularly. I have
a 50GB database I dump every 3 hours that takes 35 minutes to dump. :(

2. The rest was my hypothesizing on what might be necessary for
point-in-time recovery, which pg_dump isn't going to allow even if you
can perform incremental dumps.

#12Cott Lang
cott@internetstaff.com
In reply to: Tom Lane (#10)
Re: tablespaces a priority for 7.5?

On Thu, 2004-01-22 at 09:04, Tom Lane wrote:

No, it wouldn't. All you need is to archive WAL beginning from the last
checkpoint record before you begin to take the baseline backup. The
baseline backup would not be consistent in itself --- but replaying WAL
from the previous checkpoint to any time later than the end of the
backup would bring it to a consistent state. Missed updates in the
backup are essentially damage that would get repaired by replay.

I will experiment with this. I have plenty of databases to wreak havoc
on. :)

There are a few trivial things that still have to be done before this
can be a reality (one I can think of is that WAL really needs to have
entries for file creation/deletion), but it's not that far off in terms
of the base mechanisms. Writing the management software is the main
task.

Being able to write WAL logs to two locations would be handy. Is it
currently possible to have the logs not be recycled and occasionally
deleted?

Is there any hope for support for all of this in 7.5? I dread knowing
that at some point, I may be forced kicking and screaming back to Oracle
because of this. :)

thanks!

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Cott Lang (#11)
Re: tablespaces a priority for 7.5?

Cott Lang <cott@internetstaff.com> writes:

2. The rest was my hypothesizing on what might be necessary for
point-in-time recovery, which pg_dump isn't going to allow even if you
can perform incremental dumps.

Right. There seems to be some confusion about that in this thread,
so just for the record: pg_dump has nothing to do with our plans for
point-in-time recovery (or incremental backup, which is essentially the
same thing). To do these, you would take a *physical* dump of the
database directory as a baseline, and thereafter copy WAL segments off
to tape or whatever you are using as archive media. Recovery would
consist of restoring the physical baseline dump, and then replaying WAL
against it up to whatever point in time you wanted to recover to. You
would, therefore, need to keep a continuous sequence of WAL files back
to the time of your most recent baseline backup.

regards, tom lane

#14John Sidney-Woollett
johnsw@wardbrook.com
In reply to: Tom Lane (#13)
Re: tablespaces a priority for 7.5?

Tom Lane said:

To do these, you would take a *physical* dump of the
database directory as a baseline, and thereafter copy WAL segments off
to tape or whatever you are using as archive media. Recovery would
consist of restoring the physical baseline dump, and then replaying WAL
against it up to whatever point in time you wanted to recover to. You
would, therefore, need to keep a continuous sequence of WAL files back
to the time of your most recent baseline backup.

This is similar to Oracle...

With Oracle you have the option of EITHER exporting the database
(equivalent to doing pg_dump) OR taking the database offline and carrying
out a file system level copy of the database files.

Recovery can be accomplished by either restoring the backup database files
or creating a new database by importing the most recent export/dump file.

After that you apply the archive logs (equivalent to WAL segments) to
bring your database up to date.

One caveat for Oracle (at least in 8.1.x) is that DDL statements are not
recorded in the archive logs, and can screw things up. Best to take a new
export/dump after making DDL changes!

The difference between Oracle and Postgres appears to be that posgres
requires a file system level copy of the database instead of being able to
make use of a dump file for this type of recovery. Is that correct?

John Sidney-Woollett

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Cott Lang (#12)
Re: tablespaces a priority for 7.5?

Cott Lang <cott@internetstaff.com> writes:

Being able to write WAL logs to two locations would be handy. Is it
currently possible to have the logs not be recycled and occasionally
deleted?

The current thought is to add some sort of hook that allows a
user-defined action to be taken at the point where a WAL file would
normally get recycled. Presumably this action would involve copying the
WAL file to someplace else (eg a tape), after which it could get
recycled. If you've got any thoughts on exactly how to manage this,
let's take up a discussion on pgsql-hackers.

Is there any hope for support for all of this in 7.5?

People are thinking about it, but I'm not seeing a lot of work getting
done. (Partly my fault, since this is one of the things I'm supposed to
be working on...)

regards, tom lane

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Sidney-Woollett (#14)
Re: tablespaces a priority for 7.5?

"John Sidney-Woollett" <johnsw@wardbrook.com> writes:

With Oracle you have the option of EITHER exporting the database
(equivalent to doing pg_dump) OR taking the database offline and carrying
out a file system level copy of the database files.

The planned PITR feature would not require you to take anything offline.
The whole concept of an "offline" database is an Oracle-ism that I see
no value in emulating.

One caveat for Oracle (at least in 8.1.x) is that DDL statements are not
recorded in the archive logs, and can screw things up. Best to take a new
export/dump after making DDL changes!

Well, we're ahead of them on that...

The difference between Oracle and Postgres appears to be that posgres
requires a file system level copy of the database instead of being able to
make use of a dump file for this type of recovery. Is that correct?

I dunno what a "dump file" would equate to in Postgres terms, but yeah,
we're envisioning using ordinary filesystem tools (tar, say) as the
mechanism for handling a baseline backup.

regards, tom lane

#17John Sidney-Woollett
johnsw@wardbrook.com
In reply to: Tom Lane (#16)
Re: tablespaces a priority for 7.5?

Tom Lane said:

The planned PITR feature would not require you to take anything offline.
The whole concept of an "offline" database is an Oracle-ism that I see
no value in emulating.

I dunno what a "dump file" would equate to in Postgres terms, but yeah,
we're envisioning using ordinary filesystem tools (tar, say) as the
mechanism for handling a baseline backup.

(I think) that was the point I was trying to make. Oracle lets you backup
while up and running (using export) OR when offline (copying db files)...

However, it looks like postgres will require a file system level backup to
recover from. Or will you be able to build a new db from the dump file,
and then apply the WAL segments?

John Sidney-Woollett

#18Cott Lang
cott@internetstaff.com
In reply to: John Sidney-Woollett (#14)
Re: tablespaces a priority for 7.5?

With Oracle you have the option of EITHER exporting the database
(equivalent to doing pg_dump) OR taking the database offline and carrying
out a file system level copy of the database files.

OR putting the tablespaces in hot backup mode and carrying out a file
system level copy of the database files, which is the option I certainly
prefer. :)

Recovery can be accomplished by either restoring the backup
database files or creating a new database by importing the most
recent export/dump file.

After that you apply the archive logs (equivalent to WAL segments)
to bring your database up to date.

You can't do that with an export in Oracle. Archive logs are only useful
to apply to a cold or hot backup of the data files.

Export with Postgres works about the same as with Oracle. It's the lack
of well-supported hot backups in Postgres that are the limitation.

One caveat for Oracle (at least in 8.1.x) is that DDL statements
are not recorded in the archive logs, and can screw things up.
Best to take a new export/dump after making DDL changes!

I've not heard that one. That would make hot standby databases a serious
maintenance problem. :)

#19Noname
lnd@hnit.is
In reply to: Cott Lang (#18)
Re: tablespaces a priority for 7.5?

It's interesting to know what pg WAL's are:

If they are page level logs, then you theoretically can have a file system
backup - which is fast, reliable, no issues like pg_dump has, i.e. who is
first/last regarding stored procedures/views/triggers, to have users
precreated, etc, etc.

If the are SQL statements which are REPLAYD after pg_dump restore: well, then
file system backup has no chance. The issue is that you are missing index
pages and probably much more.

Regards, Laimis

Show quoted text

-----Original Message-----
From: John Sidney-Woollett [mailto:johnsw@wardbrook.com]
Sent: 22. janúar 2004 17:53
To: Laimutis Nedzinskas
Cc: johnsw@wardbrook.com
Subject: RE: [GENERAL] tablespaces a priority for 7.5?

lnd@hnit.is said:

(I think) that was the point I was trying to make. Oracle lets you
backup while up and running (using export) OR when offline

(copying

db files)...

Not neccesseraly offline. In oracle you take tablespace in

backup mode

and just use OCOPY(I believe you must watch out that file copy is
performed at OS block sizes not smaller than db page size -

OCOPY does

that) to copy files.
REDO logs (i.e. WAL logs) will bring copied files in synch.
Or one can use RMAN tool which is the same as to copy files

just it makes

book-keeping for backups and discards empty pages from

backuped file.

You obviously know Oracle better than me - I stand corrected! :)

Actually I was trying to figure out whether the WAL segments
could be applied to a database rebuilt using a file generated
by pg_dump. I suspect not, and doesn't that mean that the
postmaster needs to be stopped to carry out the file system
level copy of the pg files?

John

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Sidney-Woollett (#17)
Re: tablespaces a priority for 7.5?

"John Sidney-Woollett" <johnsw@wardbrook.com> writes:

Tom Lane said:

I dunno what a "dump file" would equate to in Postgres terms, but yeah,
we're envisioning using ordinary filesystem tools (tar, say) as the
mechanism for handling a baseline backup.

However, it looks like postgres will require a file system level backup to
recover from. Or will you be able to build a new db from the dump file,
and then apply the WAL segments?

What dump file? I'm trying to say that we have no such concept and no
intention of inventing one.

regards, tom lane

#21Rick Gigger
rick@alpinenetworking.com
In reply to: Brian Maguire (#1)
#22Julian North
jnorth@lastminute.com
In reply to: Rick Gigger (#21)