PATCH: Warn users about tablespace abuse data loss risk

Started by Craig Ringerabout 12 years ago7 messagesdocs
Jump to latest
#1Craig Ringer
craig@2ndquadrant.com

Hi all

I've just seen another case of data loss due to misuse of /
misunderstanding of tablespaces:

http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres

and it's prompted me to write some docs amendments to make it more
obvious that *you shouldn't do that*.

Not that it'll stop people, but it'll at least mean they can't say we
didn't warn them.

This is actually quite important, because many users are used to MySQL's
MyISAM, where each table contains its own metadata and is readable by
simply copying the table into a different MySQL install's data
directory. It doesn't even have to be the same version! Users are
clearly surprised that PostgreSQL tablespaces don't have the same
properties.

Thoughts?

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

0001-Warn-about-risks-of-tablespace-misuse.patchtext/x-patch; name=0001-Warn-about-risks-of-tablespace-misuse.patchDownload+27-1
#2Ian Lawrence Barwick
barwick@gmail.com
In reply to: Craig Ringer (#1)
Re: PATCH: Warn users about tablespace abuse data loss risk

2014-02-12 12:52 GMT+09:00 Craig Ringer <craig@2ndquadrant.com>:

Hi all

I've just seen another case of data loss due to misuse of /
misunderstanding of tablespaces:

http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres

and it's prompted me to write some docs amendments to make it more
obvious that *you shouldn't do that*.

Not that it'll stop people, but it'll at least mean they can't say we
didn't warn them.

This is actually quite important, because many users are used to MySQL's
MyISAM, where each table contains its own metadata and is readable by
simply copying the table into a different MySQL install's data
directory. It doesn't even have to be the same version! Users are
clearly surprised that PostgreSQL tablespaces don't have the same
properties.

Thoughts?

People still use MyISAM!?

I had a similar issue pop up at work a while back, having something
explicit to point to is definitely a good idea.

Suggestion for the first paragraph of the patch (sorry I can't provide it in
patch form right now):

Even if they are located outside the main PostgreSQL data directory,
tablespaces
are an integral part of the database cluster and
<emphasis>cannot</emphasis> be
treated as an autonomous collection of data files. They rely on
metadata contained
in the main data directory, without which they are useless. In
particular, tablespaces
cannot be reattached to a different database cluster, and backing up
individual
tablespaces makes no sense as a backup/redundancy method. Similarly,
if you lose a
tablespace (file deletion, disk failure, etc) the main database may
become unreadable
or fail to start.

Regards

Ian Barwick

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#3David G. Johnston
david.g.johnston@gmail.com
In reply to: Ian Lawrence Barwick (#2)
Re: PATCH: Warn users about tablespace abuse data loss risk

Ian Lawrence Barwick wrote

2014-02-12 12:52 GMT+09:00 Craig Ringer &lt;

craig@

&gt;:

Hi all

I've just seen another case of data loss due to misuse of /
misunderstanding of tablespaces:

http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres

and it's prompted me to write some docs amendments to make it more
obvious that *you shouldn't do that*.

Not that it'll stop people, but it'll at least mean they can't say we
didn't warn them.

This is actually quite important, because many users are used to MySQL's
MyISAM, where each table contains its own metadata and is readable by
simply copying the table into a different MySQL install's data
directory. It doesn't even have to be the same version! Users are
clearly surprised that PostgreSQL tablespaces don't have the same
properties.

Thoughts?

People still use MyISAM!?

I had a similar issue pop up at work a while back, having something
explicit to point to is definitely a good idea.

Suggestion for the first paragraph of the patch (sorry I can't provide it
in
patch form right now):

Even if they are located outside the main PostgreSQL data directory,
tablespaces
are an integral part of the database cluster and
<emphasis>
cannot
</emphasis>
be
treated as an autonomous collection of data files. They rely on
metadata contained
in the main data directory, without which they are useless. In
particular, tablespaces
cannot be reattached to a different database cluster, and backing up
individual
tablespaces makes no sense as a backup/redundancy method. Similarly,
if you lose a
tablespace (file deletion, disk failure, etc) the main database may
become unreadable
or fail to start.

Regards

Ian Barwick

While providing additional warnings is good and necessary it may also help
to be more descriptive as to in what situations tablespaces are appropriate
and/or necessary so that people leave with a better understanding of why the
feature exists and not just trying to know what not to use it for. It goes
against the more prescriptive tone of the documentation generally but both
approaches work well together to tackle the knowledge/understanding gap some
users seem to have.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/PATCH-Warn-users-about-tablespace-abuse-data-loss-risk-tp5791537p5791542.html
Sent from the PostgreSQL - docs mailing list archive at Nabble.com.

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#4Ian Lawrence Barwick
barwick@gmail.com
In reply to: David G. Johnston (#3)
Re: PATCH: Warn users about tablespace abuse data loss risk

2014-02-12 14:06 GMT+09:00 David Johnston <polobo@yahoo.com>:

Ian Lawrence Barwick wrote

2014-02-12 12:52 GMT+09:00 Craig Ringer <

craig@

:
Hi all

I've just seen another case of data loss due to misuse of /
misunderstanding of tablespaces:

http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres

and it's prompted me to write some docs amendments to make it more
obvious that *you shouldn't do that*.

Not that it'll stop people, but it'll at least mean they can't say we
didn't warn them.

This is actually quite important, because many users are used to MySQL's
MyISAM, where each table contains its own metadata and is readable by
simply copying the table into a different MySQL install's data
directory. It doesn't even have to be the same version! Users are
clearly surprised that PostgreSQL tablespaces don't have the same
properties.

Thoughts?

People still use MyISAM!?

I had a similar issue pop up at work a while back, having something
explicit to point to is definitely a good idea.

Suggestion for the first paragraph of the patch (sorry I can't provide it
in
patch form right now):

Even if they are located outside the main PostgreSQL data directory,
tablespaces
are an integral part of the database cluster and
<emphasis>
cannot
</emphasis>
be
treated as an autonomous collection of data files. They rely on
metadata contained
in the main data directory, without which they are useless. In
particular, tablespaces
cannot be reattached to a different database cluster, and backing up
individual
tablespaces makes no sense as a backup/redundancy method. Similarly,
if you lose a
tablespace (file deletion, disk failure, etc) the main database may
become unreadable
or fail to start.

While providing additional warnings is good and necessary it may also help
to be more descriptive as to in what situations tablespaces are appropriate
and/or necessary so that people leave with a better understanding of why the
feature exists and not just trying to know what not to use it for. It goes
against the more prescriptive tone of the documentation generally but both
approaches work well together to tackle the knowledge/understanding gap some
users seem to have.

The warning would appear on this page:

http://www.postgresql.org/docs/current/static/manage-ag-tablespaces.html

which describes what tablespaces *can* do, but unless you're familiar with the
structure of the PostgreSQL data directories, it's not obvious what you *can't*
do. I recall reading a blog post a while back about tablespaces being "archived"
to the cloud with disastrous results, and a quick search pulls up
stuff like this:

http://stackoverflow.com/questions/3534415/moving-postgres-tablespaces-and-tables-across-ec2-instance

so it's definitely not a niche issue. Something "official" to link to
would be very
useful in this kind of situation. That doesn't preclude the general
documentation
being improved of course.

Regards

Ian Barwick

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#5Ian Lawrence Barwick
barwick@gmail.com
In reply to: Ian Lawrence Barwick (#4)
Re: PATCH: Warn users about tablespace abuse data loss risk

2014-02-12 15:16 GMT+09:00 Ian Lawrence Barwick <barwick@gmail.com>:

2014-02-12 14:06 GMT+09:00 David Johnston <polobo@yahoo.com>:

Ian Lawrence Barwick wrote

2014-02-12 12:52 GMT+09:00 Craig Ringer <

craig@

:
Hi all

I've just seen another case of data loss due to misuse of /
misunderstanding of tablespaces:

http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres

and it's prompted me to write some docs amendments to make it more
obvious that *you shouldn't do that*.

Not that it'll stop people, but it'll at least mean they can't say we
didn't warn them.

This is actually quite important, because many users are used to MySQL's
MyISAM, where each table contains its own metadata and is readable by
simply copying the table into a different MySQL install's data
directory. It doesn't even have to be the same version! Users are
clearly surprised that PostgreSQL tablespaces don't have the same
properties.

Thoughts?

People still use MyISAM!?

I had a similar issue pop up at work a while back, having something
explicit to point to is definitely a good idea.

Suggestion for the first paragraph of the patch (sorry I can't provide it
in
patch form right now):

Even if they are located outside the main PostgreSQL data directory,
tablespaces
are an integral part of the database cluster and
<emphasis>
cannot
</emphasis>
be
treated as an autonomous collection of data files. They rely on
metadata contained
in the main data directory, without which they are useless. In
particular, tablespaces
cannot be reattached to a different database cluster, and backing up
individual
tablespaces makes no sense as a backup/redundancy method. Similarly,
if you lose a
tablespace (file deletion, disk failure, etc) the main database may
become unreadable
or fail to start.

While providing additional warnings is good and necessary it may also help
to be more descriptive as to in what situations tablespaces are appropriate
and/or necessary so that people leave with a better understanding of why the
feature exists and not just trying to know what not to use it for. It goes
against the more prescriptive tone of the documentation generally but both
approaches work well together to tackle the knowledge/understanding gap some
users seem to have.

The warning would appear on this page:

http://www.postgresql.org/docs/current/static/manage-ag-tablespaces.html

which describes what tablespaces *can* do, but unless you're familiar with the
structure of the PostgreSQL data directories, it's not obvious what you *can't*
do. I recall reading a blog post a while back about tablespaces being "archived"
to the cloud with disastrous results, and a quick search pulls up
stuff like this:

http://stackoverflow.com/questions/3534415/moving-postgres-tablespaces-and-tables-across-ec2-instance

so it's definitely not a niche issue. Something "official" to link to
would be very useful in this kind of situation. That doesn't preclude the general
documentation being improved of course.

And taking a look at the page in question I see this prominent example:

CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';

As '/mnt' is usually a temporary mount point for detachable media and
all that [*]
maybe it's not the best impression to give for a suitable tablespace location.

[*] http://www.pathname.com/fhs/pub/fhs-2.3.html#MNTMOUNTPOINTFORATEMPORARILYMOUNT
"This directory is provided so that the system administrator may
temporarily mount a
filesystem as needed. "

Ian Barwick

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#6Craig Ringer
craig@2ndquadrant.com
In reply to: Ian Lawrence Barwick (#5)
Re: PATCH: Warn users about tablespace abuse data loss risk

On 02/12/2014 02:35 PM, Ian Lawrence Barwick wrote:

And taking a look at the page in question I see this prominent example:

CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';

Here's an updated patch, adopting your wording for the first par and
tweaking the example.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

0001-Warn-about-risks-of-tablespace-misuse.patchtext/x-patch; name=0001-Warn-about-risks-of-tablespace-misuse.patchDownload+37-2
#7Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#6)
Re: PATCH: Warn users about tablespace abuse data loss risk

On Thu, Feb 13, 2014 at 08:15:59AM +0800, Craig Ringer wrote:

On 02/12/2014 02:35 PM, Ian Lawrence Barwick wrote:

And taking a look at the page in question I see this prominent example:

CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';

Here's an updated patch, adopting your wording for the first par and
tweaking the example.

I have applied an adjusted version of this patch that is more compact.
Thanks.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

Attachments:

tablespace.difftext/x-diff; charset=us-asciiDownload+29-6