PG over NFS

Started by Yangabout 19 years ago7 messagesgeneral
Jump to latest
#1Yang
jkfe7q002@sneakemail.com

Hi all,

This has been discussed before (some URLs below), but the threads have
unfortunately been rather free of (precise) information. I am
interested in getting PG running over NFS. However, I am primarily
concerned with safety/recoverability (on sudden power loss); I care
very, very little about the performance. Hence, I think this is a
substantially simpler question to answer definitively (must also
assume the NFS server may lose power). The particular NFS client and
server implementations I'm using are the Linux NFS implementation
(using kernel 2.6).

If PG is unsuitable for this task, can any (preferrably open-source)
alternatives be recommended? (Just for curiosity, consider any storage
system supporting transactions and recovery, not necessarily the
relational model or high performance.)

BTW I've included some correspondence from my colleagues; I would also
appreciate it if any corrections are offered to their statements (if
necessary). From querying #postgresql on FreeNode, I gathered that as
long as fsync works properly (flushes data to the server), there are
no other concerns (and that there is in fact no file locking, except
perhaps on the pid file).

Thanks a lot!

Yang

http://archives.postgresql.org/pgsql-performance/2004-06/msg00215.php
http://www.thescripts.com/forum/thread422520.html
http://lists.freebsd.org/pipermail/freebsd-database/2006-January/000372.html

Show quoted text

Hmm... that's bad. The memory mapped IO we can live without (sure,
it's a performance issue, but at least it'll probably still work).
Mutexes and locks... I wonder if we could put them in a ramdrive?

It is not recommended to place database storage onto NFS. This is
seems to be true for Postgresql, SQLite, and BerkleyDB, it seems that
this is a general problem. The reasons are:
* DBs use memory mapped IO
* DBs use mutexes and locks

Presumably, neither of these work over NFS. I don't have any hard
evidence, I just read a bunch of FAQs and messages on mailings lists.
If this is indeed the case, that pretty much kills our plan for
hosting a DB on the slave. The "no,no" alternative is to place the DB
server on the master... and that would only work out of the box for
non-embedded databases.

#2Hannes Dorbath
light@theendofthetunnel.de
In reply to: Yang (#1)
Re: PG over NFS

There is GFS2, OCFS, DRBD, ENBD, iSCSI, AoE and a ton of other
technologies. What on earth is the point in trying to use a DBMS over
NFS? :)

In case it's just for the fun of it, maybe consider:
- davfs2
- curlftpfs

However, I am primarily concerned with safety/recoverability (on sudden power loss);

Well then.. forget about NFS :) What about various replication solutions
like slony, 8.2 warm standby log shipping, mammoth replicator?

must also assume the NFS server may lose power

A raid controller with battery backed cache and/or an UPS might be a
good start. If that's not an option disable all write caches or use a
filesystem that supports write barriers.

Yang wrote:

This has been discussed before (some URLs below), but the threads have
unfortunately been rather free of (precise) information. I am
interested in getting PG running over NFS. However, I am primarily
concerned with safety/recoverability (on sudden power loss); I care
very, very little about the performance. Hence, I think this is a
substantially simpler question to answer definitively (must also
assume the NFS server may lose power). The particular NFS client and
server implementations I'm using are the Linux NFS implementation
(using kernel 2.6).

If PG is unsuitable for this task, can any (preferrably open-source)
alternatives be recommended? (Just for curiosity, consider any storage
system supporting transactions and recovery, not necessarily the
relational model or high performance.)

BTW I've included some correspondence from my colleagues; I would also
appreciate it if any corrections are offered to their statements (if
necessary). From querying #postgresql on FreeNode, I gathered that as
long as fsync works properly (flushes data to the server), there are
no other concerns (and that there is in fact no file locking, except
perhaps on the pid file).

--
Best regards,
Hannes Dorbath

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Yang (#1)
Re: PG over NFS

Yang wrote:

This has been discussed before (some URLs below), but the threads
have unfortunately been rather free of (precise) information. I am
interested in getting PG running over NFS. However, I am primarily
concerned with safety/recoverability (on sudden power loss); I care
very, very little about the performance.

In my experience PG over NFS works fine, except that the kernel will
occasionally lock up under load. But that is a general NFS problem.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#4Yang
jkfe7q002@sneakemail.com
In reply to: Hannes Dorbath (#2)
Re: PG over NFS

On 3/26/07, Hannes Dorbath light-at-theendofthetunnel.de |postgresql|
<...> wrote:

There is GFS2, OCFS, DRBD, ENBD, iSCSI, AoE and a ton of other
technologies. What on earth is the point in trying to use a DBMS over
NFS? :)

In case it's just for the fun of it, maybe consider:
- davfs2
- curlftpfs

However, I am primarily concerned with safety/recoverability (on sudden power loss);

Well then.. forget about NFS :)

Could you offer any explanation as to why?

What about various replication solutions
like slony, 8.2 warm standby log shipping, mammoth replicator?

The environments involve two small devices - one with a flash disk
(the NFS server), and a slave which network-boots off that. Hence
these suggestions don't address the problem. (Would all the
alternative protocols listed at the top be able to coexist with the
described environment? Both devices must be able to boot into Linux.)

must also assume the NFS server may lose power

A raid controller with battery backed cache and/or an UPS might be a
good start. If that's not an option disable all write caches or use a
filesystem that supports write barriers.

On 3/26/07, Peter Eisentraut peter_e-at-gmx.net |postgresql|
<...> wrote:

Yang wrote:

This has been discussed before (some URLs below), but the threads
have unfortunately been rather free of (precise) information. I am
interested in getting PG running over NFS. However, I am primarily
concerned with safety/recoverability (on sudden power loss); I care
very, very little about the performance.

In my experience PG over NFS works fine, except that the kernel will
occasionally lock up under load. But that is a general NFS problem.

We've also been using PG over NFS but had not yet been concerned with
reliability in the face of failures.

Thanks,

Yang

#5Yang
jkfe7q002@sneakemail.com
In reply to: Yang (#1)
Re: PG over NFS

On 3/26/07, A.M. agentm-at-themactionfaction.com |postgresql|
<...> wrote:

On Mar 26, 2007, at 19:29 , Yang wrote:

On 3/26/07, Hannes Dorbath light-at-theendofthetunnel.de |postgresql|
<...> wrote:

There is GFS2, OCFS, DRBD, ENBD, iSCSI, AoE and a ton of other
technologies. What on earth is the point in trying to use a DBMS over
NFS? :)

In case it's just for the fun of it, maybe consider:
- davfs2
- curlftpfs

However, I am primarily concerned with safety/recoverability (on

sudden power loss);

Well then.. forget about NFS :)

Could you offer any explanation as to why?

What about various replication solutions
like slony, 8.2 warm standby log shipping, mammoth replicator?

The environments involve two small devices - one with a flash disk
(the NFS server), and a slave which network-boots off that. Hence
these suggestions don't address the problem. (Would all the
alternative protocols listed at the top be able to coexist with the
described environment? Both devices must be able to boot into Linux.)

Since you're booting from the NFS server, it would make more sense to
have your boot process start a postgresql instance from a copy of the
data directory instead of over NFS, no? Certainly, that way, you can
have multiple instances booted and running. Do you need to sync back
to the NFS server?

The second device has no non-volatile storage. (Sorry I should've
explicitly stated this.)

Yang

Show quoted text

Cheers,
M

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Yang (#4)
Re: PG over NFS

"Yang" <jkfe7q002@sneakemail.com> writes:

On 3/26/07, Hannes Dorbath light-at-theendofthetunnel.de |postgresql|

However, I am primarily concerned with safety/recoverability (on sudden power loss);

Well then.. forget about NFS :)

Could you offer any explanation as to why?

Basically, the problem with NFS is that it adds new failure modes that
are not present for local storage. Sure, if all works according to
spec then it's fine, but you're concerned about things going wrong no?

One of the nastier problems that we've seen involved an NFS-mounted DB
where the mount didn't come up till after the postmaster started.
(Searching in the PG archives should uncover that horror story among
others.) You can prevent some problems by being sure to mount the NFS
server hard not soft, but that's far from being a panacea.

AFAIK these issues are not PG-specific in the slightest, but are a
hazard for any DBMS run atop NFS.

regards, tom lane

#7Steve Atkins
steve@blighty.com
In reply to: Yang (#5)
Re: PG over NFS

On Mar 26, 2007, at 5:19 PM, Yang wrote:

On 3/26/07, A.M. agentm-at-themactionfaction.com |postgresql|
<...> wrote:

On Mar 26, 2007, at 19:29 , Yang wrote:

The environments involve two small devices - one with a flash disk
(the NFS server), and a slave which network-boots off that. Hence
these suggestions don't address the problem. (Would all the
alternative protocols listed at the top be able to coexist with the
described environment? Both devices must be able to boot into

Linux.)

Since you're booting from the NFS server, it would make more sense to
have your boot process start a postgresql instance from a copy of the
data directory instead of over NFS, no? Certainly, that way, you can
have multiple instances booted and running. Do you need to sync back
to the NFS server?

The second device has no non-volatile storage. (Sorry I should've
explicitly stated this.)

I might have missed something in the thread, but is there any
reason why you can't run the database on the first device, using
the local storage, and then connect to it from the second?

Cheers,
Steve