Postgres 9.1 issues running data directory from VMware shared folder

Started by Arze, Cesarover 11 years ago8 messagesgeneral
Jump to latest
#1Arze, Cesar
CArze@som.umaryland.edu

Hi,

I’ve recently encountered an issue running Postgres (both 8.4 and 9.1) on a VMware VM running Ubuntu 10.04 LTS as the guest OS with the data directory running out of a VMware shared folder. Previously on 8.4 this had worked out for me but after upgrading VMware and re-building my VM I’ve started to encounter this issue. It seems like the problem occurs when I run initdb, I get the following error:

# sudo -u postgres /usr/lib/postgresql/9.1/bin/initdb --noclean -D /mnt/pg_data/
Running in noclean mode. Mistakes will not be cleaned up.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale en_US.UTF-8.
The default database encoding has accordingly been set to UTF8.
The default text search configuration will be set to "english".

fixing permissions on existing directory /mnt/pg_data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 32MB
creating configuration files ... ok
creating template1 database in /mnt/pg_data/base/1 ... FATAL: could not open file "pg_xlog/000000010000000000000001" (log file 0, segment 1): No such file or directory

child process exited with exit code 1
initdb: data directory "/mnt/pg_data" not removed at user's request

Here is a snippet of an strace around where the error occurs:

write(4, "insert OID = 767 ( lo_import 11 "..., 141) = 141
write(4, "insert OID = 765 ( lo_export 11 "..., 132) = 132
write(4, "insert OID = 766 ( int4inc 11 10"..., 125) = 125
write(4, "insert OID = 768 ( int4larger 11"..., 134) = 134
write(4, "insert OID = 769 ( int4smaller 1"..., 136) = 136
write(4, "insert OID = 770 ( int2larger 11"..., 134) = 134
write(4, "insert OID = 771 ( int2smaller 1"..., 136) = 136
write(4, "insert OID = 774 ( gistgettuple "..., 142) = 142
write(4, "insert OID = 638 ( gistgetbitmap"..., 144) = 144
write(4, "insert OID = 775 ( gistinsert 11"..., 158) = 158
write(4, "insert OID = 777 ( gistbeginscan"..., 151) = 151
write(4, "insert OID = 778 ( gistrescan 11"..., 155) = 155
write(4, "insert OID = 779 ( gistendscan 1"..., 137) = 137
write(4, "insert OID = 780 ( gistmarkpos 1"..., 137) = 137
write(4, "insert OID = 781 ( gistrestrpos "..., 139) = 139
write(4, "insert OID = 782 ( gistbuild 11 "..., 143) = 143
write(4, "insert OID = 326 ( gistbuildempt"..., 143) = 143
write(4, "insert OID = 776 ( gistbulkdelet"..., 158) = 158
write(4, "insert OID = 2561 ( gistvacuumcl"..., 155) = 155
write(4, "insert OID = 772 ( gistcostestim"..., 187) = 187
write(4, "insert OID = 2787 ( gistoptions "..., 139) = 139
write(4, "insert OID = 784 ( tintervaleq 1"..., 138) = 138
write(4, "insert OID = 785 ( tintervalne 1"..., 138) = 138
write(4, "insert OID = 786 ( tintervallt 1"..., 138) = 138
write(4, "insert OID = 787 ( tintervalgt 1"..., 138) = 138
write(4, "insert OID = 788 ( tintervalle 1"..., 138FATAL: could not open file "pg_xlog/000000010000000000000001" (log file 0, segment 1): No such file or directory
) = -1 EPIPE (Broken pipe)

I probably should be posting to the VMware mailing list with this question but I wanted to see if anyone had any insight or suggestions here. I’ve seen many similar issues but none of the solutions proposed there worked for me.

Thanks for any help,

Cesar

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Arze, Cesar (#1)
Re: Postgres 9.1 issues running data directory from VMware shared folder

On 08/26/2014 03:08 PM, Arze, Cesar wrote:

Hi,

I’ve recently encountered an issue running Postgres (both 8.4 and 9.1)
on a VMware VM running Ubuntu 10.04 LTS as the guest OS with the data
directory running out of a VMware shared folder. Previously on 8.4 this
had worked out for me but after upgrading VMware and re-building my VM
I’ve started to encounter this issue. It seems like the problem occurs
when I run initdb, I get the following error:

So what is the host OS?

Where is the shared directory located, which OS?

What file system is the directory located on?

# sudo -u postgres /usr/lib/postgresql/9.1/bin/initdb --noclean -D
/mnt/pg_data/

creating template1 database in /mnt/pg_data/base/1 ... FATAL: could not
open file "pg_xlog/000000010000000000000001" (log file 0, segment 1): No
such file or directory

child process exited with exit code 1
initdb: data directory "/mnt/pg_data" not removed at user's request

What is in /mnt/pg_data after the error?

Thanks for any help,

Cesar

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Steve Atkins
steve@blighty.com
In reply to: Arze, Cesar (#1)
Re: Postgres 9.1 issues running data directory from VMware shared folder

On Aug 26, 2014, at 3:08 PM, Arze, Cesar <CArze@som.umaryland.edu> wrote:

I probably should be posting to the VMware mailing list with this question but I wanted to see if anyone had any insight or suggestions here. I’ve seen many similar issues but none of the solutions proposed there worked for me.

This might not be what you're seeing, but there was a hideous bug in the shared folder (hgfs) driver for linux guest OSes that'll silently corrupt your filesystem if it's accessed via more than one filehandle (e.g. multiple opens, multiple processes, ...).

If you're using vmware tools bundled with workstation 10.0.1 or fusion 6.0.2, you have that bug and cannot safely use hgfs mounts for any files, let alone postgresql. (There was a different bug, with similar results, for earlier versions too, including at least fusion 5.0.1). VMWare claim it's fixed in the tools bundled with 10.0.2 / 6.0.3 (I've not tested it). If you're not running the very latest vmware, upgrade to it and install the latest tools (or avoid using hgfs).

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Arze, Cesar
CArze@som.umaryland.edu
In reply to: Adrian Klaver (#2)
Re: Postgres 9.1 issues running data directory from VMware shared folder

So what is the host OS?

Where is the shared directory located, which OS?

What file system is the directory located on?

Host OS is Redhat 5.4

The shared directory is located on the host OS of Redhat 5.4 and is located on a local drive of the desktop.

The directory is on an ext3 filesystem.

What is in /mnt/pg_data after the error?

drwx------ 1 postgres postgres 4096 2014-08-26 21:58 base

drwx------ 1 postgres postgres 4096 2014-08-26 21:58 global

drwx------ 1 postgres postgres 4096 2014-08-26 21:58 pg_clog

-rw------- 1 postgres postgres 4476 2014-08-26 21:58 pg_hba.conf

-rw------- 1 postgres postgres 1636 2014-08-26 21:58 pg_ident.conf

drwx------ 1 postgres postgres 4096 2014-08-26 21:58 pg_multixact

drwx------ 1 postgres postgres 4096 2014-08-26 21:58 pg_notify

drwx------ 1 postgres postgres 4096 2014-08-26 21:58 pg_serial

drwx------ 1 postgres postgres 4096 2014-08-26 21:58 pg_stat_tmp

drwx------ 1 postgres postgres 4096 2014-08-26 21:58 pg_subtrans

drwx------ 1 postgres postgres 4096 2014-08-26 21:58 pg_tblspc

drwx------ 1 postgres postgres 4096 2014-08-26 21:58 pg_twophase

-rw------- 1 postgres postgres 4 2014-08-26 21:58 PG_VERSION

drwx------ 1 postgres postgres 4096 2014-08-26 21:58 pg_xlog

-rw------- 1 postgres postgres 19169 2014-08-26 21:58 postgresql.conf

On August 26, 2014 at 7:30:41 PM, Adrian Klaver (adrian.klaver@aklaver.com<mailto:adrian.klaver@aklaver.com>) wrote:

On 08/26/2014 03:08 PM, Arze, Cesar wrote:

Hi,

I’ve recently encountered an issue running Postgres (both 8.4 and 9.1)
on a VMware VM running Ubuntu 10.04 LTS as the guest OS with the data
directory running out of a VMware shared folder. Previously on 8.4 this
had worked out for me but after upgrading VMware and re-building my VM
I’ve started to encounter this issue. It seems like the problem occurs
when I run initdb, I get the following error:

So what is the host OS?

Where is the shared directory located, which OS?

What file system is the directory located on?

# sudo -u postgres /usr/lib/postgresql/9.1/bin/initdb --noclean -D
/mnt/pg_data/

creating template1 database in /mnt/pg_data/base/1 ... FATAL: could not
open file "pg_xlog/000000010000000000000001" (log file 0, segment 1): No
such file or directory

child process exited with exit code 1
initdb: data directory "/mnt/pg_data" not removed at user's request

What is in /mnt/pg_data after the error?

Thanks for any help,

Cesar

--
Adrian Klaver
adrian.klaver@aklaver.com

#5Arze, Cesar
CArze@som.umaryland.edu
In reply to: Steve Atkins (#3)
Re: Postgres 9.1 issues running data directory from VMware shared folder

Thanks for the info, will look into what version of Workstation I am running (think I have 9.0) and will see if I can’t get an upgraded copy and see if it alleviates the issue.

On August 26, 2014 at 8:36:23 PM, Steve Atkins (steve@blighty.com<mailto:steve@blighty.com>) wrote:

On Aug 26, 2014, at 3:08 PM, Arze, Cesar <CArze@som.umaryland.edu> wrote:

I probably should be posting to the VMware mailing list with this question but I wanted to see if anyone had any insight or suggestions here. I’ve seen many similar issues but none of the solutions proposed there worked for me.

This might not be what you're seeing, but there was a hideous bug in the shared folder (hgfs) driver for linux guest OSes that'll silently corrupt your filesystem if it's accessed via more than one filehandle (e.g. multiple opens, multiple processes, ...).

If you're using vmware tools bundled with workstation 10.0.1 or fusion 6.0.2, you have that bug and cannot safely use hgfs mounts for any files, let alone postgresql. (There was a different bug, with similar results, for earlier versions too, including at least fusion 5.0.1). VMWare claim it's fixed in the tools bundled with 10.0.2 / 6.0.3 (I've not tested it). If you're not running the very latest vmware, upgrade to it and install the latest tools (or avoid using hgfs).

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6John R Pierce
pierce@hogranch.com
In reply to: Arze, Cesar (#5)
Re: Postgres 9.1 issues running data directory from VMware shared folder

On 8/26/2014 5:51 PM, Arze, Cesar wrote:

Thanks for the info, will look into what version of Workstation I am
running (think I have 9.0) and will see if I can’t get an upgraded
copy and see if it alleviates the issue.

also, there's several years of patches since RHEL 5.4 was released, I
think its up to 5.9.

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Arze, Cesar
CArze@som.umaryland.edu
In reply to: John R Pierce (#6)
Re: Postgres 9.1 issues running data directory from VMware shared folder

My mistake, the host OS is RHEL 5.9

On August 26, 2014 at 11:09:35 PM, John R Pierce (pierce@hogranch.com<mailto:pierce@hogranch.com>) wrote:
On 8/26/2014 5:51 PM, Arze, Cesar wrote:

Thanks for the info, will look into what version of Workstation I am
running (think I have 9.0) and will see if I can’t get an upgraded
copy and see if it alleviates the issue.

also, there's several years of patches since RHEL 5.4 was released, I
think its up to 5.9.

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

In reply to: Arze, Cesar (#1)
Re: Postgres 9.1 issues running data directory from VMware shared folder

"Arze, Cesar" <CArze@som.umaryland.edu> writes:

creating template1 database in /mnt/pg_data/base/1 ... FATAL: could
not open file "pg_xlog/000000010000000000000001" (log file 0, segment
1): No such file or directory

We've seen something slightly similar when running PostgreSQL in a Linux
container. See this thread for more details:
/messages/by-id/spamdrop+87ha31kxrc.fsf@atom.bunk.cc

We have not solved this problem yet, but currently I'm leaning towards
blaming the container layer, so next time we experience problems I think
we'll try to remove the virtualization.

Best regards

Jacob

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general