fsync bug faq for publication?

Started by Josh Berkusover 10 years ago6 messages
#1Josh Berkus
josh@agliodbs.com

Hackers,

We need to get a notice out to our users who might update their servers
and get stuck behind the fsync bug. As such, I've prepared a FAQ.
Please read, correct and improve this FAQ so that it's fit for us to
announce to users as soon as possible:

https://wiki.postgresql.org/wiki/May_2015_Fsync_Permissions_Bug

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#1)
Re: fsync bug faq for publication?

Josh Berkus <josh@agliodbs.com> writes:

We need to get a notice out to our users who might update their servers
and get stuck behind the fsync bug. As such, I've prepared a FAQ.
Please read, correct and improve this FAQ so that it's fit for us to
announce to users as soon as possible:

https://wiki.postgresql.org/wiki/May_2015_Fsync_Permissions_Bug

Judging by Ross Boylan's report at
/messages/by-id/F1F13E14A610474196571953929C02096D0E97@ex08.net.ucsf.edu
it's not sufficient to just recommend "changing permissions" on the
problematic files. It's not entirely clear from here whether there is a
solution that both allows fsync on referenced files and keeps OpenSSL
happy; but if there is, it probably requires making the cert files be
owned by the postgres user, as well as adjusting their permissions to
be 0640 or thereabouts. I'm worried about whether that breaks other
services using the same cert files.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#2)
Re: fsync bug faq for publication?

On May 26, 2015 07:31, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Josh Berkus <josh@agliodbs.com> writes:

We need to get a notice out to our users who might update their servers
and get stuck behind the fsync bug. As such, I've prepared a FAQ.
Please read, correct and improve this FAQ so that it's fit for us to
announce to users as soon as possible:

https://wiki.postgresql.org/wiki/May_2015_Fsync_Permissions_Bug

Judging by Ross Boylan's report at

/messages/by-id/F1F13E14A610474196571953929C02096D0E97@ex08.net.ucsf.edu

it's not sufficient to just recommend "changing permissions" on the
problematic files. It's not entirely clear from here whether there is a
solution that both allows fsync on referenced files and keeps OpenSSL
happy; but if there is, it probably requires making the cert files be
owned by the postgres user, as well as adjusting their permissions to
be 0640 or thereabouts. I'm worried about whether that breaks other
services using the same cert files.

It almost certainly will.

I think the recommendation has to be that if it's a symlink, it should be
replaced with a copy of the file, and that copy be chown and chmod the
right way.

/Magnus

#4Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: fsync bug faq for publication?

On 05/25/2015 11:09 PM, Magnus Hagander wrote:

On May 26, 2015 07:31, "Tom Lane" <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>> wrote:

Josh Berkus <josh@agliodbs.com <mailto:josh@agliodbs.com>> writes:

We need to get a notice out to our users who might update their servers
and get stuck behind the fsync bug. As such, I've prepared a FAQ.
Please read, correct and improve this FAQ so that it's fit for us to
announce to users as soon as possible:

https://wiki.postgresql.org/wiki/May_2015_Fsync_Permissions_Bug

Judging by Ross Boylan's report at

/messages/by-id/F1F13E14A610474196571953929C02096D0E97@ex08.net.ucsf.edu

it's not sufficient to just recommend "changing permissions" on the
problematic files. It's not entirely clear from here whether there is a
solution that both allows fsync on referenced files and keeps OpenSSL
happy; but if there is, it probably requires making the cert files be
owned by the postgres user, as well as adjusting their permissions to
be 0640 or thereabouts. I'm worried about whether that breaks other
services using the same cert files.

It almost certainly will.

I think the recommendation has to be that if it's a symlink, it should
be replaced with a copy of the file, and that copy be chown and chmod
the right way.

Where did we get the idea that this issue only affects symlinked files?
On testing, any file which "postgres" doesn't have write permissions on
is affected:

root@d623471b11ee:/var/lib/postgresql/9.3/main# touch root_file.txt
root@d623471b11ee:/var/lib/postgresql/9.3/main# ls -l
total 60
-rw------- 1 postgres postgres 4 May 26 17:46 PG_VERSION
drwx------ 5 postgres postgres 4096 May 26 17:46 base
drwx------ 2 postgres postgres 4096 May 26 17:46 global
drwx------ 2 postgres postgres 4096 May 26 17:46 pg_clog
drwx------ 4 postgres postgres 4096 May 26 17:46 pg_multixact
drwx------ 2 postgres postgres 4096 May 26 17:46 pg_notify
drwx------ 2 postgres postgres 4096 May 26 17:46 pg_serial
drwx------ 2 postgres postgres 4096 May 26 17:46 pg_snapshots
drwx------ 2 postgres postgres 4096 May 26 17:47 pg_stat
drwx------ 2 postgres postgres 4096 May 26 17:46 pg_stat_tmp
drwx------ 2 postgres postgres 4096 May 26 17:46 pg_subtrans
drwx------ 2 postgres postgres 4096 May 26 17:46 pg_tblspc
drwx------ 2 postgres postgres 4096 May 26 17:46 pg_twophase
drwx------ 3 postgres postgres 4096 May 26 17:46 pg_xlog
-rw------- 1 postgres postgres 133 May 26 17:46 postmaster.opts
-rw-r--r-- 1 root root 0 May 26 17:49 root_file.txt
root@d623471b11ee:/var/lib/postgresql/9.3/main# service postgresql start
* Starting PostgreSQL 9.3 database server

[ OK ]
root@d623471b11ee:/var/lib/postgresql/9.3/main# ps aux | grep postgres
postgres 4627 0.2 0.4 244880 16100 ? S 17:49 0:00
/usr/lib/postgresql/9.3/bin/postgres -D /var/lib/postgresql/9.3/main -c
config_file=/etc/postgresql/9.3/main/postgresql.conf
postgres 4629 0.0 0.0 244880 1868 ? Ss 17:49 0:00
postgres: checkpointer process

postgres 4630 0.0 0.0 244880 1872 ? Ss 17:49 0:00
postgres: writer process

postgres 4631 0.0 0.0 244880 1648 ? Ss 17:49 0:00
postgres: wal writer process

postgres 4632 0.0 0.0 245632 2956 ? Ss 17:49 0:00
postgres: autovacuum launcher process

postgres 4633 0.0 0.0 100556 1768 ? Ss 17:49 0:00
postgres: stats collector process

root 4647 0.0 0.0 8860 648 ? S+ 17:49 0:00 grep
--color=auto postgres
root@d623471b11ee:/var/lib/postgresql/9.3/main# kill -9 4627
root@d623471b11ee:/var/lib/postgresql/9.3/main# service postgresql start
* Starting PostgreSQL 9.3 database server

* Removed
stale pid file.
The PostgreSQL server failed to start. Please check the log output:
2015-05-26 17:49:36 UTC [4676-1] LOG: database system was interrupted;
last known up at 2015-05-26 17:49:16 UTC
2015-05-26 17:49:36 UTC [4676-2] FATAL: could not open file
"/var/lib/postgresql/9.3/main/root_file.txt": Permission denied
2015-05-26 17:49:36 UTC [4675-1] LOG: startup process (PID 4676) exited
with exit code 1
2015-05-26 17:49:36 UTC [4675-2] LOG: aborting startup due to startup
process failure

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#4)
Re: fsync bug faq for publication?

Josh Berkus <josh@agliodbs.com> writes:

Where did we get the idea that this issue only affects symlinked files?

Nobody said any such thing. My point was that permissions and ownership
both have to be looked at. The Debian situation is that there are symlinks
in $PGDATA pointing at root-owned files, and those files are (we think)
also used by other services; so Magnus' point was that you'd probably
better copy those files not modify their ownership/permissions in situ.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: fsync bug faq for publication?

On 05/26/2015 10:57 AM, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

Where did we get the idea that this issue only affects symlinked files?

Nobody said any such thing. My point was that permissions and ownership
both have to be looked at. The Debian situation is that there are symlinks
in $PGDATA pointing at root-owned files, and those files are (we think)
also used by other services; so Magnus' point was that you'd probably
better copy those files not modify their ownership/permissions in situ.

Updated, please make further corrections so I can get an announcement
out ASAP. Thanks!

https://wiki.postgresql.org/wiki/May_2015_Fsync_Permissions_Bug

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers