Hot standby v5 patch - Databases created post backup remain inaccessible + replica SIGSEGV when coming out of standby

Started by Mark Kirkwoodabout 17 years ago5 messages
#1Mark Kirkwood
markir@paradise.net.nz

Another corner case:

1/ Setup master and replica with replica using pg_standby
2/ Create a new database (I used 'bench')
3/ Initialize the pgbench schema of size 100 in 'bench' (just to ensure
the logs with the db creation get archived)
3/ Attempt to connect to 'bench' on the replica

Head from 2nd Nov with v5 patch applied on Freebsd 7.1-Prerelease as
usual....

postgres=# \l
List of databases
Name | Owner | Encoding | Collation | Ctype | Access
Privileges
-----------+----------+-----------+-----------+-------+-------------------------------------
bench | postgres | SQL_ASCII | C | C |
postgres | postgres | SQL_ASCII | C | C |
template0 | postgres | SQL_ASCII | C | C |
{=c/postgres,postgres=CTc/postgres}
template1 | postgres | SQL_ASCII | C | C |
{=c/postgres,postgres=CTc/postgres}
(4 rows)

postgres=# \c bench
FATAL: database "bench" does not exist
Previous connection kept

Not sure if this is related at all, but if the replica is then
instructed to finish recovery via touching its trigger file, then we get:

DEBUG: executing restore command "pg_standby -l -d -s 2 -t
/tmp/pgsql.trigger.5439 /data0/pgarchive/8.4 00000001.history
pg_xlog/RECOVERYHISTORY 000000000000000000000000 2>>standby.log"
DEBUG: could not restore file "00000001.history" from archive: return
code 0
DEBUG: moving last restored xlog to "pg_xlog/000000020000000000000068"
LOG: archive recovery complete
DEBUG: Clear UnobservedXids
LOG: clearing recovery locks
DEBUG: reaping dead processes
LOG: startup process (PID 4254) was terminated by signal 11:
Segmentation fault
LOG: aborting startup due to startup process failure
DEBUG: proc_exit(1)
DEBUG: shmem_exit(1)
DEBUG: exit(1)

Using gdb:
#0 RelationClearRecoveryLocks () at inval.c:1702
1702 xl_rel_lock *lock = (xl_rel_lock *) lfirst(l);
(gdb) bt
#0 RelationClearRecoveryLocks () at inval.c:1702
#1 0x080d3849 in StartupXLOG () at xlog.c:5959
#2 0x080f1680 in AuxiliaryProcessMain (argc=2, argv=0xbfbfe6e8)
at bootstrap.c:421
#3 0x08214d4d in StartChildProcess (type=StartupProcess) at
postmaster.c:4104
#4 0x0821725b in PostmasterMain (argc=1, argv=0xbfbfec50) at
postmaster.c:1034
#5 0x081bfa7b in main (argc=1, argv=0xbfbfec50) at main.c:188

regards

Mark

#2Simon Riggs
simon@2ndQuadrant.com
In reply to: Mark Kirkwood (#1)
Re: Hot standby v5 patch - Databases created post backup remain inaccessible + replica SIGSEGV when coming out of standby

On Tue, 2008-11-04 at 18:33 +1300, Mark Kirkwood wrote:

Another corner case:

1/ Setup master and replica with replica using pg_standby
2/ Create a new database (I used 'bench')
3/ Initialize the pgbench schema of size 100 in 'bench' (just to ensure
the logs with the db creation get archived)
3/ Attempt to connect to 'bench' on the replica

Head from 2nd Nov with v5 patch applied on Freebsd 7.1-Prerelease as
usual....

Case acknowledged.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#3Simon Riggs
simon@2ndQuadrant.com
In reply to: Mark Kirkwood (#1)
Re: Hot standby v5 patch - Databases created post backup remain inaccessible + replica SIGSEGV when coming out of standby

On Tue, 2008-11-04 at 18:33 +1300, Mark Kirkwood wrote:

postgres=# \l
List of databases
Name | Owner | Encoding | Collation | Ctype | Access
Privileges
-----------+----------+-----------+-----------+-------+-------------------------------------
bench | postgres | SQL_ASCII | C | C |
postgres | postgres | SQL_ASCII | C | C |
template0 | postgres | SQL_ASCII | C | C |
{=c/postgres,postgres=CTc/postgres}
template1 | postgres | SQL_ASCII | C | C |
{=c/postgres,postgres=CTc/postgres}
(4 rows)

postgres=# \c bench
FATAL: database "bench" does not exist
Previous connection kept

CREATE DATABASE didn't trigger the db flat file update, code for which
existed and was triggered in the cases when a transaction would normally
rebuild the flat files. Simple fix, but stupid oversight.

Spotted another problem which is that BuildFlatFile may not be built
consistently if a rebuild is triggered prior to us reaching the recovery
consistency point. This is fixed by forcing a rebuild of the flat files
when we hit the recovery point.

Both one line changes, but I'll go looking for other issues there.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#4Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#3)
Re: Re: Hot standby v5 patch - Databases created post backup remain inaccessible + replica SIGSEGV when coming out of standby

On Tue, 2008-11-04 at 09:52 +0000, Simon Riggs wrote:

postgres=# \c bench
FATAL: database "bench" does not exist
Previous connection kept

CREATE DATABASE didn't trigger the db flat file update, code for which
existed and was triggered in the cases when a transaction would normally
rebuild the flat files. Simple fix, but stupid oversight.

Issue resolved.

Spotted another problem which is that BuildFlatFile may not be built
consistently if a rebuild is triggered prior to us reaching the recovery
consistency point. This is fixed by forcing a rebuild of the flat files
when we hit the recovery point.

Issue resolved.

Both one line changes, but I'll go looking for other issues there.

I also mentioned previously that I hadn't implemented locking yet during
flat file updates. After spending longer looking at the code around this
I no longer think it is required.

These changes will be rolled into the next patch version, soon.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#5Mark Kirkwood
markir@paradise.net.nz
In reply to: Simon Riggs (#3)
Re: Re: Hot standby v5 patch - Databases created post backup remain inaccessible + replica SIGSEGV when coming out of standby

Simon Riggs wrote:

On Tue, 2008-11-04 at 18:33 +1300, Mark Kirkwood wrote:

postgres=# \l
List of databases
Name | Owner | Encoding | Collation | Ctype | Access
Privileges
-----------+----------+-----------+-----------+-------+-------------------------------------
bench | postgres | SQL_ASCII | C | C |
postgres | postgres | SQL_ASCII | C | C |
template0 | postgres | SQL_ASCII | C | C |
{=c/postgres,postgres=CTc/postgres}
template1 | postgres | SQL_ASCII | C | C |
{=c/postgres,postgres=CTc/postgres}
(4 rows)

postgres=# \c bench
FATAL: database "bench" does not exist
Previous connection kept

CREATE DATABASE didn't trigger the db flat file update, code for which
existed and was triggered in the cases when a transaction would normally
rebuild the flat files. Simple fix, but stupid oversight.

Spotted another problem which is that BuildFlatFile may not be built
consistently if a rebuild is triggered prior to us reaching the recovery
consistency point. This is fixed by forcing a rebuild of the flat files
when we hit the recovery point.

Both one line changes, but I'll go looking for other issues there.

Patching with v5d lets me access the newly created database, another one
down!

Cheers

Mark