BUG #16540: Possible corrupted file?

Started by PG Bug reporting formalmost 6 years ago3 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 16540
Logged by: Paul Hatcher
Email address: paul.hatcher@sentinelpartners.co.uk
PostgreSQL version: 11.8
Operating system: Windows Server 2012
Description:

We had a situation where a query that normally ran in 10 seconds instead
never completed. No conflicts or locks could be seen. All indexes were
valid and in place. A full shutdown and restart of the server was done with
no effect.
To test we took a full backup of this system and restored to an identical
alternate system. Having done this, the alternate system ran the query in
10 seconds.
Final test was to backup and restore back into the production system -
having done this the query returned to running in 10 seconds. Do you have
any idea what could have caused this?

Thanks
Paul

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: PG Bug reporting form (#1)
Re: BUG #16540: Possible corrupted file?

PG Bug reporting form <noreply@postgresql.org> writes:

We had a situation where a query that normally ran in 10 seconds instead
never completed. No conflicts or locks could be seen. All indexes were
valid and in place. A full shutdown and restart of the server was done with
no effect.
To test we took a full backup of this system and restored to an identical
alternate system. Having done this, the alternate system ran the query in
10 seconds.
Final test was to backup and restore back into the production system -
having done this the query returned to running in 10 seconds. Do you have
any idea what could have caused this?

The most obvious theory is a change of query plan, perhaps due to having
up-to-date ANALYZE statistics in one case and not the other. I don't
suppose you captured EXPLAIN output for the non-working state?

regards, tom lane

#3Paul Hatcher
paul.hatcher@sentinelpartners.co.uk
In reply to: Tom Lane (#2)
Re: BUG #16540: Possible corrupted file?

Hi Tom

Many thanks for the feedback.

I did do an explain on both the 10second and the never ending. The actual structure of the plan was identical - there were a couple of very minor differences in costs and widths but we're talking cost of 200,000 vs 199,500 kind of differences.

Thanks

Paul Hatcher
Senior Consultant
Sentinel Partners Limited
W: sentinelpartners.co.uk
M: 07788 673898
T: 0800 612 2116

On 14/07/2020, 15:01, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

PG Bug reporting form <noreply@postgresql.org> writes:

We had a situation where a query that normally ran in 10 seconds instead
never completed. No conflicts or locks could be seen. All indexes were
valid and in place. A full shutdown and restart of the server was done with
no effect.
To test we took a full backup of this system and restored to an identical
alternate system. Having done this, the alternate system ran the query in
10 seconds.
Final test was to backup and restore back into the production system -
having done this the query returned to running in 10 seconds. Do you have
any idea what could have caused this?

The most obvious theory is a change of query plan, perhaps due to having
up-to-date ANALYZE statistics in one case and not the other. I don't
suppose you captured EXPLAIN output for the non-working state?

regards, tom lane