Instability in test/regress/sql/portals.sql

Started by Matthias van de Meentabout 1 month ago2 messageshackers
Jump to latest
#1Matthias van de Meent
boekewurm+postgres@gmail.com

Hi,

Internally at Databricks we've seen rare regression failures in the
portals.sql test, where the regression diff looks something like the
one attached in data_attachments_ed2b37649a9393b5.diffs.

It seems like this was caused by synchronized seqscans, which caused
the foo25ns cursor to start its seqscan not at the start of the table,
but instead with an offset into the table. This changed the output,
because that relied on the seqscan starting at the first page of the
table.

To stabilize this test, let's add SET synchronize_seqscans = off, as attached.

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

Attachments:

data_attachments_ed2b37649a9393b5.diffsapplication/octet-stream; name=data_attachments_ed2b37649a9393b5.diffsDownload+4-4
v1-0001-Stabilize-syncscan-issue-in-pg_regress-portals.ou.patchapplication/octet-stream; name=v1-0001-Stabilize-syncscan-issue-in-pg_regress-portals.ou.patchDownload+5-1
#2Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#1)
Re: Instability in test/regress/sql/portals.sql

On Wed, Mar 11, 2026 at 03:11:36PM +0100, Matthias van de Meent wrote:

Internally at Databricks we've seen rare regression failures in the
portals.sql test, where the regression diff looks something like the
one attached in data_attachments_ed2b37649a9393b5.diffs.

It seems like this was caused by synchronized seqscans, which caused
the foo25ns cursor to start its seqscan not at the start of the table,
but instead with an offset into the table. This changed the output,
because that relied on the seqscan starting at the first page of the
table.

To stabilize this test, let's add SET synchronize_seqscans = off, as attached.

One could be reminded about cbf4177f2ca0, as well, when using a low
number of shared buffers. My question would be why only this test?
And isn't there some benefit in running this part of the test suite
with the parameter enabled?

Sync seqscans disabled is a pre-8.3 behavior, and I'd be tempted to
suggest that we drop the GUC while making the recovery test 027 use
more shared buffers but we tend to be really conservative with these
tests, so I doubt that this is going to happen. ;)

Or you could just force that on your end with a custom .conf file? We
have plenty of configuration that can influence the outcome of the
tests, and it is not obvious why this particular case is worth caring
for when it comes to the upstream core tests.
--
Michael