Test cluster with high OIDs above the signed-int limit (2B+)
Hi. A few weeks ago, one of our clusters, with high DDL churn from
UTs, crossed the 2B mark for OIDs, which exposed a bug in our code.
I'm moving into creating clusters on-the-fly for testing, and would
like to force that situation to avoid a future silent regression,
since it takes a long time to cross that threshold, and we do move up
in major versions, so the over-the-threshold cluster will be
abandoned. How can I achieve that? A quick AI query yielded nothing,
but this is unusual enough that there's little to no material to have
good answers. Can PostgreSQL experts/hackers weigh in on this please?
If not possible now, can this be supported in the future please? --DD
On Mon, Apr 20, 2026 at 5:45 AM Dominique Devienne <ddevienne@gmail.com>
wrote:
Hi. A few weeks ago, one of our clusters, with high DDL churn from
UTs, crossed the 2B mark for OIDs, which exposed a bug in our code.
Because you track and remember OIDs?
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On Mon, Apr 20, 2026 at 2:45 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
On Mon, Apr 20, 2026 at 5:45 AM Dominique Devienne <ddevienne@gmail.com> wrote:
Hi. A few weeks ago, one of our clusters, with high DDL churn from
UTs, crossed the 2B mark for OIDs, which exposed a bug in our code.Because you track and remember OIDs?
No. I don't even remember the exact bug, and we lost networking to our
SCM right now, so can't even look it up (obviously it's not
decentralized SCM). But signed vs unsigned and 2B+ is a classic bug,
worth testing for, except it's impractical to reach such high OIDs on
demand. Given there's a cluster-wide OID counter, surely there's a
way, even hackish, to influence that counter, no? PostgreSQL itself
has mitigation strategies when running out of OIDs, doesn't it? It's a
different use-case, but that implies also reaching large OIDs, and I
suspect this is unit tested, no?
On Mon, Apr 20, 2026 at 2:59 PM Dominique Devienne <ddevienne@gmail.com> wrote:
No. I don't even remember the exact bug
Was an old test using lo_creat(-1) RETURNING the OID, and code doing
`std::stoi(PQgetvalue(...))`. In production we don't use LO and use
the binary protocol, so no such issue, still my original point
remains. We process OIDs in several places, and making sure our test
suite works with high OIDs would be better. If I fully control the
cluster, which is created specifically for the test run, on-the-fly,
it's like to be able to similate high OIDs "instantly".
On Mon, Apr 20, 2026 at 9:08 AM Dominique Devienne <ddevienne@gmail.com>
wrote:
On Mon, Apr 20, 2026 at 2:59 PM Dominique Devienne <ddevienne@gmail.com>
wrote:No. I don't even remember the exact bug
Was an old test using lo_creat(-1) RETURNING the OID, and code doing
`std::stoi(PQgetvalue(...))`. In production we don't use LO and use
the binary protocol, so no such issue, still my original point
remains. We process OIDs in several places, and making sure our test
suite works with high OIDs would be better. If I fully control the
cluster, which is created specifically for the test run, on-the-fly,
it's like to be able to similate high OIDs "instantly".
It's an unsigned integer, so I'd say not use signed ints when processing
OIDs.
It's a valid question, though, what happens when the OID counter wraps
around and hits a duplicate.
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
Dominique Devienne <ddevienne@gmail.com> writes:
Hi. A few weeks ago, one of our clusters, with high DDL churn from
UTs, crossed the 2B mark for OIDs, which exposed a bug in our code.
I'm moving into creating clusters on-the-fly for testing, and would
like to force that situation to avoid a future silent regression,
since it takes a long time to cross that threshold, and we do move up
in major versions, so the over-the-threshold cluster will be
abandoned. How can I achieve that?
See pg_resetwal --next-oid. Don't recall what else you need to say
to avoid breaking the cluster in other ways.
regards, tom lane
You can change the define for FirstNomalObjectId
in include/access/transam.h to a very large number and recompile Postgres.
I don't know an easy way to increment that for an existing cluster other
than creating/removing an object in a client loop.
On Mon, Apr 20, 2026 at 3:23 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
It's an unsigned integer, so I'd say not use signed ints when processing OIDs.
Well duh, that's why it's a bug.
But it's a sneaky bug, because clusters rarely enter that high-OID territory.
That's precisely why I'd like a way to provoke it.
It's a valid question, though, what happens when the OID counter wraps around and hits a duplicate.
Again, I'm NOT interested in OID wrap-around. But the "second-half" of
the OID space.
On Mon, Apr 20, 2026 at 3:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dominique Devienne <ddevienne@gmail.com> writes:
See pg_resetwal --next-oid. Don't recall what else you need to say
to avoid breaking the cluster in other ways.
Great, thanks Tom.
So I just initdb, run the above, then start the cluster? That's it?
Dominique Devienne <ddevienne@gmail.com> writes:
On Mon, Apr 20, 2026 at 3:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
See pg_resetwal --next-oid. Don't recall what else you need to say
to avoid breaking the cluster in other ways.
So I just initdb, run the above, then start the cluster? That's it?
Right. As I said, I don't recall what other options you might
need, but that's the game plan.
regards, tom lane