BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

Started by PG Bug reporting formover 2 years ago8 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 18084
Logged by: Git Queries
Email address: gitqueries0@gmail.com
PostgreSQL version: Unsupported/Unknown
Operating system: Microsoft Windows Server 2019 Standard
Description:

Version - PostgreSQL 10.20, compiled by Visual C++ build 1800, 64-bit

Platform - Windows

Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
perform a sanity check on the migrated tables using source and destination
row counts. Occasionally, we encounter count mismatches where the source
database has fewer rows, resulting in a higher count, or more rows,
resulting in a lower count.

Note: There are no migration issues, as all data rows have been successfully
migrated without data loss.

ProdDB=> select count(*) from PdtDetailsTable;
count
---------
1297324
(1 row)

ProdDB=> reindex table PdtDetailsTable;
REINDEX
ProdDB=> select count(*) from PdtDetailsTable;
count
---------
1297322
(1 row)

ProdDB=>
Upon reindexing, the count(*) query returns the correct results in the
source DB.

In the pg_log, we couldn't find any traces related to this behavior. Is
there a reason for such behavior, and how can this be addressed to prevent
future issues?

#2David Rowley
dgrowleyml@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

On Tue, 5 Sept 2023 at 06:19, PG Bug reporting form
<noreply@postgresql.org> wrote:

Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
perform a sanity check on the migrated tables using source and destination
row counts. Occasionally, we encounter count mismatches where the source
database has fewer rows, resulting in a higher count, or more rows,
resulting in a lower count.

Can we assume there are no concurrent changes being made?

Upon reindexing, the count(*) query returns the correct results in the
source DB.

Is it possible that the previous index was created with the
CONCURRENTLY option? There have been a few bugs in that area. I've
not looked exhaustively, but see [1]https://www.postgresql.org/docs/release/10.16/ and [2]https://www.postgresql.org/docs/release/10.19/. Search for "reindex".

In the pg_log, we couldn't find any traces related to this behavior. Is
there a reason for such behavior, and how can this be addressed to prevent
future issues?

There have been a number of bugs fixed since 10.0 that could have led
to this. The best thing to do to help prevent this is always upgrade
when a minor version is released and follow any relevant instructions
given in the release notes. If you skip minor versions, then you'll
need to look at all the release notes from the minor version you're
going from, up to the version you're upgrading to.

It looks like this practice wasn't being followed as you're on 10.20,
which was released 1.5 years ago. Minor versions are released to
address/fix bugs that are found. If you don't upgrade minor versions
then you don't receive bug fixes.

David

[1]: https://www.postgresql.org/docs/release/10.16/
[2]: https://www.postgresql.org/docs/release/10.19/

#3Bruce Momjian
bruce@momjian.us
In reply to: PG Bug reporting form (#1)
Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

On Mon, Sep 4, 2023 at 05:55:51PM +0000, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 18084
Logged by: Git Queries
Email address: gitqueries0@gmail.com
PostgreSQL version: Unsupported/Unknown
Operating system: Microsoft Windows Server 2019 Standard
Description:

Version - PostgreSQL 10.20, compiled by Visual C++ build 1800, 64-bit

Platform - Windows

Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
perform a sanity check on the migrated tables using source and destination
row counts. Occasionally, we encounter count mismatches where the source
database has fewer rows, resulting in a higher count, or more rows,
resulting in a lower count.

Note: There are no migration issues, as all data rows have been successfully
migrated without data loss.

And how did you perform this migration? pg_dump, pg_upgrade, logical
replication?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

#4Git Queries
gitqueries0@gmail.com
In reply to: David Rowley (#2)
Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

Can we assume there are no concurrent changes being made?

Yes there are no concurrent changes being made during the migration.

Is it possible that the previous index was created with the
CONCURRENTLY option? There have been a few bugs in that area. I've
not looked exhaustively, but see [1] and [2]. Search for "reindex".

No, indexes are not created with the CONCURRENTLY option.

It looks like this practice wasn't being followed as you're on 10.20,
which was released 1.5 years ago. Minor versions are released to
address/fix bugs that are found. If you don't upgrade minor versions
then you don't receive bug fixes.

Sure, we will take this into account.

On Tue, Sep 5, 2023 at 2:43 AM David Rowley <dgrowleyml@gmail.com> wrote:

Show quoted text

On Tue, 5 Sept 2023 at 06:19, PG Bug reporting form
<noreply@postgresql.org> wrote:

Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
perform a sanity check on the migrated tables using source and

destination

row counts. Occasionally, we encounter count mismatches where the source
database has fewer rows, resulting in a higher count, or more rows,
resulting in a lower count.

Can we assume there are no concurrent changes being made?

Upon reindexing, the count(*) query returns the correct results in the
source DB.

Is it possible that the previous index was created with the
CONCURRENTLY option? There have been a few bugs in that area. I've
not looked exhaustively, but see [1] and [2]. Search for "reindex".

In the pg_log, we couldn't find any traces related to this behavior. Is
there a reason for such behavior, and how can this be addressed to

prevent

future issues?

There have been a number of bugs fixed since 10.0 that could have led
to this. The best thing to do to help prevent this is always upgrade
when a minor version is released and follow any relevant instructions
given in the release notes. If you skip minor versions, then you'll
need to look at all the release notes from the minor version you're
going from, up to the version you're upgrading to.

It looks like this practice wasn't being followed as you're on 10.20,
which was released 1.5 years ago. Minor versions are released to
address/fix bugs that are found. If you don't upgrade minor versions
then you don't receive bug fixes.

David

[1] https://www.postgresql.org/docs/release/10.16/
[2] https://www.postgresql.org/docs/release/10.19/

#5Git Queries
gitqueries0@gmail.com
In reply to: Bruce Momjian (#3)
Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

And how did you perform this migration? pg_dump, pg_upgrade, logical
replication?

Migration was performed using pg_dump.

On Tue, Sep 5, 2023 at 8:48 PM Bruce Momjian <bruce@momjian.us> wrote:

Show quoted text

On Mon, Sep 4, 2023 at 05:55:51PM +0000, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 18084
Logged by: Git Queries
Email address: gitqueries0@gmail.com
PostgreSQL version: Unsupported/Unknown
Operating system: Microsoft Windows Server 2019 Standard
Description:

Version - PostgreSQL 10.20, compiled by Visual C++ build 1800, 64-bit

Platform - Windows

Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
perform a sanity check on the migrated tables using source and

destination

row counts. Occasionally, we encounter count mismatches where the source
database has fewer rows, resulting in a higher count, or more rows,
resulting in a lower count.

Note: There are no migration issues, as all data rows have been

successfully

migrated without data loss.

And how did you perform this migration? pg_dump, pg_upgrade, logical
replication?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

#6Bruce Momjian
bruce@momjian.us
In reply to: Git Queries (#5)
Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

On Wed, Sep 6, 2023 at 11:11:05PM +0530, Git Queries wrote:

And how did you perform this migration?  pg_dump, pg_upgrade, logical
replication?

Migration was performed using pg_dump.

Uh, that is kind of a surprise since pg_dump/restore does a logical dump
--- none of the binary files are transfered, so it should always be
accurate.

---------------------------------------------------------------------------

On Tue, Sep 5, 2023 at 8:48 PM Bruce Momjian <bruce@momjian.us> wrote:

On Mon, Sep  4, 2023 at 05:55:51PM +0000, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference:      18084
Logged by:          Git Queries
Email address:      gitqueries0@gmail.com
PostgreSQL version: Unsupported/Unknown
Operating system:   Microsoft Windows Server 2019 Standard
Description:       

Version - PostgreSQL 10.20, compiled by Visual C++ build 1800, 64-bit

Platform - Windows

Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
perform a sanity check on the migrated tables using source and

destination

row counts. Occasionally, we encounter count mismatches where the source
database has fewer rows, resulting in a higher count, or more rows,
resulting in a lower count.

Note: There are no migration issues, as all data rows have been

successfully

migrated without data loss.

And how did you perform this migration?  pg_dump, pg_upgrade, logical
replication?

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

#7David Rowley
dgrowleyml@gmail.com
In reply to: Bruce Momjian (#6)
Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

On Thu, 7 Sept 2023 at 11:17, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Sep 6, 2023 at 11:11:05PM +0530, Git Queries wrote:

Migration was performed using pg_dump.

Uh, that is kind of a surprise since pg_dump/restore does a logical dump
--- none of the binary files are transfered, so it should always be
accurate.

I can't be certain, but I think the likely reason here is that the
newly restored instance is fine and all indexes match the heap, but
it's the source instance that has indexes with missing entries. Since
the pg_dump queries are most likely to result in seqscans, then all
rows will be returned and the new instance has no missing rows.

David

#8Bruce Momjian
bruce@momjian.us
In reply to: David Rowley (#7)
Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

On Thu, Sep 7, 2023 at 12:26:53PM +1200, David Rowley wrote:

On Thu, 7 Sept 2023 at 11:17, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Sep 6, 2023 at 11:11:05PM +0530, Git Queries wrote:

Migration was performed using pg_dump.

Uh, that is kind of a surprise since pg_dump/restore does a logical dump
--- none of the binary files are transfered, so it should always be
accurate.

I can't be certain, but I think the likely reason here is that the
newly restored instance is fine and all indexes match the heap, but
it's the source instance that has indexes with missing entries. Since
the pg_dump queries are most likely to result in seqscans, then all
rows will be returned and the new instance has no missing rows.

Yes, very good point.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.