Replica string comparsion issue

Started by Andrey Lizenkoalmost 8 years ago5 messagesgeneral
Jump to latest
#1Andrey Lizenko
lizenko79@gmail.com

Hello,
I'm observing strange behaviour on comparing ::text field with string while
quering replica.

Here is the table structure:

=# \dS+ raw.symbols_aggregates

Table
"raw.symbols_aggregates"
Column | Type | Collation | Nullable |
Default | Storage | Stats target | Description

--------------------+---------+-----------+----------+------------------------------------------------+----------+--------------+-------------
id | bigint | | not null |
nextval('symbols_aggregates_id_seq'::regclass) | plain | |
symbol | text | | not null |
| extended | |
id_type | bigint | | not null |
| plain | |
id_aggregates_list | integer | | not null |
| plain | |
id_regionals_list | integer | | not null |
| plain | |
date_started | date | | not null |
| plain | |
Indexes:
"symbols_aggregates_pkey" PRIMARY KEY, btree (id)
"symbols_aggregates_uniq" UNIQUE CONSTRAINT, btree (symbol, id_type,
id_aggregates_list, id_regionals_list)
Foreign-key constraints:
"symbols_aggregates_id_aggregates_list_fkey" FOREIGN KEY
(id_aggregates_list) REFERENCES aggregates_list(id)
"symbols_aggregates_id_regionals_list_fkey" FOREIGN KEY
(id_regionals_list) REFERENCES regionals_list(id)
"symbols_aggregates_id_type_fkey" FOREIGN KEY (id_type) REFERENCES
types_list(id)

Simple query on master works as expected:

=# select symbol from raw.symbols_aggregates where symbol='RUT';
symbol
--------
RUT
RUT
RUT
(3 rows)

The same query on replica works only after casting to varchar, using
trim(), or something else:
(3 rows expected here)

=# select symbol from raw.symbols_aggregates where symbol='RUT';
symbol
--------
(0 rows)

with ::varchar(50) result looks correct:

=# select symbol from raw.symbols_aggregates where

symbol::varchar(50)='RUT';
symbol
--------
RUT
RUT
RUT

There is no hidden characters, it looks identical with ::bytea

=# select symbol::bytea, 'RUT'::bytea, (symbol::varchar(50))::bytea from

raw.symbols_aggregates where symbol::varchar(50)='RUT';
symbol | bytea | symbol
----------+----------+----------
\x525554 | \x525554 | \x525554
\x525554 | \x525554 | \x525554
\x525554 | \x525554 | \x525554

PostgreSQL server version is 10.3. It might be important, that master is
running on Ubuntu 16.04.4 LTS and replica on Solaris 11.3.
PostgreSQL installed from binaries, not from sources. All locales
are en_US.UTF-8

My suggestion it is a kind of collation issue, but I've no idea why :text
is not working in this case.

--
Regards, Andrei Lizenko

In reply to: Andrey Lizenko (#1)
Re: Replica string comparsion issue

On Wed, Jun 13, 2018 at 1:10 PM, Andrey Lizenko <lizenko79@gmail.com> wrote:

I'm observing strange behaviour on comparing ::text field with string while
quering replica.

The first thing to do here is to install amcheck on the replica, and
verify that the indexes are consistent with the replica's own notion
of how text sorts for the collation that is implicitly in use (the
database default):

https://www.postgresql.org/docs/current/static/amcheck.html

You can modify the example query to check the indexes that you're
interested in. I think that there is a very strong chance that the
replica has incompatible collation rules, given that it uses a totally
different OS.

--
Peter Geoghegan

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrey Lizenko (#1)
Re: Replica string comparsion issue

Andrey Lizenko <lizenko79@gmail.com> writes:

I'm observing strange behaviour on comparing ::text field with string while
quering replica.

These symptoms seem consistent with the theory that the replica's index
for that column (symbols_aggregates_uniq) is corrupt. I think your
casts etc are just serving to defeat selection of an indexscan.

Why it's corrupt, I dunno, but you might try forcing a reindex on the
master.

regards, tom lane

#4Andrey Lizenko
lizenko79@gmail.com
In reply to: Tom Lane (#3)
Re: Replica string comparsion issue

Re-indexing didn't help. Will check via amcheck.

Another point regarding collations: query with ORDER BY returns different
record set on master and on replica if more than one column used:

Master:

=# select date_started, symbol from raw.symbols_aggregates order by 1,2
limit 5;
date_started | symbol
--------------+---------
1970-01-01 | AADR.EU
1970-01-01 | AADR.IV
1970-01-01 | AADR.NV
1970-01-01 | AADR.SO
1970-01-01 | AADR.TC

Replica:

=# select date_started, symbol from raw.symbols_aggregates order by 1,2
limit 5;
date_started | symbol
--------------+--------------
1970-01-01 | /EPRH18:XMON
1970-01-01 | /EPRM18:XMON
1970-01-01 | /FCEF18:XMON
1970-01-01 | /FCEG18:XMON
1970-01-01 | /FCEH18:XMON

On Wed, Jun 13, 2018 at 10:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrey Lizenko <lizenko79@gmail.com> writes:

I'm observing strange behaviour on comparing ::text field with string

while

quering replica.

These symptoms seem consistent with the theory that the replica's index
for that column (symbols_aggregates_uniq) is corrupt. I think your
casts etc are just serving to defeat selection of an indexscan.

Why it's corrupt, I dunno, but you might try forcing a reindex on the
master.

regards, tom lane

--
Regards, Andrei Lizenko

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrey Lizenko (#4)
Re: Replica string comparsion issue

Andrey Lizenko <lizenko79@gmail.com> writes:

Another point regarding collations: query with ORDER BY returns different
record set on master and on replica if more than one column used:

When I saw Peter's reply, I figured he'd nailed it, and this seems to
be more evidence pointing in that direction. If the standby's OS
sorts strings differently than the master does, the index will seem
corrupt to it, even if it's fine on the master.

regards, tom lane