Obvious data mismatch in View2 which basically SELECT * from View1
Dear list,
Recently I am getting feedback, data in my analytic report is not
repeatable. From time to time they get different data for the same time
span.
(but IIRC previously it was OK). Therefore I started debuging the View
chain for that report, during which I bumped into this issue/phenomenon.
In a over -simplified version:
CREATE VIEW2 AS SELECT * FROM VIEW1;
SELECT col1 FROM VIEW2 WHERE cond1=True;
SELECT col1 FROM VIEW1 WHERE cond1=True;
Now col1 from both views looks different. I don't know where to start to
solve this problem.
The actual situation is a bit more than that, the following is the
actual query:
-- trying to audit utlog weighed stat
with t as (
select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
0:0:0'::timestamp t1
)
--select * from t;
select *
-- from utlog.cache_stats_per_shift_per_reason_weighed_stats
-- from utlog.stats_per_shift_filtered_per_reason
from utlog.stats_per_shift_filtered (let's call
it #View2 for short)
-- from utlog.stats_per_shift_filtered_b0206 (let's call it
#View1 for short)
-- from utlog.stats_per_shift
cross join t
where wline = 'F02' and wts >= t.t0 and wts < t.t1 and wsft ='D'
limit 100
;
The Result for #View2
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+--------+--------+-------+-------
2020-07-01 08:00:00 | D | 0 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 12.533 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12.217 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5.15 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 6.8 | S00 | F02 | {PDCB}
INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | text[]
Same query, the Result for #View1
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+-------+--------+-------+-------
2020-07-01 08:00:00 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 13 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 7 | S00 | F02 | {PDCB}
INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | varchar[]
Reuslts in `wspan` column is inaccurate while both type are float8. Most
weird thing is the 5 to 0 change. for Row 1.
The `_b0206`(#View1) is just a version of
`stats_per_shift_filtered`(#View2) from past revisions.
I am sure the original CREATE statement for (#View2) is `CREATE VIEW ...
AS SELECT * FROM ...._b0206`
Definition of View2 in SQLWorkbench/J generated schema:
CREATE OR REPLACE VIEW utlog.stats_per_shift_filtered (#View2)
(
wts,
wdate,
wsft,
wspan,
wstate,
wline,
rcodes
)
AS
SELECT stats_per_shift_filtered_u0206.wts,
stats_per_shift_filtered_u0206.wsft::character varying AS wsft,
stats_per_shift_filtered_u0206.wspan,
stats_per_shift_filtered_u0206.wstate,
stats_per_shift_filtered_u0206.wline,
stats_per_shift_filtered_u0206.rcodes
FROM utlog.stats_per_shift_filtered_u0206; (as #View1 in this post)
It feels like the utlog.stats_per_shift_filtered_u0206 in
utlog.stats_per_shift_filtered definition is a different object from
utlog.stats_per_shift_filtered_u0206?
I am totally out of clues. Any help would be appreciated. Thanks.
Regards,
Ben
On Wed, Sep 16, 2020 at 9:26 AM Ben <bentenzha@outlook.com> wrote:
Dear list,
Recently I am getting feedback, data in my analytic report is not
repeatable. From time to time they get different data for the same time
span.
(but IIRC previously it was OK). Therefore I started debuging the View
chain for that report, during which I bumped into this issue/phenomenon.In a over -simplified version:
CREATE VIEW2 AS SELECT * FROM VIEW1;
SELECT col1 FROM VIEW2 WHERE cond1=True;
SELECT col1 FROM VIEW1 WHERE cond1=True;Now col1 from both views looks different. I don't know where to start to
solve this problem.The actual situation is a bit more than that, the following is the
actual query:-- trying to audit utlog weighed stat
with t as (
select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
0:0:0'::timestamp t1
)
--select * from t;
select *
-- from utlog.cache_stats_per_shift_per_reason_weighed_stats
-- from utlog.stats_per_shift_filtered_per_reason
from utlog.stats_per_shift_filtered (let's call
it #View2 for short)
-- from utlog.stats_per_shift_filtered_b0206 (let's call it
#View1 for short)
-- from utlog.stats_per_shift
cross join t
where wline = 'F02' and wts >= t.t0 and wts < t.t1 and wsft ='D'
limit 100
;
Not sure if it might be something lost in your simplification here, but you
have a LIMIT with no ORDER BY there. That basically means "give me 100
random rows" (but not with a very good random level). It does not return
rows in a consistent/predictable order. So as long as that query is part of
what you're doing, you should not be surprised if you get the rows in an
inconsistent/unpredictable order, with whatever follow-on effects that
might have. (And it can lead to weird follow-on effects like the ones
you're talking about when used in larger query structures)
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
On 9/15/20 10:40 PM, Ben wrote:
Dear list,
Recently I am getting feedback, data in my analytic report is not
repeatable. From time to time they get different data for the same time span.
(but IIRC previously it was OK). Therefore I started debuging the View
chain for that report, during which I bumped into this issue/phenomenon.In a over -simplified version:
CREATE VIEW2 AS SELECT * FROM VIEW1;
SELECT col1 FROM VIEW2 WHERE cond1=True;
SELECT col1 FROM VIEW1 WHERE cond1=True;Now col1 from both views looks different. I don't know where to start to
solve this problem.The actual situation is a bit more than that, the following is the actual
query:-- trying to audit utlog weighed stat
with t as (
select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
0:0:0'::timestamp t1
)
--select * from t;
select *
-- from utlog.cache_stats_per_shift_per_reason_weighed_stats
-- from utlog.stats_per_shift_filtered_per_reason
from utlog.stats_per_shift_filtered (let's call it
#View2 for short)
-- from utlog.stats_per_shift_filtered_b0206 (let's call it
#View1 for short)
-- from utlog.stats_per_shift
cross join t
where wline = 'F02' and wts >= t.t0 and wts < t.t1 and wsft ='D'
limit 100
;The Result for #View2
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+--------+--------+-------+-------
2020-07-01 08:00:00 | D | 0 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 12.533 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12.217 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5.15 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 6.8 | S00 | F02 | {PDCB}INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | text[]Same query, the Result for #View1
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+-------+--------+-------+-------
2020-07-01 08:00:00 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 13 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 7 | S00 | F02 | {PDCB}INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | varchar[]Reuslts in `wspan` column is inaccurate while both type are float8. Most
weird thing is the 5 to 0 change. for Row 1.The `_b0206`(#View1) is just a version of
`stats_per_shift_filtered`(#View2) from past revisions.
I am sure the original CREATE statement for (#View2) is `CREATE VIEW ...
AS SELECT * FROM ...._b0206`Definition of View2 in SQLWorkbench/J generated schema:
CREATE OR REPLACE VIEW utlog.stats_per_shift_filtered (#View2)
(
wts,
wdate,
wsft,
wspan,
wstate,
wline,
rcodes
)
AS
SELECT stats_per_shift_filtered_u0206.wts,
stats_per_shift_filtered_u0206.wsft::character varying AS wsft,
stats_per_shift_filtered_u0206.wspan,
stats_per_shift_filtered_u0206.wstate,
stats_per_shift_filtered_u0206.wline,
stats_per_shift_filtered_u0206.rcodes
FROM utlog.stats_per_shift_filtered_u0206; (as #View1 in this post)It feels like the utlog.stats_per_shift_filtered_u0206 in
utlog.stats_per_shift_filtered definition is a different object from
utlog.stats_per_shift_filtered_u0206?I am totally out of clues. Any help would be appreciated. Thanks.
Try the queries in a serializable read only transaction. That should any
possible changes in the underlying data.
START TRANSACTION ISOLATION LEVEL SERIALIZABLE READ ONLY;
SELECT col1 FROM VIEW2 WHERE cond1=True;
SELECT col1 FROM VIEW1 WHERE cond1=True;
COMMIT;
--
Angular momentum makes the world go 'round.
Hi Magnus,
Thanks for the heads up.
For the data posted in this email, the listed result is the full result
set for that query.
I have checked the data in view2 and view1 with various conditions.
Both view has many difference in column wspan::float8.
The condition in the shown query is intended to show just a small set of
them.
But you are right, I should be more cautious. Thanks for the headsup.
Regards,
Ben
Show quoted text
On 9/16/20 3:35 PM, Magnus Hagander wrote:
On Wed, Sep 16, 2020 at 9:26 AM Ben <bentenzha@outlook.com
<mailto:bentenzha@outlook.com>> wrote:Dear list,
Recently I am getting feedback, data in my analytic report is not
repeatable. From time to time they get different data for the same
time
span.
(but IIRC previously it was OK). Therefore I started debuging the
View
chain for that report, during which I bumped into this
issue/phenomenon.In a over -simplified version:
CREATE VIEW2 AS SELECT * FROM VIEW1;
SELECT col1 FROM VIEW2 WHERE cond1=True;
SELECT col1 FROM VIEW1 WHERE cond1=True;Now col1 from both views looks different. I don't know where to
start to
solve this problem.The actual situation is a bit more than that, the following is the
actual query:-- trying to audit utlog weighed stat
with t as (
select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
0:0:0'::timestamp t1
)
--select * from t;
select *
-- from utlog.cache_stats_per_shift_per_reason_weighed_stats
-- from utlog.stats_per_shift_filtered_per_reason
from utlog.stats_per_shift_filtered (let's call
it #View2 for short)
-- from utlog.stats_per_shift_filtered_b0206 (let's call it
#View1 for short)
-- from utlog.stats_per_shift
cross join t
where wline = 'F02' and wts >= t.t0 and wts < t.t1 and wsft ='D'
limit 100
;Not sure if it might be something lost in your simplification here,
but you have a LIMIT with no ORDER BY there. That basically means
"give me 100 random rows" (but not with a very good random level). It
does not return rows in a consistent/predictable order. So as long as
that query is part of what you're doing, you should not be surprised
if you get the rows in an inconsistent/unpredictable order, with
whatever follow-on effects that might have. (And it can lead to weird
follow-on effects like the ones you're talking about when used in
larger query structures)--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
Dear List,
Some further investigation.
Creating a fresh View3 on View1 gives exactly the same result as View1.
The View1 View2 are both years old in a production database, in use for quite some time. (The database is production duty but not hosted in server room with UPS. It's like a edge PC in industry monitoring. Now am more concerned with its data integrity)
The problem with the final report is reported recently. I am not sure what's broken in the database.
I haven't replaced the broken View2 yet. Hope someone can point me to some further investigation.
My concern is that if there are other views inside that database having similar integrity issue, how can I find them all (if any).
It's beyond my regular SQL ability. I guess I really need help from people with maintenance experience.
Any help will be appreciated, thanks in advance.
Ben
On September 16, 2020 3:40:34 AM UTC, Ben <bentenzha@outlook.com> wrote:
Dear list,
Recently I am getting feedback, data in my analytic report is not
repeatable. From time to time they get different data for the same time
span.
(but IIRC previously it was OK). Therefore I started debuging the View
chain for that report, during which I bumped into this issue/phenomenon.In a over -simplified version:
CREATE VIEW2 AS SELECT * FROM VIEW1;
SELECT col1 FROM VIEW2 WHERE cond1=True;
SELECT col1 FROM VIEW1 WHERE cond1=True;Now col1 from both views looks different. I don't know where to start to
solve this problem.The actual situation is a bit more than that, the following is the
actual query:-- trying to audit utlog weighed stat
with t as (
select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
0:0:0'::timestamp t1
)
--select * from t;
select *
-- from utlog.cache_stats_per_shift_per_reason_weighed_stats
-- from utlog.stats_per_shift_filtered_per_reason
from utlog.stats_per_shift_filtered (let's call
it #View2 for short)
-- from utlog.stats_per_shift_filtered_b0206 (let's call it
#View1 for short)
-- from utlog.stats_per_shift
cross join t
where wline = 'F02' and wts >= t.t0 and wts < t.t1 and wsft ='D'
limit 100
;The Result for #View2
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+--------+--------+-------+-------
2020-07-01 08:00:00 | D | 0 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 12.533 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12.217 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5.15 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 6.8 | S00 | F02 | {PDCB}INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | text[]Same query, the Result for #View1
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+-------+--------+-------+-------
2020-07-01 08:00:00 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 13 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 7 | S00 | F02 | {PDCB}INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | varchar[]Reuslts in `wspan` column is inaccurate while both type are float8. Most
weird thing is the 5 to 0 change. for Row 1.The `_b0206`(#View1) is just a version of
`stats_per_shift_filtered`(#View2) from past revisions.
I am sure the original CREATE statement for (#View2) is `CREATE VIEW ...
AS SELECT * FROM ...._b0206`Definition of View2 in SQLWorkbench/J generated schema:
CREATE OR REPLACE VIEW utlog.stats_per_shift_filtered (#View2)
(
wts,
wdate,
wsft,
wspan,
wstate,
wline,
rcodes
)
AS
SELECT stats_per_shift_filtered_u0206.wts,
stats_per_shift_filtered_u0206.wsft::character varying AS wsft,
stats_per_shift_filtered_u0206.wspan,
stats_per_shift_filtered_u0206.wstate,
stats_per_shift_filtered_u0206.wline,
stats_per_shift_filtered_u0206.rcodes
FROM utlog.stats_per_shift_filtered_u0206; (as #View1 in this post)It feels like the utlog.stats_per_shift_filtered_u0206 in
utlog.stats_per_shift_filtered definition is a different object from
utlog.stats_per_shift_filtered_u0206?I am totally out of clues. Any help would be appreciated. Thanks.
Regards,
Ben
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Import Notes
Reply to msg id not found: 6db031b4-58fa-466c-b73d-1dc0a4bf97d6@outlook.comReference msg id not found: 6db031b4-58fa-466c-b73d-1dc0a4bf97d6@outlook.com | Resolved by subject fallback
I haven't replaced the broken View2 yet. >Hope someone can point me to some >further investigation.
Did you look at the actual definition of view2. Like all RDBMS PG materializes the Ddl as it existed at the time of creation and converts it into an in line sql.
There is a possibility it may be different than what you think.
Import Notes
Reference msg id not found: 649650653.3389440.1600345607165.ref@mail.yahoo.com | Resolved by subject fallback
Ben <bentenzha@outlook.com> writes:
Dear List,
Some further investigation.
Creating a fresh View3 on View1 gives exactly the same result as
View1.The View1 View2 are both years old in a production database, in use
for quite some time. (The database is production duty but not hosted
in server room with UPS. It's like a edge PC in industry monitoring.
Now am more concerned with its data integrity)The problem with the final report is reported recently. I am not sure
what's broken in the database.I haven't replaced the broken View2 yet. Hope someone can point me to
some further investigation.
Already mentioned downthread, but have a look at the view definitions
by...
select pg_get_viewdef('$your-view');
Or...
pg_dump --table $your-view
Expect to see something different if you repeat the above for the old
view giving undesired results and the new correct version.
HTH
My concern is that if there are other views inside that database
having similar integrity issue, how can I find them all (if any).It's beyond my regular SQL ability. I guess I really need help from
people with maintenance experience.Any help will be appreciated, thanks in advance.
Ben
On September 16, 2020 3:40:34 AM UTC, Ben <bentenzha@outlook.com>
wrote:Dear list,
Recently I am getting feedback, data in my analytic report is not
repeatable. From time to time they get different data for the same time
span.
(but IIRC previously it was OK). Therefore I started debuging the View
chain for that report, during which I bumped into this issue/phenomenon.In a over -simplified version:
CREATE VIEW2 AS SELECT * FROM VIEW1;
SELECT col1 FROM VIEW2 WHERE cond1=True;
SELECT col1 FROM VIEW1 WHERE cond1=True;Now col1 from both views looks different. I don't know where to start to
solve this problem.The actual situation is a bit more than that, the following is the
actual query:-- trying to audit utlog weighed stat
with t as (
select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
0:0:0'::timestamp t1
)
--select * from t;
select *
-- from utlog.cache_stats_per_shift_per_reason_weighed_stats
-- from utlog.stats_per_shift_filtered_per_reason
from utlog.stats_per_shift_filtered (let's call
it #View2 for short)
-- from utlog.stats_per_shift_filtered_b0206 (let's call it
#View1 for short)
-- from utlog.stats_per_shift
cross join t
where wline = 'F02' and wts >= t.t0 and wts < t.t1 and wsft ='D'
limit 100
;The Result for #View2
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+--------+--------+-------+-------
2020-07-01 08:00:00 | D | 0 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 12.533 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12.217 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5.15 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 6.8 | S00 | F02 | {PDCB}INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | text[]Same query, the Result for #View1
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+-------+--------+-------+-------
2020-07-01 08:00:00 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 13 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 7 | S00 | F02 | {PDCB}INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | varchar[]Reuslts in `wspan` column is inaccurate while both type are float8. Most
weird thing is the 5 to 0 change. for Row 1.The `_b0206`(#View1) is just a version of
`stats_per_shift_filtered`(#View2) from past revisions.
I am sure the original CREATE statement for (#View2) is `CREATE VIEW ...
AS SELECT * FROM ...._b0206`Definition of View2 in SQLWorkbench/J generated schema:
CREATE OR REPLACE VIEW utlog.stats_per_shift_filtered (#View2)
(
wts,
wdate,
wsft,
wspan,
wstate,
wline,
rcodes
)
AS
SELECT stats_per_shift_filtered_u0206.wts,
stats_per_shift_filtered_u0206.wsft::character varying AS wsft,
stats_per_shift_filtered_u0206.wspan,
stats_per_shift_filtered_u0206.wstate,
stats_per_shift_filtered_u0206.wline,
stats_per_shift_filtered_u0206.rcodes
FROM utlog.stats_per_shift_filtered_u0206; (as #View1 in this post)It feels like the utlog.stats_per_shift_filtered_u0206 in
utlog.stats_per_shift_filtered definition is a different object from
utlog.stats_per_shift_filtered_u0206?I am totally out of clues. Any help would be appreciated. Thanks.
Regards,
Ben
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
Hi Sievers, Krishna
You are right none of them is what I originally used to create them
which should be
`select * from utlog.stats_per_shift_filtered_b0206`, but they do look
different,
maybe the prev version is created before a pg update ? b0206 could be
20190206 or 20180206...
the output:
lets_db=# select pg_get_viewdef('utlog.stats_per_shift_filtered');
pg_get_viewdef
-----------------------------------------------------------------------
SELECT stats_per_shift_filtered_u0206.wts, +
stats_per_shift_filtered_u0206.wdate, +
(stats_per_shift_filtered_u0206.wsft)::character varying AS wsft,+
stats_per_shift_filtered_u0206.wspan, +
stats_per_shift_filtered_u0206.wstate, +
stats_per_shift_filtered_u0206.wline, +
stats_per_shift_filtered_u0206.rcodes +
FROM utlog.stats_per_shift_filtered_u0206;
(1 row)
(result for the same query, wspan is the column in question:
wts | wdate | wsft | wspan | wstate | wlin
------------------------+------------+------+------------------+--------+-----
2020-07-01 14:57:38+08 | 2020-07-01 | D | 6.8 | S00 | F02
2020-07-01 13:22:58+08 | 2020-07-01 | D | 5.15 | S00
| F02
2020-07-01 11:35:46+08 | 2020-07-01 | D | 12.2166666666667 | S00 | F02
2020-07-01 09:50:01+08 | 2020-07-01 | D | 12.5333333333333 | S00 | F02
2020-07-01 08:00:00+08 | 2020-07-01 | D | 0 | S00 | F02
(5 rows) )
lets_db=# select pg_get_viewdef('utlog.view_test1');
pg_get_viewdef
-----------------------------------------------
SELECT stats_per_shift_filtered_b0206.wts, +
stats_per_shift_filtered_b0206.wdate, +
stats_per_shift_filtered_b0206.wsft, +
stats_per_shift_filtered_b0206.wspan, +
stats_per_shift_filtered_b0206.wstate, +
stats_per_shift_filtered_b0206.wline, +
stats_per_shift_filtered_b0206.rcodes +
FROM utlog.stats_per_shift_filtered_b0206;
(1 row)
(the result for the same query:
wts | wdate | wsft | wspan | wstate | wline | rcodes
------------------------+------------+------+-------+--------+-------+----------------+---------------------
2020-07-01 09:50:01+08 | 2020-07-01 | D | 13 | S00 | F02 |
{PDCB}
2020-07-01 11:35:46+08 | 2020-07-01 | D | 12 | S00 | F02 | {CDSO}
2020-07-01 14:57:38+08 | 2020-07-01 | D | 7 | S00 | F02 | {PDCB}
2020-07-01 08:00:00+08 | 2020-07-01 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 13:22:58+08 | 2020-07-01 | D | 5 | S00 | F02 | {PDCB}
(5 rows)
)
The result in returned column looks different but
definition of the column in question (wspan::float8) looks identical in
both case.
Regards,
Ben
Show quoted text
On 9/17/20 10:41 PM, Jerry Sievers wrote:
Ben <bentenzha@outlook.com> writes:
Dear List,
Some further investigation.
Creating a fresh View3 on View1 gives exactly the same result as
View1.The View1 View2 are both years old in a production database, in use
for quite some time. (The database is production duty but not hosted
in server room with UPS. It's like a edge PC in industry monitoring.
Now am more concerned with its data integrity)The problem with the final report is reported recently. I am not sure
what's broken in the database.I haven't replaced the broken View2 yet. Hope someone can point me to
some further investigation.Already mentioned downthread, but have a look at the view definitions
by...select pg_get_viewdef('$your-view');
Or...
pg_dump --table $your-view
Expect to see something different if you repeat the above for the old
view giving undesired results and the new correct version.HTH
My concern is that if there are other views inside that database
having similar integrity issue, how can I find them all (if any).It's beyond my regular SQL ability. I guess I really need help from
people with maintenance experience.Any help will be appreciated, thanks in advance.
Ben
On September 16, 2020 3:40:34 AM UTC, Ben <bentenzha@outlook.com>
wrote:Dear list,
Recently I am getting feedback, data in my analytic report is not
repeatable. From time to time they get different data for the same time
span.
(but IIRC previously it was OK). Therefore I started debuging the View
chain for that report, during which I bumped into this issue/phenomenon.In a over -simplified version:
CREATE VIEW2 AS SELECT * FROM VIEW1;
SELECT col1 FROM VIEW2 WHERE cond1=True;
SELECT col1 FROM VIEW1 WHERE cond1=True;Now col1 from both views looks different. I don't know where to start to
solve this problem.The actual situation is a bit more than that, the following is the
actual query:-- trying to audit utlog weighed stat
with t as (
select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
0:0:0'::timestamp t1
)
--select * from t;
select *
-- from utlog.cache_stats_per_shift_per_reason_weighed_stats
-- from utlog.stats_per_shift_filtered_per_reason
from utlog.stats_per_shift_filtered (let's call
it #View2 for short)
-- from utlog.stats_per_shift_filtered_b0206 (let's call it
#View1 for short)
-- from utlog.stats_per_shift
cross join t
where wline = 'F02' and wts >= t.t0 and wts < t.t1 and wsft ='D'
limit 100
;The Result for #View2
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+--------+--------+-------+-------
2020-07-01 08:00:00 | D | 0 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 12.533 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12.217 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5.15 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 6.8 | S00 | F02 | {PDCB}INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | text[]Same query, the Result for #View1
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+-------+--------+-------+-------
2020-07-01 08:00:00 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 13 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 7 | S00 | F02 | {PDCB}INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | varchar[]Reuslts in `wspan` column is inaccurate while both type are float8. Most
weird thing is the 5 to 0 change. for Row 1.The `_b0206`(#View1) is just a version of
`stats_per_shift_filtered`(#View2) from past revisions.
I am sure the original CREATE statement for (#View2) is `CREATE VIEW ...
AS SELECT * FROM ...._b0206`Definition of View2 in SQLWorkbench/J generated schema:
CREATE OR REPLACE VIEW utlog.stats_per_shift_filtered (#View2)
(
wts,
wdate,
wsft,
wspan,
wstate,
wline,
rcodes
)
AS
SELECT stats_per_shift_filtered_u0206.wts,
stats_per_shift_filtered_u0206.wsft::character varying AS wsft,
stats_per_shift_filtered_u0206.wspan,
stats_per_shift_filtered_u0206.wstate,
stats_per_shift_filtered_u0206.wline,
stats_per_shift_filtered_u0206.rcodes
FROM utlog.stats_per_shift_filtered_u0206; (as #View1 in this post)It feels like the utlog.stats_per_shift_filtered_u0206 in
utlog.stats_per_shift_filtered definition is a different object from
utlog.stats_per_shift_filtered_u0206?I am totally out of clues. Any help would be appreciated. Thanks.
Regards,
Ben
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Ben <bentenzha@outlook.com> <MWHPR06MB2400E1FA4AAD12440EA96B79B93E0@MWHPR06MB2400.namprd06.prod.outlook.com> writes:
You are right none of them is what I originally used to create them
which should be
`select * from utlog.stats_per_shift_filtered_b0206`, but they do look
different,
This discrepancy could explain a lot:
(stats_per_shift_filtered_u0206.wsft)::character varying AS wsft,
vs
stats_per_shift_filtered_b0206.wsft,
Your problem query has a WHERE clause testing wsft, so this difference
is subtly affecting what that WHERE clause means. That could result in
a plan change, thereby changing the row output order, which as already
noted would be sufficient to explain the inconsistency.
If I had to bet I'd suspect that the extra cast is blocking use of an
index on wsft.
regards, tom lane