Re: Obvious data mismatch in View2 which basically SELECT * from View1 - Mailing list pgsql-general
From | Ben |
---|---|
Subject | Re: Obvious data mismatch in View2 which basically SELECT * from View1 |
Date | |
Msg-id | MWHPR06MB24005D602C007214E610B44FB93E0@MWHPR06MB2400.namprd06.prod.outlook.com Whole thread Raw |
In response to | Obvious data mismatch in View2 which basically SELECT * from View1 (Ben <bentenzha@outlook.com>) |
List | pgsql-general |
Dear List,
Some further investigation.
Creating a fresh View3 on View1 gives exactly the same result as View1.
The View1 View2 are both years old in a production database, in use for quite some time. (The database is production duty but not hosted in server room with UPS. It's like a edge PC in industry monitoring. Now am more concerned with its data integrity)
The problem with the final report is reported recently. I am not sure what's broken in the database.
I haven't replaced the broken View2 yet. Hope someone can point me to some further investigation.
My concern is that if there are other views inside that database having similar integrity issue, how can I find them all (if any).
It's beyond my regular SQL ability. I guess I really need help from people with maintenance experience.
Any help will be appreciated, thanks in advance.
Ben
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Some further investigation.
Creating a fresh View3 on View1 gives exactly the same result as View1.
The View1 View2 are both years old in a production database, in use for quite some time. (The database is production duty but not hosted in server room with UPS. It's like a edge PC in industry monitoring. Now am more concerned with its data integrity)
The problem with the final report is reported recently. I am not sure what's broken in the database.
I haven't replaced the broken View2 yet. Hope someone can point me to some further investigation.
My concern is that if there are other views inside that database having similar integrity issue, how can I find them all (if any).
It's beyond my regular SQL ability. I guess I really need help from people with maintenance experience.
Any help will be appreciated, thanks in advance.
Ben
On September 16, 2020 3:40:34 AM UTC, Ben <bentenzha@outlook.com> wrote:
Dear list,
Recently I am getting feedback, data in my analytic report is not
repeatable. From time to time they get different data for the same time
span.
(but IIRC previously it was OK). Therefore I started debuging the View
chain for that report, during which I bumped into this issue/phenomenon.
In a over -simplified version:
CREATE VIEW2 AS SELECT * FROM VIEW1;
SELECT col1 FROM VIEW2 WHERE cond1=True;
SELECT col1 FROM VIEW1 WHERE cond1=True;
Now col1 from both views looks different. I don't know where to start to
solve this problem.
The actual situation is a bit more than that, the following is the
actual query:
-- trying to audit utlog weighed stat
with t as (
select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
0:0:0'::timestamp t1
)
--select * from t;
select *
-- from utlog.cache_stats_per_shift_per_reason_weighed_stats
-- from utlog.stats_per_shift_filtered_per_reason
from utlog.stats_per_shift_filtered (let's call
it #View2 for short)
-- from utlog.stats_per_shift_filtered_b0206 (let's call it
#View1 for short)
-- from utlog.stats_per_shift
cross join t
where wline = 'F02' and wts >= t.t0 and wts < t.t1 and wsft ='D'
limit 100
;
The Result for #View2
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+--------+--------+-------+-------
2020-07-01 08:00:00 | D | 0 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 12.533 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12.217 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5.15 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 6.8 | S00 | F02 | {PDCB}
INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | text[]
Same query, the Result for #View1
wts | wsft | wspan | wstate | wline | rcodes
--------------------+------+-------+--------+-------+-------
2020-07-01 08:00:00 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 09:50:01 | D | 13 | S00 | F02 | {PDCB}
2020-07-01 11:35:46 | D | 12 | S00 | F02 | {CDSO}
2020-07-01 13:22:58 | D | 5 | S00 | F02 | {PDCB}
2020-07-01 14:57:38 | D | 7 | S00 | F02 | {PDCB}
INDEX | COLUMN_NAME | DATA_TYPE
------+-------------+------------
1 | wts | timestamptz
3 | wsft | varchar
4 | wspan | float8
5 | wstate | varchar
6 | wline | varchar
7 | rcodes | varchar[]
Reuslts in `wspan` column is inaccurate while both type are float8. Most
weird thing is the 5 to 0 change. for Row 1.
The `_b0206`(#View1) is just a version of
`stats_per_shift_filtered`(#View2) from past revisions.
I am sure the original CREATE statement for (#View2) is `CREATE VIEW ...
AS SELECT * FROM ...._b0206`
Definition of View2 in SQLWorkbench/J generated schema:
CREATE OR REPLACE VIEW utlog.stats_per_shift_filtered (#View2)
(
wts,
wdate,
wsft,
wspan,
wstate,
wline,
rcodes
)
AS
SELECT stats_per_shift_filtered_u0206.wts,
stats_per_shift_filtered_u0206.wsft::character varying AS wsft,
stats_per_shift_filtered_u0206.wspan,
stats_per_shift_filtered_u0206.wstate,
stats_per_shift_filtered_u0206.wline,
stats_per_shift_filtered_u0206.rcodes
FROM utlog.stats_per_shift_filtered_u0206; (as #View1 in this post)
It feels like the utlog.stats_per_shift_filtered_u0206 in
utlog.stats_per_shift_filtered definition is a different object from
utlog.stats_per_shift_filtered_u0206?
I am totally out of clues. Any help would be appreciated. Thanks.
Regards,
Ben
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
pgsql-general by date: