Re: Obvious data mismatch in View2 which basically SELECT * from View1 - Mailing list pgsql-general

From Ben
Subject Re: Obvious data mismatch in View2 which basically SELECT * from View1
Date
Msg-id MWHPR06MB24005D602C007214E610B44FB93E0@MWHPR06MB2400.namprd06.prod.outlook.com
Whole thread Raw
In response to Obvious data mismatch in View2 which basically SELECT * from View1  (Ben <bentenzha@outlook.com>)
List pgsql-general
Dear List,

Some further investigation.

Creating a fresh View3 on View1 gives exactly the same result as View1.

The View1 View2 are both years old in a production database, in use for quite some time. (The database is production duty but not hosted in server room with UPS. It's like a edge PC in industry monitoring. Now am more concerned with its data integrity)

The problem with the final report is reported recently. I am not sure what's broken in the database.

I haven't replaced the broken View2 yet. Hope someone can point me to some further investigation.

My concern is that if there are other views inside that database having similar integrity issue, how can I find them all (if any).

It's beyond my regular SQL ability. I guess I really need help from people with maintenance experience.

Any help will be appreciated, thanks in advance.

Ben



On September 16, 2020 3:40:34 AM UTC, Ben <bentenzha@outlook.com> wrote:
Dear list,

Recently I am getting feedback, data in my analytic report is not
repeatable. From time to time they get different data for the same time
span.
(but IIRC previously it was OK). Therefore I started debuging the View
chain for that report, during which I bumped into this issue/phenomenon.

In a over -simplified version:

CREATE VIEW2 AS SELECT * FROM VIEW1;
SELECT  col1 FROM VIEW2 WHERE cond1=True;
SELECT  col1 FROM VIEW1 WHERE cond1=True;

Now col1 from both views looks different. I don't know where to start to
solve this problem.

The actual situation is a bit more than that, the following is the
actual query:


    -- trying to audit utlog weighed stat
    with t as (
    select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
0:0:0'::timestamp t1
    )
    --select * from t;
    select *
    -- from utlog.cache_stats_per_shift_per_reason_weighed_stats
    -- from utlog.stats_per_shift_filtered_per_reason
    from utlog.stats_per_shift_filtered                     (let's call
it #View2 for short)
    -- from utlog.stats_per_shift_filtered_b0206      (let's call it
#View1 for short)
    -- from utlog.stats_per_shift
    cross join t
    where wline = 'F02'  and wts >= t.t0 and wts < t.t1 and wsft ='D'
    limit 100
    ;

The Result for #View2

    wts                 | wsft | wspan  | wstate | wline | rcodes
    --------------------+------+--------+--------+-------+-------
    2020-07-01 08:00:00 | D    |      0 | S00    | F02   | {PDCB}
    2020-07-01 09:50:01 | D    | 12.533 | S00    | F02   | {PDCB}
    2020-07-01 11:35:46 | D    | 12.217 | S00    | F02   | {CDSO}
    2020-07-01 13:22:58 | D    |   5.15 | S00    | F02   | {PDCB}
    2020-07-01 14:57:38 | D    |    6.8 | S00    | F02   | {PDCB}

    INDEX | COLUMN_NAME | DATA_TYPE
    ------+-------------+------------
    1     | wts         | timestamptz
    3     | wsft        | varchar
    4     | wspan       | float8
    5     | wstate      | varchar
    6     | wline       | varchar
    7     | rcodes      | text[]


Same query, the Result for #View1

    wts                 | wsft | wspan | wstate | wline | rcodes
    --------------------+------+-------+--------+-------+-------
    2020-07-01 08:00:00 | D    |     5 | S00    | F02   | {PDCB}
    2020-07-01 09:50:01 | D    |    13 | S00    | F02   | {PDCB}
    2020-07-01 11:35:46 | D    |    12 | S00    | F02   | {CDSO}
    2020-07-01 13:22:58 | D    |     5 | S00    | F02   | {PDCB}
    2020-07-01 14:57:38 | D    |     7 | S00    | F02   | {PDCB}

    INDEX | COLUMN_NAME | DATA_TYPE
    ------+-------------+------------
    1     | wts         | timestamptz
    3     | wsft        | varchar
    4     | wspan       | float8
    5     | wstate      | varchar
    6     | wline       | varchar
    7     | rcodes      | varchar[]

Reuslts in `wspan` column is inaccurate while both type are float8. Most
weird thing is the 5 to 0 change. for Row 1.

The `_b0206`(#View1) is just a version of
`stats_per_shift_filtered`(#View2) from past revisions.
I am sure the original CREATE statement for (#View2) is `CREATE VIEW ...
AS SELECT * FROM ...._b0206`

Definition of View2 in SQLWorkbench/J generated schema:


    CREATE OR REPLACE VIEW utlog.stats_per_shift_filtered (#View2)
    (
    wts,
    wdate,
    wsft,
    wspan,
    wstate,
    wline,
    rcodes
    )
    AS
    SELECT stats_per_shift_filtered_u0206.wts,
        stats_per_shift_filtered_u0206.wsft::character varying AS wsft,
        stats_per_shift_filtered_u0206.wspan,
        stats_per_shift_filtered_u0206.wstate,
        stats_per_shift_filtered_u0206.wline,
        stats_per_shift_filtered_u0206.rcodes
    FROM utlog.stats_per_shift_filtered_u0206;  (as #View1 in this post)


It feels like the utlog.stats_per_shift_filtered_u0206 in
utlog.stats_per_shift_filtered definition is a different object from
utlog.stats_per_shift_filtered_u0206?

I am totally out of clues. Any help would be appreciated. Thanks.


Regards,

Ben



--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

pgsql-general by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Problems with MemoryContextSwitchTo ()
Next
From: Yessica Brinkmann
Date:
Subject: Re: Problems with MemoryContextSwitchTo ()