Re: Obvious data mismatch in View2 which basically SELECT * from View1 - Mailing list pgsql-general

From Ben
Subject Re: Obvious data mismatch in View2 which basically SELECT * from View1
Date
Msg-id MWHPR06MB2400E1FA4AAD12440EA96B79B93E0@MWHPR06MB2400.namprd06.prod.outlook.com
Whole thread Raw
In response to Re: Obvious data mismatch in View2 which basically SELECT * from View1  (Jerry Sievers <gsievers19@comcast.net>)
Responses Re: Obvious data mismatch in View2 which basically SELECT * from View1
List pgsql-general
Hi Sievers,  Krishna


You are right none of them is what I originally used to create them 
which should be
`select * from utlog.stats_per_shift_filtered_b0206`, but they do look 
different,
maybe the prev version is created before a pg update ? b0206 could be 
20190206 or 20180206...

the output:


lets_db=#     select pg_get_viewdef('utlog.stats_per_shift_filtered');
                             pg_get_viewdef
-----------------------------------------------------------------------
   SELECT stats_per_shift_filtered_u0206.wts,                          +
stats_per_shift_filtered_u0206.wdate,                            +
      (stats_per_shift_filtered_u0206.wsft)::character varying AS wsft,+
stats_per_shift_filtered_u0206.wspan,                            +
stats_per_shift_filtered_u0206.wstate,                           +
stats_per_shift_filtered_u0206.wline,                            +
stats_per_shift_filtered_u0206.rcodes                            +
     FROM utlog.stats_per_shift_filtered_u0206;
(1 row)

(result for the same query, wspan is the column in question:

  wts           |   wdate    | wsft |      wspan       | wstate | wlin
------------------------+------------+------+------------------+--------+-----
2020-07-01 14:57:38+08 | 2020-07-01 | D    |              6.8 | S00    | F02
  2020-07-01 13:22:58+08 | 2020-07-01 | D    |             5.15 | S00    
| F02
2020-07-01 11:35:46+08 | 2020-07-01 | D    | 12.2166666666667 | S00    | F02
2020-07-01 09:50:01+08 | 2020-07-01 | D    | 12.5333333333333 | S00    | F02
2020-07-01 08:00:00+08 | 2020-07-01 | D    |                0 | S00    | F02
(5 rows) )

lets_db=#     select pg_get_viewdef('utlog.view_test1');
                 pg_get_viewdef
-----------------------------------------------
   SELECT stats_per_shift_filtered_b0206.wts,  +
      stats_per_shift_filtered_b0206.wdate,    +
      stats_per_shift_filtered_b0206.wsft,     +
      stats_per_shift_filtered_b0206.wspan,    +
      stats_per_shift_filtered_b0206.wstate,   +
      stats_per_shift_filtered_b0206.wline,    +
      stats_per_shift_filtered_b0206.rcodes    +
     FROM utlog.stats_per_shift_filtered_b0206;
(1 row)

(the result for the same query:

wts |   wdate    | wsft | wspan | wstate | wline | rcodes
------------------------+------------+------+-------+--------+-------+----------------+---------------------
  2020-07-01 09:50:01+08 | 2020-07-01 | D    |    13 | S00    | F02   | 
{PDCB}
2020-07-01 11:35:46+08 | 2020-07-01 | D    |    12 | S00    | F02   | {CDSO}
2020-07-01 14:57:38+08 | 2020-07-01 | D    |     7 | S00    | F02   | {PDCB}
2020-07-01 08:00:00+08 | 2020-07-01 | D    |     5 | S00    | F02   | {PDCB}
2020-07-01 13:22:58+08 | 2020-07-01 | D    |     5 | S00    | F02   | {PDCB}
(5 rows)
)

The result in returned column looks different but

definition of the column in question (wspan::float8) looks identical in 
both case.


Regards,

Ben


On 9/17/20 10:41 PM, Jerry Sievers wrote:
> Ben <bentenzha@outlook.com> writes:
>
>> Dear List,
>>
>> Some further investigation.
>>
>> Creating a fresh View3 on View1 gives exactly the same result as
>> View1.
>>
>> The View1 View2 are both years old in a production database, in use
>> for quite some time. (The database is production duty but not hosted
>> in server room with UPS. It's like a edge PC in industry monitoring.
>> Now am more concerned with its data integrity)
>>
>> The problem with the final report is reported recently. I am not sure
>> what's broken in the database.
>>
>> I haven't replaced the broken View2 yet. Hope someone can point me to
>> some further investigation.
> Already mentioned downthread, but have a look at the view definitions
> by...
>
> select pg_get_viewdef('$your-view');
>
> Or...
>
> pg_dump --table $your-view
>
> Expect to see something different if you repeat the above for the old
> view giving undesired results and the new correct version.
>
> HTH
>
>
>
>> My concern is that if there are other views inside that database
>> having similar integrity issue, how can I find them all (if any).
>>
>> It's beyond my regular SQL ability. I guess I really need help from
>> people with maintenance experience.
>>
>> Any help will be appreciated, thanks in advance.
>>
>> Ben
>>
>>
>>
>> On September 16, 2020 3:40:34 AM UTC, Ben <bentenzha@outlook.com>
>> wrote:
>>
>>      Dear list,
>>      
>>      Recently I am getting feedback, data in my analytic report is not
>>      repeatable. From time to time they get different data for the same time
>>      span.
>>      (but IIRC previously it was OK). Therefore I started debuging the View
>>      chain for that report, during which I bumped into this issue/phenomenon.
>>      
>>      In a over -simplified version:
>>      
>>      CREATE VIEW2 AS SELECT * FROM VIEW1;
>>      SELECT  col1 FROM VIEW2 WHERE cond1=True;
>>      SELECT  col1 FROM VIEW1 WHERE cond1=True;
>>      
>>      Now col1 from both views looks different. I don't know where to start to
>>      solve this problem.
>>      
>>      The actual situation is a bit more than that, the following is the
>>      actual query:
>>      
>>      
>>           -- trying to audit utlog weighed stat
>>           with t as (
>>           select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
>>      0:0:0'::timestamp t1
>>           )
>>           --select * from t;
>>           select *
>>           -- from utlog.cache_stats_per_shift_per_reason_weighed_stats
>>           -- from utlog.stats_per_shift_filtered_per_reason
>>           from utlog.stats_per_shift_filtered                     (let's call
>>      it #View2 for short)
>>           -- from utlog.stats_per_shift_filtered_b0206      (let's call it
>>      #View1 for short)
>>           -- from utlog.stats_per_shift
>>           cross join t
>>           where wline = 'F02'  and wts >= t.t0 and wts < t.t1 and wsft ='D'
>>           limit 100
>>           ;
>>      
>>      The Result for #View2
>>      
>>           wts                 | wsft | wspan  | wstate | wline | rcodes
>>           --------------------+------+--------+--------+-------+-------
>>           2020-07-01 08:00:00 | D    |      0 | S00    | F02   | {PDCB}
>>           2020-07-01 09:50:01 | D    | 12.533 | S00    | F02   | {PDCB}
>>           2020-07-01 11:35:46 | D    | 12.217 | S00    | F02   | {CDSO}
>>           2020-07-01 13:22:58 | D    |   5.15 | S00    | F02   | {PDCB}
>>           2020-07-01 14:57:38 | D    |    6.8 | S00    | F02   | {PDCB}
>>      
>>           INDEX | COLUMN_NAME | DATA_TYPE
>>           ------+-------------+------------
>>           1     | wts         | timestamptz
>>           3     | wsft        | varchar
>>           4     | wspan       | float8
>>           5     | wstate      | varchar
>>           6     | wline       | varchar
>>           7     | rcodes      | text[]
>>      
>>      
>>      Same query, the Result for #View1
>>      
>>           wts                 | wsft | wspan | wstate | wline | rcodes
>>           --------------------+------+-------+--------+-------+-------
>>           2020-07-01 08:00:00 | D    |     5 | S00    | F02   | {PDCB}
>>           2020-07-01 09:50:01 | D    |    13 | S00    | F02   | {PDCB}
>>           2020-07-01 11:35:46 | D    |    12 | S00    | F02   | {CDSO}
>>           2020-07-01 13:22:58 | D    |     5 | S00    | F02   | {PDCB}
>>           2020-07-01 14:57:38 | D    |     7 | S00    | F02   | {PDCB}
>>      
>>           INDEX | COLUMN_NAME | DATA_TYPE
>>           ------+-------------+------------
>>           1     | wts         | timestamptz
>>           3     | wsft        | varchar
>>           4     | wspan       | float8
>>           5     | wstate      | varchar
>>           6     | wline       | varchar
>>           7     | rcodes      | varchar[]
>>      
>>      Reuslts in `wspan` column is inaccurate while both type are float8. Most
>>      weird thing is the 5 to 0 change. for Row 1.
>>      
>>      The `_b0206`(#View1) is just a version of
>>      `stats_per_shift_filtered`(#View2) from past revisions.
>>      I am sure the original CREATE statement for (#View2) is `CREATE VIEW ...
>>      AS SELECT * FROM ...._b0206`
>>      
>>      Definition of View2 in SQLWorkbench/J generated schema:
>>      
>>      
>>           CREATE OR REPLACE VIEW utlog.stats_per_shift_filtered (#View2)
>>           (
>>           wts,
>>           wdate,
>>           wsft,
>>           wspan,
>>           wstate,
>>           wline,
>>           rcodes
>>           )
>>           AS
>>           SELECT stats_per_shift_filtered_u0206.wts,
>>               stats_per_shift_filtered_u0206.wsft::character varying AS wsft,
>>               stats_per_shift_filtered_u0206.wspan,
>>               stats_per_shift_filtered_u0206.wstate,
>>               stats_per_shift_filtered_u0206.wline,
>>               stats_per_shift_filtered_u0206.rcodes
>>           FROM utlog.stats_per_shift_filtered_u0206;  (as #View1 in this post)
>>      
>>      
>>      It feels like the utlog.stats_per_shift_filtered_u0206 in
>>      utlog.stats_per_shift_filtered definition is a different object from
>>      utlog.stats_per_shift_filtered_u0206?
>>      
>>      I am totally out of clues. Any help would be appreciated. Thanks.
>>      
>>      
>>      Regards,
>>      
>>      Ben
>>      
>>      
>>      
>>
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>
>>



pgsql-general by date:

Previous
From: "Peter J. Holzer"
Date:
Subject: Re: PostgreSQL processes use large amount of private memory on Windows
Next
From: Tom Lane
Date:
Subject: Re: Obvious data mismatch in View2 which basically SELECT * from View1