bad result from too large number of rows in imbricated requests - Mailing list pgsql-bugs

From Éric Quinton
Subject bad result from too large number of rows in imbricated requests
Date
Msg-id 5450A9A9.4000908@irstea.fr
Whole thread Raw
List pgsql-bugs
Hello,

We have data from oxygen level in an estuary. Each 10', a value is recorded.

Our table :
CREATE TABLE mesure
(
   mesure_id            serial       NOT NULL,
   station_id           integer      NOT NULL,
   mesure_date          timestamp    NOT NULL,
   temperature          float4,
   turbidite            float4,
   oxygene              float4,
   saturation_o2        float4,
   salinite             float4,
   hauteur_eau          float4,
   conductivite         float4,
   oxygene_meth2        float4,
   saturation_o2_meth2  float4
);

We want to calculate the duration of an event when level of oxygen is under a value, like < 4, for a station.

I've write this code :

with minmax as (
select
m.mesure_date as date_debut,
(select max (m1.mesure_date) from mesure m1
    where m1.mesure_date >= m.mesure_date and m1.mesure_date < (
        select min(m2.mesure_date) from mesure m2
        where m2.mesure_date > m.mesure_date and m2.oxygene >= 4 and m2.station_id = m.station_id)
        ) as date_last

from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0
order by 1
)
select distinct on (date_last) date_debut, date_last,
(extract (minute from (date_last - date_debut))
+ extract(hour from (date_last - date_debut)) * 60
+ extract(day from (date_last - date_debut)) * 1440)
as nb_minute
from minmax
;

This request work on 11928 records (select count(*) from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0;)

But...
some records are missing :

date_debut    date_last    nb_minute
2006-07-12 22:07:00    2006-07-13 07:20:00    553
2006-07-13 07:32:00    2006-07-13 07:50:00    18
2006-07-13 08:54:00    2006-07-13 09:10:00    16
2006-07-13 09:24:00    2006-07-13 09:30:00    6
2006-07-13 10:04:00    2006-07-13 20:01:00    597
2006-07-13 20:57:00    2006-07-13 21:42:00    45
2006-07-21 07:33:00    2006-07-24 01:52:00    3979
2006-07-24 04:52:00    2006-07-24 06:44:00    112
2006-07-24 16:00:00    2006-07-24 18:07:00    127


Yet, with this request :

select
m.mesure_date as date_debut,
(select max (m1.mesure_date) from mesure m1
    where m1.mesure_date >= m.mesure_date and m1.mesure_date < (
        select min(m2.mesure_date) from mesure m2
        where m2.mesure_date > m.mesure_date and m2.oxygene >= 4 and m2.station_id = m.station_id)
        ) as date_last

from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0
and m.mesure_date > '2006-01-01'
order by 1
)
select distinct on (date_last) date_debut, date_last,
(extract (minute from (date_last - date_debut))
+ extract(hour from (date_last - date_debut)) * 60
+ extract(day from (date_last - date_debut)) * 1440)
as nb_minute
from minmax
;

date_debut    date_last    nb_minute
2006-07-12 08:14:00    2006-07-12 08:23:00    9
2006-07-12 10:04:00    2006-07-12 18:39:00    515
2006-07-12 22:07:00    2006-07-13 07:20:00    553
2006-07-13 07:32:00    2006-07-13 07:50:00    18
2006-07-13 08:54:00    2006-07-13 09:10:00    16
2006-07-13 09:24:00    2006-07-13 09:30:00    6
2006-07-13 10:04:00    2006-07-13 20:01:00    597
2006-07-13 20:57:00    2006-07-13 21:42:00    45
2006-07-13 21:57:00    2006-07-24 01:52:00    14635
2006-07-24 02:12:00    2006-07-24 06:44:00    272
2006-07-24 08:48:00    2006-07-24 13:56:00    308
2006-07-24 14:20:00    2006-07-24 14:36:00    16
2006-07-24 15:20:00    2006-07-24 18:07:00    167

the data are good.

The number of records is, in this last request, 9645.

In complement, for this request  :
select mesure_id, mesure_date, station_id, oxygene
from mesure
where station_id = 2
and mesure_date >= '2006-07-12' and mesure_date <= '2006-07-24'
order by mesure_date;

2 files : raw_data.txt, and insert_query.txt.

Requests generate with SQL Workbench/J Build 115 (2013-09-01 11:26)
Server :
 cat /etc/issue : Debian GNU/Linux 7
 uname -r : 3.2.0-4-amd64
 psql --version : psql (PostgreSQL) 9.3.5

Tanks,

--
Éric Quinton
Ingénieur en bases de données
Unité de recherche "Écosystèmes aquatiques et changements globaux" (EABX)
IRSTEA
50, avenue de Verdun-Gazinet
F-33612 Cestas Cedex

+33 (0)5 57 89 27 10
www.irstea.fr

Pour mieux affirmer ses missions,
le Cemagref devient Irstea
Attachment

pgsql-bugs by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: BUG #11455: PQerrorMessage not reset after PQreset
Next
From: djlu126@126.com
Date:
Subject: BUG #11821: The delete rule problem