bad result from too large number of rows in imbricated requests - Mailing list pgsql-bugs
From | Éric Quinton |
---|---|
Subject | bad result from too large number of rows in imbricated requests |
Date | |
Msg-id | 5450A9A9.4000908@irstea.fr Whole thread Raw |
List | pgsql-bugs |
Hello,
We have data from oxygen level in an estuary. Each 10', a value is recorded.
Our table :
CREATE TABLE mesure
(
mesure_id serial NOT NULL,
station_id integer NOT NULL,
mesure_date timestamp NOT NULL,
temperature float4,
turbidite float4,
oxygene float4,
saturation_o2 float4,
salinite float4,
hauteur_eau float4,
conductivite float4,
oxygene_meth2 float4,
saturation_o2_meth2 float4
);
We want to calculate the duration of an event when level of oxygen is under a value, like < 4, for a station.
I've write this code :
with minmax as (
select
m.mesure_date as date_debut,
(select max (m1.mesure_date) from mesure m1
where m1.mesure_date >= m.mesure_date and m1.mesure_date < (
select min(m2.mesure_date) from mesure m2
where m2.mesure_date > m.mesure_date and m2.oxygene >= 4 and m2.station_id = m.station_id)
) as date_last
from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0
order by 1
)
select distinct on (date_last) date_debut, date_last,
(extract (minute from (date_last - date_debut))
+ extract(hour from (date_last - date_debut)) * 60
+ extract(day from (date_last - date_debut)) * 1440)
as nb_minute
from minmax
;
This request work on 11928 records (select count(*) from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0;)
But...
some records are missing :
date_debut date_last nb_minute
2006-07-12 22:07:00 2006-07-13 07:20:00 553
2006-07-13 07:32:00 2006-07-13 07:50:00 18
2006-07-13 08:54:00 2006-07-13 09:10:00 16
2006-07-13 09:24:00 2006-07-13 09:30:00 6
2006-07-13 10:04:00 2006-07-13 20:01:00 597
2006-07-13 20:57:00 2006-07-13 21:42:00 45
2006-07-21 07:33:00 2006-07-24 01:52:00 3979
2006-07-24 04:52:00 2006-07-24 06:44:00 112
2006-07-24 16:00:00 2006-07-24 18:07:00 127
Yet, with this request :
select
m.mesure_date as date_debut,
(select max (m1.mesure_date) from mesure m1
where m1.mesure_date >= m.mesure_date and m1.mesure_date < (
select min(m2.mesure_date) from mesure m2
where m2.mesure_date > m.mesure_date and m2.oxygene >= 4 and m2.station_id = m.station_id)
) as date_last
from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0
and m.mesure_date > '2006-01-01'
order by 1
)
select distinct on (date_last) date_debut, date_last,
(extract (minute from (date_last - date_debut))
+ extract(hour from (date_last - date_debut)) * 60
+ extract(day from (date_last - date_debut)) * 1440)
as nb_minute
from minmax
;
date_debut date_last nb_minute
2006-07-12 08:14:00 2006-07-12 08:23:00 9
2006-07-12 10:04:00 2006-07-12 18:39:00 515
2006-07-12 22:07:00 2006-07-13 07:20:00 553
2006-07-13 07:32:00 2006-07-13 07:50:00 18
2006-07-13 08:54:00 2006-07-13 09:10:00 16
2006-07-13 09:24:00 2006-07-13 09:30:00 6
2006-07-13 10:04:00 2006-07-13 20:01:00 597
2006-07-13 20:57:00 2006-07-13 21:42:00 45
2006-07-13 21:57:00 2006-07-24 01:52:00 14635
2006-07-24 02:12:00 2006-07-24 06:44:00 272
2006-07-24 08:48:00 2006-07-24 13:56:00 308
2006-07-24 14:20:00 2006-07-24 14:36:00 16
2006-07-24 15:20:00 2006-07-24 18:07:00 167
the data are good.
The number of records is, in this last request, 9645.
In complement, for this request :
select mesure_id, mesure_date, station_id, oxygene
from mesure
where station_id = 2
and mesure_date >= '2006-07-12' and mesure_date <= '2006-07-24'
order by mesure_date;
2 files : raw_data.txt, and insert_query.txt.
Requests generate with SQL Workbench/J Build 115 (2013-09-01 11:26)
Server :
cat /etc/issue : Debian GNU/Linux 7
uname -r : 3.2.0-4-amd64
psql --version : psql (PostgreSQL) 9.3.5
Tanks,
We have data from oxygen level in an estuary. Each 10', a value is recorded.
Our table :
CREATE TABLE mesure
(
mesure_id serial NOT NULL,
station_id integer NOT NULL,
mesure_date timestamp NOT NULL,
temperature float4,
turbidite float4,
oxygene float4,
saturation_o2 float4,
salinite float4,
hauteur_eau float4,
conductivite float4,
oxygene_meth2 float4,
saturation_o2_meth2 float4
);
We want to calculate the duration of an event when level of oxygen is under a value, like < 4, for a station.
I've write this code :
with minmax as (
select
m.mesure_date as date_debut,
(select max (m1.mesure_date) from mesure m1
where m1.mesure_date >= m.mesure_date and m1.mesure_date < (
select min(m2.mesure_date) from mesure m2
where m2.mesure_date > m.mesure_date and m2.oxygene >= 4 and m2.station_id = m.station_id)
) as date_last
from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0
order by 1
)
select distinct on (date_last) date_debut, date_last,
(extract (minute from (date_last - date_debut))
+ extract(hour from (date_last - date_debut)) * 60
+ extract(day from (date_last - date_debut)) * 1440)
as nb_minute
from minmax
;
This request work on 11928 records (select count(*) from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0;)
But...
some records are missing :
date_debut date_last nb_minute
2006-07-12 22:07:00 2006-07-13 07:20:00 553
2006-07-13 07:32:00 2006-07-13 07:50:00 18
2006-07-13 08:54:00 2006-07-13 09:10:00 16
2006-07-13 09:24:00 2006-07-13 09:30:00 6
2006-07-13 10:04:00 2006-07-13 20:01:00 597
2006-07-13 20:57:00 2006-07-13 21:42:00 45
2006-07-21 07:33:00 2006-07-24 01:52:00 3979
2006-07-24 04:52:00 2006-07-24 06:44:00 112
2006-07-24 16:00:00 2006-07-24 18:07:00 127
Yet, with this request :
select
m.mesure_date as date_debut,
(select max (m1.mesure_date) from mesure m1
where m1.mesure_date >= m.mesure_date and m1.mesure_date < (
select min(m2.mesure_date) from mesure m2
where m2.mesure_date > m.mesure_date and m2.oxygene >= 4 and m2.station_id = m.station_id)
) as date_last
from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0
and m.mesure_date > '2006-01-01'
order by 1
)
select distinct on (date_last) date_debut, date_last,
(extract (minute from (date_last - date_debut))
+ extract(hour from (date_last - date_debut)) * 60
+ extract(day from (date_last - date_debut)) * 1440)
as nb_minute
from minmax
;
date_debut date_last nb_minute
2006-07-12 08:14:00 2006-07-12 08:23:00 9
2006-07-12 10:04:00 2006-07-12 18:39:00 515
2006-07-12 22:07:00 2006-07-13 07:20:00 553
2006-07-13 07:32:00 2006-07-13 07:50:00 18
2006-07-13 08:54:00 2006-07-13 09:10:00 16
2006-07-13 09:24:00 2006-07-13 09:30:00 6
2006-07-13 10:04:00 2006-07-13 20:01:00 597
2006-07-13 20:57:00 2006-07-13 21:42:00 45
2006-07-13 21:57:00 2006-07-24 01:52:00 14635
2006-07-24 02:12:00 2006-07-24 06:44:00 272
2006-07-24 08:48:00 2006-07-24 13:56:00 308
2006-07-24 14:20:00 2006-07-24 14:36:00 16
2006-07-24 15:20:00 2006-07-24 18:07:00 167
the data are good.
The number of records is, in this last request, 9645.
In complement, for this request :
select mesure_id, mesure_date, station_id, oxygene
from mesure
where station_id = 2
and mesure_date >= '2006-07-12' and mesure_date <= '2006-07-24'
order by mesure_date;
2 files : raw_data.txt, and insert_query.txt.
Requests generate with SQL Workbench/J Build 115 (2013-09-01 11:26)
Server :
cat /etc/issue : Debian GNU/Linux 7
uname -r : 3.2.0-4-amd64
psql --version : psql (PostgreSQL) 9.3.5
Tanks,
--
Éric Quinton
Ingénieur en bases de données
Unité de recherche "Écosystèmes aquatiques et changements globaux" (EABX)
IRSTEA
50, avenue de Verdun-Gazinet
F-33612 Cestas Cedex
+33 (0)5 57 89 27 10
www.irstea.fr
Pour mieux affirmer ses missions,
le Cemagref devient Irstea
Éric Quinton
Ingénieur en bases de données
Unité de recherche "Écosystèmes aquatiques et changements globaux" (EABX)
IRSTEA
50, avenue de Verdun-Gazinet
F-33612 Cestas Cedex
+33 (0)5 57 89 27 10
www.irstea.fr
Pour mieux affirmer ses missions,
le Cemagref devient Irstea
Attachment
pgsql-bugs by date: