Re: [PERFORM] Tuning queries on large database - Mailing list pgsql-general

From Valerie Schneider DSI/DEV
Subject Re: [PERFORM] Tuning queries on large database
Date
Msg-id 200408041318.i74DIGO19457@mu.meteo.fr
Whole thread Raw
List pgsql-general
>Content-class: urn:content-classes:message
>MIME-Version: 1.0
>Subject: RE: [PERFORM] Tuning queries on large database
>X-MimeOLE: Produced By Microsoft Exchange V6.5.6944.0
>Date: Wed, 4 Aug 2004 09:06:54 -0400
>X-MS-Has-Attach:
>X-MS-TNEF-Correlator:
>Thread-Topic: [PERFORM] Tuning queries on large database
>thread-index: AcR6Iae9QRnQrjxYRJyInj9KrC3FYQAAOJgQ
>From: "Merlin Moncure" <merlin.moncure@rcsonline.com>
>To: "Valerie Schneider DSI/DEV" <Valerie.Schneider@meteo.fr>
>Cc: <pgsql-general@postgresql.org>
>Content-Transfer-Encoding: 8bit
>X-MIME-Autoconverted: from quoted-printable to 8bit by mu.meteo.fr id
i74D9IO19408
>
>>
>> The result is that for "short queries" (Q1 and Q2) it runs in a few
>> seconds on both Oracle and PG. The difference becomes important with
>> Q3 : 8 seconds with oracle
>>      80 sec with PG
>> and too much with Q4 : 28s with oracle
>>                        17m20s with PG !
>>
>> Of course when I run 100 or 1000 parallel queries such as Q3 or Q4,
>> it becomes a disaster !
>> I can't understand these results. The way to execute queries is the
>> same I think. I've read recommended articles on the PG site.
>> I tried with a table containing 30 millions rows, results are similar.
>
>
>I don't trust the Oracle #s.  Lets look at Q4: returns 3 million rows.
>Using your #s of 160 fields and 256 bytes, your are asking for a result
>set of 160 * 256 * 3M  = 12 GB!  This data has to be gathered by the
>disk, assembled, and sent over the network.

Yes, but queries such as Q3 or Q4 look like :

select 'Q3',sum(rr1),count(ff)
from data
where num_poste in (:p1,:p1 + 2)
;

select 'Q4',count(*)
from data
where t<td
and num_poste between :p1 and :p1 + 25
;

I need to declare a cursor also in this case (group functions) ?

>
>I don't know Oracle, but it probably has some 'smart' result set that
>uses a cursor behind the scenes to do the fetching.
>
>With a 3M row result set, you need to strongly consider using cursors.
>Try experimenting with the same query (Q4), declared as a cursor, and
>fetch the data in 10k blocks in a loop (fetch 10000), and watch the #s
>fly.
>
>Merlin
>
>



********************************************************************
*    Les points de vue exprimes sont strictement personnels et     *
*      n'engagent pas la responsabilite de METEO-FRANCE.           *
********************************************************************
* Valerie SCHNEIDER             Tel : +33 (0)5 61 07 81 91         *
* METEO-FRANCE / DSI/DEV        Fax : +33 (0)5 61 07 81 09         *
* 42, avenue G. Coriolis        Email : Valerie.Schneider@meteo.fr *
* 31057 TOULOUSE Cedex - FRANCE         http://www.meteo.fr        *
********************************************************************


pgsql-general by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: [PERFORM] Tuning queries on large database
Next
From: Christopher Kings-Lynne
Date:
Subject: Re: [PERFORM] Tuning queries on large database