Re: Transaction progress - Mailing list pgsql-general

From Pablo Yaggi
Subject Re: Transaction progress
Date
Msg-id 200301192022.45142.pyaggi@aulamagna.com.ar
Whole thread Raw
In response to Re: Transaction progress  (Manfred Koizar <mkoi-pg@aon.at>)
Responses Re: Transaction progress
Re: Transaction progress
List pgsql-general
On Sunday 19 January 2003 07:08 pm, Manfred Koizar wrote:
> On Sun, 19 Jan 2003 14:53:49 -0300, Pablo Yaggi
>
> <pyaggi@aulamagna.com.ar> wrote:
> >if not, can I estimate the time from the explain result ? how ?
> >this is the explain result about the query I'm running:
> >
> >Hash Join  (cost=100347487.59..145240934.03 rows=26 width=60)
> >  ->  Index Scan using inars_rectificaciones_inx on inars_rectificaciones
> > b2 (cost=0.00..37839140.62 rows=9546466 width=34)
>
>                                                 ^^^^^^^       ^^
>                                                   (c)         (d)
>
> >  ->  Hash  (cost=100347379.07..100347379.07 rows=43407 width=26)
> >        ->  Seq Scan on inars_nocontrib_perm1
> >                 (cost=100000000.00..100347379.07 rows=43407 width=26)
>
>                         ^^^^^^^^^^^^     ^^^^^^
>                          (a)               (b)
>
> The end result of explain, 145240934.03, basically tells us that you
> have SET enable_seqscan = OFF (a).
You're right, that's a configuration error, i fixed it, but the query is still working,
so it will be working after I restart the server. But anyway the planner says that
it will use Seq scan anyway, so is it not the same if enable_seqscan is on or not ?

>
> (b) looks a bit strange.  Does inars_nocontrib_perm1 have 300000 pages
> but only 43407 tuples?  If so, you should VACUUM more often.  Anyway
> this seq scan contributes only a small fraction to the overall cost.
how did you realize that inars_nocontrib_perm1 is 300000 pages ?
I did a Vacuum before I execute the query that is still in progress, but the plan I tell last time
was made before it, so this is the output of the planner while the query is running:

Merge Join  (cost=209527960.44..247478144.43 rows=612 width=60)
  ->  Sort  (cost=209527960.44..209527960.44 rows=35037092 width=26)
        ->  Seq Scan on inars_nocontrib_perm1  (cost=100000000.00..100697315.92 rows=35037092 width=26)
  ->  Index Scan using inars_rectificaciones_inx on inars_rectificaciones b2  (cost=0.00..37838713.13 rows=9546358
width=34)


> (c) There are almost 10M rows in inars_rectificaciones, the width of
> which is at least 34 (d).  Counting for overhead and columns not used
34 (d) ? Sorry, I didn't get it.

these are the tables and indx:

create table inars_rectificaciones (
        cuil            int8,cuit            int8,ano             int2,mes             int2,
        presentacion    date,actividad       int2,remuneracion    float
        );

create table inars_contrib_perm1 (
        cuil            int8,cuit            int8,ano             int2,mes             int2,
        presentacion    date,rectificada     bool,actividad       int2,remuneracion    float
        );

create index inars_rectificaciones_inx on inars_rectificaciones (ano,mes,cuil,cuit);

> by your statement, let's guess 80 bytes per tuple (might be more).
> This gives at least 800MB for the whole relation.  Assuming that your
> physical memory is much less than this and the index
> inars_rectificaciones_inx doesn't retrieve tuples in physical order,
> you need one random page read per tuple.  So your question boils down
> to how long it takes to do 10M random reads on your hardware.

my memory size is 512Mb and the processor is a dual pentium 1.4G


> I have to guess again: 4ms seek, 4ms latency.  Ignoring CPU and
> transfer times we get 8ms per page access or 80000 seconds (more than
> 22 hours) for the query on average PC hardware.  This could be much
> more, if your disk is fragmented or slow or both ...

Thank's a lot,
Pablo



pgsql-general by date:

Previous
From: Manfred Koizar
Date:
Subject: Re: Transaction progress
Next
From: Manfred Koizar
Date:
Subject: Re: Transaction progress