RE: [HACKERS] vacuum process size - Mailing list pgsql-hackers

From Hiroshi Inoue
Subject RE: [HACKERS] vacuum process size
Date
Msg-id 000201beeaa4$ce3887e0$2801007e@cadzone.tpf.co.jp
Whole thread Raw
In response to Re: [HACKERS] vacuum process size  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Responses Re: [HACKERS] vacuum process size
List pgsql-hackers
Hi all,

I found the following comment in utils/mmgr/aset.c.
The high memory usage of big vacuum is probably caused by this
change.
Calling repalloc() many times with its size parameter increasing
would need large amount of memory.

Should vacuum call realloc() directly ?
Or should AllocSet..() be changed ?

Comments ?
* NOTE:*      This is a new (Feb. 05, 1999) implementation of the allocation set*      routines. AllocSet...() does not
useOrderedSet...() any more.*      Instead it manages allocations in a block pool by itself, combining*      many small
allocationsin a few bigger blocks. AllocSetFree() does*      never free() memory really. It just add's the free'd area
tosome       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^*      list for later reuse by AllocSetAlloc(). All memory blocks are
 
free()'d

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

>
> >Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> >> Just for a testing I made a huge table (>2GB and it has 10000000
> >> tuples).  copy 10000000 tuples took 23 minutes. This is not so
> >> bad. Vacuum analyze took 11 minutes, not too bad. After this I created
> >> an index on int4 column. It took 9 minutes. Next I deleted 5000000
> >> tuples to see how long delete took. I found it was 6
> >> minutes. Good. Then I ran into a problem. After that I did vacuum
> >> analyze, and seemed it took forever!  (actually took 47 minutes). The
> >> biggest problem was postgres's process size. It was 478MB! This is not
> >> acceptable for me.  Any idea?
> >
> >Yeah, I've complained about that before --- it seems that vacuum takes
> >a really unreasonable amount of time to remove dead tuples from an index.
> >It's been like that at least since 6.3.2, probably longer.
>
> Hiroshi came up with a work around for this(see included
> patches). After applying it, the process size shrinked from 478MB to
> 86MB! (the processing time did not descrease, however). According to
> him, repalloc seems not very effective with large number of calls. The
> patches probably descreases the number to 1/10.
> --
> Tatsuo Ishii
>
> -------------------------------------------------------------------------
> *** vacuum.c.orig    Sat Jul  3 09:32:40 1999
> --- vacuum.c    Thu Aug 19 17:34:18 1999
> ***************
> *** 2519,2530 ****
>   static void
>   vc_vpinsert(VPageList vpl, VPageDescr vpnew)
>   {
>
>       /* allocate a VPageDescr entry if needed */
>       if (vpl->vpl_num_pages == 0)
> !         vpl->vpl_pagedesc = (VPageDescr *) palloc(100 *
> sizeof(VPageDescr));
> !     else if (vpl->vpl_num_pages % 100 == 0)
> !         vpl->vpl_pagedesc = (VPageDescr *)
> repalloc(vpl->vpl_pagedesc, (vpl->vpl_num_pages + 100) *
> sizeof(VPageDescr));
>       vpl->vpl_pagedesc[vpl->vpl_num_pages] = vpnew;
>       (vpl->vpl_num_pages)++;
>
> --- 2519,2531 ----
>   static void
>   vc_vpinsert(VPageList vpl, VPageDescr vpnew)
>   {
> + #define PG_NPAGEDESC 1000
>
>       /* allocate a VPageDescr entry if needed */
>       if (vpl->vpl_num_pages == 0)
> !         vpl->vpl_pagedesc = (VPageDescr *)
> palloc(PG_NPAGEDESC * sizeof(VPageDescr));
> !     else if (vpl->vpl_num_pages % PG_NPAGEDESC == 0)
> !         vpl->vpl_pagedesc = (VPageDescr *)
> repalloc(vpl->vpl_pagedesc, (vpl->vpl_num_pages + PG_NPAGEDESC) *
> sizeof(VPageDescr));
>       vpl->vpl_pagedesc[vpl->vpl_num_pages] = vpnew;
>       (vpl->vpl_num_pages)++;
>



pgsql-hackers by date:

Previous
From: wieck@debis.com (Jan Wieck)
Date:
Subject: Re: [HACKERS] Tangent ... you know what's scary?
Next
From: "Oliver Elphick"
Date:
Subject: Re: Bug#43221: postgresql: When disk is full, insert corrupts indices