Re: patch review : Add ability to constrain backend temporary file space - Mailing list pgsql-hackers

On 01/06/11 09:24, Cédric Villemain wrote:
>
>   Submission review
> ================
>
>      * The patch is not in context diff format.
>      * The patch apply, but contains some extra whitespace.
>      * Documentation is here but not explicit about 'temp tables',
> maybe worth adding that this won't limit temporary table size ?
>      * There is no test provided. One can be expected to check that the
> feature work.
>

I've created a new patch (attached) with 'git diff -c' so hopefully the
format is ok now. I've added a paragraph about how temporary *table*
storage is not constrained, only temp work files. I made the point that
the *query* that creates a temp table will have its work files
constrained too. I've added a test too.

The major visible change is that the guc has been renamed to
'work_disk', to gel better with the idea that it is the disk spill for
'work_mem'.

> Code review
> =========
>
> * in fd.c, I think that "temporary_files_size -=
> (double)vfdP->fileSize;" should be done later in the function once we
> have successfully unlink the file, not before.
>

Agreed, I've moved it.

> * I am not sure it is better to add a fileSize like you did or use
> relationgetnumberofblock() when file is about to be truncated or
> unlinked, this way the seekPos should be enough to increase the global
> counter.
>

The temp files are not relations so I'd have to call stat I guess. Now
truncate/unlink can happen quite a lot (e.g hash with many files) and I
wanted to avoid adding too many library calls to this code for
performance reasons, so on balance I'm thinking it is gonna be more
efficient to remember the size in the Vfd.

> * temporary_files_size, I think it is better to have a number of pages
> à la postgresql than a kilobyte size
>

The temp file related stuff is worked on in bytes or kbytes in the fd.c
and other code (e.g log_temp_files), so it seems more natural to stay in
kbytes. Also work_disk is really only a spillover from work_mem (in
kbytes) so seems logical to match its units.

> * max_temp_files_size, I'll prefer an approach like shared_buffers
> GUC: you can use pages, or KB, MB, ...
>

We can use KB, MB, GB - just not pages, again like work_mem. These files
are not relations so I'm not sure pages is an entirely appropriate unit
for them.


> Simple Feature test
> ==============
>
> either explain buffers is wrong or the patch is wrong:
> cedric=# explain (analyze,buffers) select * from foo  order by 1 desc ;
>                                                     QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------
>   Sort  (cost=10260.02..10495.82 rows=94320 width=4) (actual
> time=364.373..518.940 rows=100000 loops=1)
>     Sort Key: generate_series
>     Sort Method: external merge  Disk: 1352kB
>     Buffers: local hit=393, temp read=249 written=249
>     ->   Seq Scan on foo  (cost=0.00..1336.20 rows=94320 width=4)
> (actual time=0.025..138.754 rows=100000 loops=1)
>           Buffers: local hit=393
>   Total runtime: 642.874 ms
> (7 rows)
>
> cedric=# set max_temp_files_size to 1900;
> SET
> cedric=# explain (analyze,buffers) select * from foo  order by 1 desc ;
> ERROR:  aborting due to exceeding max temp files size
> STATEMENT:  explain (analyze,buffers) select * from foo  order by 1 desc ;
> ERROR:  aborting due to exceeding max temp files size
>
> Do you have some testing method I can apply to track that without
> explain (analyze, buffers) before going to low-level monitoring ?
>

We're looking at this...


> Architecture review
> ==============
>
> max_temp_files_size is used for the global space used per backend.
> Based on how work_mem work I expect something like "work_disk" to
> limit per file and maybe a backend_work_disk (and yes maybe a
> backend_work_mem ?!) per backend.
> So I propose to rename the current GUC to something like backend_work_disk.
>

Done - 'work_disk' it is to match 'work_mem'.

> Patch is not large and easy to read.
> I like the idea and it sounds useful.
>

Great! Cheers

Mark


Attachment

pgsql-hackers by date:

Previous
From: Florian Pflug
Date:
Subject: Re: Another issue with invalid XML values
Next
From: Alvaro Herrera
Date:
Subject: Re: pgpool versus sequences