Thread: Front end memory consumption in SELECT

Front end memory consumption in SELECT

From
Douglas Thomson
Date:
I have a large table that I need to traverse in full. I currently
start with a simple unrestricted SELECT, and then fetch each and
every row one at a time. I thought that by fetching just one row at a
time I would not consume any significant amount of memory.

However, judging by the memory consumption of my front-end process,
it would seem that the SELECT is loading the entire table into memory
before I even fetch the first row! Can anyone confirm that this is in
fact what goes on?

If so, is there any way to avoid it? The obvious solution would seem
to be to use LIMIT and OFFSET to get just a few thousand rows at a
time, but will that suffer from a time overhead while the backend
skips over millions of rows to get to the ones it needs??

Thanks for any clues anyone can provide!

Doug.

P.S. If it matters, I am using the Perl interface. I am also running in
     SERIALIZABLE mode...

Re: [INTERFACES] Front end memory consumption in SELECT

From
Tom Lane
Date:
Douglas Thomson <dougt@mugc.cc.monash.edu.au> writes:
> However, judging by the memory consumption of my front-end process,
> it would seem that the SELECT is loading the entire table into memory
> before I even fetch the first row! Can anyone confirm that this is in
> fact what goes on?

libpq handles SELECTs that way.  You should consider DECLARE CURSOR
and FETCH if you need to retrieve a large query result in chunks.

It was probably bad design for libpq to offer random access to query
results --- I'm sure there are few applications that really care,
and the memory-consumption problem is a very real one for many apps.
But I see no way to fix it without fundamental changes to libpq's API,
and it's not clear it's worth that kind of pain.  Maybe there will
be a deliberately-incompatible libpq Mark II someday ... or maybe we'll
switch to a whole new client interface like CORBA.

> If so, is there any way to avoid it? The obvious solution would seem
> to be to use LIMIT and OFFSET to get just a few thousand rows at a
> time, but will that suffer from a time overhead while the backend
> skips over millions of rows to get to the ones it needs??

Yes.  See the CURSOR stuff instead.
        regards, tom lane


Re: [INTERFACES] Front end memory consumption in SELECT

From
Douglas Thomson
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Douglas Thomson <dougt@mugc.cc.monash.edu.au> writes:
> > However, judging by the memory consumption of my front-end process,
> > it would seem that the SELECT is loading the entire table into memory
> > before I even fetch the first row! Can anyone confirm that this is in
> > fact what goes on?
>
> libpq handles SELECTs that way.  You should consider DECLARE CURSOR
> and FETCH if you need to retrieve a large query result in chunks.

Thanks for this suggestion, it has solved my problem very nicely.

In case any Perl interface users are reading this and interested,
this meant changing my old code that used to work like:
    my $result = $conn->exec("SELECT * from big_table");
    $result->resultStatus == PGRES_TUPLES_OK or die(...);
    while (my @tuple = $result->fetchrow) {
        ...
    }
to instead work like:
    my $result = $conn->exec(
            "DECLARE big_cursor CURSOR FOR SELECT * FROM big_table");
    $result->resultStatus == PGRES_COMMAND_OK or die(...);
    do {
        $result = $conn->exec("FETCH 100 FROM big_cursor");
        $result->resultStatus == PGRES_TUPLES_OK or die(...);
        while (my @tuple = $result->fetchrow) {
            ...
        }
    } while ($result->ntuples == 100);
    $conn->exec("CLOSE big_cursor");  # if not about to COMMIT anyway
which limits to 100 the number of tuples in memory at any one time.

I only found two cases that I could not handle this way: SELECT FOR
UPDATE (since PostgreSQL cursors are read-only) and selects done
outside of transactions. In my application both these exceptions only
affected quite small selections (usually just one tuple) so I was
able to revert to the non-cursor select with no problem.

For what it is worth, I measured no significant difference in the
total time taken to traverse the entire table, but the reduction in
memory consumption was dramatic. Note that fetching only a single
tuple at a time (instead of the 100 shown above) did take much
longer.

Hope this information helps someone,

Doug.
A few weeks ago I reported a problem I was having with front end
memory consumption in SELECT, and Tom's suggestion about using a
cursor solved that problem very nicely at the front end.

However, my front and back end processes are running on the same
server, and I have just noticed that the back end process swells to
the total size of all the tables involved in my query for the
duration of the SELECT (or is it the duration of the declared cursor
now?).

To summarise the situation:
    - back end processes on my system are usually about 3M
      (PostgreSQL 6.5.1, Debian slink, Intel)
    - I am attempting a join on 9 tables, using a cursor to fetch the
      selected data
    - I am only selecting a tiny fraction of the total data
    - I am running in serializable mode if it matters
    - the files used to store my tables (including associated keys
      and indexes etc) take just under 17M on disk
    - the back end process grows to over 20M while processing my query

This is a serious concern, because 17M is just some sample data, and
will eventually be much larger.

Any hints for ways of limiting back end memory consumption (hopefully
to something constant rather than something proportional to the
tables sizes)?

Doug.
Douglas Thomson <dougt@mugc.cc.monash.edu.au> writes:
> I have just noticed that the back end process swells to
> the total size of all the tables involved in my query for the
> duration of the SELECT (or is it the duration of the declared cursor
> now?).

The life of the cursor, probably, but I'm a little surprised by this.

> To summarise the situation:
>     - I am attempting a join on 9 tables, using a cursor to fetch the
>       selected data
>     - the back end process grows to over 20M while processing my query

What query plan does EXPLAIN show for your SELECT?
        regards, tom lane

************




Tom Lane <tgl@sss.pgh.pa.us> writes:
> Douglas Thomson <dougt@mugc.cc.monash.edu.au> writes:
> > I have just noticed that the back end process swells to
> > the total size of all the tables involved in my query for the
> > duration of the SELECT (or is it the duration of the declared cursor
> > now?).
>
> The life of the cursor, probably, but I'm a little surprised by this.
>
> > To summarise the situation:
> >     - I am attempting a join on 9 tables, using a cursor to fetch the
> >       selected data
> >     - the back end process grows to over 20M while processing my query
>
> What query plan does EXPLAIN show for your SELECT?

OK, here goes! I don't profess to understand the output from EXPLAIN,
so here it is verbatim. If I hadn't checked that I have indexes on
all the attributes (and attribute combinations) that make up the keys
needed for the join, I would think the back end was trying to read in
and sort a large part of the data just for the query!


NOTICE:  _outNode: don't know how to print type 21
NOTICE:  QUERY DUMP:

{ NESTLOOP :cost 0 :size 600832 :width 240 :state <> :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 25
:restypmod-1 :resname "name" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false } :expr { VAR :varno 65000 :varattno
11:vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 9 :varoattno 2}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype
25:restypmod -1 :resname "canonical" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false } :expr { VAR :varno 65000
:varattno15 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 5 :varoattno 2}}) :qpqual ({ EXPR :typeOid 0  :opType
op:oper { OPER :opno 98 :opid 67 :opresulttype 16 } :args ({ VAR :varno 65000 :varattno 4 :vartype 25 :vartypmod -1
:varlevelsup0 :varnoold 2 :varoattno 6} { VAR :varno 65001 :varattno 2 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold1 :varoattno 2})} { EXPR :typeOid 0  :opType op :oper { OPER :opno 98 :opid 67 :opresulttype 16 } :args ({ VAR
:varno65000 :varattno 3 :vartype 25 :vart! 
ypmod -1  :varlevelsup 0 :varnoold 2 :varoattno 4} { VAR :varno 65001 :varattno 1 :vartype 25 :vartypmod -1
:varlevelsup0 :varnoold 1 :varoattno 1})}) :lefttree { SEQSCAN :cost 0 :size 0 :width 24 :state <> :qptargetlist ({
TARGETENTRY:resdom { RESDOM :resno 1 :restype 25 :restypmod -1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0
:resjunkfalse } :expr { VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 1}} {
TARGETENTRY:resdom { RESDOM :resno 2 :restype 25 :restypmod -1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0
:resjunkfalse } :expr { VAR :varno 1 :varattno 2 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 2}})
:qpqual({ EXPR :typeOid 0  :opType op :oper { OPER :opno 98 :opid 67 :opresulttype 16 } :args ({ VAR :varno 1 :varattno
3:vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 3} { CONST :consttype 25 :constlen -1 :constisnull
false:constvalue  32 [  32  0  0  0  47  83  117  98  ! 
106  101  99  116  115  47  49  57  57  57  47  115  101  109  49  47  71  67  79  47  49  56  49  50 ]  :constbyval
false})}) :lefttree <> :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :scanrelid 1 } :righttree { SEQSCAN
:cost3.64259e+07 :size 1073741849 :width 216 :state <> :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype
25:restypmod -1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 1
:vartype25 :vartypmod -1  :varlevelsup 0 :varnoold 0 :varoattno 1}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 25
:restypmod-1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 2
:vartype25 :vartypmod -1  :varlevelsup 0 :varnoold 0 :varoattno 3}} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 25
:restypmod-1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 3
:vartype25 :vartypmod -1  :varlevelsup 0 :var! 
noold 0 :varoattno 4}} { TARGETENTRY :resdom { RESDOM :resno 4 :restype 25 :restypmod -1 :resname "<>" :reskey 0
:reskeyop0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 4 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold0 :varoattno 6}} { TARGETENTRY :resdom { RESDOM :resno 5 :restype 25 :restypmod -1 :resname "<>" :reskey 0
:reskeyop0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 5 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold0 :varoattno 4}} { TARGETENTRY :resdom { RESDOM :resno 6 :restype 25 :restypmod -1 :resname "<>" :reskey 0
:reskeyop0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 6 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold0 :varoattno 8}} { TARGETENTRY :resdom { RESDOM :resno 7 :restype 25 :restypmod -1 :resname "<>" :reskey 0
:reskeyop0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 7 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold0 :varoattno 7}} { TARGETENTRY :resdom { ! 
RESDOM :resno 8 :restype 25 :restypmod -1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false }NOTICE:
QUERYPLAN: 

Nested Loop  (cost=0.00 rows=600832 width=240)
  ->  Seq Scan on t25 agents  (cost=0.00 rows=0 width=24)
  ->  Seq Scan  (cost=36425892.00 rows=1073741849 width=216)
        ->  ???  (cost=36425892.00 rows=1073741849 width=216)
              ->  Merge Join  (cost=36425892.00 rows=1073741849 width=216)
                    ->  Merge Join  (cost=8371644.00 rows=850088153 width=192)
                          ->  Seq Scan  (cost=8110708.50 rows=7333003 width=144)
                                ->  Sort  (cost=8110708.50 rows=7333003 width=144)
                                      ->  Hash Join  (cost=13897.26 rows=7333003 width=144)
                                            ->  Hash Join  (cost=4635.32 rows=280070 width=132)
                                                  ->  Hash Join  (cost=2674.97 rows=59324 width=120)
                                                        ->  Hash Join  (cost=2180.41 rows=14894 width=108)
                                                              ->  Hash Join  (cost=1705.82 rows=14315 width=96)
                                                                    ->  Seq Scan on t30 enrolments  (cost=667.40
rows=14315width=72) 
                                                                    ->  Hash  (cost=302.51 rows=6955 width=24)
                                                                          ->  Seq Scan on t24 users  (cost=302.51
rows=6955width=24) 
                                                              ->  Hash  (cost=1.10 rows=3 width=12)
                                                                    ->  Seq Scan on t21 studymodes  (cost=1.10 rows=3
width=12)
                                                        ->  Hash  (cost=1.53 rows=16 width=12)
                                                              ->  Seq Scan on t29 periods  (cost=1.53 rows=16 width=12)
                                                  ->  Hash  (cost=1.33 rows=10 width=12)
                                                        ->  Seq Scan on t23 campuses  (cost=1.33 rows=10 width=12)
                                            ->  Hash  (cost=10.32 rows=252 width=12)
                                                  ->  Seq Scan on t28 courses  (cost=10.32 rows=252 width=12)
                          ->  Seq Scan  (cost=17962.40 rows=29831 width=48)
                                ->  Sort  (cost=17962.40 rows=29831 width=48)
                                      ->  Seq Scan on t27 offerings  (cost=1457.42 rows=29831 width=48)
                    ->  Index Scan using t26_unq_0 on t26 subjects  (cost=894.95 rows=13439 width=24)

NOTICE:  _outNode: don't know how to print type 21


Hope this means something to you,

Doug.
Douglas Thomson <dougt@mugc.cc.monash.edu.au> writes:
> If I hadn't checked that I have indexes on
> all the attributes (and attribute combinations) that make up the keys
> needed for the join, I would think the back end was trying to read in
> and sort a large part of the data just for the query!

Well, of course it *is* ... the trick is to make sure that anything
really large ends up on disk (in a temp file) and not in memory.

Offhand the plan looks pretty reasonable for a 9-way join.  I see that
you have two sorts and five hashes going on here.  Each of those will
feel entitled to use as much memory as whatever your -S setting is,
which IIRC is 512K by default.  But 3.5meg altogether isn't enough to
explain your memory usage ... unless you are using a larger-than-normal
-S switch?

Also, while the sorts are pretty certain to honor your -S limit (plus or
minus some slop), the hashes are much more probabilistic affairs.  If a
table is much larger than the planner guesses, or if the distribution of
tuple values is very uneven, a hash join might use a lot more memory
than -S.  It's hard to tell whether this might be happening without a
lot more info than EXPLAIN provides, however.  EXPLAIN shows that your
tables being hashed are all pretty small, the largest being 
't24 users' at 6955 rows.  If that's accurate then I doubt hash overrun
is the explanation...

Anyone have any other ideas?
        regards, tom lane

************




Tom Lane <tgl@sss.pgh.pa.us> writes:
> Douglas Thomson <dougt@mugc.cc.monash.edu.au> writes:
> > If I hadn't checked that I have indexes on
> > all the attributes (and attribute combinations) that make up the keys
> > needed for the join, I would think the back end was trying to read in
> > and sort a large part of the data just for the query!
>
> Well, of course it *is* ... the trick is to make sure that anything
> really large ends up on disk (in a temp file) and not in memory.

What I don't understand is, why would it bother trying to read in
lots of data and sort it when it already has an index sorted in the
required order? I was trying to set up the individual table indexes
so that the join could be done efficiently, and not need either lots
of memory or temporary disk files. Do indexes not get used for joins?

> Offhand the plan looks pretty reasonable for a 9-way join.  I see that
> you have two sorts and five hashes going on here.  Each of those will
> feel entitled to use as much memory as whatever your -S setting is,
> which IIRC is 512K by default.  But 3.5meg altogether isn't enough to
> explain your memory usage ... unless you are using a larger-than-normal
> -S switch?

No, I use the default.

> Also, while the sorts are pretty certain to honor your -S limit (plus or
> minus some slop), the hashes are much more probabilistic affairs.  If a
> table is much larger than the planner guesses, or if the distribution of
> tuple values is very uneven, a hash join might use a lot more memory
> than -S.  It's hard to tell whether this might be happening without a
> lot more info than EXPLAIN provides, however.  EXPLAIN shows that your
> tables being hashed are all pretty small, the largest being
> 't24 users' at 6955 rows.  If that's accurate then I doubt hash overrun
> is the explanation...

6955 rows is accurate for that table. Values could well be highly
skewed though. The joining attribute for t24 is a student ID "number"
text field, and the numbers will be grouped.

Doug.
Douglas Thomson <dougt@mugc.cc.monash.edu.au> writes:
> I have just noticed that the back end process swells to
> the total size of all the tables involved in my query for the
> duration of the SELECT (or is it the duration of the declared cursor
> now?).

The life of the cursor, probably, but I'm a little surprised by this.

> To summarise the situation:
>     - I am attempting a join on 9 tables, using a cursor to fetch the
>       selected data
>     - the back end process grows to over 20M while processing my query

What query plan does EXPLAIN show for your SELECT?
        regards, tom lane

************




A few weeks ago I reported a problem I was having with front end
memory consumption in SELECT, and Tom's suggestion about using a
cursor solved that problem very nicely at the front end.

However, my front and back end processes are running on the same
server, and I have just noticed that the back end process swells to
the total size of all the tables involved in my query for the
duration of the SELECT (or is it the duration of the declared cursor
now?).

To summarise the situation:
    - back end processes on my system are usually about 3M
      (PostgreSQL 6.5.1, Debian slink, Intel)
    - I am attempting a join on 9 tables, using a cursor to fetch the
      selected data
    - I am only selecting a tiny fraction of the total data
    - I am running in serializable mode if it matters
    - the files used to store my tables (including associated keys
      and indexes etc) take just under 17M on disk
    - the back end process grows to over 20M while processing my query

This is a serious concern, because 17M is just some sample data, and
will eventually be much larger.

Any hints for ways of limiting back end memory consumption (hopefully
to something constant rather than something proportional to the
tables sizes)?

Doug.

************
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Douglas Thomson <dougt@mugc.cc.monash.edu.au> writes:
> > I have just noticed that the back end process swells to
> > the total size of all the tables involved in my query for the
> > duration of the SELECT (or is it the duration of the declared cursor
> > now?).
>
> The life of the cursor, probably, but I'm a little surprised by this.
>
> > To summarise the situation:
> >     - I am attempting a join on 9 tables, using a cursor to fetch the
> >       selected data
> >     - the back end process grows to over 20M while processing my query
>
> What query plan does EXPLAIN show for your SELECT?

OK, here goes! I don't profess to understand the output from EXPLAIN,
so here it is verbatim. If I hadn't checked that I have indexes on
all the attributes (and attribute combinations) that make up the keys
needed for the join, I would think the back end was trying to read in
and sort a large part of the data just for the query!


NOTICE:  _outNode: don't know how to print type 21
NOTICE:  QUERY DUMP:

{ NESTLOOP :cost 0 :size 600832 :width 240 :state <> :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 25
:restypmod-1 :resname "name" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false } :expr { VAR :varno 65000 :varattno
11:vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 9 :varoattno 2}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype
25:restypmod -1 :resname "canonical" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false } :expr { VAR :varno 65000
:varattno15 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 5 :varoattno 2}}) :qpqual ({ EXPR :typeOid 0  :opType
op:oper { OPER :opno 98 :opid 67 :opresulttype 16 } :args ({ VAR :varno 65000 :varattno 4 :vartype 25 :vartypmod -1
:varlevelsup0 :varnoold 2 :varoattno 6} { VAR :varno 65001 :varattno 2 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold1 :varoattno 2})} { EXPR :typeOid 0  :opType op :oper { OPER :opno 98 :opid 67 :opresulttype 16 } :args ({ VAR
:varno65000 :varattno 3 :vartype 25 :vart! 
!
ypmod -1  :varlevelsup 0 :varnoold 2 :varoattno 4} { VAR :varno 65001 :varattno 1 :vartype 25 :vartypmod -1
:varlevelsup0 :varnoold 1 :varoattno 1})}) :lefttree { SEQSCAN :cost 0 :size 0 :width 24 :state <> :qptargetlist ({
TARGETENTRY:resdom { RESDOM :resno 1 :restype 25 :restypmod -1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0
:resjunkfalse } :expr { VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 1}} {
TARGETENTRY:resdom { RESDOM :resno 2 :restype 25 :restypmod -1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0
:resjunkfalse } :expr { VAR :varno 1 :varattno 2 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 2}})
:qpqual({ EXPR :typeOid 0  :opType op :oper { OPER :opno 98 :opid 67 :opresulttype 16 } :args ({ VAR :varno 1 :varattno
3:vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 3} { CONST :consttype 25 :constlen -1 :constisnull
false:constvalue  32 [  32  0  0  0  47  83  117  98  ! 
!
106  101  99  116  115  47  49  57  57  57  47  115  101  109  49  47  71  67  79  47  49  56  49  50 ]  :constbyval
false})}) :lefttree <> :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :scanrelid 1 } :righttree { SEQSCAN
:cost3.64259e+07 :size 1073741849 :width 216 :state <> :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype
25:restypmod -1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 1
:vartype25 :vartypmod -1  :varlevelsup 0 :varnoold 0 :varoattno 1}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 25
:restypmod-1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 2
:vartype25 :vartypmod -1  :varlevelsup 0 :varnoold 0 :varoattno 3}} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 25
:restypmod-1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 3
:vartype25 :vartypmod -1  :varlevelsup 0 :var! 
!
noold 0 :varoattno 4}} { TARGETENTRY :resdom { RESDOM :resno 4 :restype 25 :restypmod -1 :resname "<>" :reskey 0
:reskeyop0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 4 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold0 :varoattno 6}} { TARGETENTRY :resdom { RESDOM :resno 5 :restype 25 :restypmod -1 :resname "<>" :reskey 0
:reskeyop0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 5 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold0 :varoattno 4}} { TARGETENTRY :resdom { RESDOM :resno 6 :restype 25 :restypmod -1 :resname "<>" :reskey 0
:reskeyop0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 6 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold0 :varoattno 8}} { TARGETENTRY :resdom { RESDOM :resno 7 :restype 25 :restypmod -1 :resname "<>" :reskey 0
:reskeyop0 :resgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 7 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold0 :varoattno 7}} { TARGETENTRY :resdom { ! 
!
RESDOM :resno 8 :restype 25 :restypmod -1 :resname "<>" :reskey 0 :reskeyop 0 :resgroupref 0 :resjunk false }NOTICE:
QUERYPLAN: 

Nested Loop  (cost=0.00 rows=600832 width=240)
  ->  Seq Scan on t25 agents  (cost=0.00 rows=0 width=24)
  ->  Seq Scan  (cost=36425892.00 rows=1073741849 width=216)
        ->  ???  (cost=36425892.00 rows=1073741849 width=216)
              ->  Merge Join  (cost=36425892.00 rows=1073741849 width=216)
                    ->  Merge Join  (cost=8371644.00 rows=850088153 width=192)
                          ->  Seq Scan  (cost=8110708.50 rows=7333003 width=144)
                                ->  Sort  (cost=8110708.50 rows=7333003 width=144)
                                      ->  Hash Join  (cost=13897.26 rows=7333003 width=144)
                                            ->  Hash Join  (cost=4635.32 rows=280070 width=132)
                                                  ->  Hash Join  (cost=2674.97 rows=59324 width=120)
                                                        ->  Hash Join  (cost=2180.41 rows=14894 width=108)
                                                              ->  Hash Join  (cost=1705.82 rows=14315 width=96)
                                                                    ->  Seq Scan on t30 enrolments  (cost=667.40
rows=14315width=72) 
                                                                    ->  Hash  (cost=302.51 rows=6955 width=24)
                                                                          ->  Seq Scan on t24 users  (cost=302.51
rows=6955width=24) 
                                                              ->  Hash  (cost=1.10 rows=3 width=12)
                                                                    ->  Seq Scan on t21 studymodes  (cost=1.10 rows=3
width=12)
                                                        ->  Hash  (cost=1.53 rows=16 width=12)
                                                              ->  Seq Scan on t29 periods  (cost=1.53 rows=16 width=12)
                                                  ->  Hash  (cost=1.33 rows=10 width=12)
                                                        ->  Seq Scan on t23 campuses  (cost=1.33 rows=10 width=12)
                                            ->  Hash  (cost=10.32 rows=252 width=12)
                                                  ->  Seq Scan on t28 courses  (cost=10.32 rows=252 width=12)
                          ->  Seq Scan  (cost=17962.40 rows=29831 width=48)
                                ->  Sort  (cost=17962.40 rows=29831 width=48)
                                      ->  Seq Scan on t27 offerings  (cost=1457.42 rows=29831 width=48)
                    ->  Index Scan using t26_unq_0 on t26 subjects  (cost=894.95 rows=13439 width=24)

NOTICE:  _outNode: don't know how to print type 21


Hope this means something to you,

Doug.

************
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Douglas Thomson <dougt@mugc.cc.monash.edu.au> writes:
> > If I hadn't checked that I have indexes on
> > all the attributes (and attribute combinations) that make up the keys
> > needed for the join, I would think the back end was trying to read in
> > and sort a large part of the data just for the query!
>
> Well, of course it *is* ... the trick is to make sure that anything
> really large ends up on disk (in a temp file) and not in memory.

What I don't understand is, why would it bother trying to read in
lots of data and sort it when it already has an index sorted in the
required order? I was trying to set up the individual table indexes
so that the join could be done efficiently, and not need either lots
of memory or temporary disk files. Do indexes not get used for joins?

> Offhand the plan looks pretty reasonable for a 9-way join.  I see that
> you have two sorts and five hashes going on here.  Each of those will
> feel entitled to use as much memory as whatever your -S setting is,
> which IIRC is 512K by default.  But 3.5meg altogether isn't enough to
> explain your memory usage ... unless you are using a larger-than-normal
> -S switch?

No, I use the default.

> Also, while the sorts are pretty certain to honor your -S limit (plus or
> minus some slop), the hashes are much more probabilistic affairs.  If a
> table is much larger than the planner guesses, or if the distribution of
> tuple values is very uneven, a hash join might use a lot more memory
> than -S.  It's hard to tell whether this might be happening without a
> lot more info than EXPLAIN provides, however.  EXPLAIN shows that your
> tables being hashed are all pretty small, the largest being
> 't24 users' at 6955 rows.  If that's accurate then I doubt hash overrun
> is the explanation...

6955 rows is accurate for that table. Values could well be highly
skewed though. The joining attribute for t24 is a student ID "number"
text field, and the numbers will be grouped.

Doug.

************
Douglas Thomson <dougt@mugc.cc.monash.edu.au> writes:
> If I hadn't checked that I have indexes on
> all the attributes (and attribute combinations) that make up the keys
> needed for the join, I would think the back end was trying to read in
> and sort a large part of the data just for the query!

Well, of course it *is* ... the trick is to make sure that anything
really large ends up on disk (in a temp file) and not in memory.

Offhand the plan looks pretty reasonable for a 9-way join.  I see that
you have two sorts and five hashes going on here.  Each of those will
feel entitled to use as much memory as whatever your -S setting is,
which IIRC is 512K by default.  But 3.5meg altogether isn't enough to
explain your memory usage ... unless you are using a larger-than-normal
-S switch?

Also, while the sorts are pretty certain to honor your -S limit (plus or
minus some slop), the hashes are much more probabilistic affairs.  If a
table is much larger than the planner guesses, or if the distribution of
tuple values is very uneven, a hash join might use a lot more memory
than -S.  It's hard to tell whether this might be happening without a
lot more info than EXPLAIN provides, however.  EXPLAIN shows that your
tables being hashed are all pretty small, the largest being 
't24 users' at 6955 rows.  If that's accurate then I doubt hash overrun
is the explanation...

Anyone have any other ideas?
        regards, tom lane

************