Thread: Slow joins against set-returning functions

Slow joins against set-returning functions

From
Michael Fuhr
Date:
PostgreSQL versions: 7.4.3, 8.0.0beta1

Joins against set-returning functions can be slow.  Here's a simple
example (in 8.0.0beta1, the gen_series function can be replaced
with generate_series):

  CREATE FUNCTION gen_series(INTEGER, INTEGER) RETURNS SETOF INTEGER AS '
  DECLARE
      xstart  ALIAS FOR $1;
      xend    ALIAS FOR $2;
      x       INTEGER;
  BEGIN
      FOR x IN xstart .. xend LOOP
          RETURN NEXT x;
      END LOOP;

      RETURN;
  END;
  ' LANGUAGE plpgsql IMMUTABLE STRICT;

  CREATE TABLE stuff (
      id    INTEGER NOT NULL PRIMARY KEY,
      item  VARCHAR(32) NOT NULL
  );

  INSERT INTO stuff (id, item)
    SELECT id, 'Item ' || id FROM gen_series(1, 100000) AS f(id);

  ANALYZE stuff;

Here are two queries; notice how the second query, which uses higher
numbers for the join key, is much slower than the first, apparently
due to an inefficient index scan:

  EXPLAIN ANALYZE
    SELECT stuff.* FROM stuff JOIN gen_series(1, 10) AS f(id) USING (id);

   Merge Join  (cost=62.33..2544.33 rows=1001 width=17) (actual time=1.398..1.950 rows=10 loops=1)
     Merge Cond: ("outer".id = "inner".id)
     ->  Index Scan using stuff_pkey on stuff  (cost=0.00..2217.00 rows=100000 width=17) (actual time=0.667..0.860
rows=11loops=1) 
     ->  Sort  (cost=62.33..64.83 rows=1000 width=4) (actual time=0.646..0.670 rows=10 loops=1)
           Sort Key: f.id
           ->  Function Scan on gen_series f  (cost=0.00..12.50 rows=1000 width=4) (actual time=0.403..0.478 rows=10
loops=1)
   Total runtime: 2.529 ms
  (7 rows)

  EXPLAIN ANALYZE
    SELECT stuff.* FROM stuff JOIN gen_series(99991, 100000) AS f(id) USING (id);

   Merge Join  (cost=62.33..2544.33 rows=1001 width=17) (actual time=2907.078..2907.618 rows=10 loops=1)
     Merge Cond: ("outer".id = "inner".id)
     ->  Index Scan using stuff_pkey on stuff  (cost=0.00..2217.00 rows=100000 width=17) (actual time=0.107..2270.722
rows=100000loops=1) 
     ->  Sort  (cost=62.33..64.83 rows=1000 width=4) (actual time=0.630..0.654 rows=10 loops=1)
           Sort Key: f.id
           ->  Function Scan on gen_series f  (cost=0.00..12.50 rows=1000 width=4) (actual time=0.392..0.469 rows=10
loops=1)
   Total runtime: 2908.205 ms
  (7 rows)

If I turn off enable_mergejoin then both queries are fast:

  SET enable_mergejoin TO off;

  EXPLAIN ANALYZE
    SELECT stuff.* FROM stuff JOIN gen_series(1, 10) AS f(id) USING (id);

   Nested Loop  (cost=0.00..3038.50 rows=1001 width=17) (actual time=0.600..1.912 rows=10 loops=1)
     ->  Function Scan on gen_series f  (cost=0.00..12.50 rows=1000 width=4) (actual time=0.395..0.482 rows=10 loops=1)
     ->  Index Scan using stuff_pkey on stuff  (cost=0.00..3.01 rows=1 width=17) (actual time=0.091..0.107 rows=1
loops=10)
           Index Cond: (stuff.id = "outer".id)
   Total runtime: 2.401 ms
  (5 rows)

  EXPLAIN ANALYZE
    SELECT stuff.* FROM stuff JOIN gen_series(99991, 100000) AS f(id) USING (id);

   Nested Loop  (cost=0.00..3038.50 rows=1001 width=17) (actual time=0.586..1.891 rows=10 loops=1)
     ->  Function Scan on gen_series f  (cost=0.00..12.50 rows=1000 width=4) (actual time=0.394..0.479 rows=10 loops=1)
     ->  Index Scan using stuff_pkey on stuff  (cost=0.00..3.01 rows=1 width=17) (actual time=0.089..0.105 rows=1
loops=10)
           Index Cond: (stuff.id = "outer".id)
   Total runtime: 2.374 ms
  (5 rows)

Is the planner doing something wrong here?

Thanks.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Slow joins against set-returning functions

From
Tom Lane
Date:
Michael Fuhr <mike@fuhr.org> writes:
> Is the planner doing something wrong here?

Hard to see how it can be very smart with no idea about what the
function is going to return :-(.

I'd say that the mergejoin plan is actually a good choice given the
limited amount of info, because it has the least degradation when the
input varies from what you expected.  Those "better" nestloop plans
could easily be very far worse, if the function returned more than a
trivial number of rows.

The reason the two mergejoin cases differ so much is that the scan of
the other relation can stop as soon as we've exhausted the function
output.  Evidently scanning to key 10 doesn't traverse much of
stuff_pkey while scanning to key 100000 does.  The planner is aware of
that effect, but with no information about the maximum key value to be
expected from the function scan, it can't apply the knowledge.

            regards, tom lane