Thread: BUG #12866: Another performance problem with intarray extenstion operators

BUG #12866: Another performance problem with intarray extenstion operators

From
maxim.boguk@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      12866
Logged by:          Maksym Boguk
Email address:      maxim.boguk@gmail.com
PostgreSQL version: 9.3.5
Operating system:   Linux
Description:

Hi,

I found second case where intarray operators have serious performance
issues:

create table test1 as select array_agg(id) as _array from
generate_series(100000,1, -1) as g(id);
analyze verbose test;

generic version:
# EXPLAIN ANALYZE SELECT 1 FROM test1 WHERE ARRAY[16997]
OPERATOR(pg_catalog.<@) _array;
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Seq Scan on test1  (cost=0.00..17.38 rows=7 width=0) (actual
time=1.286..1.287 rows=1 loops=1)
   Filter: ('{16997}'::integer[] OPERATOR(pg_catalog.<@) _array)
 Total runtime: 1.301 ms

intarray version:
=# EXPLAIN ANALYZE SELECT 1 FROM test1 WHERE ARRAY[16997]
OPERATOR(public.<@) _array;
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Seq Scan on test1  (cost=0.00..17.38 rows=1 width=0) (actual
time=2738.589..2738.591 rows=1 loops=1)
   Filter: ('{16997}'::integer[] <@ _array)
 Total runtime: 2738.607 ms

Whats made situation worse - intarray version could not be terminated by
ctrl-c or pg_terminate_backend (so don't try it with 1M or more array
elements).

PS: filling array in descending order is critical to reproducing the issue.

PPS: my previous bug report about issues with intarray (wrong selectivity
estimator functions used) seems stuck in the moderation queue.
maxim.boguk@gmail.com writes:
> I found second case where intarray operators have serious performance
> issues:
> ...
> PS: filling array in descending order is critical to reproducing the issue.

The problem is evidently in isort():

    /*
     * We use a simple insertion sort.  While this is O(N^2) in the worst
     * case, it's quite fast if the input is already sorted or nearly so.
     * Also, for not-too-large inputs it's faster than more complex methods
     * anyhow.
     */

That comment, as well as the current code, date to my commit fdf2dbda;
but it doesn't look like the previous code was any better in terms of
worst-case behavior.  It's tempting to just trash the whole thing in
favor of qsort().

            regards, tom lane