Thread: Same query - Slow in production

Same query - Slow in production

From

Brian Wipf

Date:

10 May 2006, 19:40:01

I'm trying to determine why an identical query is running
approximately 500 to 1000 times slower on our production database
compared to our backup database server.

Both database servers are dual 2.3 GHz G5 Xserves running PostgreSQL
8.1.3; both are configured with 8GB of RAM with identical shared
memory settings; both postgresql.conf files are identical; both
databases have identical indexes defined.

The three relevant tables are all clustered the same, although I'm
not sure when clustering was last performed on either server. All
three tables have recently been analyzed on both servers.

The different explain plans for this query seem to be consistent on
both servers regardless of category and the production server is
consistently and drastically slower than the backup server.

If anyone has any ideas on how to have the production server generate
the same explain plan as the backup server, or can suggest anything I
might want to try, I would greatly appreciate it.

Brian Wipf
ClickSpace Interactive Inc.
<brian@clickspace.com>

Here's the query:

SELECT    ac.attribute_id
FROM    attribute_category ac
WHERE    is_browsable = 'true' AND
    category_id = 1000962 AND
    EXISTS     (    SELECT     'X'
            FROM     product_attribute_value pav,
                category_product cp
            WHERE     pav.attribute_id = ac.attribute_id AND
                pav.status_code is null AND
                pav.product_id = cp.product_id AND
                cp.category_id = ac.category_id AND
                cp.product_is_active = 'true' AND
                cp.product_status_code = 'complete'
    )

Explain plans:

Fast (backup server):
  Index Scan using attribute_category__category_id_fk_idx on
attribute_category ac  (cost=0.00..47943.34 rows=7 width=4) (actual
time=0.110..0.263 rows=5 loops=1)
    Index Cond: (category_id = 1000962)
    Filter: (((is_browsable)::text = 'true'::text) AND (subplan))
    SubPlan
      ->  Nested Loop  (cost=0.00..7983.94 rows=3 width=0) (actual
time=0.043..0.043 rows=1 loops=5)
            ->  Index Scan using
category_product__category_id_is_active_and_status_idx on
category_product cp  (cost=0.00..4362.64 rows=1103 width=4) (actual
time=0.013..0.015 rows=2 loops=5)
                  Index Cond: ((category_id = $1) AND
((product_is_active)::text = 'true'::text) AND
((product_status_code)::text = 'complete'::text))
            ->  Index Scan using
product_attribute_value__product_id_fk_idx on product_attribute_value
pav  (cost=0.00..3.27 rows=1 width=4) (actual time=0.016..0.016
rows=1 loops=8)
                  Index Cond: (pav.product_id = "outer".product_id)
                  Filter: ((attribute_id = $0) AND (status_code IS
NULL))
Total runtime: 0.449 ms
(11 rows)

Slow (production server):
  Index Scan using attribute_category__category_id_fk_idx on
attribute_category ac  (cost=0.00..107115.90 rows=7 width=4) (actual
time=1.472..464.437 rows=5 loops=1)
    Index Cond: (category_id = 1000962)
    Filter: (((is_browsable)::text = 'true'::text) AND (subplan))
    SubPlan
      ->  Nested Loop  (cost=18.33..23739.70 rows=4 width=0) (actual
time=92.870..92.870 rows=1 loops=5)
            ->  Bitmap Heap Scan on product_attribute_value pav
(cost=18.33..8764.71 rows=2549 width=4) (actual time=10.191..45.672
rows=5869 loops=5)
                  Recheck Cond: (attribute_id = $0)
                  Filter: (status_code IS NULL)
                  ->  Bitmap Index Scan on
product_attribute_value__attribute_id_fk_idx  (cost=0.00..18.33
rows=2952 width=0) (actual time=9.160..9.160 rows=33330 loops=5)
                        Index Cond: (attribute_id = $0)
            ->  Index Scan using x_category_product_pk on
category_product cp  (cost=0.00..5.86 rows=1 width=4) (actual
time=0.007..0.007 rows=0 loops=29345)
                  Index Cond: ((cp.category_id = $1) AND
("outer".product_id = cp.product_id))
                  Filter: (((product_is_active)::text = 'true'::text)
AND ((product_status_code)::text = 'complete'::text))
Total runtime: 464.667 ms
(14 rows)

Table Descriptions:

\d attribute_category;
          Table "public.attribute_category"
      Column      |         Type         | Modifiers
-----------------+----------------------+-----------
attribute_id    | integer              | not null
category_id     | integer              | not null
is_browsable    | character varying(5) |
is_required     | character varying(5) |
sort_order      | integer              |
default_unit_id | integer              |
Indexes:
     "attribute_category_pk" PRIMARY KEY, btree (attribute_id,
category_id)
     "attribute_category__attribute_id_fk_idx" btree (attribute_id)
     "attribute_category__category_id_fk_idx" btree (category_id)
CLUSTER
Foreign-key constraints:
     "attribute_category_attribute_fk" FOREIGN KEY (attribute_id)
REFERENCES attribute(attribute_id) DEFERRABLE INITIALLY DEFERRED
     "attribute_category_category_fk" FOREIGN KEY (category_id)
REFERENCES category(category_id) DEFERRABLE INITIALLY DEFERRED

\d product_attribute_value;
              Table "public.product_attribute_value"
            Column           |         Type          | Modifiers
----------------------------+-----------------------+-----------
attribute_id               | integer               | not null
attribute_unit_id          | integer               |
attribute_value_id         | integer               |
boolean_value              | character varying(5)  |
decimal_value              | numeric(30,10)        |
product_attribute_value_id | integer               | not null
product_id                 | integer               | not null
product_reference_id       | integer               |
status_code                | character varying(32) |
Indexes:
     "product_attribute_value_pk" PRIMARY KEY, btree
(product_attribute_value_id)
     "product_attribute_value__attribute_id_fk_idx" btree (attribute_id)
     "product_attribute_value__attribute_unit_id_fk_idx" btree
(attribute_unit_id)
     "product_attribute_value__attribute_value_id_fk_idx" btree
(attribute_value_id)
     "product_attribute_value__decimal_value_idx" btree (decimal_value)
     "product_attribute_value__product_id_fk_idx" btree (product_id)
CLUSTER
     "product_attribute_value__product_reference_id_fk_idx" btree
(product_reference_id)
Foreign-key constraints:
     "product_attribute_value_attribute_fk" FOREIGN KEY
(attribute_id) REFERENCES attribute(attribute_id) DEFERRABLE
INITIALLY DEFERRED
     "product_attribute_value_attributeunit_fk" FOREIGN KEY
(attribute_unit_id) REFERENCES attribute_unit(attribute_unit_id)
DEFERRABLE INITIALLY DEFERRED
     "product_attribute_value_attributevalue_fk" FOREIGN KEY
(attribute_value_id) REFERENCES attribute_value(attribute_value_id)
DEFERRABLE INITIALLY DEFERRED
     "product_attribute_value_product_fk" FOREIGN KEY (product_id)
REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED
     "product_attribute_value_productreference_fk" FOREIGN KEY
(product_reference_id) REFERENCES product(product_id) DEFERRABLE
INITIALLY DEFERRED

\d category_product;
              Table "public.category_product"
        Column        |          Type          | Modifiers
---------------------+------------------------+-----------
category_id         | integer                | not null
product_id          | integer                | not null
en_name_sort_order  | integer                |
fr_name_sort_order  | integer                |
merchant_sort_order | integer                |
price_sort_order    | integer                |
merchant_count      | integer                |
is_active           | character varying(5)   |
product_is_active   | character varying(5)   |
product_status_code | character varying(32)  |
product_name_en     | character varying(512) |
product_name_fr     | character varying(512) |
product_click_count | integer                |
Indexes:
     "x_category_product_pk" PRIMARY KEY, btree (category_id,
product_id)
     "category_product__category_id_is_active_and_status_idx" btree
(category_id, product_is_active, product_status_code)
     "category_product__is_active_idx" btree (is_active)
     "category_product__merchant_sort_order_idx" btree
(merchant_sort_order)
     "x_category_product__category_id_fk_idx" btree (category_id)
CLUSTER
     "x_category_product__product_id_fk_idx" btree (product_id)
Foreign-key constraints:
     "x_category_product_category_fk" FOREIGN KEY (category_id)
REFERENCES category(category_id) DEFERRABLE INITIALLY DEFERRED
     "x_category_product_product_fk" FOREIGN KEY (product_id)
REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED

Re: Same query - Slow in production

From

Brian Wipf

Date:

10 May 2006, 20:57:12

I added to the exists query qualifier: AND cp.category_id = 1000962
(in addition to the cp.category_id = ac.category_id)

Now I am getting a much better query plan on our production server:

Index Scan using attribute_category__category_id_fk_idx on
attribute_category ac  (cost=0.00..485.71 rows=7 width=4) (actual
time=0.104..0.351 rows=5 loops=1)
    Index Cond: (category_id = 1000962)
    Filter: (((is_browsable)::text = 'true'::text) AND (subplan))
    SubPlan
      ->  Nested Loop  (cost=0.00..24.77 rows=1 width=0) (actual
time=0.058..0.058 rows=1 loops=5)
            ->  Index Scan using
x_category_product__category_id_fk_idx on category_product cp
(cost=0.00..6.01 rows=1 width=4) (actual time=0.014..0.014 rows=1
loops=5)
                  Index Cond: ((category_id = $1) AND (category_id =
1000962))
                  Filter: (((product_is_active)::text = 'true'::text)
AND ((product_status_code)::text = 'complete'::text))
            ->  Index Scan using
product_attribute_value__product_id_fk_idx on product_attribute_value
pav  (cost=0.00..18.75 rows=1 width=4) (actual time=0.041..0.041
rows=1 loops=5)
                  Index Cond: (pav.product_id = "outer".product_id)
                  Filter: ((attribute_id = $0) AND (status_code IS
NULL))
Total runtime: 0.558 ms
(12 rows)

It is using the x_category_product__category_id_fk_idx on
category_product instead of the
category_product__category_id_is_active_and_status_idx index as on
our backup server. Still not sure what's causing the differences in
query execution between the servers, but at least the query is fast
again.

Brian

On 10-May-06, at 4:39 PM, Brian Wipf wrote:

> I'm trying to determine why an identical query is running
> approximately 500 to 1000 times slower on our production database
> compared to our backup database server.
>
> Both database servers are dual 2.3 GHz G5 Xserves running
> PostgreSQL 8.1.3; both are configured with 8GB of RAM with
> identical shared memory settings; both postgresql.conf files are
> identical; both databases have identical indexes defined.
>
> The three relevant tables are all clustered the same, although I'm
> not sure when clustering was last performed on either server. All
> three tables have recently been analyzed on both servers.
>
> The different explain plans for this query seem to be consistent on
> both servers regardless of category and the production server is
> consistently and drastically slower than the backup server.
>
> If anyone has any ideas on how to have the production server
> generate the same explain plan as the backup server, or can suggest
> anything I might want to try, I would greatly appreciate it.
>
> Brian Wipf
> ClickSpace Interactive Inc.
> <brian@clickspace.com>
>
> Here's the query:
>
> SELECT    ac.attribute_id
> FROM    attribute_category ac
> WHERE    is_browsable = 'true' AND
>     category_id = 1000962 AND
>     EXISTS     (    SELECT     'X'
>             FROM     product_attribute_value pav,
>                 category_product cp
>             WHERE     pav.attribute_id = ac.attribute_id AND
>                 pav.status_code is null AND
>                 pav.product_id = cp.product_id AND
>                 cp.category_id = ac.category_id AND
>                 cp.product_is_active = 'true' AND
>                 cp.product_status_code = 'complete'
>     )
>
> Explain plans:
>
> Fast (backup server):
>  Index Scan using attribute_category__category_id_fk_idx on
> attribute_category ac  (cost=0.00..47943.34 rows=7 width=4) (actual
> time=0.110..0.263 rows=5 loops=1)
>    Index Cond: (category_id = 1000962)
>    Filter: (((is_browsable)::text = 'true'::text) AND (subplan))
>    SubPlan
>      ->  Nested Loop  (cost=0.00..7983.94 rows=3 width=0) (actual
> time=0.043..0.043 rows=1 loops=5)
>            ->  Index Scan using
> category_product__category_id_is_active_and_status_idx on
> category_product cp  (cost=0.00..4362.64 rows=1103 width=4) (actual
> time=0.013..0.015 rows=2 loops=5)
>                  Index Cond: ((category_id = $1) AND
> ((product_is_active)::text = 'true'::text) AND
> ((product_status_code)::text = 'complete'::text))
>            ->  Index Scan using
> product_attribute_value__product_id_fk_idx on
> product_attribute_value pav  (cost=0.00..3.27 rows=1 width=4)
> (actual time=0.016..0.016 rows=1 loops=8)
>                  Index Cond: (pav.product_id = "outer".product_id)
>                  Filter: ((attribute_id = $0) AND (status_code IS
> NULL))
> Total runtime: 0.449 ms
> (11 rows)
>
> Slow (production server):
>  Index Scan using attribute_category__category_id_fk_idx on
> attribute_category ac  (cost=0.00..107115.90 rows=7 width=4)
> (actual time=1.472..464.437 rows=5 loops=1)
>    Index Cond: (category_id = 1000962)
>    Filter: (((is_browsable)::text = 'true'::text) AND (subplan))
>    SubPlan
>      ->  Nested Loop  (cost=18.33..23739.70 rows=4 width=0) (actual
> time=92.870..92.870 rows=1 loops=5)
>            ->  Bitmap Heap Scan on product_attribute_value pav
> (cost=18.33..8764.71 rows=2549 width=4) (actual time=10.191..45.672
> rows=5869 loops=5)
>                  Recheck Cond: (attribute_id = $0)
>                  Filter: (status_code IS NULL)
>                  ->  Bitmap Index Scan on
> product_attribute_value__attribute_id_fk_idx  (cost=0.00..18.33
> rows=2952 width=0) (actual time=9.160..9.160 rows=33330 loops=5)
>                        Index Cond: (attribute_id = $0)
>            ->  Index Scan using x_category_product_pk on
> category_product cp  (cost=0.00..5.86 rows=1 width=4) (actual
> time=0.007..0.007 rows=0 loops=29345)
>                  Index Cond: ((cp.category_id = $1) AND
> ("outer".product_id = cp.product_id))
>                  Filter: (((product_is_active)::text =
> 'true'::text) AND ((product_status_code)::text = 'complete'::text))
> Total runtime: 464.667 ms
> (14 rows)
>
> Table Descriptions:
>
> \d attribute_category;
>          Table "public.attribute_category"
>      Column      |         Type         | Modifiers
> -----------------+----------------------+-----------
> attribute_id    | integer              | not null
> category_id     | integer              | not null
> is_browsable    | character varying(5) |
> is_required     | character varying(5) |
> sort_order      | integer              |
> default_unit_id | integer              |
> Indexes:
>     "attribute_category_pk" PRIMARY KEY, btree (attribute_id,
> category_id)
>     "attribute_category__attribute_id_fk_idx" btree (attribute_id)
>     "attribute_category__category_id_fk_idx" btree (category_id)
> CLUSTER
> Foreign-key constraints:
>     "attribute_category_attribute_fk" FOREIGN KEY (attribute_id)
> REFERENCES attribute(attribute_id) DEFERRABLE INITIALLY DEFERRED
>     "attribute_category_category_fk" FOREIGN KEY (category_id)
> REFERENCES category(category_id) DEFERRABLE INITIALLY DEFERRED
>
> \d product_attribute_value;
>              Table "public.product_attribute_value"
>            Column           |         Type          | Modifiers
> ----------------------------+-----------------------+-----------
> attribute_id               | integer               | not null
> attribute_unit_id          | integer               |
> attribute_value_id         | integer               |
> boolean_value              | character varying(5)  |
> decimal_value              | numeric(30,10)        |
> product_attribute_value_id | integer               | not null
> product_id                 | integer               | not null
> product_reference_id       | integer               |
> status_code                | character varying(32) |
> Indexes:
>     "product_attribute_value_pk" PRIMARY KEY, btree
> (product_attribute_value_id)
>     "product_attribute_value__attribute_id_fk_idx" btree
> (attribute_id)
>     "product_attribute_value__attribute_unit_id_fk_idx" btree
> (attribute_unit_id)
>     "product_attribute_value__attribute_value_id_fk_idx" btree
> (attribute_value_id)
>     "product_attribute_value__decimal_value_idx" btree (decimal_value)
>     "product_attribute_value__product_id_fk_idx" btree (product_id)
> CLUSTER
>     "product_attribute_value__product_reference_id_fk_idx" btree
> (product_reference_id)
> Foreign-key constraints:
>     "product_attribute_value_attribute_fk" FOREIGN KEY
> (attribute_id) REFERENCES attribute(attribute_id) DEFERRABLE
> INITIALLY DEFERRED
>     "product_attribute_value_attributeunit_fk" FOREIGN KEY
> (attribute_unit_id) REFERENCES attribute_unit(attribute_unit_id)
> DEFERRABLE INITIALLY DEFERRED
>     "product_attribute_value_attributevalue_fk" FOREIGN KEY
> (attribute_value_id) REFERENCES attribute_value(attribute_value_id)
> DEFERRABLE INITIALLY DEFERRED
>     "product_attribute_value_product_fk" FOREIGN KEY (product_id)
> REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED
>     "product_attribute_value_productreference_fk" FOREIGN KEY
> (product_reference_id) REFERENCES product(product_id) DEFERRABLE
> INITIALLY DEFERRED
>
> \d category_product;
>              Table "public.category_product"
>        Column        |          Type          | Modifiers
> ---------------------+------------------------+-----------
> category_id         | integer                | not null
> product_id          | integer                | not null
> en_name_sort_order  | integer                |
> fr_name_sort_order  | integer                |
> merchant_sort_order | integer                |
> price_sort_order    | integer                |
> merchant_count      | integer                |
> is_active           | character varying(5)   |
> product_is_active   | character varying(5)   |
> product_status_code | character varying(32)  |
> product_name_en     | character varying(512) |
> product_name_fr     | character varying(512) |
> product_click_count | integer                |
> Indexes:
>     "x_category_product_pk" PRIMARY KEY, btree (category_id,
> product_id)
>     "category_product__category_id_is_active_and_status_idx" btree
> (category_id, product_is_active, product_status_code)
>     "category_product__is_active_idx" btree (is_active)
>     "category_product__merchant_sort_order_idx" btree
> (merchant_sort_order)
>     "x_category_product__category_id_fk_idx" btree (category_id)
> CLUSTER
>     "x_category_product__product_id_fk_idx" btree (product_id)
> Foreign-key constraints:
>     "x_category_product_category_fk" FOREIGN KEY (category_id)
> REFERENCES category(category_id) DEFERRABLE INITIALLY DEFERRED
>     "x_category_product_product_fk" FOREIGN KEY (product_id)
> REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED

Re: Same query - Slow in production

From

Tom Lane

Date:

10 May 2006, 22:20:23

Brian Wipf <brian@clickspace.com> writes:
> I'm trying to determine why an identical query is running
> approximately 500 to 1000 times slower on our production database
> compared to our backup database server.

It looks to me like it's pure luck that the query is fast on the backup
server.  The outer side of the EXISTS' join is being badly misestimated:

>             ->  Index Scan using
> category_product__category_id_is_active_and_status_idx on
> category_product cp  (cost=0.00..4362.64 rows=1103 width=4) (actual
> time=0.013..0.015 rows=2 loops=5)
>                   Index Cond: ((category_id = $1) AND
> ((product_is_active)::text = 'true'::text) AND
> ((product_status_code)::text = 'complete'::text))

If there actually had been 1100 matching rows instead of 2, the query
would have run 550 times slower, putting it in the same ballpark as
the other plan.  So what I'm guessing is that the planner sees these
two plans as being nearly the same cost, and small differences in the
stats between the two databases are enough to tip its choice in one
direction or the other.

So what you want, of course, is to improve that rowcount estimate.
I suppose the reason it's so bad is that we don't have multicolumn
statistics ... is there a strong correlation between product_is_active
and product_status_code?  If so, it might be worth your while to find a
way to merge them into one column.

            regards, tom lane

Speed Up Offset and Limit Clause

From

"Christian Paul Cosinas"

Date:

11 May 2006, 03:36:11

Hi!

How can I speed up my server's performance when I use offset and limit
clause.

For example I have a query:
SELECT * FROM table ORDER BY id, name OFFSET 100000 LIMIT 10000

This query takes a long time about more than 2 minutes.

If my query is:
SELECT * FROM table ORDER BY id, name OFFSET 50000 LIMIT 10000
It takes about 2 seconds.

Thanks

Re: Speed Up Offset and Limit Clause

From

Chris

Date:

11 May 2006, 03:51:51

Christian Paul Cosinas wrote:
> Hi!
>
> How can I speed up my server's performance when I use offset and limit
> clause.
>
> For example I have a query:
> SELECT * FROM table ORDER BY id, name OFFSET 100000 LIMIT 10000
>
> This query takes a long time about more than 2 minutes.
>
> If my query is:
> SELECT * FROM table ORDER BY id, name OFFSET 50000 LIMIT 10000
> It takes about 2 seconds.

Please create a new thread rather than replying to someone elses post
and changing the subject. These threads can sometimes get missed.

You do have an index on id and name don't you?

--
Postgresql & php tutorials
http://www.designmagick.com/

Re: Speed Up Offset and Limit Clause

From

PFC

Date:

11 May 2006, 04:04:46

    Why do you want to use it this way ?
    Explain what you want to do, there probably is another faster solution...

On Thu, 11 May 2006 16:45:33 +0200, Christian Paul Cosinas
<cpc@cybees.com> wrote:

> Hi!
>
> How can I speed up my server's performance when I use offset and limit
> clause.
>
> For example I have a query:
> SELECT * FROM table ORDER BY id, name OFFSET 100000 LIMIT 10000
>
> This query takes a long time about more than 2 minutes.
>
> If my query is:
> SELECT * FROM table ORDER BY id, name OFFSET 50000 LIMIT 10000
> It takes about 2 seconds.
>
> Thanks
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match

Re: Speed Up Offset and Limit Clause

From

Guillaume Cottenceau

Date:

11 May 2006, 04:06:43

"Christian Paul Cosinas" <cpc 'at' cybees.com> writes:

> Hi!
>
> How can I speed up my server's performance when I use offset and limit
> clause.
>
> For example I have a query:
> SELECT * FROM table ORDER BY id, name OFFSET 100000 LIMIT 10000
>
> This query takes a long time about more than 2 minutes.
>
> If my query is:
> SELECT * FROM table ORDER BY id, name OFFSET 50000 LIMIT 10000
> It takes about 2 seconds.

First you should read the appropriate documentation.

http://www.postgresql.org/docs/8.1/interactive/performance-tips.html

--
Guillaume Cottenceau

Re: Speed Up Offset and Limit Clause

From

"Christian Paul Cosinas"

Date:

16 May 2006, 22:41:41

I am creating an application that gets the value of a large table and write
it to a file.

Why I want to use offset and limit is for me to create a threaded
application so that they will not get the same results.

For example:

Thread 1 : gets offset 0 limit 5000
Thread 2 : gets offset 5000 limit 5000
Thread 3 : gets offset 10000 limit 5000

And so on...

Would there be any other faster way than what It thought?

-----Original Message-----
From: PFC [mailto:lists@peufeu.com]
Sent: Thursday, May 11, 2006 7:06 AM
To: Christian Paul Cosinas; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Speed Up Offset and Limit Clause


    Why do you want to use it this way ?
    Explain what you want to do, there probably is another faster
solution...

On Thu, 11 May 2006 16:45:33 +0200, Christian Paul Cosinas
<cpc@cybees.com> wrote:

> Hi!
>
> How can I speed up my server's performance when I use offset and limit
> clause.
>
> For example I have a query:
> SELECT * FROM table ORDER BY id, name OFFSET 100000 LIMIT 10000
>
> This query takes a long time about more than 2 minutes.
>
> If my query is:
> SELECT * FROM table ORDER BY id, name OFFSET 50000 LIMIT 10000
> It takes about 2 seconds.
>
> Thanks
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match

Re: Speed Up Offset and Limit Clause

From

"Craig A. James"

Date:

16 May 2006, 23:20:37

Christian Paul Cosinas wrote:
> I am creating an application that gets the value of a large table and write
> it to a file.
>
> Why I want to use offset and limit is for me to create a threaded
> application so that they will not get the same results.
>
> For example:
>
> Thread 1 : gets offset 0 limit 5000
> Thread 2 : gets offset 5000 limit 5000
> Thread 3 : gets offset 10000 limit 5000
>
> And so on...
>
> Would there be any other faster way than what It thought?

In order to return rows 10000 to 15000, it must select all rows from zero to 15000 and then discard the first 10000 --
probablynot what you were hoping for. 

You might add a "thread" column.  Say you want to run ten threads:

   create sequence thread_seq
     increment by 1
     minvalue 1 maxvalue 10
     cycle
     start with 1;

   create table mytable(
      column1    integer,
      ... other columns...,
      thread     integer default nextval('thread_seq')
   );

   create bitmap index i_mytable_thread on mytable(thread);

Now whenever you insert into mytable, you get a value in mytable.thread between 1 and 10, and it's indexed with a
highlyefficient bitmap index.  So your query becomes: 

   Thread 1:  select ... from mytable where ... and thread = 1;
   Thread 2:  select ... from mytable where ... and thread = 2;
   ... and so forth.

Craig

Re: Speed Up Offset and Limit Clause

From

"Jim C. Nasby"

Date:

17 May 2006, 16:04:16

On Tue, May 16, 2006 at 07:20:12PM -0700, Craig A. James wrote:
> >Why I want to use offset and limit is for me to create a threaded
> >application so that they will not get the same results.
>
> In order to return rows 10000 to 15000, it must select all rows from zero
> to 15000 and then discard the first 10000 -- probably not what you were
> hoping for.
>
> You might add a "thread" column.  Say you want to run ten threads:

Another possibility is partitioning the table. If you do that using
inheritance-based partitioning, you could just select directly from
different partition tables, which probably be even faster than using a
single table. The downside is it's more work to setup.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Speed Up Offset and Limit Clause

From

PFC

Date:

27 May 2006, 18:02:27

> Thread 1 : gets offset 0 limit 5000
> Thread 2 : gets offset 5000 limit 5000
> Thread 3 : gets offset 10000 limit 5000
>
> Would there be any other faster way than what It thought?

    Yeah, sure, use a thread which does the whole query (maybe using a
cursor) and fills a queue with the results, then N threads consuming from
that queue... it will work better.