Thread: how to improve perf of 131MM row table?

how to improve perf of 131MM row table?

From
AJ Weber
Date:
Sorry for the semi-newbie question...

I have a relatively sizable postgresql 9.0.2 DB with a few large tables
(keep in mind "large" is relative, I'm sure there are plenty larger out
there).

One of my queries that seems to be bogging-down performance is a join
between two tables on each of their BIGINT PK's (so they have default
unique constraint/PK indexes on them).  One table is a detail table for
the other.  The "master" has about 6mm rows.  The detail table has about
131mm rows (table size = 17GB, index size = 16GB).

I unfortunately have limited disks, so I can't actually move to multiple
spindles, but wonder if there is anything I can do (should I partition
the data, etc.) to improve performance?  Maybe some further tuning to my
.conf, but I do think that's using as much mem as I can spare right now
(happy to send it along if it would help).

DB is vacuumed nightly with stats updates enabled.  I can send the
statistics info listed in pgAdmin tab if that would help.

Any suggestions, tips, tricks, links, etc. are welcomed!

Thanks in advance,
AJ



Re: how to improve perf of 131MM row table?

From
Shaun Thomas
Date:
On 06/25/2014 03:10 PM, AJ Weber wrote:

> I have a relatively sizable postgresql 9.0.2 DB with a few large tables
> (keep in mind "large" is relative, I'm sure there are plenty larger out
> there).

Regardless of any help we might offer regarding this, you need to
upgrade your installation to 9.0.17. You are behind by several
performance, security, and integrity bugfixes, some of which address
critical corruption bugs related to replication.

> One of my queries that seems to be bogging-down performance is a join
> between two tables on each of their BIGINT PK's (so they have default
> unique constraint/PK indexes on them).  One table is a detail table for
> the other.

This isn't enough information. Just knowing the relative sizes of the
tables doesn't tell us which columns are indexed, whether or not the
query is using those indexes, how many rows usually match, which queries
are performing badly, and so on.

Please refer to this page to ask performance related questions:

https://wiki.postgresql.org/wiki/Slow_Query_Questions

Without much of this information, we'd only be speculating.

--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: how to improve perf of 131MM row table?

From
Aaron Weber
Date:
I will gather the other data tonight. Thank you.

In the meantime, I guess I wasn't clear about some other particulars
The query's where clause is only an "IN", with a list of id's (those I mentioned are the PK), and the join is explicitly on the PK (so, indexed).

Thus, there should be only the explicit matches to the in clause returned, and if postgresql isn't using the unique index on that column, I would be very shocked (to the point I would suggest there is a bug somewhere).

An IN with 50 int values took 23sec to return (by way of example).

Thanks again.
--
Aaron

On June 25, 2014 4:49:16 PM EDT, Shaun Thomas <sthomas@optionshouse.com> wrote:
On 06/25/2014 03:10 PM, AJ Weber wrote:

I have a relatively sizable postgresql 9.0.2 DB with a few large tables
(keep in mind "large" is relative, I'm sure there are plenty larger out
there).

Regardless of any help we might offer regarding this, you need to
upgrade your installation to 9.0.17. You are behind by several
performance, security, and integrity bugfixes, some of which address
critical corruption bugs related to replication.

One of my queries that seems to be bogging-down performance is a join
between two tables on each of their BIGINT PK's (so they have default
unique constraint/PK indexes on them). One table is a detail table for
the other.

This isn't enough information. Just knowing the relative sizes of the
tables doesn't tell us which columns are indexed, whether or not the
query is using those indexes, how many rows usually match, which queries
are performing badly, and so on.

Please refer to this page to ask performance related questions:

https://wiki.postgresql.org/wiki/Slow_Query_Questions

Without much of this information, we'd only be speculating.

Re: how to improve perf of 131MM row table?

From
Shaun Thomas
Date:
On 06/25/2014 04:40 PM, Aaron Weber wrote:

> In the meantime, I guess I wasn't clear about some other particulars
> The query's where clause is only an "IN", with a list of id's (those
> I mentioned are the PK), and the join is explicitly on the PK (so,
> indexed).

Indexed doesn't mean indexed if the wrong datatypes are used. We need to
see the table and index definitions, and a sample query with EXPLAIN
ANALYZE output.

> An IN with 50 int values took 23sec to return (by way of example).

To me, this sounds like a sequence scan, or one of your key matches so
many rows, the random seeks are throwing off your performance. Of
course, I can't confirm that without EXPLAIN output.

--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: how to improve perf of 131MM row table?

From
Aaron Weber
Date:
Will get what you asked for ASAP. Thanks for your time.
--
Aaron

On June 25, 2014 5:55:29 PM EDT, Shaun Thomas <sthomas@optionshouse.com> wrote:
On 06/25/2014 04:40 PM, Aaron Weber wrote:

In the meantime, I guess I wasn't clear about some other particulars
The query's where clause is only an "IN", with a list of id's (those
I mentioned are the PK), and the join is explicitly on the PK (so,
indexed).

Indexed doesn't mean indexed if the wrong datatypes are used. We need to
see the table and index definitions, and a sample query with EXPLAIN
ANALYZE output.

An IN with 50 int values took 23sec to return (by way of example).

To me, this sounds like a sequence scan, or one of your key matches so
many rows, the random seeks are throwing off your performance. Of
course, I can't confirm that without EXPLAIN output.

Re: how to improve perf of 131MM row table?

From
AJ Weber
Date:
OK, the sample query is attached (hopefully attachments are allowed) as
"query.sql".
The "master table" definition is attached as "table1.sql".
The "detail table" definition is attached as "table2.sql".
The EXPLAIN (ANALYZE, BUFFERS) output is here:
http://explain.depesz.com/s/vd5

Let me know if I can provide anything else, and thank you again.

-AJ


On 6/25/2014 5:55 PM, Shaun Thomas wrote:
> On 06/25/2014 04:40 PM, Aaron Weber wrote:
>
>> In the meantime, I guess I wasn't clear about some other particulars
>> The query's where clause is only an "IN", with a list of id's (those
>> I mentioned are the PK), and the join is explicitly on the PK (so,
>> indexed).
>
> Indexed doesn't mean indexed if the wrong datatypes are used. We need
> to see the table and index definitions, and a sample query with
> EXPLAIN ANALYZE output.
>
>> An IN with 50 int values took 23sec to return (by way of example).
>
> To me, this sounds like a sequence scan, or one of your key matches so
> many rows, the random seeks are throwing off your performance. Of
> course, I can't confirm that without EXPLAIN output.
>


Attachment

Re: how to improve perf of 131MM row table?

From
Matheus de Oliveira
Date:

On Thu, Jun 26, 2014 at 10:26 AM, AJ Weber <aweber@comcast.net> wrote:
OK, the sample query is attached (hopefully attachments are allowed) as "query.sql".
The "master table" definition is attached as "table1.sql".
The "detail table" definition is attached as "table2.sql".
The EXPLAIN (ANALYZE, BUFFERS) output is here: http://explain.depesz.com/s/vd5

Could you try chaning your query and sending the EXPLAIN of the following?

Instead of `node_id in ('175769', '175771', ...)` try `node_in IN (VALUES('175769'), ('175771'), ... )`.


--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres

Re: how to improve perf of 131MM row table?

From
Shaun Thomas
Date:
On 06/26/2014 08:26 AM, AJ Weber wrote:

> The "master table" definition is attached as "table1.sql".
> The "detail table" definition is attached as "table2.sql".

I'm not sure what you think a primary key is, but neither of these
tables have one. Primary keys are declared one of two ways:

CREATE TABLE foo
(
   id    BIGINT PRIMARY KEY,
   col1  VARCHAR,
   col2  INT
);

Or this:

CREATE TABLE foo
(
   id    BIGINT,
   col1  VARCHAR,
   col2  INT
);

ALTER TABLE foo ADD constraint pk_foo PRIMARY KEY (id);

On your alf_node_properties table, you only have an index on node_id
because you created one. If you look at your alf_node table, there is no
index on the id column at all. This is confirmed by the explain output
you attached:

Seq Scan on alf_node node  (cost=0.00..227265.29 rows=5733429 width=16)
(actual time=0.013..2029.649 rows=5733888 loops=1)

Since it has no index, the database is reading the entire table to find
your matching values. Then it's using the index on node_id in the other
table to find the 'detail' matches, as seen here:

Bitmap Index Scan on fk_alf_nprop_n  (cost=0.00..1240.00 rows=52790
width=0) (actual time=0.552..0.552 rows=1071 loops=1)

Add an actual primary key to your alf_node table, and your query
performance should improve substantially. But I also strongly suggest
you spend some time learning how to read an EXPLAIN plan, as that would
have made your problem obvious immediately.

Here's a link for your version:

http://www.postgresql.org/docs/9.0/static/sql-explain.html

You should still consider upgrading to the latest release of 9.0 too.

--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: how to improve perf of 131MM row table?

From
AJ Weber
Date:
I will try this, but can you clarify the syntax?  I only know the VALUES clause from insert statements, and it would be one set of parens like VALUES('175769', '175771', ... )

You seem to indicate a VALUES clause that has strange parenthesis corresponding to it.

Thank you for the feedback and offer to help!

-AJ

On 6/26/2014 9:56 AM, Matheus de Oliveira wrote:

On Thu, Jun 26, 2014 at 10:26 AM, AJ Weber <aweber@comcast.net> wrote:
OK, the sample query is attached (hopefully attachments are allowed) as "query.sql".
The "master table" definition is attached as "table1.sql".
The "detail table" definition is attached as "table2.sql".
The EXPLAIN (ANALYZE, BUFFERS) output is here: http://explain.depesz.com/s/vd5

Could you try chaning your query and sending the EXPLAIN of the following?

Instead of `node_id in ('175769', '175771', ...)` try `node_in IN (VALUES('175769'), ('175771'), ... )`.


--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres


Re: how to improve perf of 131MM row table?

From
Matheus de Oliveira
Date:

On Thu, Jun 26, 2014 at 11:07 AM, AJ Weber <aweber@comcast.net> wrote:
I will try this, but can you clarify the syntax?  I only know the VALUES clause from insert statements, and it would be one set of parens like VALUES('175769', '175771', ... )


That is for multiple columns, mine is for multiple rows (and it also work on INSERT to insert many rows at once).

The result would be:

    WHERE  node_id in
               VALUES(('175769'), ('175771'), ('175781'), ('175825'), ('175881'), ('175893'), ('175919'), ('175932'), ('175963'), ('175999'), ('176022'), ('176079'), ('176099'), ('176115'), ('176118'), ('176171'), ('176181'), ('176217'), ('176220'), ('176243'), ('176283'), ('176312'), ('176326'), ('176335'), ('176377'), ('176441'), ('176444'), ('176475'), ('176530'), ('176570'), ('176623'), ('176674'), ('176701'), ('176730'), ('176748'), ('176763'), ('176771'), ('176808'), ('176836'), ('176851'), ('176864'), ('176881'), ('176929'), ('176945'), ('176947'), ('176960'), ('177006'), ('177039'), ('177079'), ('177131'), ('177144'))


You seem to indicate a VALUES clause that has strange parenthesis corresponding to it.

No, nothing strange, you are just not aware of the syntax. See [1], and a more clear example (for INSERT) at [2], look for "To insert multiple rows using the multirow VALUES syntax:".

Re: how to improve perf of 131MM row table?

From
AJ Weber
Date:
I sent the details as identified by pgAdmin III.

psql output shows this:
\d alf_node
                Table "public.alf_node"
      Column     |          Type          | Modifiers
----------------+------------------------+-----------
  id             | bigint                 | not null
  version        | bigint                 | not null
  store_id       | bigint                 | not null
  uuid           | character varying(36)  | not null
  transaction_id | bigint                 | not null
  node_deleted   | boolean                | not null
  type_qname_id  | bigint                 | not null
  locale_id      | bigint                 | not null
  acl_id         | bigint                 |
  audit_creator  | character varying(255) |
  audit_created  | character varying(30)  |
  audit_modifier | character varying(255) |
  audit_modified | character varying(30)  |
  audit_accessed | character varying(30)  |
Indexes:
     "alf_node_pkey" PRIMARY KEY, btree (id) CLUSTER
     "store_id" UNIQUE, btree (store_id, uuid)
     "fk_alf_node_acl" btree (acl_id)
     "fk_alf_node_loc" btree (locale_id)
     "fk_alf_node_store" btree (store_id)
     "fk_alf_node_tqn" btree (type_qname_id)
     "fk_alf_node_txn" btree (transaction_id)
     "idx_alf_node_del" btree (node_deleted)
     "idx_alf_node_txn_del" btree (transaction_id, node_deleted)
Foreign-key constraints:
     "fk_alf_node_acl" FOREIGN KEY (acl_id) REFERENCES
alf_access_control_list(id)
     "fk_alf_node_loc" FOREIGN KEY (locale_id) REFERENCES alf_locale(id)
     "fk_alf_node_store" FOREIGN KEY (store_id) REFERENCES alf_store(id)
     "fk_alf_node_tqn" FOREIGN KEY (type_qname_id) REFERENCES alf_qname(id)
     "fk_alf_node_txn" FOREIGN KEY (transaction_id) REFERENCES
alf_transaction(id)
Referenced by:
     TABLE "alf_child_assoc" CONSTRAINT "fk_alf_cass_cnode" FOREIGN KEY
(child_node_id) REFERENCES alf_node(id)
     TABLE "alf_child_assoc" CONSTRAINT "fk_alf_cass_pnode" FOREIGN KEY
(parent_node_id) REFERENCES alf_node(id)
     TABLE "alf_node_aspects" CONSTRAINT "fk_alf_nasp_n" FOREIGN KEY
(node_id) REFERENCES alf_node(id)
     TABLE "alf_node_assoc" CONSTRAINT "fk_alf_nass_snode" FOREIGN KEY
(source_node_id) REFERENCES alf_node(id)
     TABLE "alf_node_assoc" CONSTRAINT "fk_alf_nass_tnode" FOREIGN KEY
(target_node_id) REFERENCES alf_node(id)
     TABLE "alf_node_properties" CONSTRAINT "fk_alf_nprop_n" FOREIGN KEY
(node_id) REFERENCES alf_node(id)
     TABLE "alf_store" CONSTRAINT "fk_alf_store_root" FOREIGN KEY
(root_node_id) REFERENCES alf_node(id)
     TABLE "alf_subscriptions" CONSTRAINT "fk_alf_sub_node" FOREIGN KEY
(node_id) REFERENCES alf_node(id) ON DELETE CASCADE
     TABLE "alf_subscriptions" CONSTRAINT "fk_alf_sub_user" FOREIGN KEY
(user_node_id) REFERENCES alf_node(id) ON DELETE CASCADE
     TABLE "alf_usage_delta" CONSTRAINT "fk_alf_usaged_n" FOREIGN KEY
(node_id) REFERENCES alf_node(id)

This line of the output:
     "alf_node_pkey" PRIMARY KEY, btree (id) CLUSTER
would indicate to me that there is a PK on alf_node table, it is on
column "id", it is of type btree, and the table is clustered around that
index.

Am I reading this totally wrong?

The supporting table actually seems to have a multi-column PK defined,
and a separate btree index on node_id as you mentioned.

-AJ


On 6/26/2014 10:05 AM, Shaun Thomas wrote:
> On 06/26/2014 08:26 AM, AJ Weber wrote:
>
>> The "master table" definition is attached as "table1.sql".
>> The "detail table" definition is attached as "table2.sql".
>
> I'm not sure what you think a primary key is, but neither of these
> tables have one. Primary keys are declared one of two ways:
>
> CREATE TABLE foo
> (
>   id    BIGINT PRIMARY KEY,
>   col1  VARCHAR,
>   col2  INT
> );
>
> Or this:
>
> CREATE TABLE foo
> (
>   id    BIGINT,
>   col1  VARCHAR,
>   col2  INT
> );
>
> ALTER TABLE foo ADD constraint pk_foo PRIMARY KEY (id);
>
> On your alf_node_properties table, you only have an index on node_id
> because you created one. If you look at your alf_node table, there is
> no index on the id column at all. This is confirmed by the explain
> output you attached:
>
> Seq Scan on alf_node node  (cost=0.00..227265.29 rows=5733429
> width=16) (actual time=0.013..2029.649 rows=5733888 loops=1)
>
> Since it has no index, the database is reading the entire table to
> find your matching values. Then it's using the index on node_id in the
> other table to find the 'detail' matches, as seen here:
>
> Bitmap Index Scan on fk_alf_nprop_n  (cost=0.00..1240.00 rows=52790
> width=0) (actual time=0.552..0.552 rows=1071 loops=1)
>
> Add an actual primary key to your alf_node table, and your query
> performance should improve substantially. But I also strongly suggest
> you spend some time learning how to read an EXPLAIN plan, as that
> would have made your problem obvious immediately.
>
> Here's a link for your version:
>
> http://www.postgresql.org/docs/9.0/static/sql-explain.html
>
> You should still consider upgrading to the latest release of 9.0 too.
>



Re: how to improve perf of 131MM row table?

From
Shaun Thomas
Date:
On 06/26/2014 09:22 AM, AJ Weber wrote:

> I sent the details as identified by pgAdmin III.

Interesting. Either there is a bug in pgAdmin, or you're connecting to a
different database that is missing the primary key. What is the EXPLAIN
ANALYZE output if you execute the query you sent on a psql prompt?

>      "alf_node_pkey" PRIMARY KEY, btree (id) CLUSTER
> would indicate to me that there is a PK on alf_node table, it is on
> column "id", it is of type btree, and the table is clustered around that
> index.
>
> Am I reading this totally wrong?

No, that's right. But that wasn't in the SQL you sent. In fact, there's
a lot of stuff missing in that output.

Try running the EXPLAIN ANALYZE using the same psql connection you used
to retrieve the actual table structure just now. I suspect you've
accidentally connected to the wrong database. If it's still doing the
sequence scan, we'll have to dig deeper.

--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: how to improve perf of 131MM row table?

From
AJ Weber
Date:
From psql (same session as previous \d output) --

  Hash Join  (cost=328182.35..548154.83 rows=52790 width=187) (actual
time=4157.886..4965.466 rows=1071 loops=1)
    Hash Cond: (prop.node_id = node.id)
    Buffers: shared hit=146711 read=23498, temp read=23676 written=23646
    ->  Bitmap Heap Scan on alf_node_properties prop
(cost=1253.19..189491.88 rows=52790 width=179) (actual time=0.429..1.154
rows=1071 loops=1)
          Recheck Cond: (node_id = ANY

('{175769,175771,175781,175825,175881,175893,175919,175932,175963,175999,176022,176079,176099,176115,176118,176171,176181,176217,176220,176243,176283,176312,176326,176335,176377,176441,176444,176475,176530,176570,176623,176674,176701,176730,176748,176763,176771,176808,176836,176851,176864,176881,176929,176945,176947,176960,177006,177039,177079,177131,177144}'::bigint[]))
          Buffers: shared hit=278
          ->  Bitmap Index Scan on fk_alf_nprop_n (cost=0.00..1240.00
rows=52790 width=0) (actual time=0.411..0.411 rows=1071 loops=1)
                Index Cond: (node_id = ANY

('{175769,175771,175781,175825,175881,175893,175919,175932,175963,175999,176022,176079,176099,176115,176118,176171,176181,176217,176220,176243,176283,176312,176326,176335,176377,176441,176444,176475,176530,176570,176623,176674,176701,176730,176748,176763,176771,176808,176836,176851,176864,176881,176929,176945,176947,176960,177006,177039,177079,177131,177144}'::bigint[]))
                Buffers: shared hit=207
    ->  Hash  (cost=227265.29..227265.29 rows=5733429 width=16) (actual
time=4156.075..4156.075 rows=5734255 loops=1)
          Buckets: 65536  Batches: 16  Memory Usage: 16888kB
          Buffers: shared hit=146433 read=23498, temp written=23609
          ->  Seq Scan on alf_node node  (cost=0.00..227265.29
rows=5733429 width=16) (actual time=0.004..1908.493 rows=5734255 loops=1)
                Buffers: shared hit=146433 read=23498
  Total runtime: 4967.674 ms
(15 rows)

On 6/26/2014 10:37 AM, Shaun Thomas wrote:
> On 06/26/2014 09:22 AM, AJ Weber wrote:
>
>> I sent the details as identified by pgAdmin III.
>
> Interesting. Either there is a bug in pgAdmin, or you're connecting to
> a different database that is missing the primary key. What is the
> EXPLAIN ANALYZE output if you execute the query you sent on a psql
> prompt?
>
>>      "alf_node_pkey" PRIMARY KEY, btree (id) CLUSTER
>> would indicate to me that there is a PK on alf_node table, it is on
>> column "id", it is of type btree, and the table is clustered around that
>> index.
>>
>> Am I reading this totally wrong?
>
> No, that's right. But that wasn't in the SQL you sent. In fact,
> there's a lot of stuff missing in that output.
>
> Try running the EXPLAIN ANALYZE using the same psql connection you
> used to retrieve the actual table structure just now. I suspect you've
> accidentally connected to the wrong database. If it's still doing the
> sequence scan, we'll have to dig deeper.
>



Re: how to improve perf of 131MM row table?

From
Claudio Freire
Date:
On Thu, Jun 26, 2014 at 10:26 AM, AJ Weber <aweber@comcast.net> wrote:
> OK, the sample query is attached (hopefully attachments are allowed) as
> "query.sql".
> The "master table" definition is attached as "table1.sql".
> The "detail table" definition is attached as "table2.sql".
> The EXPLAIN (ANALYZE, BUFFERS) output is here:
> http://explain.depesz.com/s/vd5


I think the problem is that you're sending strings in the ids, instead
of integers.

Remove the quotes, leave only the numbers. That will make pg able to
infer that node.id = prop.node_id means it can also use an index on
alf_node_properties.

I think.

Try.


Re: how to improve perf of 131MM row table?

From
AJ Weber
Date:
I noticed this too.  I am trying to find where the actual SQL is
generated, and I am seeing if this is an artifact of Hibernate.

Will test the same query without the quotes as you recommend.  (But I
don't know where to fix that, if it is the actual issue, unfortunately.)

On 6/26/2014 11:35 AM, Claudio Freire wrote:
> On Thu, Jun 26, 2014 at 10:26 AM, AJ Weber <aweber@comcast.net> wrote:
>> OK, the sample query is attached (hopefully attachments are allowed) as
>> "query.sql".
>> The "master table" definition is attached as "table1.sql".
>> The "detail table" definition is attached as "table2.sql".
>> The EXPLAIN (ANALYZE, BUFFERS) output is here:
>> http://explain.depesz.com/s/vd5
>
> I think the problem is that you're sending strings in the ids, instead
> of integers.
>
> Remove the quotes, leave only the numbers. That will make pg able to
> infer that node.id = prop.node_id means it can also use an index on
> alf_node_properties.
>
> I think.
>
> Try.



Re: how to improve perf of 131MM row table?

From
Claudio Freire
Date:
On Thu, Jun 26, 2014 at 12:38 PM, AJ Weber <aweber@comcast.net> wrote:
> I noticed this too.  I am trying to find where the actual SQL is generated,
> and I am seeing if this is an artifact of Hibernate.
>
> Will test the same query without the quotes as you recommend.  (But I don't
> know where to fix that, if it is the actual issue, unfortunately.)

Last time I used Hibernate it was an ancient version, but that version
didn't handle bigint very well, IIRC.

What we did is write a customized dialect that did the proper handling
of it. I really can't remember the details, nor whether this still
applies to the latest version. But it's worth a look.


Re: how to improve perf of 131MM row table?

From
AJ Weber
Date:
FWIW: I tested removing the quotes around each value, and it did not
change the plan (I am a little surprised too, but I guess PG is "smarter
than that").

Thanks for the idea.

On 6/26/2014 11:38 AM, AJ Weber wrote:
> I noticed this too.  I am trying to find where the actual SQL is
> generated, and I am seeing if this is an artifact of Hibernate.
>
> Will test the same query without the quotes as you recommend. (But I
> don't know where to fix that, if it is the actual issue, unfortunately.)
>
> On 6/26/2014 11:35 AM, Claudio Freire wrote:
>> On Thu, Jun 26, 2014 at 10:26 AM, AJ Weber <aweber@comcast.net> wrote:
>>> OK, the sample query is attached (hopefully attachments are allowed) as
>>> "query.sql".
>>> The "master table" definition is attached as "table1.sql".
>>> The "detail table" definition is attached as "table2.sql".
>>> The EXPLAIN (ANALYZE, BUFFERS) output is here:
>>> http://explain.depesz.com/s/vd5
>>
>> I think the problem is that you're sending strings in the ids, instead
>> of integers.
>>
>> Remove the quotes, leave only the numbers. That will make pg able to
>> infer that node.id = prop.node_id means it can also use an index on
>> alf_node_properties.
>>
>> I think.
>>
>> Try.
>
>
>



Re: how to improve perf of 131MM row table?

From
Claudio Freire
Date:
On Thu, Jun 26, 2014 at 12:48 PM, AJ Weber <aweber@comcast.net> wrote:
> FWIW: I tested removing the quotes around each value, and it did not change
> the plan (I am a little surprised too, but I guess PG is "smarter than
> that").
>
> Thanks for the idea.


Ok, second round.

Try changing node_id in (...) into node.id in (...)


Re: how to improve perf of 131MM row table?

From
Shaun Thomas
Date:
On 06/26/2014 11:19 AM, Claudio Freire wrote:

> Try changing node_id in (...) into node.id in (...)

Wow. How did we not see that earlier? That's probably the issue. If you
look at the estimates of his query:

Bitmap Heap Scan on alf_node_properties prop  (cost=1253.19..189491.87
rows=52790 width=179) (actual time=0.571..1.349 rows=1071 loops=1)

The planner is off by an order of magnitude, and since the matches are
against node_id instead of node.id, it thinks it would have to index
seek on the alf_node table for over 50k rows. I could easily see it
opting for a sequence scan in that case, depending on how high
random_page_cost is.

--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: how to improve perf of 131MM row table?

From
Sébastien Lorion
Date:
On Thu, Jun 26, 2014 at 10:37 AM, Shaun Thomas <sthomas@optionshouse.com> wrote:
On 06/26/2014 09:22 AM, AJ Weber wrote:

I sent the details as identified by pgAdmin III.

Interesting. Either there is a bug in pgAdmin, or you're connecting to a different database that is missing the primary key. What is the EXPLAIN ANALYZE output if you execute the query you sent on a psql prompt?


     "alf_node_pkey" PRIMARY KEY, btree (id) CLUSTER
would indicate to me that there is a PK on alf_node table, it is on
column "id", it is of type btree, and the table is clustered around that
index.

Am I reading this totally wrong?

No, that's right. But that wasn't in the SQL you sent. In fact, there's a lot of stuff missing in that output.

Try running the EXPLAIN ANALYZE using the same psql connection you used to retrieve the actual table structure just now. I suspect you've accidentally connected to the wrong database. If it's still doing the sequence scan, we'll have to dig deeper.
 

​I see "CONSTRAINT alf_node_pkey PRIMARY KEY (id)" for table1 and ​"CONSTRAINT alf_node_properties_pkey PRIMARY KEY (node_id, qname_id, list_index, locale_id)" for table2. When you say there is not primary key defined, is it based on the execution plan ?

Sébastien

Re: how to improve perf of 131MM row table?

From
AJ Weber
Date:
On 6/26/2014 12:23 PM, Shaun Thomas wrote:
> On 06/26/2014 11:19 AM, Claudio Freire wrote:
>
>> Try changing node_id in (...) into node.id in (...)
>
That looks much better to my untrained eye!  (Am I right?)

  Nested Loop  (cost=218.29..21305.47 rows=53480 width=187) (actual
time=42.347..
43.617 rows=1071 loops=1)
    Buffers: shared hit=487 read=15
    ->  Bitmap Heap Scan on alf_node node  (cost=218.29..423.40 rows=51
width=16)
  (actual time=42.334..42.413 rows=51 loops=1)
          Recheck Cond: (id = ANY
('{175769,175771,175781,175825,175881,175893,17
5919,175932,175963,175999,176022,176079,176099,176115,176118,176171,176181,17621
7,176220,176243,176283,176312,176326,176335,176377,176441,176444,176475,176530,1
76570,176623,176674,176701,176730,176748,176763,176771,176808,176836,176851,1768
64,176881,176929,176945,176947,176960,177006,177039,177079,177131,177144}'::bigi
nt[]))
          Buffers: shared hit=159 read=15
          ->  Bitmap Index Scan on alf_node_pkey (cost=0.00..218.28
rows=51 widt
h=0) (actual time=42.326..42.326 rows=51 loops=1)
                Index Cond: (id = ANY
('{175769,175771,175781,175825,175881,17589
3,175919,175932,175963,175999,176022,176079,176099,176115,176118,176171,176181,1
76217,176220,176243,176283,176312,176326,176335,176377,176441,176444,176475,1765
30,176570,176623,176674,176701,176730,176748,176763,176771,176808,176836,176851,
176864,176881,176929,176945,176947,176960,177006,177039,177079,177131,177144}'::
bigint[]))
                Buffers: shared hit=146 read=7
    ->  Index Scan using fk_alf_nprop_n on alf_node_properties prop
(cost=0.00..
396.34 rows=1049 width=179) (actual time=0.006..0.013 rows=21 loops=51)
          Index Cond: (prop.node_id = node.id)
          Buffers: shared hit=328
  Total runtime: 43.747 ms

AM I RIGHT?  (That it's much better -- I thank Claudio and Shaun for
being right!)


Re: how to improve perf of 131MM row table?

From
Jeff Janes
Date:
On Wed, Jun 25, 2014 at 2:40 PM, Aaron Weber <aweber@comcast.net> wrote:
> I will gather the other data tonight. Thank you.
>
> In the meantime, I guess I wasn't clear about some other particulars
> The query's where clause is only an "IN", with a list of id's (those I
> mentioned are the PK), and the join is explicitly on the PK (so, indexed).

The PK of the master table and the PK of the detail table cannot be
the same thing, or they would not have a master-detail relationship.
One side has to be an FK, not a PK.

>
> An IN with 50 int values took 23sec to return (by way of example).

If that is 50 PKs from the master table, it would be about 1000 on the
detail table.  If you have 5600 rpm drives and every detail row
requires one index leaf page and one table page to be read from disk,
then 23 seconds is right on the nose. Although they shouldn't require
a different leaf page each because all entries for the same master row
should be adjacent in the index, so that does sound a little high if
this is the only thing going on.

Cheers,

Jeff


Re: how to improve perf of 131MM row table?

From
Shaun Thomas
Date:
On 06/26/2014 03:14 PM, Jeff Janes wrote:

> If that is 50 PKs from the master table, it would be about 1000 on the
> detail table.

You're right. But here's the funny part: we solved this after we noticed
his where clause was directed at the *detail* table instead of the
master table. This was compounded by the fact the planner incorrectly
estimated the row match count on the detail table due to the well-known
correlation deficiencies especially present in older versions. The row
count went from 1000 to 50,000.

Then it joined against the master table. Since 50,000 index page fetches
followed by 50,000 data page fetches would be pretty damn slow, the
planner went for a sequence scan on the master table instead. Clearly
the old 9.0 planner does not consider transitive IN equality.

I'm curious to see if Aaron can test his structure on 9.3 with the
original data and WHERE clause and see if the planner still goes for the
terrible plan. If it does, that would seem like an obvious planner tweak
to me.

--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: how to improve perf of 131MM row table?

From
Aaron Weber
Date:


>The PK of the master table and the PK of the detail table cannot be
>the same thing, or they would not have a master-detail relationship.
>One side has to be an FK, not a PK.
>
Of course this is correct. I was trying to make the point that there should be unique indices (of whatever flavor PG
usesfor PK's by default) on the relevant columns.  Since we're referring to a select statement, the actual integrity
constraintsshould not come into play.  

I will remember to be more explicit about the schema next time.



Re: how to improve perf of 131MM row table?

From
Aaron Weber
Date:


>I'm curious to see if Aaron can test his structure on 9.3 with the
>original data and WHERE clause and see if the planner still goes for
>the
>terrible plan. If it does, that would seem like an obvious planner
>tweak
>to me.

I will try to spin up a test 9.3 db and run the same queries to see if this is the case.   Least I can do in return for
thehelp.