Thread: Need more speed from this.

Need more speed from this.

From
Brian Avis
Date:
I hope someone can help me out with this.

Postgres 7.1.3
Solaris 2.8

I have a table called system_data setup like this.


     | server_id |  date  |  time  | data_type_id | data |
     |___________|________|________|______________|______|
     |           |        |        |              |      |
     |     26    | a date | a time |     8        |  98  |
     |           |        |        |              |      |
     -----------------------------------------------------


This table is indexed on server_id and data_type_id.

It currently has about 674,456 rows and is growing on a daily
basis.

The problem is that selects from this table are really slow.

For instance if I do this.

SELECT AVG(data) FROM system_data
        WHERE date BETWEEN CAST('2001-02-01' AS DATE)
            AND CAST ('2001-02-28' AS DATE)
        AND
            server_id = 26
        AND
            data_type_id = 8;

It will take several seconds to complete this select.

I need this type of data retrieval to be much faster if
possible.

Thanks in advance for any assistance.


Re: Need more speed from this.

From
Martín Marqués
Date:
On Lun 19 Nov 2001 18:06, you wrote:
> I hope someone can help me out with this.
>
> Postgres 7.1.3
> Solaris 2.8
>
> I have a table called system_data setup like this.
>
>      | server_id |  date  |  time  | data_type_id | data |
>      |___________|________|________|______________|______|
>      |
>      |     26    | a date | a time |     8        |  98  |
>
>      -----------------------------------------------------
>
>
> This table is indexed on server_id and data_type_id.
>
> It currently has about 674,456 rows and is growing on a daily
> basis.
>
> The problem is that selects from this table are really slow.
>
> For instance if I do this.
>
> SELECT AVG(data) FROM system_data
>         WHERE date BETWEEN CAST('2001-02-01' AS DATE)
>             AND CAST ('2001-02-28' AS DATE)
>         AND
>             server_id = 26
>         AND
>             data_type_id = 8;
>
> It will take several seconds to complete this select.
>
> I need this type of data retrieval to be much faster if
> possible.

have you tried adding an index on the date column? It should make the
condition 'date between ...' much faster.

Saludos... :-)

--
Porqué usar una base de datos relacional cualquiera,
si podés usar PostgreSQL?
-----------------------------------------------------------------
Martín Marqués                  |        mmarques@unl.edu.ar
Programador, Administrador, DBA |       Centro de Telematica
                       Universidad Nacional
                            del Litoral
-----------------------------------------------------------------

Re: Need more speed from this.

From
"Eric Ridge"
Date:
I bet EXPLAIN shows a seq. scan b/c of the date field.  index the date
field too.

eric

> -----Original Message-----
> From: Brian Avis [mailto:brian.avis@searhc.org]
> Sent: Monday, November 19, 2001 4:06 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] Need more speed from this.
>
>
> I hope someone can help me out with this.
>
> Postgres 7.1.3
> Solaris 2.8
>
> I have a table called system_data setup like this.
>
>
>      | server_id |  date  |  time  | data_type_id | data |
>      |___________|________|________|______________|______|
>      |           |        |        |              |      |
>      |     26    | a date | a time |     8        |  98  |
>      |           |        |        |              |      |
>      -----------------------------------------------------
>
>
> This table is indexed on server_id and data_type_id.
>
> It currently has about 674,456 rows and is growing on a daily
> basis.
>
> The problem is that selects from this table are really slow.
>
> For instance if I do this.
>
> SELECT AVG(data) FROM system_data
>         WHERE date BETWEEN CAST('2001-02-01' AS DATE)
>             AND CAST ('2001-02-28' AS DATE)
>         AND
>             server_id = 26
>         AND
>             data_type_id = 8;
>
> It will take several seconds to complete this select.
>
> I need this type of data retrieval to be much faster if
> possible.
>
> Thanks in advance for any assistance.
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

Re: Need more speed from this.

From
wsheldah@lexmark.com
Date:

Standard questions:
Have you run VACUUM ANALYZE recently?

Could you send the results of running EXPLAIN on the query in question?

Do the statistics from the EXPLAIN look reasonably accurate?

How about adding an index to the date column?

Is the table growing interactively, or via a once-a-day batch?

HTH,

Wes Sheldahl




Brian Avis <brian.avis%searhc.org@interlock.lexmark.com> on 11/19/2001 04:06:17
PM

To:   pgsql-general%postgresql.org@interlock.lexmark.com
cc:    (bcc: Wesley Sheldahl/Lex/Lexmark)
Subject:  [GENERAL] Need more speed from this.


I hope someone can help me out with this.

Postgres 7.1.3
Solaris 2.8

I have a table called system_data setup like this.


     | server_id |  date  |  time  | data_type_id | data |
     |___________|________|________|______________|______|
     |           |        |        |              |      |
     |     26    | a date | a time |     8        |  98  |
     |           |        |        |              |      |
     -----------------------------------------------------


This table is indexed on server_id and data_type_id.

It currently has about 674,456 rows and is growing on a daily
basis.

The problem is that selects from this table are really slow.

For instance if I do this.

SELECT AVG(data) FROM system_data
          WHERE date BETWEEN CAST('2001-02-01' AS DATE)
               AND CAST ('2001-02-28' AS DATE)
          AND
               server_id = 26
          AND
               data_type_id = 8;

It will take several seconds to complete this select.

I need this type of data retrieval to be much faster if
possible.

Thanks in advance for any assistance.


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org





Re: Need more speed from this.

From
Stephan Szabo
Date:
On Mon, 19 Nov 2001, Brian Avis wrote:

>      | server_id |  date  |  time  | data_type_id | data |
>      |___________|________|________|______________|______|
>      |           |        |        |              |      |
>      |     26    | a date | a time |     8        |  98  |
>      |           |        |        |              |      |
>      -----------------------------------------------------
>
>
> This table is indexed on server_id and data_type_id.
>
> It currently has about 674,456 rows and is growing on a daily
> basis.
>
> The problem is that selects from this table are really slow.
>
> For instance if I do this.
>
> SELECT AVG(data) FROM system_data
>         WHERE date BETWEEN CAST('2001-02-01' AS DATE)
>             AND CAST ('2001-02-28' AS DATE)
>         AND
>             server_id = 26
>         AND
>             data_type_id = 8;
>
> It will take several seconds to complete this select.

Let's go through the usual.  Have you used vacuum analyze
recently and what does explain show for the query?



Re: Need more speed from this.

From
wsheldah@lexmark.com
Date:

Forwarding to the list......
---------------------- Forwarded by Wesley Sheldahl/Lex/Lexmark on 11/19/2001
05:16 PM ---------------------------

Brian Avis <brian.avis%searhc.org@interlock.lexmark.com> on 11/19/2001 05:02:39
PM

To:   "Wesley_Sheldahl/Lex/Lexmark.LEXMARK"@sweeper.lex.lexmark.com
cc:    (bcc: Wesley Sheldahl/Lex/Lexmark)
Subject:  Re: [GENERAL] Need more speed from this.


Okay then.  :)

EXPLAIN SELECT AVG(data) FROM system_data
          WHERE date BETWEEN CAST('2000-01-01' AS DATE)
               AND CAST ('2000-01-31' AS DATE)
          AND
               server_id = 26
          AND
               data_type_id = 8;



Gives me this.

NOTICE:  QUERY PLAN:

Aggregate  (cost=17296.84..17296.84 rows=1 width=8)
  ->  Index Scan using data_type_id_idx on system_data
(cost=0.00..17296.84 rows=1 width=8)

EXPLAIN


The table grows once a day via a perl script.

I will add an index to the date column and see what happens.




wsheldah@lexmark.com wrote:

>
>Standard questions:
>Have you run VACUUM ANALYZE recently?
>
>Could you send the results of running EXPLAIN on the query in question?
>
>Do the statistics from the EXPLAIN look reasonably accurate?
>
>How about adding an index to the date column?
>
>Is the table growing interactively, or via a once-a-day batch?
>
>HTH,
>
>Wes Sheldahl
>
>
>
>
>
>
>

--
Brian Avis
SEARHC Medical Clinic
Juneau, AK 99801
(907) 463-4049
cd /pub
more beer







Re: Need more speed from this.

From
wsheldah@lexmark.com
Date:

Forwarding to the list...........

---------------------- Forwarded by Wesley Sheldahl/Lex/Lexmark on 11/19/2001
05:19 PM ---------------------------

Brian Avis <brian.avis%searhc.org@interlock.lexmark.com> on 11/19/2001 05:10:31
PM

To:   "Wesley_Sheldahl/Lex/Lexmark.LEXMARK"@sweeper.lex.lexmark.com
cc:    (bcc: Wesley Sheldahl/Lex/Lexmark)
Subject:  Re: [GENERAL] Need more speed from this.


indexing the date field sped things up a lot. Thanks for the help.

Oh and I forgot to mention.  I didn't mention the vacuum analyze bit
because this is a brand new database that hasn't had very many updates
or deletes at all, maybe half a dozen.

Thanks again gang.


wsheldah@lexmark.com wrote:

>
>Standard questions:
>Have you run VACUUM ANALYZE recently?
>
>Could you send the results of running EXPLAIN on the query in question?
>
>Do the statistics from the EXPLAIN look reasonably accurate?
>
>How about adding an index to the date column?
>
>Is the table growing interactively, or via a once-a-day batch?
>
>HTH,
>
>Wes Sheldahl
>
>
>
>
>Brian Avis <brian.avis%searhc.org@interlock.lexmark.com> on 11/19/2001 04:06:17
>PM
>
>To:   pgsql-general%postgresql.org@interlock.lexmark.com
>cc:    (bcc: Wesley Sheldahl/Lex/Lexmark)
>Subject:  [GENERAL] Need more speed from this.
>
>
>I hope someone can help me out with this.
>
>Postgres 7.1.3
>Solaris 2.8
>
>I have a table called system_data setup like this.
>
>
>     | server_id |  date  |  time  | data_type_id | data |
>     |___________|________|________|______________|______|
>     |           |        |        |              |      |
>     |     26    | a date | a time |     8        |  98  |
>     |           |        |        |              |      |
>     -----------------------------------------------------
>
>
>This table is indexed on server_id and data_type_id.
>
>It currently has about 674,456 rows and is growing on a daily
>basis.
>
>The problem is that selects from this table are really slow.
>
>For instance if I do this.
>
>SELECT AVG(data) FROM system_data
>          WHERE date BETWEEN CAST('2001-02-01' AS DATE)
>               AND CAST ('2001-02-28' AS DATE)
>          AND
>               server_id = 26
>          AND
>               data_type_id = 8;
>
>It will take several seconds to complete this select.
>
>I need this type of data retrieval to be much faster if
>possible.
>
>Thanks in advance for any assistance.
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org
>
>
>
>

--
Brian Avis
SEARHC Medical Clinic
Juneau, AK 99801
(907) 463-4049
cd /pub
more beer







Re: Need more speed from this.

From
Tom Lane
Date:
wsheldah@lexmark.com writes:
> Aggregate  (cost=17296.84..17296.84 rows=1 width=8)
>   ->  Index Scan using data_type_id_idx on system_data
> (cost=0.00..17296.84 rows=1 width=8)

That's an awfully high cost for an indexscan that's only supposed to
return one row!

I'd guess that in fact the indexscan itself is returning a lot of rows,
and that the reduction to a small number of rows comes mostly from
applying the restrictions on date and server_id (which actually happens
above the index search, though you can't see the difference in EXPLAIN's
depiction).  How many rows are there with data_type_id = 8?

Merely having an index is not much help if the index is unselective,
which is to say that there aren't many distinct values in the indexed
column.  It would seem though that the system thinks the data_type_id
index is the best alternative it's got.  Would scanning on server_id
yield a smaller number of matches in the index?  How about date, or
a multicolumn index combining some of these fields?

            regards, tom lane