Re: EXPLAIN detail - Mailing list pgsql-performance
From | Luigi N. Puleio |
---|---|
Subject | Re: EXPLAIN detail |
Date | |
Msg-id | 981759.47577.qm@web33504.mail.mud.yahoo.com Whole thread Raw |
In response to | EXPLAIN detail ("Luigi N. Puleio" <npuleio@rocketmail.com>) |
Responses |
Re: EXPLAIN detail
Re: EXPLAIN detail |
List | pgsql-performance |
>> With all that casting, is it possible that appropriate indexes aren't >> being used because your WHERE / ON clauses aren't an exact type match >> for the index? > > You mean to put an index on date with timestamptz datatype column?... > Er ... I'm not quite sure what you mean. Do you mean an index on a cast > of the column, eg: > CREATE INDEX some_idx_name ON some_table ( some_timestamp_field::date ) > then ... maybe. It's hard to be sure when there is so little information > available. It shouldn't be necessary, but there are certainly uses for > that sort of thing - for example, I use a couple of functional indexes > in the schema I'm working on at the moment. It's probably a good idea to > look at ways to avoid doing that first, though. >> Can you post the full EXPLAIN ANALYZE from the query? This snippet >> doesn't even show how records are being looked up. > > HashAggregate (cost=52236.31..52236.33 rows=1 width=34) (actual time=7004.779...7004.782 rows=1 loops=1) > -> Nested Loop (cost=0.00..52236.30 rows=1 width=34) (actual time=3939.450..7004.592 rows=1 loops=1) > Join filter: (("inner".calldate)::time without time zone => ("outer".calldate)::time without time zone) > -> Seq Scan on table a (cost=0.00..27444.03 rows=1 width=26) (actual time=2479.199..2485.266 rows=3 loops=1) > Filter: (((calldate)::date = '2008-04-09'::date) AND ((src)::text = '410'::text) AND (substr((dst)::text, 1,4)='*100'::text) AND ((lastdata)::text ='/dati/ita/loginok'::text)) > ->Seq Scan on table b (cost=0.00..24792.22 rows=3 width=16) (actual time=1504.508..1506.374 rows=1 loops=3) > Filter: ((((lastdata)::text ='/dati/ita/logoutok'::text) AND ('410'::text=(src)::text) AND ('2008-04-09'::date= (calldate)::date)) > Total runtime: 7005.706 ms > Personally, I'd want to get rid of all those casts first. Once that's > cleaned up I'd want to look at creating appropriate indexes on your > tables. If necessary, I might even create a composite index on > (lastdata,src,calldate) . >> What about a \d of the table from psql, or at least a summary of the >> involved column data types and associated indexes? > > this table has an acctid column which is PK then most of the other columns are varchar(80) or so.... > Do you mean that the columns involved in your WHERE and ON clauses, the > ones you're casting to date, timestamp, etc, are stored as VARCHAR? If > so, it's no surprise that the query is slow because you're forcing > PostgreSQL to convert a string to a date, timestamp, or time datatype to > do anything with it ... and you're doing it many times in every query. > That will be VERY slow, and prevent the use of (simple) indexes on those > columns. > If you're really storing dates/times as VARCHAR, you should probably > look at some changes to your database design, starting with the use of > appropriate data types. > That's all guesswork, because you have not provided enough information. > Can you please post the output of psql's \d command on the table in > question? > If for some reason you cannot do that, please at least include the data > type of the primary key and all fields involved in the query, as well as > a list of all the indexes on both tables. > The easy way to do that is to just launch "psql" then run: > \d table > and paste the output to an email. > So for 4 records result, 7 seconds are too way a lot I guess... but as I said before I'm gonna wait if > the responsibleof the server did a VACUUM on the table... > > What do you think?... > If you're really casting VARCHAR to DATE, TIME, TIMESTAMP, etc on demand > then personally I really doubt that dead rows are your problem. Well, this table has a primary key index on first column called acctid which is an integer; instead the calldate column isa TIMESTAMPTZ and in fact I'm using to do (calldate)::date in the ON clause because since the time part of that columnis always different and in the nesting I have to identificate the date is the same... the other two columns (src and lastdata) are both VARCHAR(80) and the query is this one: EXPLAIN ANALYZE SELECT (a.calldate)::date, a.src, a.dst, MIN(e.calldate) - a.calldate FROM cdr a INNER JOIN cdr e ON ((e.calldate)::date = (a.calldate)::date AND e.src = a.src AND e.lastdata = '/dati/ita/logoutok' AND e.calldate >= a.calldate) WHERE (a.calldate)::date = '2008-04-09' AND a.src = '410' AND substr(a.dst, 1, 4) = '*100' AND a.lastdata = '/dati/ita/loginok' GROUP BY a.calldate, a.src, a.dst __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
pgsql-performance by date: