Re: Need help optimizing this query - Mailing list pgsql-general

From Pat Maddox
Subject Re: Need help optimizing this query
Date
Msg-id 810a540e0707181412x5f520a8bg1aa3b9d003dc0c46@mail.gmail.com
Whole thread Raw
In response to Re: Need help optimizing this query  (Michael Glaesemann <grzm@seespotcode.net>)
Responses Re: Need help optimizing this query
List pgsql-general
On 7/18/07, Michael Glaesemann <grzm@seespotcode.net> wrote:
>
> On 7/18/07, Pat Maddox <pergesu@gmail.com> wrote:
> > I've got a query that's taking forever (as will be obvious when you
> > see it and the explain output).  I can't figure out what indexes to
> > add to make this run faster.  I'd appreciate any help.
> >
>
> I'm curious why it's choosing to use hash joins rather than taking
> advantage of the indexes you have on the foreign key columns. What
> are the table definitions? Are hit, logged_in, played, downloaded all
> columns of videos_views?
>
> Have you ANALYZEd these tables? What is the EXPLAIN ANALYZE output
> for this query? You only provided the EXPLAIN output, which doesn't
> compare the plan with the actual query cost.
>
> I found it quite difficult to read you query. I reformatted it and
> also used some SQL functions to abstract away the CASE expressions.
> These SQL functions will probably be inlined so there should be very
> little overhead. If you have a boolean column, you don't need to test
> IS TRUE or IS FALSE: you can just use the value itself. I also find
> it helpful to separate the join conditions (in the JOIN clause) from
> the restrictions (the WHERE clause), which I've done below.
>
> CREATE OR REPLACE FUNCTION ternary(BOOLEAN, INTEGER, INTEGER)
> RETURNS INTEGER
> LANGUAGE SQL AS $_$SELECT CASE WHEN $1 THEN $2 ELSE $3 END$_$;
>
> CREATE OR REPLACE FUNCTION value_when(BOOLEAN, INTEGER)
> RETURNS INTEGER
> LANGUAGE SQL AS $_$SELECT ternary($1,$2,0)$_$;
>
> CREATE OR REPLACE FUNCTION one_when(BOOLEAN)
> RETURNS INTEGER
> LANGUAGE SQL as $_$SELECT value_when($1,1)$_$;
>
> SELECT
>      SUM (one_when(hit AND logged_in)) AS count_hits_console
>      , SUM (one_when(hit AND NOT logged_in)) AS count_hits_remote
>      , SUM (one_when(played AND logged_in)) AS count_played_console
>      , SUM (one_when(played AND NOT logged_in)) AS count_played_remote
>      , SUM (one_when(downloaded AND logged_in)) AS
> count_downloaded_console
>      , SUM (one_when(downloaded AND NOT logged_in)) AS
> count_downloaded_remote
>      , SUM (value_when((played OR downloaded) AND logged_in,
> assets.size))
>          as download_size_console
>      , SUM (value_when((played OR downloaded) AND NOT logged_in),
> assets.size)
>          as download_size_remote
>      , videos.id
>      , videos.title
>      , videos.guid
> FROM video_views
> JOIN assets ON (video_views.video_id=videos.id)
> JOIN videos ON (video_views.asset_id=assets.id)
> WHERE videos.company_id=1
> GROUP BY videos.id
>      , videos.title
>      , videos.guid
> ORDER BY count_hits_remote
> DESC LIMIT 100
>
>
>
> Michael Glaesemann
> grzm seespotcode net
>
>
>

Michael,

I tried your SQL but it didn't work - it was missing the videos table
in the FROM clause.  But when I add it, I get the error:

ERROR:  invalid reference to FROM-clause entry for table "video_views"
LINE 20: JOIN assets ON (video_views.video_id=videos.id)
                         ^
HINT:  There is an entry for table "video_views", but it cannot be
referenced from this part of the query.


Not really sure what that means.

Here are the table definitions:

twistage_development=# \d video_views
                                     Table "public.video_views"
   Column   |            Type             |
Modifiers
------------+-----------------------------+----------------------------------------------------------
 id         | integer                     | not null default
nextval('video_views_id_seq'::regclass)
 video_id   | integer                     |
 created_at | timestamp without time zone |
 asset_id   | integer                     |
 played     | boolean                     | default false
 downloaded | boolean                     | default false
 hit        | boolean                     | default false
 logged_in  | boolean                     | default false
Indexes:
    "video_views_pkey" PRIMARY KEY, btree (id)
    "index_video_views_on_asset_id" btree (asset_id)
    "index_video_views_on_video_id" btree (video_id)
    "index_video_views_on_video_id_and_asset_id_and_created_at" btree
(video_id, created_at, asset_id)

twistage_development=# \d videos
                                           Table "public.videos"
        Column         |            Type             |                      Modi
fiers
-----------------------+-----------------------------+--------------------------
---------------------------
 id                    | integer                     | not null default nextval(
'videos_id_seq'::regclass)
 title                 | character varying(255)      |
 duration              | double precision            |
 description           | text                        |
 status                | character varying(255)      |
 user_id               | integer                     |
 created_at            | timestamp without time zone |
 upload_finished       | boolean                     | default false
 publisher_name        | character varying(255)      |
 company_id            | integer                     |
 available_for_display | boolean                     | default true
 guid                  | character varying(255)      |
 main_asset_id         | integer                     |
 container_type        | character varying(255)      |
 codec                 | character varying(255)      |
 site_id               | integer                     |
 deleted_at            | timestamp without time zone |
 purged_at             | timestamp without time zone |
 remote_hits_count     | integer                     | default 0
Indexes:
    "videos_pkey" PRIMARY KEY, btree (id)
    "complete_videos_without_deleted_at" btree (company_id, status)
    "index_complete_videos" btree (deleted_at, purged_at, status, created_at, co
mpany_id)
    "index_videos_on_company_id" btree (company_id)
    "index_videos_on_deleted_at_and_status_and_site_id" btree (status, deleted_a
t, site_id)
    "index_videos_on_guid" btree (guid)
    "index_videos_on_publisher_name" btree (publisher_name)
    "index_videos_on_site_id" btree (site_id)
    "index_videos_on_user_id" btree (user_id)

twistage_development=# \d assets
                                        Table "public.assets"
     Column      |            Type             |
Modifiers
-----------------+-----------------------------+-----------------------------------------------------
 id              | integer                     | not null default
nextval('assets_id_seq'::regclass)
 video_id        | integer                     |
 video_format_id | integer                     |
 guid            | character varying(255)      |
 source_path     | character varying(255)      |
 size            | bigint                      |
 vresolution     | integer                     |
 hresolution     | integer                     |
 video_bitrate   | integer                     |
 frame_rate      | integer                     |
 container       | character varying(255)      |
 vcodec          | character varying(255)      |
 status          | character varying(255)      |
 deleted_at      | timestamp without time zone |
 audio_bitrate   | integer                     |
 acodec          | character varying(255)      |
 duration        | double precision            |
Indexes:
    "assets_pkey" PRIMARY KEY, btree (id)
    "index_assets_on_video_format_id" btree (video_format_id)
    "index_assets_on_video_id" btree (video_id)

pgsql-general by date:

Previous
From: "Andrej Ricnik-Bay"
Date:
Subject: Re: Sylph-Searcher 1.0.0 released
Next
From: Michael Glaesemann
Date:
Subject: Re: Need help optimizing this query