Re: [PERFORM] Inappropriate inner table for nested loop join - Mailing list pgsql-performance

From Akihiko Odaki
Subject Re: [PERFORM] Inappropriate inner table for nested loop join
Date
Msg-id 81a17ec0-4556-8d41-e043-777433d8ba10@stu.hosei.ac.jp
Whole thread Raw
In response to Re: [PERFORM] Inappropriate inner table for nested loop join  (Akihiko Odaki <akihiko.odaki.4i@stu.hosei.ac.jp>)
Responses Re: [PERFORM] Inappropriate inner table for nested loop join
Re: [PERFORM] Inappropriate inner table for nested loop join
List pgsql-performance
On 2017-06-23 20:20, Albe Laurenz wrote:
> You could either try to do something like
>
> SELECT *
> FROM (SELECT "posts".*
>        FROM "posts"
>           JOIN "follows" ON "follows"."target_account" =
"posts"."account"
>        WHERE "follows"."owner_account" = $1
>        OFFSET 0) q
> ORDER BY "posts"."timestamp"
> LIMIT 100;

Now I wonder whether it actually sorted or not. As you said, I want to
"find rows with the greatest 'timestamp', match with rows from 'posts'
in a nested loop and stop as soon as it has found 100 matches".

However, it seems to query 100 records without any consideration for
"timestamp", and then sorts them. That is not expected. Here is a
abstract query plan:

  Limit
    ->  Sort
          Sort Key: posts.id DESC
          ->  Nested Loop
                ->  Seq Scan on follows
                      Filter: (owner_account = $1)
                ->  Index Scan using index_posts_on_account on posts
                      Index Cond: (account_id = follows.target_account)

index_posts_on_account is an obsolete index on "posts" and only for
"account". So it does nothing for sorting "timestamp".

Regards,
Akihiko Odaki


pgsql-performance by date:

Previous
From: Akihiko Odaki
Date:
Subject: Re: [PERFORM] Inappropriate inner table for nested loop join
Next
From: Adam Brusselback
Date:
Subject: Re: [PERFORM] Dataset is fetched from cache but still takes same timeto fetch records as first run