Re: join with redundant results VS simpler join plus multiple selects - Mailing list pgsql-general

From WireSpot
Subject Re: join with redundant results VS simpler join plus multiple selects
Date
Msg-id b2d4b0380811200556p26593c1eh883f6aa6c517460d@mail.gmail.com
Whole thread Raw
In response to Re: join with redundant results VS simpler join plus multiple selects  (Craig Ringer <craig@postnewspapers.com.au>)
Responses Re: join with redundant results VS simpler join plus multiple selects  (Craig Ringer <craig@postnewspapers.com.au>)
List pgsql-general
On Thu, Nov 20, 2008 at 15:05, Craig Ringer <craig@postnewspapers.com.au> wrote:
> That's probably going to be the case.  PostgreSQL won't need to read the
> redundant info in from disk each time, and relative to the image data it's
> going to be pretty small. By doing it all in one join you're avoiding the
> overhead of all those network round trips (if on a network), statement
> preparation and planning, etc etc etc. Additionally, PostgreSQL is probably
> going to be using a join plan that's much more efficient than anything
> you'll get by looping over each user and asking for images.

How about if the subset of images for each user is randomized? As in
ORDER BY RANDOM() LIMIT 3. I'm guessing that will put somewhat of a
cramp on the big join scenario and perhaps it becomes better to have
the RANDOM() in the small individual selects?

I'm probably going to give myself the answer, please advise if I'm not
thinking straight:

In this case, from EXPLAIN ANALYZE I get that after introducing
random() and limit, while the cost for the big join scenario is
practically the same, the actual execution time increases with about
0.100ms. Whereas on individual selects with random() limit I get an
increase of 0.040, and since the people will be shown paginated 10 per
page, I'm looking at a 10 x 0.040 = 0.400 increase.

So the big join still comes ahead.

> Note, however, that when testing method (1) in your post you will REALLY
> need to make sure that you're using parameterized prepared statements for
> the image queries.

Definitely, and I'm already working on that (see my other thread).

pgsql-general by date:

Previous
From: Sam Mason
Date:
Subject: Re: Prepared statement already exists
Next
From: Scara Maccai
Date:
Subject: Re: return MAX and when it happened