Re: OFFSET/LIMIT - Disparate Performance w/ Go application - Mailing list pgsql-performance

From Matt Silverlock
Subject Re: OFFSET/LIMIT - Disparate Performance w/ Go application
Date
Msg-id 7B027793-2097-4309-A754-BB82B3A5EFD9@eatsleprepeat.net
Whole thread Raw
In response to Re: OFFSET/LIMIT - Disparate Performance w/ Go application  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
Thanks for the replies Jeff, Tom and Merlin.

Pages that SELECT multiple rows with OFFSET and LIMIT conditions struggle to top 1.3k req/s

Is that tested at the OFFSET and LIMIT of 0 and 15, as shown in the
explain plan?


Yes — 0 (OFFSET) and 16 (LIMIT), or 15 and 31 (i.e. “second page” of results). There’s no difference on that front. For context, OFFSET is a multiple of 15 (i.e. 15 results per page) and LIMIT is always 15 + 1 in an attempt to fetch one more result, get the len of the returned slice and then return paginate true + slice the last result off if there’s more than 15.



The query in question is: http://explain.depesz.com/s/7g8 and the table schema is as below:

The reported runtime of 0.078 ms should be able to sustain nearly 10
times the reported rate of 1.3k/s, so the bottleneck would seem to be
elsewhere.

Perhaps the bottleneck is formatting the result set in postgres to be
sent over the wire, then sending it over the wire, then parsing it in
the Go connection library to hand back to the Go user code, and then
the Go user code doing something meaningful with it.

What happens if you get rid of the offset and the order by, and just
use limit?  I bet it doesn't change the speed much (because that is
not where the bottleneck is).

You seem to be selecting an awful lot of wide columns.  Do you really
need to see all of them?

- Testing SELECT * FROM … with just LIMIT 15 and no offset yields 1299 request/s at the front end of the application.
- Testing SELECT id, title, company, location, commute, term, expiry_date (a subset of fields) with LIMIT 15 and no OFFSET yields 1800 request/s at the front end.

There’s definitely an increase to be realised there (I’d say by just tossing the rendered HTML field). 

Based on your comments about the Go side of things, I ran a quick test by cutting the table down to 6 records from the 39 in the test DB in all previous tests. This skips the pagination logic (below) and yields 3068 req/s on the front-end. 

// Determine if we have more than one page of results.
// If so, trim the extra result off and set pagination = true
if len(listings) > opts.PerPage {
paginate = true
listings = listings[:opts.PerPage]
}

So there certainly appears to be a bottleneck on the Go side as well (outside of even the DB driver), probably from the garbage generated from slicing the slice, although I’d be keen to know if there’s a better way to approach returning a paginated list of results.

Well, you can also do client side pagination using the row-wise
comparison feature, implemented by you :-).  Cursors can be the best
approach, but it's nice to know the client side approach if you're
really stateless and/or want to be able to pick up external changes
during the browse.


What would be a better approach here? The cursor approach isn’t ideal in my case (although I could make it work), but what other options are there that are stateless?



Some pgbench results from this machine as well:

$ pgbench -c 128 -C -j 4 -T 15 -M extended -S

This is just benchmarking how fast you can make and break connections
to the database.

Because your app is using an embedded connection pooler, this
benchmark isn't very relevant to your situation.

Noted — thanks.


On 13 Jun 2014, at 4:46 AM, Jeff Janes <jeff.janes@gmail.com> wrote:

<snip>

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: OFFSET/LIMIT - Disparate Performance w/ Go application
Next
From: Tim Kane
Date:
Subject: Re: postgres files in use not staying in linux file cache