Thread: Bad performance on simple query
Hi folks, I have a simple table that keeps track of a user's access history. It has a a few fields, but the important ones are: - ownerId: the user's ID, a int8 - accessTS: the timestamp of the record The table right now is small, only 1942 records. The user I test with (10015) has only 89 entries. What I want is to get the last 5 accesses of a user: SELECT * FROM triphistory WHERE ownerId = 10015 ORDER BY accessTS DESC LIMIT 5 If I create a composite index *and* analyze: create index IDX_TRIP_HISTORY_OWNER_ACCESS_TS on tripHistory (ownerId, accessTS); ANALYZE triphistory; It takes 0.091s (!): perpedes_db=# EXPLAIN ANALYZE SELECT * FROM triphistory WHERE ownerId = 10015 ORDER BY accessTS DESC LIMIT 5; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=0.00..7.99 rows=5 width=106) (actual time=0.024..0.042 rows=5 loops=1) -> Index Scan Backward using idx_trip_history_owner_access_ts on triphistory (cost=0.00..142.20 rows=89 width=106) (actualtime=0.021..0.034 rows=5 loops=1) Index Cond: (ownerid = 10015) Total runtime: 0.091 ms (4 rows) BTW, this is after several runs of the query, shouldn't all this stuff be in memory? This is not a fast machine, but this seems rather excessive, no? -- Dimi Paun <dimi@lattica.com> Lattica, Inc.
On Monday 17 November 2008, Dimi Paun <dimi@lattica.com> wrote: >> It takes 0.091s (!): > perpedes_db=# EXPLAIN ANALYZE SELECT * FROM triphistory WHERE ownerId = > 10015 ORDER BY accessTS DESC LIMIT 5; QUERY PLAN > ------------------------------------------------------------------------- >-------------------------------------------------------------------------- >--------------- Limit (cost=0.00..7.99 rows=5 width=106) (actual > time=0.024..0.042 rows=5 loops=1) -> Index Scan Backward using > idx_trip_history_owner_access_ts on triphistory (cost=0.00..142.20 > rows=89 width=106) (actual time=0.021..0.034 rows=5 loops=1) Index Cond: > (ownerid = 10015) > Total runtime: 0.091 ms That's 0.091 milliseconds (0.000091 seconds). -- Corporations will ingest natural resources and defecate garbage until all resources are depleted, debt can no longer be repaid and our money becomes worthless - Jay Hanson
On Mon, Nov 17, 2008 at 8:53 AM, Dimi Paun <dimi@lattica.com> wrote: > Hi folks, > > I have a simple table that keeps track of a user's access history. > It has a a few fields, but the important ones are: > - ownerId: the user's ID, a int8 > - accessTS: the timestamp of the record > > The table right now is small, only 1942 records. > The user I test with (10015) has only 89 entries. > > What I want is to get the last 5 accesses of a user: > SELECT * FROM triphistory WHERE ownerId = 10015 ORDER BY accessTS DESC LIMIT 5 > > If I create a composite index *and* analyze: > create index IDX_TRIP_HISTORY_OWNER_ACCESS_TS on tripHistory (ownerId, accessTS); > ANALYZE triphistory; > > It takes 0.091s (!): > perpedes_db=# EXPLAIN ANALYZE SELECT * FROM triphistory WHERE ownerId = 10015 ORDER BY accessTS DESC LIMIT 5; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------ > Limit (cost=0.00..7.99 rows=5 width=106) (actual time=0.024..0.042 rows=5 loops=1) > -> Index Scan Backward using idx_trip_history_owner_access_ts on triphistory (cost=0.00..142.20 rows=89 width=106)(actual time=0.021..0.034 rows=5 loops=1) > Index Cond: (ownerid = 10015) > Total runtime: 0.091 ms > (4 rows) > > > BTW, this is after several runs of the query, shouldn't all this stuff be in memory? Are you saying it's excessive you need the compound query? Cause that's running in 91microseconds as pointed out by Alan.
On Mon, 2008-11-17 at 09:53 -0700, Scott Marlowe wrote: > > Are you saying it's excessive you need the compound query? Cause > that's running in 91microseconds as pointed out by Alan. Of course, my bad. I read that as 91ms (<blush/>). Confusion came from the fact that pgadminIII reports the query taking 20-40ms, so I read the 0.091 as seconds not ms. -- Dimi Paun <dimi@lattica.com> Lattica, Inc.
On Mon, Nov 17, 2008 at 10:07 AM, Dimi Paun <dimi@lattica.com> wrote: > > On Mon, 2008-11-17 at 09:53 -0700, Scott Marlowe wrote: >> >> Are you saying it's excessive you need the compound query? Cause >> that's running in 91microseconds as pointed out by Alan. > > Of course, my bad. I read that as 91ms (<blush/>). > > Confusion came from the fact that pgadminIII reports the query > taking 20-40ms, so I read the 0.091 as seconds not ms. Ahhh. Keep in mind that if you just run the query, pgadminIII will tell you how long it took to run AND return all the data across the network, so it will definitely take longer then. But most of that's network io wait so it's not a real issue unless you're saturating your network.
On Mon, 2008-11-17 at 10:16 -0700, Scott Marlowe wrote: > Ahhh. Keep in mind that if you just run the query, pgadminIII will > tell you how long it took to run AND return all the data across the > network, so it will definitely take longer then. But most of that's > network io wait so it's not a real issue unless you're saturating your > network. But that is brutal -- there's no way it can take 20ms for a request across an unloaded network. Moreover, I got something like this: pgadminIII | pgsql w/o index: 45ms 0.620ms w/ index 20ms 0.091ms How now I try to replicate, and I get 45ms in both cases. This is very misleading... -- Dimi Paun <dimi@lattica.com> Lattica, Inc.
On Mon, Nov 17, 2008 at 10:31 AM, Dimi Paun <dimi@lattica.com> wrote: > > On Mon, 2008-11-17 at 10:16 -0700, Scott Marlowe wrote: >> Ahhh. Keep in mind that if you just run the query, pgadminIII will >> tell you how long it took to run AND return all the data across the >> network, so it will definitely take longer then. But most of that's >> network io wait so it's not a real issue unless you're saturating your >> network. > > But that is brutal -- there's no way it can take 20ms for a request > across an unloaded network. > > Moreover, I got something like this: > > pgadminIII | pgsql > w/o index: 45ms 0.620ms > w/ index 20ms 0.091ms > > How now I try to replicate, and I get 45ms in both cases. This is > very misleading... I'm guessing a fair bit of that time is pgadminIII prettifying the output for you, etc. I.e. it's not all transfer time. Hard to say without hooking some kind of profiler in pgadminIII. Is psql running local and pgadminIII remotely? Or are they both remote? If both psql and pgadminIII are remote (i.e. same basic circumstances) then it's got to be a difference in the client causing the extra time. OR is this output of explain analyze?
On Nov 17, 2008, at 12:40 PM, Scott Marlowe wrote: > On Mon, Nov 17, 2008 at 10:31 AM, Dimi Paun <dimi@lattica.com> wrote: >> >> On Mon, 2008-11-17 at 10:16 -0700, Scott Marlowe wrote: >>> Ahhh. Keep in mind that if you just run the query, pgadminIII will >>> tell you how long it took to run AND return all the data across the >>> network, so it will definitely take longer then. But most of that's >>> network io wait so it's not a real issue unless you're saturating >>> your >>> network. >> >> But that is brutal -- there's no way it can take 20ms for a request >> across an unloaded network. >> >> Moreover, I got something like this: >> >> pgadminIII | pgsql >> w/o index: 45ms 0.620ms >> w/ index 20ms 0.091ms >> >> How now I try to replicate, and I get 45ms in both cases. This is >> very misleading... > > I'm guessing a fair bit of that time is pgadminIII prettifying the > output for you, etc. I.e. it's not all transfer time. Hard to say > without hooking some kind of profiler in pgadminIII. Is psql running > local and pgadminIII remotely? Or are they both remote? If both psql > and pgadminIII are remote (i.e. same basic circumstances) then it's > got to be a difference in the client causing the extra time. OR is > this output of explain analyze? > Side note: I haven't seen pgAdminIII never show a time below 20ms (the time on the bottom right corner). When I do a query like this : select 1; it takes according to pgAdminIII around 20ms. (whatever that time is) what I normally do to find my real query time is put and explain analyse in front of my query to know to real query time. Ries
On Mon, 2008-11-17 at 10:40 -0700, Scott Marlowe wrote: > I'm guessing a fair bit of that time is pgadminIII prettifying the > output for you, etc. I.e. it's not all transfer time. Hard to say > without hooking some kind of profiler in pgadminIII. Is psql running > local and pgadminIII remotely? Or are they both remote? If both psql > and pgadminIII are remote (i.e. same basic circumstances) then it's > got to be a difference in the client causing the extra time. OR is > this output of explain analyze? With \timing on I get basically the same output (local vs remote) in psql (0.668ms vs. 0.760ms). More like it. WTH is pgadminIII reporting?!? -- Dimi Paun <dimi@lattica.com> Lattica, Inc.
On Mon, Nov 17, 2008 at 6:14 PM, Dimi Paun <dimi@lattica.com> wrote: > > On Mon, 2008-11-17 at 10:40 -0700, Scott Marlowe wrote: >> I'm guessing a fair bit of that time is pgadminIII prettifying the >> output for you, etc. I.e. it's not all transfer time. Hard to say >> without hooking some kind of profiler in pgadminIII. Is psql running >> local and pgadminIII remotely? Or are they both remote? If both psql >> and pgadminIII are remote (i.e. same basic circumstances) then it's >> got to be a difference in the client causing the extra time. OR is >> this output of explain analyze? > > With \timing on I get basically the same output (local vs remote) > in psql (0.668ms vs. 0.760ms). More like it. > > > WTH is pgadminIII reporting?!? Exactly what it's supposed to be, however it's using libpq's asynchronous query interface and has to pass the query result through the wxWidgets event handling system, both of which seem to add a few milliseconds to the overall query time from the quick testing I've just done. In a GUI app like pgAdmin, we need use this kind of architecture to allow the UI to continue processing events (such as button clicks, redraws etc), and to allow multiple windows to work independently without one query locking up the whole app. Note that the rendering time that Tom mentioned the other day which used to confuse things has not been an issue for a couple of years - that was dependent on resultset size and could lead to much bigger variations. that was fixed by having libpq act as a virtual data store for the UI instead of transferring data from the PGresult to the data grid's own data store. I think the bottom line is that you cannot compare psql and pgAdmin's timings because the architectures of the two apps are very different. Further, pgAdmin isn't the best choice for micro-optimisation of extremely fast queries. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com