Re: Why is indexonlyscan so darned slow? - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Why is indexonlyscan so darned slow? |
Date | |
Msg-id | 7600.1337541890@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Why is indexonlyscan so darned slow? (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: Why is indexonlyscan so darned slow?
Re: Why is indexonlyscan so darned slow? |
List | pgsql-hackers |
Jeff Janes <jeff.janes@gmail.com> writes: > On Thu, May 17, 2012 at 11:35 AM, Joshua Berkus <josh@agliodbs.com> wrote: >> That's in-RAM speed ... I ran the query twice to make sure the index was cached, and it didn't get any better. And Imeant 5X per byte rather than 5X per tuple. > Ah, OK that makes more sense. I played around with this, specifically > count(*), quite a bit when IOS first came out, and I attributed a > large part of the time to the code that forms a tuple out of raw > bytes, and the code that advances the aggregate. The first one is > probably more a per-tuple cost than per byte, and the second > definitely is per tuple cost. > I can't find my detailed notes from this work, so this is just from memory. I did a quick investigation of this example with oprofile, and found that there's not going to be any easy large improvement available. It's basically all per-tuple CPU costs, breaking down like this: samples % symbol name 15513 13.4664 IndexOnlyNext 10886 9.4498 index_getnext_tid 7526 6.5331 visibilitymap_test 7116 6.1772 ExecClearTuple 7054 6.1234 _bt_checkkeys 6804 5.9064 _bt_next 6344 5.5070 ExecProject 6033 5.2371 advance_aggregates 5619 4.8777 ExecScan 5331 4.6277 advance_transition_function 5202 4.5157 btgettuple 4928 4.2779 _bt_saveitem 4653 4.0391 ExecProcNode 4473 3.8829 int8inc 3404 2.9549 MemoryContextReset 3125 2.7127 _bt_readpage 2768 2.4028 FunctionCall2Coll 2278 1.9775 ExecAgg 1502 1.3038 ExecStoreVirtualTuple 1198 1.0399 BufferGetBlockNumber 1105 0.9592 ExecIndexOnlyScan 946 0.8212 hash_search_with_hash_value A fair chunk of what's being attributed to IndexOnlyNext is actually the inlined code for StoreIndexTuple, and that in turn is mostly the inlined code for index_getattr. We might possibly save a bit here if we noticed that the query doesn't actually need us to fetch the indexed columns, but it looks like that would only buy a couple percent overall --- and testing for that would add useless cycles to every case where we *do* need the indexed value. So I'm afraid that it might amount to optimizing SELECT COUNT(*) at the expense of everything else, which I'm not for. Another possibility is to try to reduce the costs of index_getnext_tid and FunctionCall2Coll, which are basically just trampolines to reach btgettuple. It's not immediately obvious how to make that much better though. Anyway, on my machine it seems that the per-tuple CPU costs for SELECT COUNT(*) with an index-only scan are something like 10% higher than the per-tuple costs with a heap scan. We might get that down to roughly par with some hacking, but it's never going to be vastly better. The argument in favor of index-only scans is mainly about reducing I/O costs anyway. regards, tom lane
pgsql-hackers by date: