Re: [PERFORM] Efficiently merging and sorting collections of sorted rows - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: [PERFORM] Efficiently merging and sorting collections of sorted rows
Date
Msg-id CAHyXU0ydXL0WxgJ_PmNFor=LExQ7KtN99Xv3Sk77_fY0d9stXg@mail.gmail.com
Whole thread Raw
In response to Re: [PERFORM] Efficiently merging and sorting collections of sorted rows  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On Fri, Jun 23, 2017 at 6:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Clint Miller <clint.miller1@gmail.com> writes:
>> That's a good plan because it's not doing a quick sort. Instead, it's just
>> reading the sort order off of the index, which is exactly what I want. (I
>> had to disable enable_sort because I didn't have enough rows of test data
>> in the table to get Postgres to use the index. But if I had enough rows,
>> the enable_sort stuff wouldn't be necessary. My real table has lots of rows
>> and doesn't need enable_sort turned off to do the sort with the index.)
>
> TBH, I think this whole argument is proceeding from false premises.
> Using an indexscan as a substitute for an explicit sort of lots of
> rows isn't all that attractive, because it implies a whole lot of
> random access to the table (unless the table is nearly in index
> order, which isn't a condition you can count on without expending
> a lot of maintenance effort to keep it that way).  seqscan-and-sort
> is often a superior alternative, especially if you're willing to give
> the sort a reasonable amount of work_mem.

Hm, if he reverses the index terms he gets his sort order for free and
a guaranteed IOS.   This would only be sensible to do only if several
conditions applied, you'd have to live under the IOS criteria
generally, the number of rows returned to what relative to what was
thrown out would have to be reasonably high (this is key), and the
result set would have to be large making the sort an expensive
consideration relative to the filtering.  You'd also have to be
uninterested in explicit filters on 's' or be willing to create
another index to do that if you were.

merlin

postgres=# \d foo
      Table "public.foo"
 Column │  Type   │ Modifiers
────────┼─────────┼───────────
 s      │ text    │
 i      │ integer │
Indexes:
    "foo_i_s_idx" btree (i, s)  -- reversed

postgres=# set enable_sort = false;
SET

postgres=# explain analyze select * from foo where s = 'a' or s = 'b'
order by i;
                                                       QUERY PLAN

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Index Only Scan using foo_i_s_idx on foo  (cost=0.15..68.75 rows=12
width=36) (actual time=0.004..0.004 rows=0 loops=1)
   Filter: ((s = 'a'::text) OR (s = 'b'::text))
   Heap Fetches: 0
 Planning time: 0.215 ms
 Execution time: 0.025 ms







merlin


pgsql-performance by date:

Previous
From: Brad DeJong
Date:
Subject: Re: [PERFORM]
Next
From: Ulf Lohbrügge
Date:
Subject: Re: [PERFORM] Performance of information_schema with many schemataand tables