performance of SELECT * much faster than SELECT with large offset - Mailing list pgsql-performance

From Marc Slemko
Subject performance of SELECT * much faster than SELECT with large offset
Date
Msg-id CAN1FPGN1ynBj3m1DMszc9MEYVj41S96OMy8-Q1cgKMGxv482SA@mail.gmail.com
Whole thread Raw
Responses Re: performance of SELECT * much faster than SELECT with large offset
List pgsql-performance
I ran into this oddity lately that goes against everything I thought I
understood and was wondering if anyone had any insight.  Version/env
details at the end.

The root of it is these query times:

marcs=# select * from ccrimes offset 5140000 limit 1;
[...data omitted...]
(1 row)
Time: 650.280 ms
marcs=# select description from ccrimes offset 5140000 limit 1;
             description
-------------------------------------
 FINANCIAL IDENTITY THEFT OVER $ 300
(1 row)

Time: 1298.672 ms

These times are all from data that is cached and are very repeatable.
Yes, I know that offset and limit without an order by isn't useful for
paging through data.

And an explain on them both... everything looks the same other than
the width and actual times:

marcs=# explain (analyze,buffers) select * from ccrimes offset 5140000 limit 1;
                                                         QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=204146.73..204146.73 rows=1 width=202) (actual
time=1067.901..1067.901 rows=1 loops=1)
   Buffers: shared hit=152743
   ->  Seq Scan on ccrimes  (cost=0.00..204146.73 rows=5139873
width=202) (actual time=0.014..810.672 rows=5140001 loops=1)
         Buffers: shared hit=152743
 Total runtime: 1067.951 ms
(5 rows)

Time: 1068.612 ms
marcs=# explain (analyze,buffers) select description from ccrimes
offset 5140000 limit 1;
                                                         QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=204146.73..204146.73 rows=1 width=17) (actual
time=1713.027..1713.027 rows=1 loops=1)
   Buffers: shared hit=152743
   ->  Seq Scan on ccrimes  (cost=0.00..204146.73 rows=5139873
width=17) (actual time=0.013..1457.521 rows=5140001 loops=1)
         Buffers: shared hit=152743
 Total runtime: 1713.053 ms
(5 rows)

Time: 1713.612 ms

When I run the query and capture a profile using "perf" and compare
the two, the thing that stands out is the slot_getsomeattrs call that
dominates the trace in the slow query but not in the faster "SELECT *"
version:

-  39.25%  postgres  postgres           [.] _start
   - _start
      - 99.47% slot_getsomeattrs
           ExecProject
           ExecScan
           ExecProcNode
           ExecLimit
           ExecProcNode
           standard_ExecutorRun
           0x7f4315f7c427
           PortalRun
           PostgresMain
           PostmasterMain
           main
           __libc_start_main
      + 0.53% ExecProject
+  18.82%  postgres  postgres           [.] HeapTupleSatisfiesMVCC
+  12.01%  postgres  postgres           [.] 0xb6353
+   9.47%  postgres  postgres           [.] ExecProject


 The table is defined as:

        Column        |              Type              | Modifiers
----------------------+--------------------------------+-----------
 s_updated_at_0       | timestamp(3) with time zone    |
 s_version_1          | bigint                         |
 s_id_2               | bigint                         |
 s_created_at_3       | timestamp(3) with time zone    |
 id                   | numeric                        |
 case_number          | text                           |
 date                 | timestamp(3) without time zone |
 block                | text                           |
 iucr                 | text                           |
 primary_type         | text                           |
 description          | text                           |
 location_description | text                           |
 arrest               | boolean                        |
 domestic             | boolean                        |
 beat                 | text                           |
 district             | text                           |
 ward                 | numeric                        |
 community_area       | text                           |
 fbi_code             | text                           |
 x_coordinate         | numeric                        |
 y_coordinate         | numeric                        |
 year                 | numeric                        |
 updated_on           | timestamp(3) without time zone |
 latitude             | numeric                        |
 longitude            | numeric                        |
 location_lat         | double precision               |
 location_long        | double precision               |


I've been testing this against Postgres 9.3.5 on Ubuntu 12.04 LTS
running with a 3.2.0 kernel, and get similar results on both raw
hardware and in Azure VMs.  This repros on boxes with no other load.

Any suggestions about what is going on or where to dig further would
be appreciated.  I can make a pgdump of the data I'm using if anyone
is interested.

Thanks.


pgsql-performance by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Yet another abort-early plan disaster on 9.3
Next
From: Tom Lane
Date:
Subject: Re: performance of SELECT * much faster than SELECT with large offset