Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches - Mailing list pgsql-general

From David G. Johnston
Subject Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches
Date
Msg-id CAKFQuwaUy=P_pNGNgcoyFcNExrRSqwhVTUE-SfUrGbj08BzLRw@mail.gmail.com
Whole thread Raw
In response to Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches
List pgsql-general
On Tue, Jan 31, 2023 at 8:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dimitrios Apostolou <jimis@gmx.net> writes:
> The question is why this simple query is taking so long to complete.

Do you get the same 10 rows when you repeat the command?

On the basis of the limited info you provided, I'm going to guess that

(1) there are huge subranges of the table containing no live rows,
so that a seqscan might have to pass over many blocks before it finds
some to return;

It feels like there is room for improvement here using table statistics and the visibility map to significantly reduce the number of pages retrieved that turn out to be all dead.  A limit without an order is not a query to optimize for of course...


(2) once you do reach an area having live rows, the next SELECT picks
up scanning in that same area due to the effects of
"synchronize_seqscans", so you get immediate answers until you reach
the next desert of dead tuples.

Interesting, I was assuming that the behavior on repeated calls was more due to intelligent buffer pool usage.  When doing a sequential scan (I suppose the key word being sequential...) the system decides which pages to fetch before checking to see if they are already in shared buffers (as opposed to returning the ones in shared buffers first then fetching more if needed)?

David J.

pgsql-general by date:

Previous
From: Christophe Pettus
Date:
Subject: Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches
Next
From: "jacktby@gmail.com"
Date:
Subject: Re: Re: How could I elog the tupleTableSlot to the fronted terminal?