SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches - Mailing list pgsql-general
From | Dimitrios Apostolou |
---|---|
Subject | SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches |
Date | |
Msg-id | b847d458-5a03-718e-5467-422fc0e51004@gmx.net Whole thread Raw |
Responses |
Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches
Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches VACUUM vs VACUUM FULL (was: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches) |
List | pgsql-general |
Hello list, I run a very simple SELECT on a huge table (350M rows, 10 columns of type integer or timestamp). The server is an old Mac Mini with 4GB RAM and a slow HDD. All caches are cold either via reboot, or by restarting postgres and using the `purge` command to flush the OS cache. The question is why this simple query is taking so long to complete. Most of the time I see constant 30MB/s read I/O. The memory usage is low and the system is not swapping or in other ways unhealthy. Postgres is not serving any other requests. Is postgres reading all the table sequentially, just to return 10 rows in no particular order? I'm trying to understand what PostgreSQL is doing and the data access patterns involved. All feedback is appreciated. # SELECT version(); PostgreSQL 15.0 on x86_64-apple-darwin20.6.0, compiled by Apple clang version 12.0.0 (clang-1200.0.32.29), 64-bit # EXPLAIN (ANALYZE, VERBOSE,BUFFERS,SETTINGS) SELECT * FROM test_runs_raw LIMIT 10; Limit (cost=0.00..0.29 rows=10 width=42) (actual time=746005.251..746014.910 rows=10 loops=1) Output: run_n, test_name_n, workitem_n, started_on, duration_ms, test_result_n, test_executable_n, test_function_n, test_datatag_n Buffers: shared read=2334526 I/O Timings: shared/local read=691137.029 -> Seq Scan on public.test_runs_raw (cost=0.00..9250235.80 rows=317603680 width=42) (actual time=745910.672..745910.677rows=10 loops=1) Output: run_n, test_name_n, workitem_n, started_on, duration_ms, test_result_n, test_executable_n, test_function_n,test_datatag_n Buffers: shared read=2334526 I/O Timings: shared/local read=691137.029 Settings: effective_cache_size = '2GB', max_parallel_workers_per_gather = '0', work_mem = '64MB' Planning: Buffers: shared hit=69 read=18 I/O Timings: shared/local read=265.550 Planning Time: 271.719 ms Execution Time: 746057.424 ms (14 rows) Repeating the query returns instantaneously, and EXPLAIN shows shared hit=14 and no reads. Changing the LIMIT to 100 takes another 3min to respond, with constant read I/O while it runs: # EXPLAIN (ANALYZE, VERBOSE,BUFFERS,SETTINGS) SELECT * FROM test_runs_raw LIMIT 100; Limit (cost=0.00..2.91 rows=100 width=42) (actual time=0.030..184692.698 rows=100 loops=1) Output: run_n, test_name_n, workitem_n, started_on, duration_ms, test_result_n, test_executable_n, test_function_n, test_datatag_n Buffers: shared hit=14 read=772427 I/O Timings: shared/local read=171484.072 -> Seq Scan on public.test_runs_raw (cost=0.00..9250235.80 rows=317603680 width=42) (actual time=0.029..184692.664rows=100 loops=1) Output: run_n, test_name_n, workitem_n, started_on, duration_ms, test_result_n, test_executable_n, test_function_n,test_datatag_n Buffers: shared hit=14 read=772427 I/O Timings: shared/local read=171484.072 Settings: effective_cache_size = '2GB', max_parallel_workers_per_gather = '0', work_mem = '64MB' Planning Time: 0.083 ms Execution Time: 184692.748 ms (11 rows) Reruning this is instantaneous again. Other postgres settings: shared_buffers = 1024MB effective_io_concurrency = 0 wal_compression = zstd wal_recycle = on min_wal_size = 1GB max_wal_size = 10GB checkpoint_timeout = 15min Bonus question: some time ago the server crashed and rebooted, and after a long WAL recovery and some hours of idle time, I run the same query. It took even longer to respond, and besides reading, there was also a lot writing involved. What kind of writes was postgres doing? # EXPLAIN (ANALYZE, VERBOSE,BUFFERS,SETTINGS) SELECT * FROM test_runs_raw LIMIT 10; Limit (cost=0.00..0.29 rows=10 width=42) (actual time=1284114.810..1284114.819 rows=10 loops=1) Output: run_n, test_name_n, workitem_n, started_on, duration_ms, test_result_n, test_executable_n, test_function_n, test_datatag_n Buffers: shared hit=47490 read=2287036 dirtied=1809974 written=1762484 I/O Timings: shared/local read=1041111.264 write=107258.150 -> Seq Scan on public.test_runs_raw (cost=0.00..9250235.80 rows=317603680 width=42) (actual time=1283955.130..1283955.135rows=10 loops=1) Output: run_n, test_name_n, workitem_n, started_on, duration_ms, test_result_n, test_executable_n, test_function_n,test_datatag_n Buffers: shared hit=47490 read=2287036 dirtied=1809974 written=1762484 I/O Timings: shared/local read=1041111.264 write=107258.150 Settings: effective_cache_size = '2GB', max_parallel_workers_per_gather = '0', work_mem = '64MB' Planning Time: 0.545 ms Execution Time: 1284114.972 ms (11 rows) Relevant settings: synchronous_commit = off wal_recycle = on min_wal_size = 1GB max_wal_size = 10GB track_io_timing = on track_wal_io_timing = on wal_compression = zstd Thanks in advance, Dimitris
pgsql-general by date: