Re: Why could different data in a table be processed with differentperformance? - Mailing list pgsql-performance
From | Fabio Pardi |
---|---|
Subject | Re: Why could different data in a table be processed with differentperformance? |
Date | |
Msg-id | e8d93055-9643-0ad2-d8bd-3f4fc344bf51@portavita.eu Whole thread Raw |
In response to | Re: Why could different data in a table be processed with different performance? (Vladimir Ryabtsev <greatvovan@gmail.com>) |
Responses |
Re: Why could different data in a table be processed with different performance?
|
List | pgsql-performance |
Hi,
answers (and questions) in line here below
Agree, it was pure speculation
You can create 2 partial indexes and the planner will pick it up for you. (and the planning time will go a bit up)
> - does the raid controller have a cache?
it is not unusual to have 1GB cache or more... and do not forget to drop the cache between tests + do a sync
Given that the most of the time is on the I/O then caching is maybe playing a role.
I tried to reproduce your problem but I cannot go even closer to your results. Everything goes smooth with or without shared buffers, or OS cache.
A few questions and considerations came to mind:
- how big is your index?
- how big is the table?
- given the size of shared_buffers, almost 2M blocks should fit, but you say 2 consecutive runs still are hitting the disk. That's strange indeed since you are using way more than 2M blocks.
Did you check that perhaps are there any other processes or cronjobs (on postgres and on the system) that are maybe reading data and flushing out the cache?
You can make use of pg_buffercache in order to see what is actually cached. That might help to have an overview of the content of it.
- As Laurenz suggested (VACUUM FULL), you might want to move data around. You can try also a dump + restore to narrow the problem to data or disk
- You might also want to try to see the disk graph of Windows, while you are running your tests. It can show you if data (and good to know how much) is actually fetching from disk or not.
regards,
fabio pardi
answers (and questions) in line here below
On 22/09/18 11:19, Vladimir Ryabtsev wrote:
agree, should not play a role here> is the length of the text equally distributed over the 2 partitions?
Not 100% equally, but to me it does not seem to be a big deal... Considering the ranges independently:First range: ~70% < 10 KB, ~25% for 10-20 KB, ~3% for 20-30 KB, everything else is less than 1% (with 10 KB steps).Second range: ~80% < 10 KB, ~18% for 10-20 KB, ~2% for 20-30 KB, everything else is less than 1% (with 10 KB steps).
>From what you posted, the first query retrieves 5005 rows, but the second 2416. It might be helpful if we are able to compare 5000 vs 5000Yes it was just an example, here are the plans for approximately same number of rows:Aggregate (cost=9210.12..9210.13 rows=1 width=16) (actual time=4265.478..4265.479 rows=1 loops=1)Buffers: shared hit=27027 read=4311I/O Timings: read=2738.728-> Index Scan using articles_pkey on articles (cost=0.57..9143.40 rows=5338 width=107) (actual time=12.254..873.081 rows=5001 loops=1)Index Cond: ((article_id >= 438030000) AND (article_id <= 438035000))Buffers: shared hit=4282 read=710I/O Timings: read=852.547Planning time: 0.235 msExecution time: 4265.554 msAggregate (cost=11794.59..11794.60 rows=1 width=16) (actual time=62298.559..62298.559 rows=1 loops=1)Buffers: shared hit=15071 read=14847I/O Timings: read=60703.859-> Index Scan using articles_pkey on articles (cost=0.57..11709.13 rows=6837 width=107) (actual time=24.686..24582.221 rows=5417 loops=1)Index Cond: ((article_id >= '100021040000'::bigint) AND (article_id <= '100021060000'::bigint))Buffers: shared hit=195 read=5244I/O Timings: read=24507.621Planning time: 0.494 msExecution time: 62298.630 msIf we subtract I/O from total time, we get 1527 ms vs 1596 ms — very close timings for other than I/O operations (considering slightly higher number of rows in second case). But I/O time differs dramatically.> Also is worth noticing that the 'estimated' differs from 'actual' on the second query. I think that happens because data is differently distributed over the ranges. Probably the analyzer does not have enough samples to understand the real distribution.I think we should not worry about it unless the planner chose poor plan, should we? Statistics affects on picking a proper plan, but not on execution of the plan, doesn't it?
Agree, it was pure speculation
you cannot drop it since is on a PKEY.> or to create partial indexes on the 2 ranges.Sure, will try it with partial indexes. Should I drop existing PK index, or ensuring that planner picks range index is enough?
You can create 2 partial indexes and the planner will pick it up for you. (and the planning time will go a bit up)
> - does the raid controller have a cache?
> - how big is the cache? (when you measure disk speed, that will influence the result very much, if you do not run the test on big-enough data chunk) best if is disabled during your testsI am pretty sure there is some, usually it's several tens of megabytes, but I ran disk read tests several times with chunks that could not be fit in the cache and with random offset, so I am pretty sure that something around 500 MB/s is enough reasonably accurate (but it is only for sequential read).
it is not unusual to have 1GB cache or more... and do not forget to drop the cache between tests + do a sync
Sorry I meant the hypervisor OS.> - is the OS caching disk blocks too? maybe you want to drop everything from there too.How can I find it out? And how to drop it? Or you mean hypervisor OS?Anyway, don't you think that caching specifics could not really explain these issues?
Given that the most of the time is on the I/O then caching is maybe playing a role.
I tried to reproduce your problem but I cannot go even closer to your results. Everything goes smooth with or without shared buffers, or OS cache.
A few questions and considerations came to mind:
- how big is your index?
- how big is the table?
- given the size of shared_buffers, almost 2M blocks should fit, but you say 2 consecutive runs still are hitting the disk. That's strange indeed since you are using way more than 2M blocks.
Did you check that perhaps are there any other processes or cronjobs (on postgres and on the system) that are maybe reading data and flushing out the cache?
You can make use of pg_buffercache in order to see what is actually cached. That might help to have an overview of the content of it.
- As Laurenz suggested (VACUUM FULL), you might want to move data around. You can try also a dump + restore to narrow the problem to data or disk
- You might also want to try to see the disk graph of Windows, while you are running your tests. It can show you if data (and good to know how much) is actually fetching from disk or not.
regards,
fabio pardi
pgsql-performance by date: