On 8/7/23 18:56, Nathan Bossart wrote:
> On Mon, Aug 07, 2023 at 12:51:24PM +0200, Tomas Vondra wrote:
>> The bad news is this seems to have negative impact on cases with few
>> partitions, that'd fit into 16 slots. Which is not surprising, as the
>> code has to walk longer arrays, it probably affects caching etc. So this
>> would hurt the systems that don't use that many relations - not much,
>> but still.
>>
>> The regression appears to be consistently ~3%, and v2 aimed to improve
>> that - at least for the case with just 100 rows. It even gains ~5% in a
>> couple cases. It's however a bit strange v2 doesn't really help the two
>> larger cases.
>>
>> Overall, I think this seems interesting - it's hard to not like doubling
>> the throughput in some cases. Yes, it's 100 rows only, and the real
>> improvements are bound to be smaller, it would help short OLTP queries
>> that only process a couple rows.
>
> Indeed. I wonder whether we could mitigate the regressions by using SIMD
> intrinsics in the loops. Or auto-vectorization, if that is possible.
>
Maybe, but from what I know about SIMD it would require a lot of changes
to the design, so that the loops don't mix accesses to different PGPROC
fields (fpLockBits, fpRelId) and so on. But I think it'd be better to
just stop walking the whole array regularly.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company