Re: [HACKERS] Remove 1MB size limit in tsvector - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [HACKERS] Remove 1MB size limit in tsvector
Date
Msg-id f240a088-aab8-832c-f4e5-f6fdf2624ac8@2ndquadrant.com
Whole thread Raw
In response to Re: [HACKERS] Remove 1MB size limit in tsvector  (Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru>)
Responses Re: [HACKERS] Remove 1MB size limit in tsvector
List pgsql-hackers
Hi,

On 08/17/2017 12:23 PM, Ildus Kurbangaliev wrote:
> In my benchmarks when database fits into buffers (so it's measurement of
> the time required for the tsvectors conversion) it gives me these
> results:
> 
> Without conversion:
> 
> $ ./tsbench2 -database test1 -bench_time 300
> 2017/08/17 12:04:44 Number of connections:  4
> 2017/08/17 12:04:44 Database:  test1
> 2017/08/17 12:09:44 Processed: 51419
> 
> With conversion:
> 
> $ ./tsbench2 -database test1 -bench_time 300
> 2017/08/17 12:14:31 Number of connections:  4
> 2017/08/17 12:14:31 Database:  test1
> 2017/08/17 12:19:31 Processed: 43607
> 
> I ran a bunch of these tests, and these results are stable on my
> machine. So in these specific tests performance regression about 15%.
> 
> Same time I think this could be the worst case, because usually data
> is on disk and conversion will not affect so much to performance.
> 

That seems like a fairly significant regression, TBH. I don't quite
agree we can simply assume in-memory workloads don't matter, plenty of
databases have 99% cache hit ratio (particularly when considering not
just shared buffers, but also page cache).

Can you share the benchmarks, so that others can retry running them?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] Hooks to track changed pages for backup purposes
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range