Re: [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

From Thom Brown
Subject Re: [WIP] Effective storage of duplicates in B-tree index.
Date
Msg-id CAA-aLv6CD8VZYo40unBewD+8XfQ-9jT0KcPYjQObBiaUsKicgA@mail.gmail.com
Whole thread Raw
In response to Re: [WIP] Effective storage of duplicates in B-tree index.  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Responses Re: [WIP] Effective storage of duplicates in B-tree index.
List pgsql-hackers
On 2 February 2016 at 11:47, Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
>
>
> 29.01.2016 20:43, Thom Brown:
>
>> On 29 January 2016 at 16:50, Anastasia Lubennikova
>> <a.lubennikova@postgrespro.ru>  wrote:
>>>
>>> 29.01.2016 19:01, Thom Brown:
>>>>
>>>> On 29 January 2016 at 15:47, Aleksander Alekseev
>>>> <a.alekseev@postgrespro.ru>  wrote:
>>>>>
>>>>> I tested this patch on x64 and ARM servers for a few hours today. The
>>>>> only problem I could find is that INSERT works considerably slower
>>>>> after
>>>>> applying a patch. Beside that everything looks fine - no crashes, tests
>>>>> pass, memory doesn't seem to leak, etc.
>>>
>>> Thank you for testing. I rechecked that, and insertions are really very
>>> very
>>> very slow. It seems like a bug.
>>>
>>>>>> Okay, now for some badness.  I've restored a database containing 2
>>>>>> tables, one 318MB, another 24kB.  The 318MB table contains 5 million
>>>>>> rows with a sequential id column.  I get a problem if I try to delete
>>>>>> many rows from it:
>>>>>> # delete from contacts where id % 3 != 0 ;
>>>>>> WARNING:  out of shared memory
>>>>>> WARNING:  out of shared memory
>>>>>> WARNING:  out of shared memory
>>>>>
>>>>> I didn't manage to reproduce this. Thom, could you describe exact steps
>>>>> to reproduce this issue please?
>>>>
>>>> Sure, I used my pg_rep_test tool to create a primary (pg_rep_test
>>>> -r0), which creates an instance with a custom config, which is as
>>>> follows:
>>>>
>>>> shared_buffers = 8MB
>>>> max_connections = 7
>>>> wal_level = 'hot_standby'
>>>> cluster_name = 'primary'
>>>> max_wal_senders = 3
>>>> wal_keep_segments = 6
>>>>
>>>> Then create a pgbench data set (I didn't originally use pgbench, but
>>>> you can get the same results with it):
>>>>
>>>> createdb -p 5530 pgbench
>>>> pgbench -p 5530 -i -s 100 pgbench
>>>>
>>>> And delete some stuff:
>>>>
>>>> thom@swift:~/Development/test$ psql -p 5530 pgbench
>>>> Timing is on.
>>>> psql (9.6devel)
>>>> Type "help" for help.
>>>>
>>>>
>>>>    ➤ psql://thom@[local]:5530/pgbench
>>>>
>>>> # DELETE FROM pgbench_accounts WHERE aid % 3 != 0;
>>>> WARNING:  out of shared memory
>>>> WARNING:  out of shared memory
>>>> WARNING:  out of shared memory
>>>> WARNING:  out of shared memory
>>>> WARNING:  out of shared memory
>>>> WARNING:  out of shared memory
>>>> WARNING:  out of shared memory
>>>> ...
>>>> WARNING:  out of shared memory
>>>> WARNING:  out of shared memory
>>>> DELETE 6666667
>>>> Time: 22218.804 ms
>>>>
>>>> There were 358 lines of that warning message.  I don't get these
>>>> messages without the patch.
>>>>
>>>> Thom
>>>
>>> Thank you for this report.
>>> I tried to reproduce it, but I couldn't. Debug will be much easier now.
>>>
>>> I hope I'll fix these issueswithin the next few days.
>>>
>>> BTW, I found a dummy mistake, the previous patch contains some unrelated
>>> changes. I fixed it in the new version (attached).
>>
>> Thanks.  Well I've tested this latest patch, and the warnings are no
>> longer generated.  However, the index sizes show that the patch
>> doesn't seem to be doing its job, so I'm wondering if you removed too
>> much from it.
>
>
> Huh, this patch seems to be enchanted) It works fine for me. Did you perform
> "make distclean"?

Yes.  Just tried it again:

git clean -fd
git stash
make distclean
patch -p1 < ~/Downloads/btree_compression_2.0.patch
../dopg.sh (script I've always used to build with)
pg_ctl start
createdb pgbench
pgbench -i -s 100 pgbench

$ psql pgbench
Timing is on.
psql (9.6devel)
Type "help" for help.

➤ psql://thom@[local]:5488/pgbench

# \di+                                   List of relationsSchema |         Name          | Type  | Owner |      Table
   | 
Size  | Description
--------+-----------------------+-------+-------+------------------+--------+-------------public |
pgbench_accounts_pkey| index | thom  | pgbench_accounts | 214 MB |public | pgbench_branches_pkey | index | thom  |
pgbench_branches| 24 kB  |public | pgbench_tellers_pkey  | index | thom  | pgbench_tellers  | 48 kB  | 
(3 rows)

Previously, this would show an index size of 87MB for pgbench_accounts_pkey.

> Anyway, I'll send a new version soon.
> I just write here to say that I do not disappear and I do remember about the
> issue.
> I even almost fixed the insert speed problem. But I'm very very busy this
> week.
> I'll send an updated patch next week as soon as possible.

Thanks.

> Thank you for attention to this work.

Thanks for your awesome patches.

Thom



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [PATCH] Refactoring of LWLock tranches
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Freeze avoidance of very large table.