> On Apr 10, 2019, at 9:08 PM, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote:
>
>
>> On 11/04/19 4:01 PM, Mark Kirkwood wrote:
>>> On 9/04/19 12:27 PM, Ashwin Agrawal wrote:
>>>
>>> Heikki and I have been hacking recently for few weeks to implement
>>> in-core columnar storage for PostgreSQL. Here's the design and initial
>>> implementation of Zedstore, compressed in-core columnar storage (table
>>> access method). Attaching the patch and link to github branch [1] to
>>> follow along.
>>>
>>>
>>
>> Very nice. I realize that it is very early days, but applying this patch I've managed to stumble over some
compressionbugs doing some COPY's:
>>
>> benchz=# COPY dim1 FROM '/data0/dump/dim1.dat'
>> USING DELIMITERS ',';
>> psql: ERROR: compression failed. what now?
>> CONTEXT: COPY dim1, line 458
>>
>> The log has:
>>
>> 2019-04-11 15:48:43.976 NZST [2006] ERROR: XX000: compression failed. what now?
>> 2019-04-11 15:48:43.976 NZST [2006] CONTEXT: COPY dim1, line 458
>> 2019-04-11 15:48:43.976 NZST [2006] LOCATION: zs_compress_finish, zedstore_compression.c:287
>> 2019-04-11 15:48:43.976 NZST [2006] STATEMENT: COPY dim1 FROM '/data0/dump/dim1.dat'
>> USING DELIMITERS ',';
>>
>> The dataset is generated from and old DW benchmark I wrote
(https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceforge.net_projects_benchw_&d=DwIDaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=gxIaqms7ncm0pvqXLI_xjkgwSStxAET2rnZQpzba2KM&m=BgmTkDoY6SKOgODe8v6fpH4hs-wM0H91cLfrAfEL6C0&s=lLcXp_8h2bRb_OR4FT8kxD-FG9MaLBPU7M5aV9nQ7JY&e=).
Therow concerned looks like:
>>
>> 457,457th interesting measure,1th measure
type,aqwycdevcmybxcnpwqgrdsmfelaxfpbhfxghamfezdiwfvneltvqlivstwralshsppcpchvdkdbraoxnkvexdbpyzgamajfp
>> 458,458th interesting measure,2th measure
type,bjgdsciehjvkxvxjqbhtdwtcftpfewxfhfkzjsdrdabbvymlctghsblxucezydghjrgsjjjnmmqhncvpwbwodhnzmtakxhsg
>>
>>
>> I'll see if changing to LZ4 makes any different.
>>
>>
>
> The COPY works with LZ4 configured.
Thank you for trying it out. Yes, noticed for certain patterns pg_lzcompress() actually requires much larger output
buffers.Like for one 86 len source it required 2296 len output buffer. Current zedstore code doesn’t handle this case
anderrors out. LZ4 for same patterns works fine, would highly recommend using LZ4 only, as anyways speed is very fast
aswell with it.