Re: pglz compression performance, take two - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: pglz compression performance, take two
Date
Msg-id e04e4097-4cac-445b-f7bf-53edf305eccc@enterprisedb.com
Whole thread Raw
In response to Re: pglz compression performance, take two  (Andres Freund <andres@anarazel.de>)
Responses Re: pglz compression performance, take two
List pgsql-hackers

On 2/7/23 21:18, Andres Freund wrote:
> Hi,
> 
> On 2023-02-05 10:36:39 -0800, Andrey Borodin wrote:
>> On Fri, Jan 6, 2023 at 10:02 PM Andrey Borodin <amborodin86@gmail.com> wrote:
>>>
>>> Hello! Please find attached v8.
>>
>> I got some interesting feedback from some patch users.
>> There was an oversight that frequently yielded results that are 1,2 or
>> 3 bytes longer than expected.
>> Looking closer I found that the correctness of the last 3-byte tail is
>> checked in two places. PFA fix for this. Previously compressed data
>> was correct, however in some cases few bytes longer than the result of
>> current pglz implementation.
> 
> This version fails on cfbot, due to address sanitizer:
> 
> https://cirrus-ci.com/task/4921632586727424
> https://api.cirrus-ci.com/v1/artifact/task/4921632586727424/log/src/test/regress/log/initdb.log
> 
> 
> performing post-bootstrap initialization ... =================================================================
> ==15991==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61e000002ee0 at pc 0x558e1b847b16 bp
0x7ffd35782f70sp 0x7ffd35782f68
 
> READ of size 1 at 0x61e000002ee0 thread T0
>     #0 0x558e1b847b15 in pglz_hist_add /tmp/cirrus-ci-build/src/common/pg_lzcompress.c:310
>     #1 0x558e1b847b15 in pglz_compress /tmp/cirrus-ci-build/src/common/pg_lzcompress.c:680
>     #2 0x558e1aa86ef0 in pglz_compress_datum /tmp/cirrus-ci-build/src/backend/access/common/toast_compression.c:65
>     #3 0x558e1aa87af2 in toast_compress_datum /tmp/cirrus-ci-build/src/backend/access/common/toast_internals.c:68
>     #4 0x558e1ac22989 in toast_tuple_try_compression
/tmp/cirrus-ci-build/src/backend/access/table/toast_helper.c:234
>     #5 0x558e1ab6af24 in heap_toast_insert_or_update /tmp/cirrus-ci-build/src/backend/access/heap/heaptoast.c:197
>     #6 0x558e1ab4a2a6 in heap_update /tmp/cirrus-ci-build/src/backend/access/heap/heapam.c:3533
> ...
> 

Yeah, and valgrind seems to hit the same issue (it's not labeled as
buffer overflow, but it seems to be exactly the same place):

==380682== Invalid read of size 1
==380682==    at 0xBCEAAB: pglz_hist_add (pg_lzcompress.c:310)
==380682==    by 0xBCF130: pglz_compress (pg_lzcompress.c:670)
==380682==    by 0x4A911F: pglz_compress_datum (toast_compression.c:65)
==380682==    by 0x4A97E2: toast_compress_datum (toast_internals.c:68)
==380682==    by 0x54CCA4: toast_tuple_try_compression (toast_helper.c:234)
==380682==    by 0x4FFC33: heap_toast_insert_or_update (heaptoast.c:197)
==380682==    by 0x4ED498: heap_update (heapam.c:3624)
==380682==    by 0x4EE023: simple_heap_update (heapam.c:4060)
==380682==    by 0x5B1B2B: CatalogTupleUpdateWithInfo (indexing.c:329)
==380682==    by 0x65C3AB: update_attstats (analyze.c:1741)
==380682==    by 0x65A054: do_analyze_rel (analyze.c:602)
==380682==    by 0x659405: analyze_rel (analyze.c:261)
==380682==    by 0x70A162: vacuum (vacuum.c:523)
==380682==    by 0x8DF8F7: autovacuum_do_vac_analyze (autovacuum.c:3155)
==380682==    by 0x8DE74A: do_autovacuum (autovacuum.c:2473)
==380682==    by 0x8DD49E: AutoVacWorkerMain (autovacuum.c:1716)
==380682==    by 0x8DD097: StartAutoVacWorker (autovacuum.c:1494)
==380682==    by 0x8EA5B2: StartAutovacuumWorker (postmaster.c:5481)
==380682==    by 0x8EA10A: process_pm_pmsignal (postmaster.c:5192)
==380682==    by 0x8E6121: ServerLoop (postmaster.c:1770)
==380682==  Address 0xe722c78 is 103,368 bytes inside a recently
re-allocated block of size 131,072 alloc'd
==380682==    at 0x48457AB: malloc (vg_replace_malloc.c:393)
==380682==    by 0xB95423: AllocSetAlloc (aset.c:929)
==380682==    by 0xBA2B6C: palloc (mcxt.c:1224)
==380682==    by 0x4A0962: heap_copytuple (heaptuple.c:687)
==380682==    by 0x73A2BB: tts_buffer_heap_copy_heap_tuple
(execTuples.c:842)
==380682==    by 0x658E42: ExecCopySlotHeapTuple (tuptable.h:464)
==380682==    by 0x65B288: acquire_sample_rows (analyze.c:1261)
==380682==    by 0x659E42: do_analyze_rel (analyze.c:536)
==380682==    by 0x659405: analyze_rel (analyze.c:261)
==380682==    by 0x70A162: vacuum (vacuum.c:523)
==380682==    by 0x8DF8F7: autovacuum_do_vac_analyze (autovacuum.c:3155)
==380682==    by 0x8DE74A: do_autovacuum (autovacuum.c:2473)
==380682==    by 0x8DD49E: AutoVacWorkerMain (autovacuum.c:1716)
==380682==    by 0x8DD097: StartAutoVacWorker (autovacuum.c:1494)
==380682==    by 0x8EA5B2: StartAutovacuumWorker (postmaster.c:5481)
==380682==    by 0x8EA10A: process_pm_pmsignal (postmaster.c:5192)
==380682==    by 0x8E6121: ServerLoop (postmaster.c:1770)
==380682==    by 0x8E5B54: PostmasterMain (postmaster.c:1463)
==380682==    by 0x7A806C: main (main.c:200)
==380682==

The place allocating the buffer changes over time, but the first part
(invalid read) seems to be exactly the same.

FWIW I did run previous versions using valgrind, so this gotta be due
some recent change.

> 
> Independent of this failure, I'm worried about the cost/benefit analysis of a
> pglz change that changes this much at once. It's quite hard to review.
> 

I agree.

I think I managed to understand what the patch does during the review,
but it's so much harder - it'd definitely be better to have this split
into smaller parts, somehow. Interestingly enough the commit message
actually says this:

  This patch accumulates several changes to pglz compression:
  1. Convert macro-functions to regular functions for readability
  2. Use more compact hash table with uint16 indexes instead of pointers
  3. Avoid prev pointer in hash table
  4. Use 4-byte comparisons during a search instead of 1-byte
     comparisons

Which I think is a pretty good recipe how to split the patch. (And we
also need a better commit message, or at least a proposal.)

This'd probably also help when investigating the extra byte issue,
discussed yesterday. (Assuming it's not related to the invalid access
reported by valgrind / asan).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Exit walsender before confirming remote flush in logical replication
Next
From: Laurenz Albe
Date:
Subject: Re: Why cann't simplify stable function in planning phase?