Re: Making all nbtree entries unique by having heap TIDs participatein comparisons - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Making all nbtree entries unique by having heap TIDs participatein comparisons
Date
Msg-id CAH2-Wz=03oLFYfKBFJtER2wsySXtKMaB8MHKCsUNBd8C5mD0qg@mail.gmail.com
Whole thread Raw
In response to Re: Making all nbtree entries unique by having heap TIDs participatein comparisons  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, Oct 18, 2018 at 1:44 PM Andres Freund <andres@anarazel.de> wrote:
> What kind of backend_flush_after values where you trying?
> backend_flush_after=0 obviously is the default, so I'm not clear on
> that.   How large is the database here, and how high is shared_buffers

I *was* trying backend_flush_after=512kB, but it's
backend_flush_after=0 in the benchmark I posted. See the
"postgres*settings" files.

On the master branch, things looked like this after the last run:

pg@tpcc_oltpbench[15547]=# \dt+
                      List of relations
 Schema │    Name    │ Type  │ Owner │   Size   │ Description
────────┼────────────┼───────┼───────┼──────────┼─────────────
 public │ customer   │ table │ pg    │ 4757 MB  │
 public │ district   │ table │ pg    │ 5240 kB  │
 public │ history    │ table │ pg    │ 1442 MB  │
 public │ item       │ table │ pg    │ 10192 kB │
 public │ new_order  │ table │ pg    │ 140 MB   │
 public │ oorder     │ table │ pg    │ 1185 MB  │
 public │ order_line │ table │ pg    │ 19 GB    │
 public │ stock      │ table │ pg    │ 9008 MB  │
 public │ warehouse  │ table │ pg    │ 4216 kB  │
(9 rows)

pg@tpcc_oltpbench[15547]=# \di+
                                         List of relations
 Schema │                 Name                 │ Type  │ Owner │
Table    │  Size   │ Description
────────┼──────────────────────────────────────┼───────┼───────┼────────────┼─────────┼─────────────
 public │ customer_pkey                        │ index │ pg    │
customer   │ 367 MB  │
 public │ district_pkey                        │ index │ pg    │
district   │ 600 kB  │
 public │ idx_customer_name                    │ index │ pg    │
customer   │ 564 MB  │
 public │ idx_order                            │ index │ pg    │
oorder     │ 715 MB  │
 public │ item_pkey                            │ index │ pg    │ item
     │ 2208 kB │
 public │ new_order_pkey                       │ index │ pg    │
new_order  │ 188 MB  │
 public │ oorder_o_w_id_o_d_id_o_c_id_o_id_key │ index │ pg    │
oorder     │ 715 MB  │
 public │ oorder_pkey                          │ index │ pg    │
oorder     │ 958 MB  │
 public │ order_line_pkey                      │ index │ pg    │
order_line │ 9624 MB │
 public │ stock_pkey                           │ index │ pg    │ stock
     │ 904 MB  │
 public │ warehouse_pkey                       │ index │ pg    │
warehouse  │ 56 kB   │
(11 rows)

> Is it possible that there's new / prolonged cases where a buffer is read
> from disk after the patch? Because that might require doing *write* IO
> when evicting the previous contents of the victim buffer, and obviously
> that can take longer if you're running with backend_flush_after > 0.

Yes, I suppose that that's possible, because the buffer
popularity/usage_count will be affected in ways that cannot easily be
predicted. However, I'm not running with "backend_flush_after > 0"
here -- that was before.

> I wonder if it'd make sense to hack up a patch that logs when evicting a
> buffer while already holding another lwlock. That shouldn't be too hard.

I'll look into this.

Thanks
-- 
Peter Geoghegan

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Making all nbtree entries unique by having heap TIDs participatein comparisons
Next
From: Peter Eisentraut
Date:
Subject: Re: file cloning in pg_upgrade and CREATE DATABASE