Re: [HACKERS] COPY FREEZE and PD_ALL_VISIBLE - Mailing list pgsql-hackers

From Darafei "Komяpa" Praliaskouski
Subject Re: [HACKERS] COPY FREEZE and PD_ALL_VISIBLE
Date
Msg-id CAC8Q8tJhCO7-2A+Gtv8ymtfMpOyrkyqyEuv=FPuv5xNRavxmLQ@mail.gmail.com
Whole thread Raw
In response to Re: COPY FREEZE and PD_ALL_VISIBLE  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: [HACKERS] COPY FREEZE and PD_ALL_VISIBLE
List pgsql-hackers
Hello,

Today I bumped into need to limit first VACUUM time on data import.
I'm using utility called osmium together with COPY FREEZE to import openstreetmap data into database.

osmium export -c osmium.config -f pg belarus-latest.osm.pbf  -v --progress | psql -1 -c 'create table byosm(geom geometry, osm_type text, osm_id bigint, tags jsonb);copy byosm from stdin freeze;'

However, first pass of VACUUM rewrites the whole table. Here is two logs of VACUUM VERBOSE in a row:


In Russian Postgres Telegram group I've been recommended this thread. 
Can the patch be revived? What is needed to get it up for 12?

On Sun, Aug 14, 2016 at 10:37 PM Jeff Janes <jeff.janes@gmail.com> wrote:
On Tue, Nov 3, 2015 at 6:37 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 3 November 2015 at 15:23, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Fri, Oct 23, 2015 at 6:29 AM, Simon Riggs <simon@2ndquadrant.com>
>> wrote:
>>>
>>> On 21 October 2015 at 13:31, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>
>>>> Index-only scans will visit the heap for each tuple until the first
>>>> VACUUM is done.
>>>>
>>>> The first vacuum will read the entire table, but not need to write it
>>>> anymore.  And will create the _vm file.
>>>>
>>>> I think we really want to create _vm file as well as set PD_ALL_VISIBLE,
>>>> but I don't know the best way to do that.  Set a flag somewhere and then
>>>> create it in bulk at the end of the transaction?  Set it bit by bit as the
>>>> pages are extended and initialized?
>>>
>>>
>>> Easy enough to do it at the end of the COPY FREEZE in one step.
>>
>>
>> Here, we might want to consider that setting bit in visibility map
>> will generate WAL log whereas Copy Freeze otherwise skip WAL
>> when wal_level is less than archive.  This can lead to extra disk
>> writes which can slow down Copy Freeze, but OTOH that might
>> be acceptable.
>
>
> I'm building the map as I go, in the latest version of this patch I'm
> working on.

Hi Simon,

Is this still on your radar?  If you would like someone else to pick
it up, can you post the WIP patch you have?

Thanks,

Jeff


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Darafei Praliaskouski

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: insensitive collations
Next
From: Tom Lane
Date:
Subject: Re: current_logfiles not following group access and instead follows log_file_mode permissions