Thread: how to use trace_lock_oidmin config correctly
If set, do not trace locks for tables below this OID (used to avoid output on system tables).
This parameter is only available if the LOCK_DEBUG macro was defined when PostgreSQL was compiled
I do not see logs for locks linked to attaching and detaching partitions if I use thetrace_lock_oidmin config set to oid below the table of concern, basically any locking on objects above the threshold oid do not log.from the doc:trace_lock_oidmin (integer)
If set, do not trace locks for tables below this OID (used to avoid output on system tables).
This parameter is only available if the LOCK_DEBUG macro was defined when PostgreSQL was compiledi compiled my postgres with LOCK_DEBUG macro/*postgres@ubuntu:/tmp$ pg_configBINDIR = /opt/postgresql/binDOCDIR = /opt/postgresql/share/docHTMLDIR = /opt/postgresql/share/docINCLUDEDIR = /opt/postgresql/includePKGINCLUDEDIR = /opt/postgresql/includeINCLUDEDIR-SERVER = /opt/postgresql/include/serverLIBDIR = /opt/postgresql/libPKGLIBDIR = /opt/postgresql/libLOCALEDIR = /opt/postgresql/share/localeMANDIR = /opt/postgresql/share/manSHAREDIR = /opt/postgresql/shareSYSCONFDIR = /opt/postgresql/etcPGXS = /opt/postgresql/lib/pgxs/src/makefiles/pgxs.mkCONFIGURE = '--prefix=/opt/postgresql' '--with-openssl' '--enable-debug' '--enable-profiling' '--enable-cassert' '--enable-tap-tests' 'CFLAGS=-ggdb -Og -g3 -fno-omit-frame-pointer -DLOCK_DEBUG -DBTREE_BUILD_STATS -DWAL_DEBUG 'CC = gccCPPFLAGS = -D_GNU_SOURCECFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wshadow=compatible-local -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -pg -DLINUX_PROFILE -ggdb -Og -g3 -fno-omit-frame-pointer -DLOCK_DEBUG -DBTREE_BUILD_STATS -DWAL_DEBUGCFLAGS_SL = -fPICLDFLAGS = -Wl,--as-needed -Wl,-rpath,'/opt/postgresql/lib',--enable-new-dtagsLDFLAGS_EX =LDFLAGS_SL =LIBS = -lpgcommon -lpgport -lssl -lcrypto -lz -lreadline -lmVERSION = PostgreSQL 18develpostgres@ubuntu:/tmp$ psqlpsql (18devel)Type "help" for help.postgres=# show client_min_messages;client_min_messages---------------------log(1 row)postgres=# show trace_lock_oidmin;trace_lock_oidmin-------------------16400(1 row)postgres=# select 't'::regclass::oid;oid-------16401(1 row)postgres=# select 't1'::regclass::oid;oid-------16404(1 row)postgres=# alter table t detach partition t1;ALTER TABLEpostgres=# alter table t attach partition t1 for values in (0);ALTER TABLE*/but if i map the trace_lock_table to the oid of one table, it logs locking fine./*postgres=# alter system set trace_lock_table = 16401;ALTER SYSTEMpostgres=# select pg_reload_conf();pg_reload_conf----------------t(1 row)postgres=# alter table t detach partition t1;LOG: LockAcquire: lock [5,16401] AccessExclusiveLockLOG: LockAcquire: new: lock(0x708b6d12ae78) id(5,16401,0,0,0,1) grantMask(0) req(0,0,0,0,0,0,0)=0 grant(0,0,0,0,0,0,0)=0 wait(0) type(AccessExclusiveLock)LOG: LockAcquire: new: proclock(0x708b6d1da680) lock(0x708b6d12ae78) method(1) proc(0x708b6d4b8250) hold(0)LOG: LockCheckConflicts: no conflict: proclock(0x708b6d1da680) lock(0x708b6d12ae78) method(1) proc(0x708b6d4b8250) hold(0)LOG: GrantLock: lock(0x708b6d12ae78) id(5,16401,0,0,0,1) grantMask(100) req(0,0,0,0,0,0,0)=1 grant(0,0,0,0,0,0,0)=1 wait(0) type(AccessExclusiveLock)LOG: LockAcquire: lock [5,16401] AccessExclusiveLockLOG: LockReleaseAll: proclock(0x708b6d1da680) lock(0x708b6d12ae78) method(1) proc(0x708b6d4b8250) hold(100)LOG: LockReleaseAll: lock(0x708b6d12ae78) id(5,16401,0,0,0,1) grantMask(100) req(0,0,0,0,0,0,0)=1 grant(0,0,0,0,0,0,0)=1 wait(0) type(INVALID)LOG: UnGrantLock: updated: lock(0x708b6d12ae78) id(5,16401,0,0,0,1) grantMask(0) req(0,0,0,0,0,0,0)=0 grant(0,0,0,0,0,0,0)=0 wait(0) type(AccessExclusiveLock)LOG: UnGrantLock: updated: proclock(0x708b6d1da680) lock(0x708b6d12ae78) method(1) proc(0x708b6d4b8250) hold(0)LOG: LockReleaseAll: updated: lock(0x708b6d12ae78) id(5,16401,0,0,0,1) grantMask(0) req(0,0,0,0,0,0,0)=0 grant(0,0,0,0,0,0,0)=0 wait(0) type(INVALID)LOG: CleanUpLock: deleting: proclock(0x708b6d1da680) lock(0x708b6d12ae78) method(1) proc(0x708b6d4b8250) hold(0)LOG: CleanUpLock: deleting: lock(0x708b6d12ae78) id(5,16401,0,0,0,1) grantMask(0) req(0,0,0,0,0,0,0)=0 grant(0,0,0,0,0,0,0)=0 wait(0) type(INVALID)ALTER TABLE*/--
How to Copy/Load 1 billions rows into a Partition Tables Fast
I am trying to copy a table (Postgres) that is close to 1 billion rows into a Partition table (Postgres) within the same DB. What is the fastest way to copy the data? This table has 37 columns where some of which are text data types.
Thank you
Kam Fook Wong
This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.htmlI am trying to copy a table (Postgres) that is close to 1 billion rows into a Partition table (Postgres) within the same DB. What is the fastest way to copy the data? This table has 37 columns where some of which are text data types.
Thank you
Kam Fook Wong
There are many methods to achieve this and one of them is pg_bulkload utility as described in previous email but I always preferred using python multiprocessing which I think is more efficient. Below is the code which you can modify as per your requirement:
import multiprocessing
import psycopg2
def insert_partition(date_range):
conn = psycopg2.connect("dbname=your_db user=your_user password=your_password")
cur = conn.cursor()
query = f"""
INSERT INTO partitioned_table (column1, column2, ...)
SELECT column1, column2, ...
FROM source_table
WHERE partition_key BETWEEN '{date_range[0]}' AND '{date_range[1]}';
"""
cur.execute(query)
conn.commit()
cur.close()
conn.close()
if __name__ == "__main__":
ranges = [
('2024-01-01', '2024-03-31'),
('2024-04-01', '2024-06-30'),
# Add more ranges as needed
]
pool = multiprocessing.Pool(processes=4) # Adjust based on CPU cores
pool.map(insert_partition, ranges)
pool.close()
pool.join()
This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.htmlI am trying to copy a table (Postgres) that is close to 1 billion rows into a Partition table (Postgres) within the same DB. What is the fastest way to copy the data? This table has 37 columns where some of which are text data types.
Thank you
Kam Fook Wong
Re: How to Copy/Load 1 billions rows into a Partition Tables Fast
This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.htmlI am trying to copy a table (Postgres) that is close to 1 billion rows into a Partition table (Postgres) within the same DB. What is the fastest way to copy the data? This table has 37 columns where some of which are text data types.
Thank you
Kam Fook Wong
On Tue, 15 Oct 2024 at 06:59, Wong, Kam Fook (TR Technology) <kamfook.wong@thomsonreuters.com> wrote: > I am trying to copy a table (Postgres) that is close to 1 billion rows into a Partition table (Postgres) within the sameDB. What is the fastest way to copy the data? This table has 37 columns where some of which are text data types. Is the purpose of this question because you're partitioning an existing table? If so, you might want to consider if speed is the biggest consideration to the requirements. It is possible, for example, to partition a table "online" by using table inheritance as an intermediate way to partition the table and migrate the rows in smaller batches into inheritance child tables with CHECK constraints backing up the partition constraint. You can use a CTE with a DELETE .. WHERE <clause to some small batch of rows> RETURNING with an INSERT INTO new_table SELECT * FROM cte;. Once the inheritance parent table is empty, you can then consider rearranging the inheritance hierarchy into a partitioned table and its partitions. The CHECK constraint will allow the tables to be ATTACHed as partitions to a new partitioned table without having to scan each partition to ensure no rows violate the partition constraint. If done correctly, the only blocking operation done is some DDL which includes renaming a table and attaching all the partitions. All of that should be metadata-only operations. You'll want to rehearse the migration a few times away from production to help ensure it'll run smoothly on the day. I'm not familiar with pg_bulkload so can't comment on the other suggestions, however, I'd be surprised if exporting the data out of and back into PostgreSQL would be faster than having it remain inside PostgreSQL. Not exporting/importing means you don't need to call output and input functions for every row and column. If you didn't want to go down the inheritance table as an intermediate step, then you might find it's quite fast to start up a series of parallel jobs to INSERT INTO partition_name SELECT * FROM original_table WHERE <rows for this partition>; David