Re: Is there any limit on the number of rows to import using copy command - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: Is there any limit on the number of rows to import using copy command |
Date | |
Msg-id | e8fb9b99-40a2-41eb-8932-0959db8356c6@aklaver.com Whole thread Raw |
In response to | Re: Is there any limit on the number of rows to import using copy command ("sivapostgres@yahoo.com" <sivapostgres@yahoo.com>) |
List | pgsql-general |
On 7/24/25 16:59, sivapostgres@yahoo.com wrote: > 1. Testcase. Created a new database, modified the triggers (split into > three), populated required master data, lookup tables. Then transferred > 86420 records. Checked whether all the 86420 records inserted in table1 > and also whether the trigger created the required records in table2. > Yes, it created. > > 2. In the test case above, the total time taken to insert 86420 records > is 1.15 min only. Earlier (before splitting the triggers) we waited > for more than 1.5 hrs first time and 2.5 hrs second time with no records > inserted. > > 3. Regarding moving the logic to procedure. Won't the trigger work? > Will it be a burden for 86420 records? It's working, if we insert few > thousand records. After split of trigger function, it's working for > 86420 records. Are triggers overhead for handling even 100000 records? > In production system, the same (single) trigger is working with 3 > millions of records. There might be better alternatives to triggers, > but triggers should also work. IMHO. Reread this post, in the thread, from Laurenz Albe: https://www.postgresql.org/message-id/de08fd016dd9c630f65c52b80292550e0bcdea4c.camel%40cybertec.at > > 4. Staging tables. Yes, I have done that in another case, where there > was a need to add data / transform for few more columns. It worked like > a charm. In this case, since there was no need for any other > calculations (transformation), and with just column to column matching, > I thought copy command will do. There is a transformation, you are moving data to another table. That is overhead, especially if the triggers are not optimized. > > Before splitting the trigger into three, we tried > 1. Transferring data using DataWindow / PowerBuilder (that's the tool > we use to develop our front end). With the same single trigger, it took > few hours (more than 4 hours, exact time not noted down) to transfer the > same 86420 records. (Datawindow fires insert statements for every > row). Works, but the time taken is not acceptable. INSERTs by row is going to be slow, especially if the tool is doing a commit for each which I suspect it is. Check the Postgres logs. > > 2. Next, we split the larger csv file into 8, with each file containing > 10,000 records and the last one with 16420 records. Copy command > worked. Works, but the time taken to split the file not acceptable. We > wrote a batch file to split the larger csv file. We felt batch file is > easier to automate the whole process using PowerBuilder. I find most GUI tools create extra steps and overhead. My preference are simpler tools e.g. using Python csv module to batch/stream rows that the Python psycopg2 Postgres driver can insert or copy into the database. See: https://www.psycopg.org/psycopg3/docs/basic/copy.html > > 3. What we observed here, is insert statement succeeds and copy command > fails, if the records exceed a certain no. Haven't arrived the exact > number of rows when the copy command fails. > > Will do further works after my return from a holiday. > > Happiness Always > BKR Sivaprakash > > > > On Thursday 24 July, 2025 at 08:18:07 pm IST, Adrian Klaver > <adrian.klaver@aklaver.com> wrote: > > > On 7/24/25 05:18, sivapostgres@yahoo.com <mailto:sivapostgres@yahoo.com> > wrote: > > Thanks Merlin, adrain, Laurenz > > > > As a testcase, I split the trigger function into three, one each for > > insert, update, delete, each called from a separate trigger. > > > > IT WORKS!. > > It worked before, it just slowed down as your cases got bigger. You need > to provide more information on what test case you used and how you > define worked. > > > > > Shouldn't we have one trigger function for all the three trigger > > events? Is it prohibited for bulk insert like this? > > No. Triggers are overhead and they add to the processing that need to be > done for moving the data into the table. Whether that is an issue is a > case by case determination. > > > > > I tried this in PGAdmin only, will complete the testing from the program > > which we are developing, after my return from holiday. > > From Merlin Moncure's post: > > "* reconfiguring your logic to a procedure can be a better idea; COPY > your data into some staging tables (perhaps temp, and indexed), then > write to various tables with joins, upserts, etc." > > I would suggest looking into implementing the above. > > > > > > Happiness Always > > BKR Sivaprakash > > > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > -- Adrian Klaver adrian.klaver@aklaver.com
pgsql-general by date: