Thread: Batch Insert Performance
I'm trying to seed a database and I'm surprised by how slow it is. I'm doing a query like: INSERT INTO source_other_xref (source_id, other_id) ( SELECT source_id,1+trunc(1000*random()) FROM source_table ); where source_table has 271,000 records. Table source_other_xref has two columns both of which are foreign key references, one to source_table and the other to another table. It's been running for hours and hours. The machine doesn't seem to be thrashing its disk writing or reading much. It seems to be completely cpu-bound. I'm thinking it's mostly wasting time checking the foreign key constraint. Is there any equivalent to Oracle's ability to "disable" a constraint and then reenable it later? Even with the foreign key constraints I wouldn't have expected inserting 200k records to take this long. -- greg
Hi greg, here is an info page that could be useful for you: "Referential Integrity Tutorial & Hacking" at http://techdocs.postgresql.org/techdocs/hackingreferentialintegrity.php But be careful, you have to deal with the system tables directly. Have you tried selecting the data into a temp table and "copying" that into the source_other_xref table? Could that be faster? Ciao, peter
Ok, a weird thing happened. I gave up on the batch insert, went to a friend's house to drink some scotch, came back and tried it again and the whole insert took 5 minutes. That's _much_ more reasonable. Now it was late at night and I wasn't necessarily completely sober but I'm pretty sure _nothing_ had changed. I can only imagine somehow the query was frozen trying to grab a lock on something. Only nothing else was running and it was consuming an awful lot of cpu for something trying to acquire a lock. For what it's worth I tried it with the constraints removed using alter table. It took 10 seconds to do the entire insert, then a 1m15s to restore each constraint. (Can I restore them together? would it be faster?) So that's faster but only by a factor of 2. -- greg