Thread: insert
Hi, is there anything I can doo to speed up inserts? One of my tables gets about 100 new rows every five minutes. And somehow the inserts tend to take more and more time. Any suggestions welcome. TIA Ulrich
Tips! *Delete indexes and recreate them after the insert. *Disable auto-commit *Perform a copy will be faster, sure. Best wishes, Guido > Hi, > > is there anything I can doo to speed up inserts? One of my tables gets > about 100 new rows every five minutes. And somehow the inserts tend to > take more and more time. > > Any suggestions welcome. > > TIA > > Ulrich > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match
"G u i d o B a r o s i o" <gbarosio@uolsinectis.com.ar> wrote: [speeding up 100 inserts every 5 minutes] > Tips! > *Delete indexes and recreate them after the insert. sounds a bit extreme, for only 100 inserts gnari
As I see it's 100 inserts every 5 minutes, not only 100 inserts. Sure it's extreme for only 100 inserts. Cheers, Guido > "G u i d o B a r o s i o" <gbarosio@uolsinectis.com.ar> wrote: > > [speeding up 100 inserts every 5 minutes] > > > Tips! > > *Delete indexes and recreate them after the insert. > > sounds a bit extreme, for only 100 inserts > > gnari > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match
From: "G u i d o B a r o s i o" <gbarosio@uolsinectis.com.ar>: > As I see it's 100 inserts every 5 minutes, not only 100 inserts. > > Sure it's extreme for only 100 inserts. I am sorry, I do not quite grasp what you are saying. my understanding was that there are constantly new inserts, coming in bursts of 100 , every 5 minutes. I imagined that the indexes were needed in between. if this is the case, the bunches of 100 inserts should be done inside a transaction (or by 1 COPY statement) if, on the other hand, the inserts happen independently, at a rate of 100 inserts / 5 minutes, then this will not help gnari > > Cheers, > Guido > > > "G u i d o B a r o s i o" <gbarosio@uolsinectis.com.ar> wrote: > > > > [speeding up 100 inserts every 5 minutes] > > > > > Tips! > > > *Delete indexes and recreate them after the insert. > > > > sounds a bit extreme, for only 100 inserts > > > > gnari > > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 9: the planner will ignore your desire to choose an index scan if your > > joining column's datatypes do not match > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >
Hi, my inserts are done in one transaction, but due to some foreign key constraints and five indexes sometimes the 100 inserts will take more than 5 minutes. /Ulrich
On 13/08/2004 13:10 Ulrich Wisser wrote: > Hi, > > my inserts are done in one transaction, but due to some foreign key > constraints and five indexes sometimes the 100 inserts will take more > than 5 minutes. Two possibilities come to mind: a) you need an index on the referenced FK field b) you have an index but a type mis-match (e.g, an int4 field referencing an int8 field) Either of these will cause a sequential table scan and poor performance. -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
On Fri, 2004-08-13 at 08:10, Ulrich Wisser wrote: > Hi, > > my inserts are done in one transaction, but due to some foreign key > constraints and five indexes sometimes the 100 inserts will take more > than 5 minutes. It is likely that you are missing an index on one of those foreign key'd items. Do an EXPLAIN ANALYZE SELECT * FROM foreign_table WHERE foreign_col = '<insert value>'; Fix them until they're quick.
On Fri, Aug 13, 2004 at 08:57:56 -0400, Rod Taylor <pg@rbt.ca> wrote: > On Fri, 2004-08-13 at 08:10, Ulrich Wisser wrote: > > Hi, > > > > my inserts are done in one transaction, but due to some foreign key > > constraints and five indexes sometimes the 100 inserts will take more > > than 5 minutes. > > It is likely that you are missing an index on one of those foreign key'd > items. I don't think that is too likely as a foreign key reference must be a unique key which would have an index. I think the type mismatch suggestion is probably what the problem is. The current solution is to make the types match. In 8.0.0 it would probably work efficiently as is, though it isn't normal for foreign keys to have a type mismatch and he may want to change that anyway.
On Fri, Aug 13, 2004 at 17:17:10 +0100, Matt Clark <matt@ymogen.net> wrote: > > > It is likely that you are missing an index on one of those foreign > > > key'd items. > > > > I don't think that is too likely as a foreign key reference > > must be a unique key which would have an index. > > I think you must be thinking of primary keys, not foreign keys. All > one-to-many relationships have non-unique foreign keys. The target of the reference needs to have at least a unique index. I am not sure if it needs to actually be declared as either a unique or primary key, though that is the intention. The records doing the referencing don't need (and normally aren't) unique.
> > It is likely that you are missing an index on one of those foreign > > key'd items. > > I don't think that is too likely as a foreign key reference > must be a unique key which would have an index. I think you must be thinking of primary keys, not foreign keys. All one-to-many relationships have non-unique foreign keys.
Bruno Wolff III <bruno@wolff.to> writes: > Rod Taylor <pg@rbt.ca> wrote: >> It is likely that you are missing an index on one of those foreign key'd >> items. > I don't think that is too likely as a foreign key reference must be a > unique key which would have an index. I think the type mismatch > suggestion is probably what the problem is. I agree. It is possible to have a lack-of-index problem on the referencing column (as opposed to the referenced column), but that normally only hurts you for deletes from the referenced table. > The current solution is to make the types match. In 8.0.0 it would probably > work efficiently as is, though it isn't normal for foreign keys to have a type > mismatch and he may want to change that anyway. 8.0 will not fix this particular issue, as I did not add any numeric-vs-int comparison operators. If we see a lot of complaints we could think about adding such, but for 8.0 only the more common cases such as int-vs-bigint are covered. regards, tom lane
gnari wrote: > "G u i d o B a r o s i o" <gbarosio@uolsinectis.com.ar> wrote: > > [speeding up 100 inserts every 5 minutes] > > >>Tips! >>*Delete indexes and recreate them after the insert. > > > sounds a bit extreme, for only 100 inserts which fsync method are you using ? change it and see what happen Regards Gaetano Mendola
I think I figured it out myself. If anyone sees issues with this (simple) approach, please let me know. I changed my table definitions to this: CREATE SEQUENCE public.product_id_seq CREATE TABLE products ( product_id INTEGER DEFAULT nextval('product_id_seq'::regclass) NOT NULL, name VARCHAR(60) NOT NULL, category SMALLINT NOT NULL, CONSTRAINT product_id PRIMARY KEY (product_id) ); CREATE SEQUENCE public.category_id_seq CREATE TABLE category ( category_id INTEGER DEFAULT nextval('category_id_seq'::regclass) NOT NULL, name VARCHAR(20) NOT NULL, CONSTRAINT category_id PRIMARY KEY (category_id) ); ALTER TABLE products ADD CONSTRAINT category_products_fk FOREIGN KEY (category) REFERENCES category (category_id) ON DELETE NO ACTION ON UPDATE CASCADE ; Then created this function: CREATE OR REPLACE FUNCTION getid(_table text,_pk text,_name text) RETURNS integer AS $$ DECLARE _id integer; BEGIN EXECUTE 'SELECT ' || _pk || ' FROM ' || _table::regclass || ' WHERE name' || ' = ' || quote_literal(_name) INTO _id; IF _id > 0 THEN return _id; ELSE EXECUTE 'INSERT INTO ' || _table || ' VALUES (DEFAULT,' || quote_literal(_name) || ')' || ' RETURNING ' || _pk INTO _id; return _id; END IF; END; $$ LANGUAGE 'plpgsql' VOLATILE; Now I can just insert into the products table via: INSERT INTO products VALUES(DEFAULT,'Postgresql for Dummies',getid('category','category_id','books')); For example: testdb=# select * from products; product_id | name | category ------------+------+---------- (0 rows) iims_test=# select * from category; category_id | name -------------+------ (0 rows) testdb=# insert into products values(DEFAULT,'Postgresql for Dummies',getid('category','category_id','books')); INSERT 0 1 testdb=# select * from category; category_id | name -------------+------- 1 | books testdb=# select * from products; product_id | name | category ------------+------------------------+---------- 1 | Postgresql for Dummies | 1 Updating the category_id in category table are also cascaded to the product table. testdb=# UPDATE category SET category_id = 2 WHERE category_id = 1; UPDATE 1 testdb=# SELECT * FROM products; product_id | name | category ------------+------------------------+---------- 1 | Postgresql for Dummies | 2 Alan
Hello. Please note that in multitasking environment you may have problems with your code. Two connections may check if "a" is available and if not (and both got empty "select" result), try to insert. One will succeed, another will fail if you have a unique constraint on category name (and you'd better have one). Please note that select for update won't help you much, since this is new record you are looking for, and select don't return (and lock) it. I am using "lock table <tableName> in SHARE ROW EXCLUSIVE mode" in this case. But then, if you have multiple lookup dictinaries, you need to ensure strict order of locking or you will be getting deadlocks. As for me, I did create a special application-side class to retrieve such values. If I can't find a value in main connection with simple select, I open new connection, perform table lock, check if value is in there. If it is not, add the value and commit. This may produce orphaned dictionary entries (if dictionary entry is committed an main transaction is rolled back), but this is usually OK for dictionaries. At the same time I don't introduce hard locks into main transaction and don't have to worry about deadlocks. Best regards, Vitalii Tymchyshyn
Vitalii Tymchyshyn <tivv00@gmail.com> wrote: > Please note that in multitasking environment you may have problems > with your code. Two connections may check if "a" is available and > if not (and both got empty "select" result), try to insert. One > will succeed, another will fail if you have a unique constraint on > category name (and you'd better have one). > > Please note that select for update won't help you much, since this > is new record you are looking for, and select don't return (and > lock) it. I am using "lock table <tableName> in SHARE ROW > EXCLUSIVE mode" in this case. > > But then, if you have multiple lookup dictinaries, you need to > ensure strict order of locking or you will be getting deadlocks. > As for me, I did create a special application-side class to > retrieve such values. If I can't find a value in main connection > with simple select, I open new connection, perform table lock, > check if value is in there. If it is not, add the value and > commit. This may produce orphaned dictionary entries (if > dictionary entry is committed an main transaction is rolled back), > but this is usually OK for dictionaries. At the same time I don't > introduce hard locks into main transaction and don't have to worry > about deadlocks. It sounds like you might want to check out the new "truly serializable" transactions in version 9.1. If you can download the latest beta version of it and test with default_transaction_isolation = 'serializable' I would be interested to hear your results. Note that you can't have deadlocks, but you can have other types of serialization failures, so your software needs to be prepared to start a transaction over from the beginning when the SQLSTATE of a failure is '40001'. The Wiki page which was used to document and organize the work is: http://wiki.postgresql.org/wiki/Serializable This is in a little bit of a funny state because not all of the wording that was appropriate while the feature was under development (e.g., future tense verbs) has been changed to something more appropriate for a finished feature, but it should cover the theoretical ground pretty well. An overlapping document which was initially based on parts of the Wiki page and has received more recent attention is the README-SSI file here: http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/backend/storage/lmgr/README-SSI;hb=master Some examples geared toward programmers and DBAs is at this Wiki page: http://wiki.postgresql.org/wiki/SSI It could use a couple more examples and a bit of language cleanup, but what is there is fairly sound. The largest omission is that we need to show more explicitly that serialization failures can occur at times other than COMMIT. (I got a little carried away trying to show that there was no blocking and that the "first committer wins".) -Kevin