Thread: Queries never returning...
I am currently having a problem with a query never finishing (or at least not in a reasonable amount of time.) I have had similar problems to this in Postgres over the past several years, and I have always found workarounds. This time I'd like to actually understand what is happening. I have two tables, tbl_claim and tbl_insured. tbl_claim has a column insured_id which references the primary key in tbl_insured, also named insured_id. Originally they were to have a one to many relationship (one tbl_insured record could potentionally be referenced by multiple records in tbl_claim). In practice, however, they're essentially 1 to 1. There are a few "stragglers", but it isn't really necessary to keep the integrity. I want to move all data from tbl_insured into tbl_claim. I added all pertinent columns to tbl_insured. However, when I run the following query: UPDATE tbl_claim SET ins_lname = tbl_insured.ins_lname, ins_fname = tbl_insured.ins_fname, ins_mi = tbl_insured.ins_mi, ins_add1 = tbl_insured.ins_add1, ins_add2 = tbl_insured.ins_add2, ins_city = tbl_insured.ins_city, ins_state = tbl_insured.ins_state, ins_zip = tbl_insured.ins_zip, ins_phone = tbl_insured.ins_phone, ins_altphone =tbl_insured.ins_altphone, ins_cell = tbl_insured.ins_cell, ins_pager = tbl_insured.ins_pager, ins_fax = tbl_insured.ins_fax, ins_email = tbl_insured.ins_email FROM tbl_insured WHERE tbl_claim.insured_id = tbl_insured.insured_id; it never comes back. I have left it running for 30 minutes or so, but it never comes back (It is simply not acceptable to run this query for 30 minutes, as this is part of a much larger system overhaul...I can't have my production system down for this long) Here are my record counts: select count(*) FROM tbl_insured; count -------- 242083 (1 row) select count(*) FROM tbl_claim; count -------- 243121 select count(*) FROM tbl_claim INNER JOIN tbl_insured ON tbl_claim.insured_id = tbl_insured.insured_id; count -------- 243117 Here is my explain of the update query: Hash Join (cost=11033.04..94030.41 rows=313765 width=596) Hash Cond: ("outer".insured_id = "inner".insured_id) -> Seq Scan on tbl_claim (cost=0.00..16240.29 rows=445829 width=442) -> Hash (cost=4871.83..4871.83 rows=242083 width=158) -> Seq Scan on tbl_insured (cost=0.00..4871.83 rows=242083 width=158) (5 rows) When I try an explain analyze, it just spins forever as well (I've never let it run for more than 30 minutes). I could easily write a little PHP script to copy over the data, but I'd like to do it in SQL, and if not at least know WHY it isn't working. I can't quite see why this is taking so long, and I don't know how I would go about diagnosing the problem, since the explain doesn't really seem like it should take too long, and explain analyze is hanging as well. Note: I did try a VACUUM FULL ANALYZE prior to running the query.
John McCawley <nospam@hardgeus.com> writes: > I am currently having a problem with a query never finishing (or at > least not in a reasonable amount of time.) I have had similar problems > to this in Postgres over the past several years, and I have always found > workarounds. This time I'd like to actually understand what is happening. If that join SELECT comes back in a reasonable period of time, then the UPDATE shouldn't take too long either --- they're both doing about the same thing as far as performing the join goes. The differential would have to be index updates or triggers fired by the UPDATE. I'd bet on the latter, but since you've told us zip about your schema or what PG version this is, it's impossible to speculate further... regards, tom lane
On 12/28/05, John McCawley <nospam@hardgeus.com> wrote: > I am currently having a problem with a query never finishing (or at > least not in a reasonable amount of time.) I have had similar problems > to this in Postgres over the past several years, and I have always found > workarounds. This time I'd like to actually understand what is happening. > I bet it would go faster if you dropped the RI constraints and any other triggers first.
Tom Lane wrote: >The differential would have to be index updates or triggers fired by the UPDATE. I'd bet on >the latter, but since you've told us zip about your schema or what PG >version this is, it's impossible to speculate further... > > This is my development machine. I'm running PostgreSQL 8.0.3 on a Pentium 4 3GHZ Gentoo machine with a 2.6.12 kernel, 1 gig of RAM. Everything is running on one big partition on a SATA drive. You're right, it looks to be trigger related. I did have a timestamp trigger, which I have removed, however it still has a bunch of foreign key triggers on it. Even if I run: update tbl_claim SET ins_lname = NULL; I get the same problem. In looking at the "\d tbl_claim" output, there is something odd I notice. I have many foreign keys (the claim_id in tbl_claim is referenced by 12 or so other tables, and tbl_claim references about 6 or so tables by their _id) What is strange is that two of my newer foreign keys are shown as follows: Foreign-key constraints: "fk_tbl_claim_tbl_stormgroup_stormgroup_id" FOREIGN KEY (stormgroup_id) REFERENCES tbl_stormgroup(stormgroup_id) MATCH FULL "fk_tbl_claim_emp_id" FOREIGN KEY (emp_id) REFERENCES tbl_employee(emp_id) MATCH FULL Which matches the syntax I used to create them, however all of my older foreign keys are under the Triggers section and are defined as follows: "RI_ConstraintTrigger_23354821" AFTER INSERT OR UPDATE ON tbl_claim FROM tbl_agents NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins"('fk_tbl_claim_tbl_agents_fk', 'tbl_claim', 'tbl_agents', 'UNSPECIFIED', 'agent_id', 'agent_id') Why are they different? Should all of my foreign keys look like the first two, or are they logically identical? I assume the difference is because the older keys were initially created in a 7.x version of Postgres, and got into 8.x from a pg_dumpall Below is a full \d dump of this table. Obviously, there are a ton of triggers on it, but how would I avoid this in a heavily referenced table? (tbl_claim is the core table of this entire system). Certainly, dropping all of the triggers, indexes, etc. would solve the problem and allow me to update, but I'd like a more elegant solution. I don't have THAT many records in this table, and I wouldn't expect a simple update of a column to hang everything. Should I modify my foreign key triggers? ----------------------------------------------------- Table "public.tbl_claim" Column | Type | Modifiers ------------------+-----------------------------+---------------------------------------------------------- claim_id | integer | not null default nextval('tbl_claim_claim_id_key'::text) worlfilenum | character varying(12) | createby | integer | claimnum | character varying(50) | insured_id | integer | comaster_id | integer | clntmaster_id | integer | agent_id | integer | storm_id | integer | claim_createdate | timestamp with time zone | claim_lossdate | timestamp with time zone | claim_mailer | timestamp with time zone | claim_contdate | timestamp with time zone | claim_inpecdate | timestamp with time zone | claim_closedate | timestamp with time zone | claim_clntnum | character varying(25) | claim_deductible | double precision | clmtype_id | integer | subrogation | character varying(10) | peril_id | integer | rcv | double precision | policydate | timestamp with time zone | limita | double precision | limitb | double precision | limitc | double precision | limitd | double precision | deductible | double precision | riskadd | character varying(100) | riskcity | character varying(50) | riskstate | character varying(50) | riskzip | character varying(50) | secinjury | character varying(10) | searchtext | character varying(32) | lossreserves | double precision | expensereserves | double precision | notes | character varying(512) | active | integer | default 1 policyexpiredate | timestamp with time zone | deductible2 | double precision | salvage | integer | siu | integer | policynum | character varying(32) | groupnumber | integer | stormgroup_id | integer | printed | integer | severitycode | character varying(32) | otherreserves | double precision | personalreserves | double precision | stamp | timestamp without time zone | emp_id | integer | ins_lname | character varying(50) | ins_fname | character varying(100) | ins_mi | character varying(50) | ins_add1 | character varying(50) | ins_add2 | character varying(50) | ins_city | character varying(50) | ins_state | character varying(50) | ins_zip | character varying(50) | ins_phone | character varying(50) | ins_altphone | character varying(50) | ins_cell | character varying(50) | ins_pager | character varying(50) | ins_fax | character varying(50) | ins_email | character varying(256) | Indexes: "tbl_claim_pkey" PRIMARY KEY, btree (claim_id) "idx_claim_claimnum" btree (claimnum) "idx_tbl_claim_comaster_id" btree (comaster_id) "idx_tbl_claim_createby" btree (createby) "idx_tbl_claim_insured_id" btree (insured_id) "idx_tbl_claim_storm_id" btree (storm_id) "tbl_claim_agent_id" btree (agent_id) Foreign-key constraints: "fk_tbl_claim_tbl_stormgroup_stormgroup_id" FOREIGN KEY (stormgroup_id) REFERENCES tbl_stormgroup(stormgroup_id) MATCH FULL "fk_tbl_claim_emp_id" FOREIGN KEY (emp_id) REFERENCES tbl_employee(emp_id) MATCH FULL Triggers: "RI_ConstraintTrigger_23354821" AFTER INSERT OR UPDATE ON tbl_claim FROM tbl_agents NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins"('fk_tbl_claim_tbl_agents_fk', 'tbl_claim', 'tbl_agents', 'UNSPECIFIED', 'agent_id', 'agent_id') "RI_ConstraintTrigger_23354824" AFTER INSERT OR UPDATE ON tbl_claim FROM tbl_claimtype NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins"('fk_tbl_claim_tbl_claimtype_fk', 'tbl_claim', 'tbl_claimtype', 'UNSPECIFIED', 'clmtype_id', 'clmtype_id') "RI_ConstraintTrigger_23354827" AFTER INSERT OR UPDATE ON tbl_claim FROM tbl_clntmaster NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins"('fk_tbl_claim_tbl_clntmaster_fk', 'tbl_claim', 'tbl_clntmaster', 'UNSPECIFIED', 'clntmaster_id', 'clntmaster_id') "RI_ConstraintTrigger_23354830" AFTER INSERT OR UPDATE ON tbl_claim FROM tbl_insured NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins"('fk_tbl_claim_tbl_insured_fk', 'tbl_claim', 'tbl_insured', 'UNSPECIFIED', 'insured_id', 'insured_id') "RI_ConstraintTrigger_23354833" AFTER INSERT OR UPDATE ON tbl_claim FROM tbl_peril NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins"('fk_tbl_claim_tbl_peril_fk', 'tbl_claim', 'tbl_peril', 'UNSPECIFIED', 'peril_id', 'peril_id') "RI_ConstraintTrigger_23354836" AFTER INSERT OR UPDATE ON tbl_claim FROM tbl_foocomstr NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins"('fk_tbl_claim_tbl_foocomst_fk', 'tbl_claim', 'tbl_foocomstr', 'UNSPECIFIED', 'comaster_id', 'comaster_id') "RI_ConstraintTrigger_23354846" AFTER DELETE ON tbl_claim FROM tbl_claimactivity NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_del"('fk_tbl_claimactivity_tbl_cla_fk', 'tbl_claimactivity', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id') "RI_ConstraintTrigger_23354847" AFTER UPDATE ON tbl_claim FROM tbl_claimactivity NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd"('fk_tbl_claimactivity_tbl_cla_fk', 'tbl_claimactivity', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id') "RI_ConstraintTrigger_23354858" AFTER DELETE ON tbl_claim FROM tbl_claimchecklog NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_del"('fk_tbl_claimtpa_tbl_claim_fk', 'tbl_claimchecklog', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id') "RI_ConstraintTrigger_23354859" AFTER UPDATE ON tbl_claim FROM tbl_claimchecklog NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd"('fk_tbl_claimtpa_tbl_claim_fk', 'tbl_claimchecklog', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id') "RI_ConstraintTrigger_23354861" AFTER DELETE ON tbl_claim FROM tbl_claimclaimant NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_del"('fk_tbl_claimclaimant_tbl_cla_fk', 'tbl_claimclaimant', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id') "RI_ConstraintTrigger_23354862" AFTER UPDATE ON tbl_claim FROM tbl_claimclaimant NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd"('fk_tbl_claimclaimant_tbl_cla_fk', 'tbl_claimclaimant', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id') "RI_ConstraintTrigger_23354873" AFTER DELETE ON tbl_claim FROM tbl_claimdocument NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_del"('fk_tbl_claimdocument_tbl_cla_fk', 'tbl_claimdocument', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id') "RI_ConstraintTrigger_23354874" AFTER UPDATE ON tbl_claim FROM tbl_claimdocument NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd"('fk_tbl_claimdocument_tbl_cla_fk', 'tbl_claimdocument', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id') "RI_ConstraintTrigger_23354876" AFTER DELETE ON tbl_claim FROM tbl_claimwitness NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_del"('fk_tbl_claimwitness_tbl_clai_fk', 'tbl_claimwitness', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id') "RI_ConstraintTrigger_23354877" AFTER UPDATE ON tbl_claim FROM tbl_claimwitness NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd"('fk_tbl_claimwitness_tbl_clai_fk', 'tbl_claimwitness', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id') "RI_ConstraintTrigger_23354948" AFTER DELETE ON tbl_claim FROM tbl_invoice NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_del"('fk_tbl_invoice_tbl_claim_fk', 'tbl_invoice', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id') "RI_ConstraintTrigger_23354949" AFTER UPDATE ON tbl_claim FROM tbl_invoice NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd"('fk_tbl_invoice_tbl_claim_fk', 'tbl_invoice', 'tbl_claim', 'UNSPECIFIED', 'claim_id', 'claim_id')
John McCawley <nospam@hardgeus.com> writes: > In looking at the "\d tbl_claim" output, there is something odd I > notice. I have many foreign keys (the claim_id in tbl_claim is > referenced by 12 or so other tables, and tbl_claim references about 6 or > so tables by their _id) It seems a good bet that the poor performance is due to lack of indexes on the columns that reference tbl_claim from other tables. PG enforces an index on the referenced side of an FK constraint, but not on the referencing side. This is OK if you mostly update the referencing table, but it hurts for updates and deletes on the referenced table. Try creating those indexes. (You'll likely need to start a fresh psql session afterwards to make sure that the RI mechanism notices the new indexes.) > Which matches the syntax I used to create them, however all of my older > foreign keys are under the Triggers section and are defined as follows: > "RI_ConstraintTrigger_23354821" AFTER INSERT OR UPDATE ON tbl_claim > FROM tbl_agents NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE > PROCEDURE "RI_FKey_check_ins"('fk_tbl_claim_tbl_agents_fk', 'tbl_claim', > 'tbl_agents', 'UNSPECIFIED', 'agent_id', 'agent_id') These are probably inherited from some pre-7.3-or-so schema? I'd suggest dropping those triggers and recreating the constraints with ALTER TABLE ADD CONSTRAINT. You could also look at contrib/adddepend/ which is alleged to fix such things automatically (but I wouldn't trust it too much, because it's not been maintained since 7.3). This won't make any difference to performance, but it'll clean up your schema into a more future-proof form. regards, tom lane
It looks like my primary slowdown on that query was the timestamp trigger. However, even after removing that trigger, and ensuring that all of my referencing tables had their foreign keys indexed, a simple update of one column on 244451 records took 14 minutes. Given the specs I mentioned in the earlier email, is this to be expected? I have also modified my entire schema to use the more updated constraint syntax. I wrote a php script which can be used on a pg_dump. I have attached it here in case anyone else ends up needing this: -----------------file fixkey.php------------------- #!/usr/bin/php <? echo "This script modifies a schema file generated by pg_dump and converts any pre 7.3 foreign key triggers to proper foreign key constraint syntax. I have only tested it on a dumpfile generated by Postgres 8.0.3. I have no idea if this will work on any other version, or with other people's wacky schemas. This worked for me and that's all I can say. Don't blame me if this script burns down your house.\n\n"; if( $argc != 3 ) { die("Usage: fixkey.php schemafile.db outfile.db\n"); } $fp = fopen($argv[1], "r"); $fpout = fopen($argv[2], "w"); if( !$fp ) { die("Error opening '" . $argv[1] . "' for read\n"); } if( !$fpout ) { die("Error opening '" . $argv[2] . "' for write\n"); } while( $line = fgets($fp, 5000) ) { if( strstr($line, "ConstraintTrigger_" ) ) { //echo "Skipping comment $line\n"; } else if( strstr($line, "CREATE CONSTRAINT TRIGGER" ) ) { $keyname = substr($line, strlen("CREATE CONSTRAINT TRIGGER" ) ); $keyname = trim($keyname); //Get 5 lines after declaration for foreign key info $line2 = fgets($fp, 5000); $line3 = fgets($fp, 5000); $line4 = fgets($fp, 5000); $line5 = fgets($fp, 5000); $line6 = fgets($fp, 5000); //Foreign keys are apparently made of up 3 triggers...we only care about the first one //I assume that the subsequent ones will be implicitly created by the new syntax if( !$key_array[$keyname] ) { //Store key name so we don't process it again $key_array[$keyname] = 1; //Referencing table name is in line 2 $table = explode(" ", $line2); $table = $table[count($table)-1]; $table = trim($table); //Referenced table is in line 3 $parent = explode(" ", $line3); $parent = $parent[count($parent)-1]; $parent = trim($parent); //Referencing column is on line 6 $column = explode(",", $line6); $column = $column[4]; $column = str_replace("'", "", $column); $column = trim($column); //Referenced column is on line 6 $parentcolumn = explode(",", $line6); $parentcolumn = $parentcolumn[5]; $parentcolumn = str_replace("'", "", $parentcolumn); $parentcolumn = str_replace(")", "", $parentcolumn); $parentcolumn = str_replace(";", "", $parentcolumn); $parentcolumn = trim($parentcolumn); $sKeySQL = "ALTER TABLE $table ADD CONSTRAINT $keyname FOREIGN KEY ($column) REFERENCES $parent ($parentcolumn) MATCH FULL;\n"; echo $sKeySQL; fputs($fpout, "\n\n" . $sKeySQL . "\n\n"); } } else { fputs($fpout, $line); } } fclose($fp); fclose($fpout); ?>
I have decided that I'm going to resume working on my pgDesigner project ( http://www.hardgeus.com/projects/pgdesigner/ ). I haven't really maintained it for over a year, but I get about 200 visits a day to the project homepage, and about two or three emails a week asking for help. As the project currently stands, it stores datamodel information in a text file separate from the database. What I would like to do is modify the application such that it can operate in "live" mode. i.e. store all visual datamodel information in the database itself in a series of tables. My question for this list is the following: Is there a standard way for storing quasi-system related information in Postgres. i.e., this is essentially an admin tool, and I was wondering if there are naming conventions etc. I should follow such as the table names etc. John
On Wed, Jan 04, 2006 at 01:36:45PM -0600, John McCawley wrote: > I have decided that I'm going to resume working on my pgDesigner project > ( http://www.hardgeus.com/projects/pgdesigner/ ). I haven't really > maintained it for over a year, but I get about 200 visits a day to the > project homepage, and about two or three emails a week asking for help. > As the project currently stands, it stores datamodel information in a > text file separate from the database. What I would like to do is modify > the application such that it can operate in "live" mode. i.e. store all > visual datamodel information in the database itself in a series of > tables. My question for this list is the following: Is there a > standard way for storing quasi-system related information in Postgres. > i.e., this is essentially an admin tool, and I was wondering if there > are naming conventions etc. I should follow such as the table names etc. > > John > It is recommended that the application data tables reside in their own schema. If the schema is not meant to be used by anything other than your program, then consider prepending the schema name with an underscore. For example: _pgdes.tablex _pgdes.tabley _pgdes.tablez Be sure you tell people that the tables are being created in their databases when you do it. elein