Thread: Make UPDATE query quicker?
Hi all, Wondered if someone had any tips about how to do this UPDATE query quicker. I've got two tables: CREATE TABLE import_table ( date_time TIMESTAMP person_id, INTEGER data REAL) CREATE TABLE master_table (date_time TIMESTAMP person_id INTEGER data REAL) Each table has 172800 rows in it. I want to move the 'data' from the import_table to the master_table by matching on both the date_time and the person_id. I do this with this query: UPDATE master_table a SET data = (SELECT b.date FROM import_table b WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND b.data IS NOT NULL) However I need to do this about 20 times, and the first time is still running at the moment (about 15 minutes). Am I doing something wrong? Should I put some indexes on the columns somehow to improve performance? Thanks James
On 12 October 2012 12:16, James David Smith <james.david.smith@gmail.com> wrote: > > Hi all, > > Wondered if someone had any tips about how to do this UPDATE query > quicker. I've got two tables: > > CREATE TABLE import_table > ( date_time TIMESTAMP > person_id, INTEGER > data REAL) > > CREATE TABLE master_table > (date_time TIMESTAMP > person_id INTEGER > data REAL) > > Each table has 172800 rows in it. > > I want to move the 'data' from the import_table to the master_table by > matching on both the date_time and the person_id. I do this with this > query: > > UPDATE master_table a > SET data = > (SELECT b.date > FROM import_table b > WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND > b.data IS NOT NULL) > > However I need to do this about 20 times, and the first time is still > running at the moment (about 15 minutes). > > Am I doing something wrong? Should I put some indexes on the columns > somehow to improve performance? I don't know if this will help performance-wise, but maybe it's worth a try: See if doing it with an UPDATE FROM helps at all. e.g.: http://stackoverflow.com/a/2766766/495319 -- Michael Wood <esiotrot@gmail.com>
Michael, Thanks, I will give that a try later today. The first of the 20 updates I need to do took about 25 minutes in the end by the way. All other useful suggestions gratefully recevied... ;-) Thanks James On 12 October 2012 12:53, Michael Wood <esiotrot@gmail.com> wrote: > On 12 October 2012 12:16, James David Smith <james.david.smith@gmail.com> wrote: >> >> Hi all, >> >> Wondered if someone had any tips about how to do this UPDATE query >> quicker. I've got two tables: >> >> CREATE TABLE import_table >> ( date_time TIMESTAMP >> person_id, INTEGER >> data REAL) >> >> CREATE TABLE master_table >> (date_time TIMESTAMP >> person_id INTEGER >> data REAL) >> >> Each table has 172800 rows in it. >> >> I want to move the 'data' from the import_table to the master_table by >> matching on both the date_time and the person_id. I do this with this >> query: >> >> UPDATE master_table a >> SET data = >> (SELECT b.date >> FROM import_table b >> WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND >> b.data IS NOT NULL) >> >> However I need to do this about 20 times, and the first time is still >> running at the moment (about 15 minutes). >> >> Am I doing something wrong? Should I put some indexes on the columns >> somehow to improve performance? > > I don't know if this will help performance-wise, but maybe it's worth a try: > > See if doing it with an UPDATE FROM helps at all. e.g.: > > http://stackoverflow.com/a/2766766/495319 > > -- > Michael Wood <esiotrot@gmail.com>
On Fri, Oct 12, 2012 at 3:46 PM, James David Smith <james.david.smith@gmail.com> wrote:
Hi all,
Wondered if someone had any tips about how to do this UPDATE query
quicker. I've got two tables:
CREATE TABLE import_table
( date_time TIMESTAMP
person_id, INTEGER
data REAL)
CREATE TABLE master_table
(date_time TIMESTAMP
person_id INTEGER
data REAL)
Each table has 172800 rows in it.
I want to move the 'data' from the import_table to the master_table by
matching on both the date_time and the person_id. I do this with this
query:
UPDATE master_table a
SET data =
(SELECT b.date
FROM import_table b
WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
b.data IS NOT NULL)
However I need to do this about 20 times, and the first time is still
running at the moment (about 15 minutes).
Am I doing something wrong? Should I put some indexes on the columns
somehow to improve performance?
Any idea how long the SELECT is taking ? Indexes on import_table (an Index on person_id) should help. ofcourse it all depends on the uniqueness of the column and size of the table etc. Whats the uniqueness (the n_distinct from pg_stats table) on all the 3 columns in the WHERE clause ?
Regards,
VB
Hi VB, 1) No idea how long the select is taking unfortunately. How can I figure this out? 2) I'll stick an index on person_id in the import_table. 3) Not sure what you mean by this unfortunately: "Whats the uniqueness (the n_distinct from pg_stats table) on all the 3 columns in the WHERE clause ?" Thanks James On 12 October 2012 15:00, VB N <vbnpgc@gmail.com> wrote: > > On Fri, Oct 12, 2012 at 3:46 PM, James David Smith > <james.david.smith@gmail.com> wrote: >> >> Hi all, >> >> Wondered if someone had any tips about how to do this UPDATE query >> quicker. I've got two tables: >> >> CREATE TABLE import_table >> ( date_time TIMESTAMP >> person_id, INTEGER >> data REAL) >> >> CREATE TABLE master_table >> (date_time TIMESTAMP >> person_id INTEGER >> data REAL) >> >> Each table has 172800 rows in it. >> >> I want to move the 'data' from the import_table to the master_table by >> matching on both the date_time and the person_id. I do this with this >> query: >> >> UPDATE master_table a >> SET data = >> (SELECT b.date >> FROM import_table b >> WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND >> b.data IS NOT NULL) >> >> However I need to do this about 20 times, and the first time is still >> running at the moment (about 15 minutes). >> >> Am I doing something wrong? Should I put some indexes on the columns >> somehow to improve performance? > > > Any idea how long the SELECT is taking ? Indexes on import_table (an Index > on person_id) should help. ofcourse it all depends on the uniqueness of the > column and size of the table etc. Whats the uniqueness (the n_distinct from > pg_stats table) on all the 3 columns in the WHERE clause ? > > Regards, > VB
On Fri, 2012-10-12 at 11:16 +0100, James David Smith wrote: > Hi all, > > Wondered if someone had any tips about how to do this UPDATE query > quicker. I've got two tables: > > CREATE TABLE import_table > ( date_time TIMESTAMP > person_id, INTEGER > data REAL) > > CREATE TABLE master_table > (date_time TIMESTAMP > person_id INTEGER > data REAL) > > Each table has 172800 rows in it. > > I want to move the 'data' from the import_table to the master_table by > matching on both the date_time and the person_id. I do this with this > query: > > UPDATE master_table a > SET data = > (SELECT b.date > FROM import_table b > WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND > b.data IS NOT NULL) > > However I need to do this about 20 times, and the first time is still > running at the moment (about 15 minutes). > > Am I doing something wrong? Should I put some indexes on the columns > somehow to improve performance? > > Thanks > > James > > For this you definitely want indexes. Also you have not defined any primary keys. I'm going to assume the combination of date_time and person_id defines a unique record. A suitable index will automatically be created if you define these two columns as the primary key. ALTER TABLE master_table ADD PRIMARY KEY(date_time, person_id); Do the same for import_table. Also your query isn't doing what you think it is. Where b.data is null, then no record is returned by the select statement but the update is not conditional so you end up with a.data is null. You might want to check any records in master_table that previously had a data value but the same record in update_table had a null value for data. SELECT * from import_table WHERE data IS NULL; would be a good place to start. The query you want is something like this: UPDATE master_table a SET data = b.data from import_table b where a.date_time = b.date_time and a.person_id = a.person_id and b.data is not null; HTH Tim Bowden