Trying to figure out why these queries are so slow - Mailing list pgsql-general
From | Tim Uckun |
---|---|
Subject | Trying to figure out why these queries are so slow |
Date | |
Msg-id | AANLkTim-BRRttUjWBDQ1_3gXztp_8RhYkhK7HBbmL6wp@mail.gmail.com Whole thread Raw |
Responses |
Re: Trying to figure out why these queries are so slow
Re: Trying to figure out why these queries are so slow Re: Trying to figure out why these queries are so slow |
List | pgsql-general |
I have two tables. Table C has about 300K records in it. Table E has about a million records in it. Today I tried to run this query. update C set result_count = X.result_count from C inner join (select c_id, count(c_id) as result_count from E where c_id is not null group by c_id) as X on C.id = X.c_id All the fields mentioned are indexed. In the case of Table C it's the primary key. In the case table E it's just an index (non unique). I let this query run for about three hours before I cancelled it. Next I did a CREATE TABLE of the sub query thinking the subquery was the problem I extracted the results into a separate table and in that table made the c_id field the primary key. That table only had about 80 thousand records. The creation of the table took just a couple of minutes. Perfectly reasonable. I tried to run the same thing again but this time joining the newly created table instead of the subquery. I killed that query after an hour. Then I tried to be clever about it so I added a Where id in (select id from C where result_count =0 limit 10) into the where clause. I figure this would run fast because it's only updating ten records. The result_count field is indexed so that part should run very quick and the where ID IN should be quick because it's looking up ten items in the PKEY right? Well that didn't go very well either. I killed that query after about 15 minutes. In the end (once again) I wrote a ruby script to pull up the records from table E in small chunks and pull the corresponding records records from C in small chunks and update them myself. In other words I wrote a ruby script to do the join and the update in chunks. I am simply baffled. What am I doing wrong. Is the database not tuned? Do I not have enough RAM? Is there some parameter I need to tweak someplace? There was nothing else going on in the database when this query was running. I shut off all applications touching the database. The database machine is a linode host. It's got two gigs of RAM I realize that this is not the ideal host for a database but it's what I have to work with. kernel.shmmax=536870912 , kernel.shmall=2097152, shared_buffers = 128MB effective_cache_size = 256MB log_min_duration_statement = 1000 max_connections = 100 This was a case of doing an update on about 80,000 records using an inner join on two tables on using the primary keys of both records. Why would this query take more than a few seconds to run?
pgsql-general by date: