Re: Fwd: Tricky join and update with same table - Mailing list pgsql-novice
From | Michael Swierczek |
---|---|
Subject | Re: Fwd: Tricky join and update with same table |
Date | |
Msg-id | CAHp1f1MrE0eZ_aHRRKTauJ-7gJ8bkPLzkbRCKtSjz1uk43qo1g@mail.gmail.com Whole thread Raw |
In response to | Re: Fwd: Tricky join and update with same table (James David Smith <james.david.smith@gmail.com>) |
List | pgsql-novice |
On Fri, Jul 26, 2013 at 10:36 AM, James David Smith <james.david.smith@gmail.com> wrote: > Hi Mike, > > About half an hour before you replied I actually managed to come up with > that myself! :-) > > Using it in an update query is troubling me though. I've done it a few times > now and it keeps updating the wrong fields. > > Luca sent me an email a short while ago with some code to try however, and > I'm doing that at the moment. Though it's been running for about 30 minutes > now and isn't done. Though there are 230,000 rows, so perhaps that isn't > that surprising after all. Here it is for posterity: > > PDATE hlhs_day hd > SET point_geom = ( SELECT hd2.point_geom > FROM hlhs_day hd2 > WHERE hd.spid = hd2.spid > AND hd2.point_geom IS NOT NULL > AND hd2.point_time = ( SELECT min( hd3.point_time ) > FROM hlhs_day hd3 > WHERE hd.spid = hd3.spid > AND hd3.point_geom IS NOT NULL > AND hd3.point_time > > hd.point_time > ) > ) > WHERE hd.point_geom IS null > > Thanks > > James > > This is a bit of a kluge solution, but if it were me I would take the query I suggested (and you figured out on your own), put the results in a text file, and then use Excel or some regular expressions in Vim/Emacs/Notepad++/whatever to translate each line from spidx,point_timex,point_geomx to update hlhs_day set point_geom = point_geomx where spid = spidx and point_time = point_timex; Then run it through psql or pgAdmin3. I believe that would probably run much more quickly, and you can break it up into batches. Good luck either way. -Mike > > > > > On 26 July 2013 14:37, Michael Swierczek <mike.swierczek@gmail.com> wrote: >> >> On Fri, Jul 26, 2013 at 5:24 AM, James David Smith >> <james.david.smith@gmail.com> wrote: >> > Hi all, >> > >> > Some help if you can please. I have GPS data for a number of people. >> > Each >> > person turned the GPS on, at home, at a different time of the day. >> > Before >> > that time, I presume that the person was in their house. I have made a >> > table >> > for each person for a whole 24 hours (one record per minute), and I now >> > want >> > to 'fill in' their location for the rows before they turned the GPS on. >> > So >> > for each person I want to take the first row where the point_geom is not >> > null, and update all of the rows above it with that value. It's driving >> > me >> > nuts. >> > >> > spid | point_time | point_geom >> > ----------------------------------------------------------- >> > 1 | 2012-01-01 00:01:00 | >> > 1 | 2012-01-01 00:02:00 | >> > 1 | 2012-01-01 00:03:00 | POINT(X, Y) >> > 1 | 2012-01-01 00:04:00 | POINT(X, Y) >> > 1 | 2012-01-01 00:05:00 | POINT(X, Y) >> > 2 | 2012-01-01 00:01:00 | >> > 2 | 2012-01-01 00:02:00 | >> > 2 | 2012-01-01 00:03:00 | >> > 2 | 2012-01-01 00:04:00 | >> > 2 | 2012-01-01 00:05:00 | POINT(X, Y) >> > 3 | 2012-01-01 00:01:00 | >> > 3 | 2012-01-01 00:02:00 | POINT(X, Y) >> > 3 | 2012-01-01 00:03:00 | POINT(X, Y) >> > 3 | 2012-01-01 00:04:00 | POINT(X, Y) >> > 3 | 2012-01-01 00:05:00 | POINT(X, Y) >> > >> > I've managed to select the correct row using this: >> > >> > SELECT spid, min(point_time) as point_time >> > FROM hlhs_day >> > WHERE point_geom IS NOT NULL >> > GROUP BY spid; >> > >> > However when I try to add in the column 'point_geom' to the query, it >> > won't >> > work. >> > >> > Thanks for your help and suggestions. >> > >> > James >> >> Usually when I chime in on questions like this, someone comes along >> with a better solution after I finish. But I think you can get the >> information you want by "wrapping" that query: >> >> SELECT hd1.spid, hd1.point_time, hd2.point_geom >> FROM >> (SELECT spid, min(point_time) as point_time >> FROM hlhs_day >> WHERE point_geom IS NOT NULL GROUP BY spid) as hd1 >> INNER JOIN hlhs_day hd2 ON hd1.spid = hd2.spid AND hd1.point_time = >> hd2.point_time >> ORDER BY hd1.spid; >> >> I hope this helps. >> --Mike > >
pgsql-novice by date: