Thread: question about a select
OK, I have the following table: create table citations_by_level ( aid smallint, wid smallint, v_level varchar(50), w_level varchar(50), x_level varchar(50), y_level varchar(50), z_level varchar(50), byteloc integer ); (If it helps, aid/wid identifies a text or work, the levels are citation levels for that work (all but z_level potentially optional); eg for some work y_level might indicate chapters [z_levels indicate lines], and byteloc is the file position of that particular citation in the work). What I would *like* to be able to do is construct a query that groups by a level, but sorts by byteloc. I don't seem to be able to do this. Here are some examples. Note that y_level (any level) may have duplicates (which I want to eliminate), and that it's ordering is strictly on byteloc, not on its own value. postgres doesn't seem to have envisioned this scenario and/or I'm not being creative enough in constructing the query... Text=# select distinct on (y_level) y_level from citations_by_level where aid=543 and wid=1; y_level --------- 1 10 10a 11 12 13 14 15 16 17 18 19 2 20 21 22 23 24 25 26 27 28 29 2a 3 30 31 32 33 34 35 36 37 4 5 5a 6 7 7,8 8 9 t (42 rows) but as you can see, the "ordering" winds up being alphabetic on y_level which simply does not do. [In this case it is only coincidental that y_level appears numeric, it is a string and could be anything; and the 7,8 is such an example]. The *byteloc* associated with a given y_level (the location of that particular citation) does, but I can't seem to use it: Text=# select distinct on (y_level) y_level, byteloc from citations_by_level where aid=543 and wid=1 order by byteloc; ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions Text=# select distinct on (y_level) y_level, byteloc from citations_by_level where aid=543 and wid=1 order by byteloc, y_level; ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions ??? --Cindy -- ctmoore@uci.edu
On Thu, 9 Oct 2003, Cindy wrote: > Text=# select distinct on (y_level) y_level, byteloc from > citations_by_level where aid=543 and wid=1 order by byteloc; > ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY > expressions Maybe something like: select y_level from (select distinct on (y_level) y_level, byteloc from citations_by_level where aid=543 and wid=1) tab order by byteloc;
Thanks for the speedy response! Stephan Szabo writes: >On Thu, 9 Oct 2003, Cindy wrote: > > >> Text=# select distinct on (y_level) y_level, byteloc from >> citations_by_level where aid=543 and wid=1 order by byteloc; >> ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY >> expressions > >Maybe something like: > >select y_level from (select distinct on (y_level) y_level, >byteloc from citations_by_level where aid=543 and wid=1) tab order by >byteloc; Hm...this seems to pick out the y_level with the largest byteloc associated with it...I need the y_level with the smallest byteloc and then sorted by that...lemme play this, but any other suggestions also welcomed... (EG, I got: Text=# select y_level, byteloc from (select distinct on (y_level) y_level, byteloc from citations_by_level where aid=543::smallint and wid=1::smallint) tab order by byteloc; y_level | byteloc ---------+--------- 13 | 24017 11 | 36231 8 | 47240 10 | 96777 2 | 285678 5 | 400212 7,8 | 420879 ... but if you look at the raw data (or some of it): ---------+--------- 1 | 29 1 | 84 2 | 423 3 | 827 4 | 1005 5 | 1169 6 | 1530 1 | 1698 2 | 1988 3 | 2284 4 | 2460 5 | 2638 6 | 2924 7 | 3155 8 | 3396 1 | 3673 2 | 4095 3 | 4387 4 | 4699 5 | 4944 6 | 5055 7 | 5406 8 | 5704 9 | 5996 10 | 6349 1 | 6578 2 | 7110 I want 1 | 29 2 | 423 3 | 827 4 | 1005 5 | 1169 6 | 1530 7 | 3155 8 | 3396 9 | 5996 10 | 6349 etc... -- Cindy ctmoore@uci.edu
On Thu, 9 Oct 2003, Cindy wrote: > Stephan Szabo writes: > >On Thu, 9 Oct 2003, Cindy wrote: > >> Text=# select distinct on (y_level) y_level, byteloc from > >> citations_by_level where aid=543 and wid=1 order by byteloc; > >> ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY > >> expressions > > > >Maybe something like: > > > >select y_level from (select distinct on (y_level) y_level, > >byteloc from citations_by_level where aid=543 and wid=1) tab order by > >byteloc; > > Hm...this seems to pick out the y_level with the largest byteloc > associated with it...I need the y_level with the smallest byteloc and > then sorted by that...lemme play this, but any other suggestions also > welcomed... I think an order by y_level, byteloc in the subquery might give you the ordering you want for the distinct on step, and then the outer order by will order the y_levels by their respective bytelocs.