Thread: question about a select

question about a select

From
Cindy
Date:
OK, I have the following table:

create table citations_by_level
(
        aid smallint,
        wid smallint,
        v_level varchar(50),
        w_level varchar(50),
        x_level varchar(50),
        y_level varchar(50),
        z_level varchar(50),
        byteloc integer
);

(If it helps, aid/wid identifies a text or work, the levels are
citation levels for that work (all but z_level potentially optional);
eg for some work y_level might indicate chapters [z_levels indicate
lines], and byteloc is the file position of that particular citation
in the work).

What I would *like* to be able to do is construct a query that groups
by a level, but sorts by byteloc.  I don't seem to be able to do this.

Here are some examples.  Note that y_level (any level) may have
duplicates (which I want to eliminate), and that it's ordering is
strictly on byteloc, not on its own value.  postgres doesn't seem to
have envisioned this scenario and/or I'm not being creative enough in
constructing the query...

Text=# select distinct on (y_level) y_level from citations_by_level where aid=543 and wid=1;
 y_level
---------
 1
 10
 10a
 11
 12
 13
 14
 15
 16
 17
 18
 19
 2
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 2a
 3
 30
 31
 32
 33
 34
 35
 36
 37
 4
 5
 5a
 6
 7
 7,8
 8
 9
 t
(42 rows)

but as you can see, the "ordering" winds up being alphabetic on
y_level which simply does not do.  [In this case it is only
coincidental that y_level appears numeric, it is a string and could be
anything; and the 7,8 is such an example].  The *byteloc* associated
with a given y_level (the location of that particular citation) does,
but I can't seem to use it:

Text=# select distinct on (y_level) y_level, byteloc from citations_by_level where aid=543 and wid=1 order by byteloc;
ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY expressions

Text=# select distinct on (y_level) y_level, byteloc from citations_by_level where aid=543 and wid=1 order by byteloc,
y_level;
ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY expressions

???

--Cindy
--
ctmoore@uci.edu

Re: question about a select

From
Stephan Szabo
Date:
On Thu, 9 Oct 2003, Cindy wrote:


> Text=# select distinct on (y_level) y_level, byteloc from
> citations_by_level where aid=543 and wid=1 order by byteloc;
> ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY
> expressions

Maybe something like:

select y_level from (select distinct on (y_level) y_level,
byteloc from citations_by_level where aid=543 and wid=1) tab order by
byteloc;

Re: question about a select

From
Cindy
Date:
Thanks for the speedy response!

Stephan Szabo writes:
 >On Thu, 9 Oct 2003, Cindy wrote:
 >
 >
 >> Text=# select distinct on (y_level) y_level, byteloc from
 >> citations_by_level where aid=543 and wid=1 order by byteloc;
 >> ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY
 >> expressions
 >
 >Maybe something like:
 >
 >select y_level from (select distinct on (y_level) y_level,
 >byteloc from citations_by_level where aid=543 and wid=1) tab order by
 >byteloc;

Hm...this seems to pick out the y_level with the largest byteloc
associated with it...I need the y_level with the smallest byteloc and
then sorted by that...lemme play this, but any other suggestions also
welcomed...

(EG, I got:
Text=# select y_level, byteloc from (select distinct on (y_level)
y_level, byteloc from citations_by_level where aid=543::smallint and
wid=1::smallint) tab order by byteloc;


 y_level | byteloc
---------+---------
 13      |   24017
 11      |   36231
 8       |   47240
 10      |   96777
 2       |  285678
 5       |  400212
 7,8     |  420879
...

but if you look at the raw data (or some of it):
---------+---------
 1       |      29
 1       |      84
 2       |     423
 3       |     827
 4       |    1005
 5       |    1169
 6       |    1530
 1       |    1698
 2       |    1988
 3       |    2284
 4       |    2460
 5       |    2638
 6       |    2924
 7       |    3155
 8       |    3396
 1       |    3673
 2       |    4095
 3       |    4387
 4       |    4699
 5       |    4944
 6       |    5055
 7       |    5406
 8       |    5704
 9       |    5996
 10      |    6349
 1       |    6578
 2       |    7110

I want

 1       |      29
 2       |     423
 3       |     827
 4       |    1005
 5       |    1169
 6       |    1530
 7       |    3155
 8       |    3396
 9       |    5996
 10      |    6349

etc...


--
Cindy
ctmoore@uci.edu

Re: question about a select

From
Stephan Szabo
Date:
On Thu, 9 Oct 2003, Cindy wrote:

> Stephan Szabo writes:
>  >On Thu, 9 Oct 2003, Cindy wrote:
>  >> Text=# select distinct on (y_level) y_level, byteloc from
>  >> citations_by_level where aid=543 and wid=1 order by byteloc;
>  >> ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY
>  >> expressions
>  >
>  >Maybe something like:
>  >
>  >select y_level from (select distinct on (y_level) y_level,
>  >byteloc from citations_by_level where aid=543 and wid=1) tab order by
>  >byteloc;
>
> Hm...this seems to pick out the y_level with the largest byteloc
> associated with it...I need the y_level with the smallest byteloc and
> then sorted by that...lemme play this, but any other suggestions also
> welcomed...

I think an order by y_level, byteloc in the subquery might give you the
ordering you want for the distinct on step, and then the outer order by
will order the y_levels by their respective bytelocs.