Thread: returning row numbers in select

returning row numbers in select

From
Randall Skelton
Date:
Is there a way to return an integer row number for a query?  Note that
there may be a large number of rows so I would rather not have joined
selects...

For a rather simple query:

SELECT timestamp from test
WHERE  timestamp > '2004-02-02 04:15:00.00 +0'
AND    timestamp < '2004-02-02 04:15:10.00 +0';

where 'test' is

   Column   |           Type           | Modifiers
-----------+--------------------------+-----------
  timestamp | timestamp with time zone |
  value     | double precision         |
Indexes: table_timestamp

I to return a sequential row number beginning at 1?

i.e.

row|         timestamp
---+----------------------------
1   2004-02-01 23:15:00.824-05
2   2004-02-01 23:15:01.824-05
3   2004-02-01 23:15:02.824-05
...

My reason for wanting row numbers is so I can use a 'MOD(row_number,
n)' to get the nth row from the table.

Cheers,
Randall


Re: returning row numbers in select

From
Chris
Date:
> I to return a sequential row number beginning at 1?
>
> i.e.
>
> row|         timestamp
> ---+----------------------------
> 1   2004-02-01 23:15:00.824-05
> 2   2004-02-01 23:15:01.824-05
> 3   2004-02-01 23:15:02.824-05
> ...
>
> My reason for wanting row numbers is so I can use a 'MOD(row_number,
> n)' to get the nth row from the table.


Doesn't LIMIT and OFFSET do the job?
http://www.postgresql.org/docs/7.3/interactive/queries-limit.html


Bye, Chris.



Re: returning row numbers in select

From
Randall Skelton
Date:
>> I to return a sequential row number beginning at 1?
>>
>> i.e.
>>
>> row|         timestamp
>> ---+----------------------------
>> 1   2004-02-01 23:15:00.824-05
>> 2   2004-02-01 23:15:01.824-05
>> 3   2004-02-01 23:15:02.824-05
>> ...
>>
>> My reason for wanting row numbers is so I can use a 'MOD(row_number,
>> n)' to get the nth row from the table.

Correction, I don't want to simply get the nth row, I want all rows
that are divisible by n.  Essentially, the timestamp is at a regular
interval and I want a way of selecting rows at different sampling
intervals.

> Doesn't LIMIT and OFFSET do the job?
> http://www.postgresql.org/docs/7.3/interactive/queries-limit.html

It would if I only wanted an offset butI want a query to return the
first, fifth, and tenth, and so on row.  This would be 'MOD(row_num,
5)' but given that I don't know the number of rows a priori, it is
difficult to write a LIMIT.  Moreover, the offset doesn't make it easy
to get the first row.  Unless, of course, I am missing something
obvious?

Cheers,
Randall


Re: returning row numbers in select

From
Tom Lane
Date:
Randall Skelton <skelton@brutus.uwaterloo.ca> writes:
> Correction, I don't want to simply get the nth row, I want all rows
> that are divisible by n.  Essentially, the timestamp is at a regular
> interval and I want a way of selecting rows at different sampling
> intervals.

Couldn't you code this as a WHERE test on the timestamp?

            regards, tom lane

Re: returning row numbers in select

From
Bernard Clement
Date:
Maybe by using a sequence and a function.

The sequece to generate the row number.

The a function
1) to reset the sequence and
2) to perform a select with the first column nextval(seq) and the column the
timestamp

However, I am a newbie with PostgreSQL and I am not sure it this will work
correctly...you might have to play with it (or wait for somebody with more
experience than me).

Bernard

On Wednesday 10 March 2004 16:23, Randall Skelton wrote:
> >> I to return a sequential row number beginning at 1?
> >>
> >> i.e.
> >>
> >> row|         timestamp
> >> ---+----------------------------
> >> 1   2004-02-01 23:15:00.824-05
> >> 2   2004-02-01 23:15:01.824-05
> >> 3   2004-02-01 23:15:02.824-05
> >> ...
> >>
> >> My reason for wanting row numbers is so I can use a 'MOD(row_number,
> >> n)' to get the nth row from the table.
>
> Correction, I don't want to simply get the nth row, I want all rows
> that are divisible by n.  Essentially, the timestamp is at a regular
> interval and I want a way of selecting rows at different sampling
> intervals.
>
> > Doesn't LIMIT and OFFSET do the job?
> > http://www.postgresql.org/docs/7.3/interactive/queries-limit.html
>
> It would if I only wanted an offset butI want a query to return the
> first, fifth, and tenth, and so on row.  This would be 'MOD(row_num,
> 5)' but given that I don't know the number of rows a priori, it is
> difficult to write a LIMIT.  Moreover, the offset doesn't make it easy
> to get the first row.  Unless, of course, I am missing something
> obvious?
>
> Cheers,
> Randall
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)


Re: returning row numbers in select

From
Randall Skelton
Date:
>> Correction, I don't want to simply get the nth row, I want all rows
>> that are divisible by n.  Essentially, the timestamp is at a regular
>> interval and I want a way of selecting rows at different sampling
>> intervals.
>
> Couldn't you code this as a WHERE test on the timestamp?

That would be ideal as it is theoretically possible for there to be
missing rows due to sampling errors; nevertheless, a WHERE test doesn't
seem obvious to me.  Can you please post an example?  The time spacing
between rows is 1 second but I want my select statement to return rows
every 5 seconds (see marked lines below).  I've tried various interval
operations but I don't really understand how to relate the timestamp
and and the interval.

SELECT timestamp FROM test WHERE timestamp > '2004-02-02 04:15:00.00
+0' AND timestamp < '2004-02-02 04:15:10.00 +0' ORDER BY timestamp;

          timestamp
----------------------------
  2004-02-01 23:15:00.824-05  *
  2004-02-01 23:15:01.824-05
  2004-02-01 23:15:02.824-05
  2004-02-01 23:15:03.824-05
  2004-02-01 23:15:04.824-05
  2004-02-01 23:15:05.824-05  *
  2004-02-01 23:15:06.824-05
  2004-02-01 23:15:07.824-05
  2004-02-01 23:15:08.824-05
  2004-02-01 23:15:09.824-05
  2004-02-01 23:15:10.824-05  *
  ...

Thanks,
Randall


Re: returning row numbers in select

From
Tom Lane
Date:
Randall Skelton <skelton@brutus.uwaterloo.ca> writes:
>> Couldn't you code this as a WHERE test on the timestamp?

> That would be ideal as it is theoretically possible for there to be
> missing rows due to sampling errors; nevertheless, a WHERE test doesn't
> seem obvious to me.  Can you please post an example?

Something like
    WHERE (EXTRACT(EPOCH FROM timestamp)::numeric % 5) = 0;
The EXTRACT function returns double precision, but there's no double
modulo operator for some reason, hence the cast to numeric which does
have one.

            regards, tom lane

Re: returning row numbers in select

From
Bruno Wolff III
Date:
On Wed, Mar 10, 2004 at 17:25:49 -0500,
  Randall Skelton <skelton@brutus.uwaterloo.ca> wrote:
>
> That would be ideal as it is theoretically possible for there to be
> missing rows due to sampling errors; nevertheless, a WHERE test doesn't
> seem obvious to me.  Can you please post an example?  The time spacing
> between rows is 1 second but I want my select statement to return rows
> every 5 seconds (see marked lines below).  I've tried various interval
> operations but I don't really understand how to relate the timestamp
> and and the interval.

You could extract seconds from timestamp, cast to integer and apply
the mod function and test against whichever remainder you want.

Re: returning row numbers in select

From
Steve Crawford
Date:
On Wednesday 10 March 2004 12:25 pm, Randall Skelton wrote:
> Is there a way to return an integer row number for a query?  Note
> that there may be a large number of rows so I would rather not have
> joined selects...
<snip>

Well...if your result has a unique column you can do something like
this:

steve=# select (select count(*) from bar as barcount where
barcount.sec<=bar.sec) as rownum, sec from bar order by sec;

 rownum |    sec
--------+------------
      1 | 1063966688
      2 | 1063966689
      3 | 1063966690
      4 | 1063966691
      5 | 1063966692
      6 | 1063966693
      7 | 1063966694
      8 | 1063966695
      9 | 1063966696
     10 | 1063966697
     11 | 1063966698
     12 | 1063966699
     13 | 1063966700
     14 | 1063966701
     15 | 1063966702
     16 | 1063966703
     17 | 1063966704
     18 | 1063966705

As you might guess, this is not a fast query - more of a brute-force
kludge. It's likely that you will be better off postprocessing the
query to select every n records or possibly writing a function that
will handle the situation.

Cheers,
Steve