Re: TABLESAMPLE usage - Mailing list pgsql-general

From Matija Lesar
Subject Re: TABLESAMPLE usage
Date
Msg-id CAPx3hmPVfnaw6kZzTSSseLqiDwiF174MMJ0vj8RdRu70FipTnA@mail.gmail.com
Whole thread Raw
In response to Re: TABLESAMPLE usage  (Tom Smith <tomsmith1989sk@gmail.com>)
Responses Re: TABLESAMPLE usage  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-general
On 25 January 2016 at 09:55, Tom Smith <tomsmith1989sk@gmail.com> wrote:
Thanks, the solution would work for fixed interval timestamp.
But the data I am dealing with has irregular timestamp so can not be generated with exact steps.

I would consider this a special case/method of random sampling, evenly distributed sampling according to the defined  timestamp index.

On Mon, Jan 25, 2016 at 3:48 AM, Vik Fearing <vik@2ndquadrant.fr> wrote:
On 01/25/2016 05:09 AM, Tom Smith wrote:
> Hello:
>
> I have a big table with that is always appended with new data with a unique
> sequence id  (always incremented, or timestamp as unique index) each row.
> I'd like to sample, say 100 rows out of say 1000 rows evently across all
> the rows,
> so that it would return  rows  of1, 101, 201, 301    you get idea.
> can TABLESAMPLE    get one row for every 100 rows, based on the order
> of the rows added to table using the timestamp as already indexed/sorted
> sequence

No, TABLESAMPLE is intended to take a random sampling of the data using
various methods.

You're looking for something more like this:

    select t.*
    from generate_series(1, (select max(id) from t), 100) g
    join t on t.id = g;
--
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Hi,

you can accomplish this with row_number():

WITH data_cte as (
    SELECT
        id,
        clock_timestamp() as ctimestamp
    FROM generate_series(1,1000) as id
    )
SELECT
    *
FROM
    (SELECT
        id,
        ctimestamp,
        row_number() OVER (ORDER BY ctimestamp) as rownum
    FROM data_cte

    ) as data_withrownumbers
WHERE
    rownum%100=1;


Bye,
Matija Lesar

pgsql-general by date:

Previous
From: David Rowley
Date:
Subject: Re: Performance options for CPU bound multi-SUM query
Next
From: Brian Cardarella
Date:
Subject: Possible to dump/load a database from within psql?