Thread: flagging first row inserted for each "group" of key

flagging first row inserted for each "group" of key

From

Ivan Sergio Borgonovo

Date:

17 December 2009, 08:29:02

I've a web application.

I'm logging data related to the landing page and subsequent selected
hits.

create table track_hit (
  hitid serial not null,
  /* pk? I don't mind if serial wrap around
  pk could be (hitid, tracking_time) */
  esid varchar(32), -- related to session
  tracking_time timestamp not null default now(),
  -- some other stats
  -- first_hit boolean not null default false, -- ???
);

I'd like to be sure I just count one "first hit" in a session (same
esid) in an efficient way that means I'd like to mark them with a
flag and avoid a group by, min subquery cycle when I'm doing
reporting.
I can't trust the browser and I just want one first hit for each
esid, no matter if they have the same tracking_time.

Of course the problem is concurrency, but maybe I missed some pg
trick that could help me.

I'm on 8.3 and no plan to move to 8.4 shortly, so no windowing
functions that will make reporting easier/faster.

--
Ivan Sergio Borgonovo
http://www.webthatworks.it

Re: flagging first row inserted for each "group" of key

From

"A. Kretschmer"

Date:

17 December 2009, 08:38:44

In response to Ivan Sergio Borgonovo :
> I've a web application.
>
> I'm logging data related to the landing page and subsequent selected
> hits.
>
> create table track_hit (
>   hitid serial not null,
>   /* pk? I don't mind if serial wrap around
>   pk could be (hitid, tracking_time) */
>   esid varchar(32), -- related to session
>   tracking_time timestamp not null default now(),
>   -- some other stats
>   -- first_hit boolean not null default false, -- ???
> );
>
> I'd like to be sure I just count one "first hit" in a session (same
> esid) in an efficient way that means I'd like to mark them with a

select distinct on (esid) esid, tracking_time from track_hit order by esid, tracking_time;

returns only one record for each esid, ordered by tracking_time. Should
work with 8.x, maybe sice 7.x (I'm not sure)


HTH, Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

Re: flagging first row inserted for each "group" of key

From

Ivan Sergio Borgonovo

Date:

17 December 2009, 09:58:29

On Thu, 17 Dec 2009 10:38:32 +0100
"A. Kretschmer" <andreas.kretschmer@schollglas.com> wrote:

> In response to Ivan Sergio Borgonovo :
> > I've a web application.
> >
> > I'm logging data related to the landing page and subsequent
> > selected hits.
> >
> > create table track_hit (
> >   hitid serial not null,
> >   /* pk? I don't mind if serial wrap around
> >   pk could be (hitid, tracking_time) */
> >   esid varchar(32), -- related to session
> >   tracking_time timestamp not null default now(),
> >   -- some other stats
> >   -- first_hit boolean not null default false, -- ???
> > );
> >
> > I'd like to be sure I just count one "first hit" in a session
> > (same esid) in an efficient way that means I'd like to mark them
> > with a
>
> select distinct on (esid) esid, tracking_time from track_hit order
> by esid, tracking_time;
>
> returns only one record for each esid, ordered by tracking_time.
> Should work with 8.x, maybe sice 7.x (I'm not sure)

I think I've tried to resolve a concurrency problem in the wrong
place... still... what is the difference between:

select min(hitid) as h from track_hit group by esid;

and

select distinct on (esid) hitid from track_hit order by esid,
track_time;

I haven't enough test data to see if they perform differently. The
second form seems to perform a little bit faster.
I'd expect the opposite: the first performing better.

I think I'll add an index on track_time for reporting and maybe make
primary key (hitid, tracking_time).
I don't want to be bothered by hitid wrap around, so I don't want to
make it a pk alone, still I may need a pk.

thanks

--
Ivan Sergio Borgonovo
http://www.webthatworks.it