On Thu, 17 Dec 2009 10:38:32 +0100
"A. Kretschmer" <andreas.kretschmer@schollglas.com> wrote:
> In response to Ivan Sergio Borgonovo :
> > I've a web application.
> >
> > I'm logging data related to the landing page and subsequent
> > selected hits.
> >
> > create table track_hit (
> > hitid serial not null,
> > /* pk? I don't mind if serial wrap around
> > pk could be (hitid, tracking_time) */
> > esid varchar(32), -- related to session
> > tracking_time timestamp not null default now(),
> > -- some other stats
> > -- first_hit boolean not null default false, -- ???
> > );
> >
> > I'd like to be sure I just count one "first hit" in a session
> > (same esid) in an efficient way that means I'd like to mark them
> > with a
>
> select distinct on (esid) esid, tracking_time from track_hit order
> by esid, tracking_time;
>
> returns only one record for each esid, ordered by tracking_time.
> Should work with 8.x, maybe sice 7.x (I'm not sure)
I think I've tried to resolve a concurrency problem in the wrong
place... still... what is the difference between:
select min(hitid) as h from track_hit group by esid;
and
select distinct on (esid) hitid from track_hit order by esid,
track_time;
I haven't enough test data to see if they perform differently. The
second form seems to perform a little bit faster.
I'd expect the opposite: the first performing better.
I think I'll add an index on track_time for reporting and maybe make
primary key (hitid, tracking_time).
I don't want to be bothered by hitid wrap around, so I don't want to
make it a pk alone, still I may need a pk.
thanks
--
Ivan Sergio Borgonovo
http://www.webthatworks.it