Home > mailing lists

Re: [HACKERS] Much Ado About COUNT(*) - Mailing list pgsql-general

From	Csaba Nagy
Subject	Re: [HACKERS] Much Ado About COUNT(*)
Date	January 13, 2005 16:08:45
Msg-id	1105632432.14493.31.camel@localhost.localdomain Whole thread Raw
In response to	Re: [HACKERS] Much Ado About COUNT(*) (Greg Stark <gsstark@mit.edu>)
List	pgsql-general

Tree view

[snip]
> See:
>
>   http://www.jlcomp.demon.co.uk/faq/random.html
>
> I think the Oracle syntax looks like
>
>   SELECT * FROM foo SAMPLE (0.1)
>
> I don't think I would have picked this syntax but it seems like a better idea
> to copy the existing practice rather than invent a new one.
>
> There are some details, like what to do when there's a WHERE clause or joins.
> Oracle disallows joins entirely and I'm unclear what the best thing to do
> about where clauses would be.

The where clauses could be applied exactly as for a normal select, with
the sampling being just a pre-filtering condition for what rows to
consider.

If there would be a way to specify the table on which to apply the
sampling, then the whole construct could be replaced automatically by
the inline view the oracle link recommends.
I doubt there would be any benefit in sampling more than one table in a
query, it should just work to sample the biggest table, and join the
result with the others. Sampling is only useful for really big tables
anyway.

So the syntax above could be extended to:

SELECT * FROM foo SAMPLE (foo, 0.1)

and:

SELECT foo.*, bar.*
  FROM foo, bar
  WHERE foo.key = bar.key
  SAMPLE (foo, 0.1)

which means sample foo and join the result with bar.

All this makes sense from a user point of view, I wonder how big a PITA
is to implement it...

Cheers,
Csaba.

pgsql-general by date:

From: Tom Lane
Date: 13 January 2005, 15:53:42
Subject: Re: Trace triggers

From: Alvaro Herrera
Date: 13 January 2005, 16:42:14
Subject: Re: Réf. : Re: Réf

Re: [HACKERS] Much Ado About COUNT(*) - Mailing list pgsql-general

Previous

Next