Home > mailing lists

Re: Optimal time series sampling. - Mailing list pgsql-general

From	Scott Marlowe
Subject	Re: Optimal time series sampling.
Date	November 9, 2007 15:23:05
Msg-id	dcc563d10711091122o29956187ga5243db9394b5910@mail.gmail.com Whole thread Raw
In response to	Re: Optimal time series sampling. (Ted Byers <r.ted.byers@rogers.com>)
Responses	Re: Optimal time series sampling.
List	pgsql-general

Tree view

On Nov 9, 2007 11:47 AM, Ted Byers <r.ted.byers@rogers.com> wrote:
>
> --- Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Ted Byers <r.ted.byers@rogers.com> writes:
> > > OK, it is challenging to present it in plain text,
> > but
> > > here is the HTML exported by MySQL Query Browser.
> >
> > Why are you asking this list for help with a MySQL
> > performance problem?
> >
> because my question isn't really about MySQL, but
> rather about how best to construct the SQL required to
> get the job done, regardless of what database is used.
>  I have seen some claims that it is better to use
> joins instead of correlated subqueries and others that
> say the opposite.  And I do not, at this stage, know
> if there are other options in SQL that may or may not
> be better.

Which is better depends largely on how your database is built.  MySQL
still uses loops for all subselects, so with large numbers of tuples
in the subselect method, it will be slow.  But they might fix this in
a later release.

Fairly recent versions of PostgreSQL could make some bad choices when
doing joins for certain datasets that would be much faster with a
correlated subquery (specifically the old left join where
righttable.field is null trick made some pgsql versions choose an
inefficient join method)

So, the "right" way is a question of which db, and even which version
of that DB you're on.


>  At this time, the database in use is
> irrelevant (I want to stick as close to the ANSI
> standard as practicable so the rewriting required will
> be minimal should we decide to change the database
> later, for whatever reason).

If you want to stick with ANSI, MySQL tends to be more divergent from
the spec than pgsql and other rdbms.

Most people would consider the correlate subquery the better method.
But it's also likely to be the slowest on MySQL.

pgsql-general by date:

From: "Scott Marlowe"
Date: 09 November 2007, 15:18:30
Subject: Re: Insert statements really slow

From: Tomas Vondra
Date: 09 November 2007, 15:54:03
Subject: Re: INSERT performance deteriorates quickly during a large import

Re: Optimal time series sampling. - Mailing list pgsql-general

Previous

Next