Thread: Strange performance boost with random()

Strange performance boost with random()

From
M Putz
Date:
Hello,

While analyzing performance, we encountered the following phenomenon,

   SELECT sum(pow(.5*generate_series,.5))
   FROM generate_series(1,1000000);

is much much (a hundred times) slower than

   SELECT sum(pow(random()*generate_series,.5))
   FROM generate_series(1,1000000);

and asymptotic difference is even more astounding.
This seems counter-intuitive, considering the cost of
an additional random() call instead of a constant factor.
What are the reasons for this strange performance boost?

Thanks, M. Putz



Re: Strange performance boost with random()

From
Heikki Linnakangas
Date:
On 02/10/2014 09:52 PM, M Putz wrote:
>
> Hello,
>
> While analyzing performance, we encountered the following phenomenon,
>
>     SELECT sum(pow(.5*generate_series,.5))
>     FROM generate_series(1,1000000);
>
> is much much (a hundred times) slower than
>
>     SELECT sum(pow(random()*generate_series,.5))
>     FROM generate_series(1,1000000);
>
> and asymptotic difference is even more astounding.
> This seems counter-intuitive, considering the cost of
> an additional random() call instead of a constant factor.
> What are the reasons for this strange performance boost?

Different data type. The first uses numeric, which is pretty slow for
doing calculations. random() returns a double, which makes the pow and
sum to also use double, which is a lot faster.

To see the effect, try these variants:

SELECT sum(pow(.5::float8 * generate_series,.5))
FROM generate_series(1,1000000);

SELECT sum(pow(random()::numeric * generate_series,.5))
FROM generate_series(1,1000000);

- Heikki


Re: Strange performance boost with random()

From
Sébastien Lorion
Date:
On Mon, Feb 10, 2014 at 3:03 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 02/10/2014 09:52 PM, M Putz wrote:

Hello,

While analyzing performance, we encountered the following phenomenon,

    SELECT sum(pow(.5*generate_series,.5))
    FROM generate_series(1,1000000);

is much much (a hundred times) slower than

    SELECT sum(pow(random()*generate_series,.5))
    FROM generate_series(1,1000000);

and asymptotic difference is even more astounding.
This seems counter-intuitive, considering the cost of
an additional random() call instead of a constant factor.
What are the reasons for this strange performance boost?

Different data type. The first uses numeric, which is pretty slow for doing calculations. random() returns a double, which makes the pow and sum to also use double, which is a lot faster.

To see the effect, try these variants:

SELECT sum(pow(.5::float8 * generate_series,.5))
FROM generate_series(1,1000000);

SELECT sum(pow(random()::numeric * generate_series,.5))
FROM generate_series(1,1000000);

- Heikki



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

That's interesting .. Does PostgreSQL always use the NUMERIC data type for constants in absence of cast ?

Sébastien

Re: Strange performance boost with random()

From
Tom Lane
Date:
=?UTF-8?Q?S=C3=A9bastien_Lorion?= <sl@thestrangefactory.com> writes:
> That's interesting .. Does PostgreSQL always use the NUMERIC data type for
> constants in absence of cast ?

If they're not integers, yes, that's the initial assumption.

            regards, tom lane


list number of entries to be delete in cascading deletes

From
Eildert Groeneveld
Date:
Dear All

this probably not the best list to post this question:

I use cascading deletes but would like to first inform the user what she
is about to do.
Something like : explain delete from PANEL where panel_id=21;
-- you are about to delete 32144 records in tables abc aaa wewew

This is clearly something that can be programmed but as all information
is available in the database schema, there should be a generalized
procedure available.

Is there someone who has heard about this problem?

greetings

Eildert



Re: list number of entries to be delete in cascading deletes

From
Claudio Freire
Date:
On Tue, Feb 11, 2014 at 5:54 PM, Eildert Groeneveld
<eildert.groeneveld@fli.bund.de> wrote:
> Dear All
>
> this probably not the best list to post this question:
>
> I use cascading deletes but would like to first inform the user what she
> is about to do.
> Something like : explain delete from PANEL where panel_id=21;
> -- you are about to delete 32144 records in tables abc aaa wewew
>
> This is clearly something that can be programmed but as all information
> is available in the database schema, there should be a generalized
> procedure available.

Granted, this is somewhat ugly, but you could issue the delete, note
down the rowcount, and rollback.


Re: list number of entries to be delete in cascading deletes

From
Eildert Groeneveld
Date:
On Di, 2014-02-11 at 18:58 -0200, Claudio Freire wrote:
> On Tue, Feb 11, 2014 at 5:54 PM, Eildert Groeneveld
> <eildert.groeneveld@fli.bund.de> wrote:
> > Dear All
> >
> > this probably not the best list to post this question:
> >
> > I use cascading deletes but would like to first inform the user what she
> > is about to do.
> > Something like : explain delete from PANEL where panel_id=21;
> > -- you are about to delete 32144 records in tables abc aaa wewew
> >
> > This is clearly something that can be programmed but as all information
> > is available in the database schema, there should be a generalized
> > procedure available.
>
> Granted, this is somewhat ugly, but you could issue the delete, note
> down the rowcount, and rollback.
Thanks Claudio, thats an option and a fallback if we do not come up with
a better version. I am sure that there is something around.


>
>

--
Eildert Groeneveld
===================================================
Institute of Farm Animal Genetics (FLI)
Mariensee 31535 Neustadt Germany
Tel : (+49)(0)5034 871155 Fax : (+49)(0)5034 871143
e-mail: eildert.groeneveld@fli.bund.de
web:    http://vce.tzv.fal.de
==================================================