Thread: Strange performance boost with random()
Hello, While analyzing performance, we encountered the following phenomenon, SELECT sum(pow(.5*generate_series,.5)) FROM generate_series(1,1000000); is much much (a hundred times) slower than SELECT sum(pow(random()*generate_series,.5)) FROM generate_series(1,1000000); and asymptotic difference is even more astounding. This seems counter-intuitive, considering the cost of an additional random() call instead of a constant factor. What are the reasons for this strange performance boost? Thanks, M. Putz
On 02/10/2014 09:52 PM, M Putz wrote: > > Hello, > > While analyzing performance, we encountered the following phenomenon, > > SELECT sum(pow(.5*generate_series,.5)) > FROM generate_series(1,1000000); > > is much much (a hundred times) slower than > > SELECT sum(pow(random()*generate_series,.5)) > FROM generate_series(1,1000000); > > and asymptotic difference is even more astounding. > This seems counter-intuitive, considering the cost of > an additional random() call instead of a constant factor. > What are the reasons for this strange performance boost? Different data type. The first uses numeric, which is pretty slow for doing calculations. random() returns a double, which makes the pow and sum to also use double, which is a lot faster. To see the effect, try these variants: SELECT sum(pow(.5::float8 * generate_series,.5)) FROM generate_series(1,1000000); SELECT sum(pow(random()::numeric * generate_series,.5)) FROM generate_series(1,1000000); - Heikki
On Mon, Feb 10, 2014 at 3:03 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 02/10/2014 09:52 PM, M Putz wrote:Different data type. The first uses numeric, which is pretty slow for doing calculations. random() returns a double, which makes the pow and sum to also use double, which is a lot faster.
Hello,
While analyzing performance, we encountered the following phenomenon,
SELECT sum(pow(.5*generate_series,.5))
FROM generate_series(1,1000000);
is much much (a hundred times) slower than
SELECT sum(pow(random()*generate_series,.5))
FROM generate_series(1,1000000);
and asymptotic difference is even more astounding.
This seems counter-intuitive, considering the cost of
an additional random() call instead of a constant factor.
What are the reasons for this strange performance boost?
To see the effect, try these variants:
SELECT sum(pow(.5::float8 * generate_series,.5))
FROM generate_series(1,1000000);
SELECT sum(pow(random()::numeric * generate_series,.5))
FROM generate_series(1,1000000);
- Heikki
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
That's interesting .. Does PostgreSQL always use the NUMERIC data type for constants in absence of cast ?
Sébastien
=?UTF-8?Q?S=C3=A9bastien_Lorion?= <sl@thestrangefactory.com> writes: > That's interesting .. Does PostgreSQL always use the NUMERIC data type for > constants in absence of cast ? If they're not integers, yes, that's the initial assumption. regards, tom lane
Dear All this probably not the best list to post this question: I use cascading deletes but would like to first inform the user what she is about to do. Something like : explain delete from PANEL where panel_id=21; -- you are about to delete 32144 records in tables abc aaa wewew This is clearly something that can be programmed but as all information is available in the database schema, there should be a generalized procedure available. Is there someone who has heard about this problem? greetings Eildert
On Tue, Feb 11, 2014 at 5:54 PM, Eildert Groeneveld <eildert.groeneveld@fli.bund.de> wrote: > Dear All > > this probably not the best list to post this question: > > I use cascading deletes but would like to first inform the user what she > is about to do. > Something like : explain delete from PANEL where panel_id=21; > -- you are about to delete 32144 records in tables abc aaa wewew > > This is clearly something that can be programmed but as all information > is available in the database schema, there should be a generalized > procedure available. Granted, this is somewhat ugly, but you could issue the delete, note down the rowcount, and rollback.
On Di, 2014-02-11 at 18:58 -0200, Claudio Freire wrote: > On Tue, Feb 11, 2014 at 5:54 PM, Eildert Groeneveld > <eildert.groeneveld@fli.bund.de> wrote: > > Dear All > > > > this probably not the best list to post this question: > > > > I use cascading deletes but would like to first inform the user what she > > is about to do. > > Something like : explain delete from PANEL where panel_id=21; > > -- you are about to delete 32144 records in tables abc aaa wewew > > > > This is clearly something that can be programmed but as all information > > is available in the database schema, there should be a generalized > > procedure available. > > Granted, this is somewhat ugly, but you could issue the delete, note > down the rowcount, and rollback. Thanks Claudio, thats an option and a fallback if we do not come up with a better version. I am sure that there is something around. > > -- Eildert Groeneveld =================================================== Institute of Farm Animal Genetics (FLI) Mariensee 31535 Neustadt Germany Tel : (+49)(0)5034 871155 Fax : (+49)(0)5034 871143 e-mail: eildert.groeneveld@fli.bund.de web: http://vce.tzv.fal.de ==================================================