Thread: pl/pgsql faster than raw SQL?
Hi, I have the following strange situation: oocms=# vacuum full analyze; VACUUM oocms=# \df+ class_get_number_of_objects Список функций Схема | Имя | Тип данных результата | Типы данных аргументов | Владелец | Язык | Исходный текст| Описание -------+-----------------------------+-----------------------+------------------------+----------+---------+----------------+----------------------------------------------------------------------------------------------- oocms | class_get_number_of_objects | integer | text | oocms | plpgsql | DECLARE arg_class_name ALIAS FOR $1; BEGIN IF arg_class_name IS NULL THEN RAISE WARNING 'class_get_number_of_objects() with NULL class name called'; RETURN NULL; END IF; RETURN count(1) FROM objects WHERE class = arg_class_name; END; | Return the number of existing or deleted objects of a class. Arguments: the name of the class (1 запись) oocms=# explain analyze select count(1) from objects where class = 'Picture'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Aggregate (cost=278.16..278.16 rows=1 width=0) (actual time=44.121..44.123 rows=1 loops=1) -> Seq Scan on objects (cost=0.00..267.65 rows=4205 width=0) (actual time=0.030..33.325 rows=4308 loops=1) Filter: ("class" = 'Picture'::text) Total runtime: 44.211 ms (записей: 4) oocms=# explain analyze select class_get_number_of_objects('Picture'); QUERY PLAN -------------------------------------------------------------------------------------- Result (cost=0.00..0.01 rows=1 width=0) (actual time=27.019..27.022 rows=1 loops=1) Total runtime: 27.062 ms (записей: 2) I.e. a function takes 27 ms to do what takes an equivalent piece of sql 43 ms. How can this be explained? Some more info: oocms=# select class_get_number_of_objects('Picture'); class_get_number_of_objects ----------------------------- 4308 (1 запись) oocms=# select count(1) from objects; count ------- 13332 (1 запись) oocms=# \d objects Таблица "oocms.objects" Колонка | Тип | Модификаторы -----------+--------------------------+--------------------------------------------------------------- object_id | integer | not null default nextval('oocms.objects_object_id_seq'::text) class | text | not null created | timestamp with time zone | not null default ('now'::text)::timestamp(6) with time zone Индексы: "objects_pkey" PRIMARY KEY, btree (object_id) "fooooo" btree ("class") Ограничения по внешнему ключу: "objects_class_fkey" FOREIGN KEY ("class") REFERENCES classes(name) ON UPDATE CASCADE -- Markus Bertheau ☭ <twanger@bluetwanger.de>
Markus Bertheau ☭ wrote: > oocms=# explain analyze select count(1) from objects where class = 'Picture'; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------- > Aggregate (cost=278.16..278.16 rows=1 width=0) (actual time=44.121..44.123 rows=1 loops=1) > -> Seq Scan on objects (cost=0.00..267.65 rows=4205 width=0) (actual time=0.030..33.325 rows=4308 loops=1) > Filter: ("class" = 'Picture'::text) > Total runtime: 44.211 ms > (записей: 4) > > oocms=# explain analyze select class_get_number_of_objects('Picture'); > QUERY PLAN > -------------------------------------------------------------------------------------- > Result (cost=0.00..0.01 rows=1 width=0) (actual time=27.019..27.022 rows=1 loops=1) > Total runtime: 27.062 ms Well, you're saving planning time with the plpgsql version, but that's not going to come to 17ms (you'd hope). The EXPLAIN will take up time itself, and it can look deeper into the SQL version. Try timing two scripts with 100 of each and see if they really differ by that much. -- Richard Huxton Archonet Ltd
Markus Bertheau ☭ wrote: >Hi, I have the following strange situation: > > > ... >oocms=# explain analyze select count(1) from objects where class = 'Picture'; > QUERY PLAN >---------------------------------------------------------------------------------------------------------------- > Aggregate (cost=278.16..278.16 rows=1 width=0) (actual time=44.121..44.123 rows=1 loops=1) > -> Seq Scan on objects (cost=0.00..267.65 rows=4205 width=0) (actual time=0.030..33.325 rows=4308 loops=1) > Filter: ("class" = 'Picture'::text) > Total runtime: 44.211 ms >(записей: 4) > >oocms=# explain analyze select class_get_number_of_objects('Picture'); > QUERY PLAN >-------------------------------------------------------------------------------------- > Result (cost=0.00..0.01 rows=1 width=0) (actual time=27.019..27.022 rows=1 loops=1) > Total runtime: 27.062 ms >(записей: 2) > > >I.e. a function takes 27 ms to do what takes an equivalent piece of sql >43 ms. How can this be explained? > >Some more info: > > In explain analyze, there is a per-row overhead of 2 gettimeofday() calls. This is usually very low and hidden in I/O, but on queries where you go through a lot of rows, but things are cached in ram, it can show up. So the explain analyze is going deep into the SQL query. With a stored procedure, explain analyze only runs the procedure, it doesn't instrument the actual function. So you don't have that per-row overhead. For an alternate accurate view. Try: # \timing # explain analyze select count(1) from objects where class = 'Picture'; # explain analyze select class_get_number_of_objects('Picture'); \timing will also give you the time it takes to run the query, but it doesn't instrument anything. John =:->
Attachment
Markus Bertheau ☭ wrote: > Hi, I have the following strange situation: that is no so strange. I have an example where: SELECT * FROM my_view WHERE field1 = 'New'; ==> 800 seconds SELECT * FROM my_view; ==> 2 seconds the only solution I had was to write a function table with the second select in a loop that was returnin the row if the field1 was equal = 'New'. It's strange but happen. Regards Gaetano Mendola
Gaetano Mendola wrote: >Markus Bertheau ☭ wrote: > > >>Hi, I have the following strange situation: >> >> > >that is no so strange. I have an example where: > >SELECT * FROM my_view WHERE field1 = 'New'; ==> 800 seconds > >SELECT * FROM my_view; ==> 2 seconds > >the only solution I had was to write a function table with >the second select in a loop that was returnin the row if >the field1 was equal = 'New'. >It's strange but happen. > > > >Regards >Gaetano Mendola > > That sounds more like you had bad statistics on the field1 column, which caused postgres to switch from a seqscan to an index scan, only there were so many rows with field1='New' that it actually would have been faster with a seqscan. Otherwise what you did is very similar to the "nested loop" of postgres which it selects when appropriate. The other issue with views is that depending on their definition, sometimes postgres can flatten them out and optimize the query, and sometimes it can't. Order by is one of the big culprits for bad queries involving views. John =:->
Attachment
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 John A Meinel wrote: > That sounds more like you had bad statistics on the field1 column, which > caused postgres to switch from a seqscan to an index scan, only there > were so many rows with field1='New' that it actually would have been > faster with a seqscan. The field1 was a calculated field and with the filter "='New'" postgres was executing that function on more rows than without filter. Regards Gaetano Mendola -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCLtwZ7UpzwH2SGd4RAhU5AJwMeFWwIO/UfdU0QTDo+FTCxPhqYACfYNVl 1yBUEObhZhUDnNDXdsJ/bi0= =xc8U -----END PGP SIGNATURE-----