Thread: pl/pgsql faster than raw SQL?

pl/pgsql faster than raw SQL?

From
Markus Bertheau ☭
Date:
Hi, I have the following strange situation:

oocms=# vacuum full analyze;
VACUUM
oocms=# \df+ class_get_number_of_objects
                                                                                                       Список функций
 Схема |             Имя             | Тип данных результата | Типы данных аргументов | Владелец |  Язык   | Исходный
текст|                                           Описание 

-------+-----------------------------+-----------------------+------------------------+----------+---------+----------------+-----------------------------------------------------------------------------------------------
 oocms | class_get_number_of_objects | integer               | text                   | oocms    | plpgsql |
DECLARE
        arg_class_name ALIAS FOR $1;
BEGIN
        IF arg_class_name IS NULL THEN
                RAISE WARNING 'class_get_number_of_objects() with NULL class name called';
                RETURN NULL;
        END IF;
        RETURN
                count(1)
        FROM
                objects
        WHERE
                class = arg_class_name;
END;
               | Return the number of existing or deleted objects of a class. Arguments: the name of the class
(1 запись)

oocms=# explain analyze select count(1) from objects where class = 'Picture';
                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=278.16..278.16 rows=1 width=0) (actual time=44.121..44.123 rows=1 loops=1)
   ->  Seq Scan on objects  (cost=0.00..267.65 rows=4205 width=0) (actual time=0.030..33.325 rows=4308 loops=1)
         Filter: ("class" = 'Picture'::text)
 Total runtime: 44.211 ms
(записей: 4)

oocms=# explain analyze select class_get_number_of_objects('Picture');
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=0) (actual time=27.019..27.022 rows=1 loops=1)
 Total runtime: 27.062 ms
(записей: 2)


I.e. a function takes 27 ms to do what takes an equivalent piece of sql
43 ms. How can this be explained?

Some more info:

oocms=# select class_get_number_of_objects('Picture');
 class_get_number_of_objects
-----------------------------
                        4308
(1 запись)

oocms=# select count(1) from objects;
 count
-------
 13332
(1 запись)

oocms=# \d objects
                                       Таблица "oocms.objects"
  Колонка  |           Тип            |                         Модификаторы
-----------+--------------------------+---------------------------------------------------------------
 object_id | integer                  | not null default nextval('oocms.objects_object_id_seq'::text)
 class     | text                     | not null
 created   | timestamp with time zone | not null default ('now'::text)::timestamp(6) with time zone
Индексы:
    "objects_pkey" PRIMARY KEY, btree (object_id)
    "fooooo" btree ("class")
Ограничения по внешнему ключу:
    "objects_class_fkey" FOREIGN KEY ("class") REFERENCES classes(name) ON UPDATE CASCADE


--
Markus Bertheau ☭ <twanger@bluetwanger.de>


Re: pl/pgsql faster than raw SQL?

From
Richard Huxton
Date:
Markus Bertheau ☭ wrote:
> oocms=# explain analyze select count(1) from objects where class = 'Picture';
>                                                    QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=278.16..278.16 rows=1 width=0) (actual time=44.121..44.123 rows=1 loops=1)
>    ->  Seq Scan on objects  (cost=0.00..267.65 rows=4205 width=0) (actual time=0.030..33.325 rows=4308 loops=1)
>          Filter: ("class" = 'Picture'::text)
>  Total runtime: 44.211 ms
> (записей: 4)
>
> oocms=# explain analyze select class_get_number_of_objects('Picture');
>                                       QUERY PLAN
> --------------------------------------------------------------------------------------
>  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=27.019..27.022 rows=1 loops=1)
>  Total runtime: 27.062 ms

Well, you're saving planning time with the plpgsql version, but that's
not going to come to 17ms (you'd hope). The EXPLAIN will take up time
itself, and it can look deeper into the SQL version. Try timing two
scripts with 100 of each and see if they really differ by that much.
--
   Richard Huxton
   Archonet Ltd

Re: pl/pgsql faster than raw SQL?

From
John A Meinel
Date:
Markus Bertheau ☭ wrote:

>Hi, I have the following strange situation:
>
>
>
...

>oocms=# explain analyze select count(1) from objects where class = 'Picture';
>                                                   QUERY PLAN
>----------------------------------------------------------------------------------------------------------------
> Aggregate  (cost=278.16..278.16 rows=1 width=0) (actual time=44.121..44.123 rows=1 loops=1)
>   ->  Seq Scan on objects  (cost=0.00..267.65 rows=4205 width=0) (actual time=0.030..33.325 rows=4308 loops=1)
>         Filter: ("class" = 'Picture'::text)
> Total runtime: 44.211 ms
>(записей: 4)
>
>oocms=# explain analyze select class_get_number_of_objects('Picture');
>                                      QUERY PLAN
>--------------------------------------------------------------------------------------
> Result  (cost=0.00..0.01 rows=1 width=0) (actual time=27.019..27.022 rows=1 loops=1)
> Total runtime: 27.062 ms
>(записей: 2)
>
>
>I.e. a function takes 27 ms to do what takes an equivalent piece of sql
>43 ms. How can this be explained?
>
>Some more info:
>
>
In explain analyze, there is a per-row overhead of 2 gettimeofday()
calls. This is usually very low and hidden in I/O, but on queries where
you go through a lot of rows,  but things are cached in ram, it can show up.
So the explain analyze is going deep into the SQL query.
With a stored procedure, explain analyze only runs the procedure, it
doesn't instrument the actual function. So you don't have that per-row
overhead.

For an alternate accurate view. Try:
# \timing
# explain analyze select count(1) from objects where class = 'Picture';
# explain analyze select class_get_number_of_objects('Picture');

\timing will also give you the time it takes to run the query, but it
doesn't instrument anything.

John
=:->


Attachment

Re: pl/pgsql faster than raw SQL?

From
Gaetano Mendola
Date:
Markus Bertheau ☭ wrote:
> Hi, I have the following strange situation:

that is no so strange. I have an example where:

SELECT * FROM my_view WHERE field1 = 'New'; ==> 800 seconds

SELECT * FROM my_view;  ==> 2 seconds

the only solution I had was to write a function table with
the second select in a loop that was returnin the row if
the field1 was equal = 'New'.
It's strange but happen.



Regards
Gaetano Mendola









Re: pl/pgsql faster than raw SQL?

From
John A Meinel
Date:
Gaetano Mendola wrote:

>Markus Bertheau ☭ wrote:
>
>
>>Hi, I have the following strange situation:
>>
>>
>
>that is no so strange. I have an example where:
>
>SELECT * FROM my_view WHERE field1 = 'New'; ==> 800 seconds
>
>SELECT * FROM my_view;  ==> 2 seconds
>
>the only solution I had was to write a function table with
>the second select in a loop that was returnin the row if
>the field1 was equal = 'New'.
>It's strange but happen.
>
>
>
>Regards
>Gaetano Mendola
>
>

That sounds more like you had bad statistics on the field1 column, which
caused postgres to switch from a seqscan to an index scan, only there
were so many rows with field1='New' that it actually would have been
faster with a seqscan.

Otherwise what you did is very similar to the "nested loop" of postgres
which it selects when appropriate.

The other issue with views is that depending on their definition,
sometimes postgres can flatten them out and optimize the query, and
sometimes it can't. Order by is one of the big culprits for bad queries
involving views.

John
=:->


Attachment

Re: pl/pgsql faster than raw SQL?

From
Gaetano Mendola
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John A Meinel wrote:
 > That sounds more like you had bad statistics on the field1 column, which
> caused postgres to switch from a seqscan to an index scan, only there
> were so many rows with field1='New' that it actually would have been
> faster with a seqscan.

The field1 was a calculated field and with the filter "='New'"
postgres was executing that function on more rows than without filter.



Regards
Gaetano Mendola



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCLtwZ7UpzwH2SGd4RAhU5AJwMeFWwIO/UfdU0QTDo+FTCxPhqYACfYNVl
1yBUEObhZhUDnNDXdsJ/bi0=
=xc8U
-----END PGP SIGNATURE-----