generating a large XML document - Mailing list pgsql-performance

From Julius Tuskenis
Subject generating a large XML document
Date
Msg-id 4DFA5AA4.8090001@nsoft.lt
Whole thread Raw
Responses Re: generating a large XML document
List pgsql-performance
Hello, postgresql guru's

When developing ticket export mechanism for our ticketing system I
thought the wise thing would be to generate the result (XML file) in
stored function. This worked fine for small datasets, but for the larger
ones I see it takes much longer to generate the document than to select
the data needed for it.

Here are several "explain alalyze"s :

explain analyze
SELECT * from tex.fnk_access_control_tickets(8560, 0);

"Function Scan on fnk_access_control_tickets  (cost=0.25..10.25
rows=1000 width=246) (actual time=479.276..484.429 rows=16292 loops=1)"
"Total runtime: 486.774 ms"

explain analyze
SELECT
     --xmlagg(
       XMLELEMENT ( NAME "bar",
         XMLELEMENT ( NAME "code", tick_barcode),
         XMLELEMENT ( NAME "stat", status),
         CASE WHEN type IS NOT NULL THEN XMLELEMENT ( NAME "tip", type)
ELSE NULL END,
         CASE WHEN sec_name IS NOT NULL THEN XMLELEMENT ( NAME "sec",
sec_name) ELSE NULL END,
         CASE WHEN row_name IS NOT NULL THEN XMLELEMENT ( NAME "row",
row_name) ELSE NULL END,
         CASE WHEN seat_name IS NOT NULL THEN XMLELEMENT ( NAME "plc",
seat_name) ELSE NULL END,
         CASE WHEN substr(tick_barcode,length(tick_barcode),1)= '1' THEN
         XMLELEMENT ( NAME "groups",
             XMLELEMENT ( NAME "group", 1)
            )
         ELSE NULL END
        )
     -- )
     FROM tex.fnk_access_control_tickets(8560, 0);

"Function Scan on fnk_access_control_tickets  (cost=0.25..17.75
rows=1000 width=238) (actual time=476.446..924.785 rows=16292 loops=1)"
"Total runtime: 928.768 ms"


explain analyze
SELECT
     xmlagg(
       XMLELEMENT ( NAME "bar",
         XMLELEMENT ( NAME "code", tick_barcode),
         XMLELEMENT ( NAME "stat", status),
         CASE WHEN type IS NOT NULL THEN XMLELEMENT ( NAME "tip", type)
ELSE NULL END,
         CASE WHEN sec_name IS NOT NULL THEN XMLELEMENT ( NAME "sec",
sec_name) ELSE NULL END,
         CASE WHEN row_name IS NOT NULL THEN XMLELEMENT ( NAME "row",
row_name) ELSE NULL END,
         CASE WHEN seat_name IS NOT NULL THEN XMLELEMENT ( NAME "plc",
seat_name) ELSE NULL END,
         CASE WHEN substr(tick_barcode,length(tick_barcode),1)= '1' THEN
         XMLELEMENT ( NAME "groups",
             XMLELEMENT ( NAME "group", 1)
            )
         ELSE NULL END
        )
      )
     FROM tex.fnk_access_control_tickets(8560, 0);

"Aggregate  (cost=12.75..12.77 rows=1 width=238) (actual
time=16110.847..16110.848 rows=1 loops=1)"
"  ->  Function Scan on fnk_access_control_tickets  (cost=0.25..10.25
rows=1000 width=238) (actual time=500.029..520.974 rows=16292 loops=1)"
"Total runtime: 16111.264 ms"

It seems the aggregate combining the elements is to blame... What I did
next was rewriting it in stored function using FOR loop and combining it
in a text variable. Sadly the result was the same (approximately 16
seconds).

 From that I had to conclude that text := text + some_text operation is
an expensive one, but now I have no ideas left how to solve the problem.
Any notices, recommendations, advices are very welcome.

I've also tried google'ing on XML creation in Postgresql, but found no
warnings or even mentioning xmlagg could cause a headache. I have
nowhere to turn for help now, so please advice...

Not sure if that will help, but anyway:
Server:
PostgreSQL 9.0.3 on i486-pc-linux-gnu, compiled by GCC gcc-4.4.real
(Debian 4.4.5-10) 4.4.5, 32-bit
see the result of select * from pg_settings  in attachment if needed.
Client:
Windows XP, pgAdmin 1.12.3


Thank you in advance.

--
Julius Tuskenis
Programavimo skyriaus vadovas
UAB nSoft
mob. +37068233050


Attachment

pgsql-performance by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Performance advice for a new low(er)-power server
Next
From: Greg Smith
Date:
Subject: Re: Performance advice for a new low(er)-power server