to high memory expensive xpath_table from xml2 contrib extension - Mailing list pgsql-bugs
From | Pavel Stehule |
---|---|
Subject | to high memory expensive xpath_table from xml2 contrib extension |
Date | |
Msg-id | CAFj8pRD75ZmgdKNmrVFQhkxxsoNt7NdoLRwmgO3huyeUGcruyg@mail.gmail.com Whole thread Raw |
List | pgsql-bugs |
Hi I found some unexpected behave when I tested some possibilities for my customer. This test needs installed extension xml2 with xslt buildin support create or replace function xml_generator(int) returns xml as $$ select xslt_process(xmlelement(name data, xmlagg(xmlelement(name row, xmlforest(right('0000000' || i, 8) as a, current_date as b, current_date + 100 as c, 'XXX' as d, 'short' as e, 'short' as f, 'some longer description' as g, 'some other longer description english' as h))))::text, '<xsl:stylesheet version="1.0" xmlns:xsl=" http://www.w3.org/1999/XSL/Transform"> <xsl:strip-space elements="*" /><xsl:output method="xml" indent="yes" /> <xsl:template match="node() | @*"> <xsl:copy> <xsl:apply-templates select="node() | @*" /> </xsl:copy> </xsl:template> </xsl:stylesheet>')::xml from generate_series(1,$1) g(i) $$ language sql; postgres=# select xml_generator(1); xml_generator -------------------------------------------------- <data> + <row> + <a>00000001</a> + <b>2015-04-05</b> + <c>2015-07-14</c> + <d>XXX</d> + <e>short</e> + <f>short</f> + <g>some longer description</g> + <h>some other longer description english</h>+ </row> + </data> + (1 row) create table test(id int, a xml); insert into test values(1, xml_generator(10)); insert into test values(2, xml_generator(100)); insert into test values(3, xml_generator(1000)); insert into test values(4, xml_generator(2000)); insert into test values(5, xml_generator(4000)); insert into test values(6, xml_generator(8000)); The size of xml document with 8K fields is about 1.7MB postgres=# select length(xml_generator(8000)::text); length --------- 1712037 (1 row) processing via buildin function xpath is fast: postgres=# select a[1]::text a, a[2]::text::date b, a[3]::text::date c, a[4]::text, a[5]::text d, a[6]::text e, a[7]::text f, a[8]::text g from (select xpath('/row/a/text()|/row/b/text()|/row/c/text()|/row/d/text()|/row/e/text()|/row/f/text()|/row/g/text()|/row/h/text()', r) a from (select unnest(xpath('/data/row', a)) r from test where id = 1) s1) s2; a | b | c | a | d | e | f | g ----------+------------+------------+-----+-------+-------+-------------------------+--------------------------------------- 00000001 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 00000002 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 00000003 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 00000004 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 00000005 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 00000006 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 00000007 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 00000008 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 00000009 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 00000010 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english (10 rows) postgres=# \timing Timing is on. postgres=# \o /dev/null postgres=# select a[1]::text a, a[2]::text::date b, a[3]::text::date c, a[4]::text, a[5]::text d, a[6]::text e, a[7]::text f, a[8]::text g from (select xpath('/row/a/text()|/row/b/text()|/row/c/text()|/row/d/text()|/row/e/text()|/row/f/text()|/row/g/text()|/row/h/text()', r) a from (select unnest(xpath('/data/row', a)) r from test where id = 6) s1) s2; Time: 2620.013 ms postgres=# select * from xpath_table('id', 'a', 'test', '/data/row/a/text()|/data/row/b/text()|/data/row/c/text()|/data/row/d/text()|/data/row/e/text()|/data/row/f/text()|/data/row/g/text()|/data/row/h/text()', 'id = 1') as (id int, a text, b date, c date, d text, e text, f text, g text, h text); id | a | b | c | d | e | f | g | h ----+----------+------------+------------+-----+-------+-------+-------------------------+--------------------------------------- 1 | 00000001 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 1 | 00000002 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 1 | 00000003 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 1 | 00000004 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 1 | 00000005 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 1 | 00000006 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 1 | 00000007 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 1 | 00000008 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 1 | 00000009 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english 1 | 00000010 | 2015-04-05 | 2015-07-14 | XXX | short | short | some longer description | some other longer description english (10 rows) Time: 4.965 ms But for processing xml(5) .. 4K fields it needs 1.5GB RAM and 89sec. Using xpath_table is relative comfortable, but dangerous for little bit larger data than small. Regards Pavel Stehule
pgsql-bugs by date: