New ECPG idea, was: Re: ECPG FETCH readahead - Mailing list pgsql-hackers

From Boszormenyi Zoltan
Subject New ECPG idea, was: Re: ECPG FETCH readahead
Date
Msg-id 520F5845.7060104@cybertec.at
Whole thread Raw
In response to Re: ECPG FETCH readahead  (Boszormenyi Zoltan <zb@cybertec.at>)
Responses Proposal: UPDATE/DELETE ... WHERE OFFSET n OF cursor_name, was: Re: New ECPG idea, was: Re: ECPG FETCH readahead  (Boszormenyi Zoltan <zb@cybertec.at>)
List pgsql-hackers
<div class="moz-cite-prefix">2013-08-17 12:08 keltezéssel, Boszormenyi Zoltan írta:<br /></div><blockquote
cite="mid:520F4B90.2010800@cybertec.at"type="cite">Hi, <br /><br /> I am restarting this old thread... :-) <br /><br />
2012-04-2410:17 keltezéssel, Michael Meskes írta: <br /><blockquote type="cite"><blockquote type="cite">OK, I will
implement#2. Another question popped up: what to do <br /> with FETCH ALL? The current readahead window size or
temporarily<br /> bumping it to say some tens of thousands can be used. We may not <br /> know how much is the "all
records".This, although lowers performance, <br /> saves memory. <br /></blockquote> I would say doing a large fetch in
twoor three batches won't cost too much in <br /> terms of performance. <br /><br /><blockquote type="cite">Please,
don'tapply this patch yet. I discovered a rather big hole <br /> that can confuse the cursor position tracking if you
dothis: <br /> ... <br /> That will also need a new round of review. Sorry for that. <br /></blockquote> No problem,
betterto find it now instead of after release. <br /><br /> Anyway, I moved the patch to 2012-next (I hope I did it
correctly)so 2012-1 <br /> can be closed. Let's try to get this patch done in the next commit fest. <br /><br />
Michael<br /></blockquote><br /> I had time to look into this patch of mine again after about 1.5 years. <br />
Frankly,this time was too long to remember every detail of the patch <br /> and looking at parts of the patch as a big
entitywas confusing. <br /><br /> So I started fresh and to make review easier, I broke the patch up <br /> into small
piecesthat all build on each other. I have also fixed quite <br /> a few bugs, mostly in my code, but some in the ECPG
parserand <br /> the regression tests as well. <br /><br /> I have put the broken up patchset into a GIT tree of mine
atGitHub: <br /><a class="moz-txt-link-freetext"
href="https://github.com/zboszor/ecpg-readahead/">https://github.com/zboszor/ecpg-readahead/</a><br/> but the huge
compressedpatch is also attached for reference. <br /> It was generated with <br /><br /> $ git diff
221e92f64c6e136e550ec2592aac3ae0d4623209..870922676e6ae0faa4ebbf94b92e0b97ec418e16<br/><br /> ECPG regression tests are
nowValgrind-clean except two of them <br /> but both are pre-existing bugs. <br /><br /> 1.
ecpg/test/compat_informix/rfmtlong.pgcpoints out a problem in <br />    ecpg/compatlib/informix.c <br /><br /> ==5036==
1errors in context 1 of 4: <br /> ==5036== Invalid read of size 4 <br /> ==5036==    at 0x4E3453C: rfmtlong
(informix.c:941)<br /> ==5036==    by 0x4007DA: fmtlong.constprop.0 (rfmtlong.pgc:22) <br /> ==5036==    by 0x4006BE:
main(rfmtlong.pgc:45) <br /> ==5036==  Address 0x60677d8 is 24 bytes inside a block of size 25 alloc'd <br />
==5036==   at 0x4C28409: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) <br /> ==5036==    by
0x4E34268:rfmtlong (informix.c:783) <br /> ==5036==    by 0x4007DA: fmtlong.constprop.0 (rfmtlong.pgc:22) <br />
==5036==   by 0x4006BE: main (rfmtlong.pgc:45) <br /><br /> The same error is reported 4 times. <br /><br /> 2.
ecpg_add_mem()seems to leak memory: <br /><br /> ==5463== 256 bytes in 16 blocks are definitely lost in loss record 1
of1 <br /> ==5463==    at 0x4C2A121: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) <br />
==5463==   by 0x4E3E153: ecpg_alloc (memory.c:21) <br /> ==5463==    by 0x4E3E212: ecpg_add_mem (memory.c:110) <br />
==5463==   by 0x4E3542B: ecpg_store_result (execute.c:409) <br /> ==5463==    by 0x4E37E5A: ecpg_process_output
(execute.c:1777)<br /> ==5463==    by 0x4E38CCA: ecpg_do (execute.c:2137) <br /> ==5463==    by 0x4E38D8A: ECPGdo
(execute.c:2159)<br /> ==5463==    by 0x400A82: fn (alloc.pgc:51) <br /> ==5463==    by 0x5152C52: start_thread
(pthread_create.c:308)<br /> ==5463==    by 0x545C13C: clone (clone.S:113) <br /><br /> The last two issue we talked
aboutin this thread are also implemented: <br /> - permanently raise the readahead window if the application sends a
<br/>   bigger FETCH command, and <br /> - temporarily raise the readahead window for FETCH ALL commands <br /><br />
Thecursor position tracking was completely rewritten, so the client side <br /> properly follows the cursor position
knownby the backend and doesn't <br /> skip MOVE statements where it shouldn't. The previously known <br /> bug is
completelyeliminated this way. <br /><br /> Please, review that patch. <br /></blockquote><br /> I have another idea to
makeECPG building on this huge patch.<br /><br /> Currently, UPDATE/DELETE WHERE CURRENT OF has to issue a MOVE<br />
beforethe command in case the cursor positions known by the application<br /> and the backend are different.<br /><br
/>My idea builds on the fact that UPDATE/DELETE RETURNING is present<br /> in all supported back branches.<br /><br />
Amini-parser only understanding SELECT, UPDATE and DELETE should<br /> be added to ecpglib, so<br /><br />     DECLARE
cursorCURSOR FOR SELECT ...<br /><br /> and<br /><br />     PREPARE prepared_stmt FROM :query;<br />     DECLARE cursor
CURSORFOR prepared_stmt;<br /><br /> can be analyzed and tweaked behind the application's back.<br /><br /> This is
neededto detect whether a query is a simple updatable<br /> scan of a table, and returning errors early to the
applicationif it's not,<br /> without actually sending the UPDATE/DELETE WHERE CURRENT OF<br /> query to the
backend.<br/><br /> For the purpose of WHERE CURRENT OF, I would add a ctid<br /> column at the end of the targelist
thatis treated like "resjunk"<br /> in the backend when returning data to the application.<br /><br /> So, SELECTs
wouldreturn the ctid information of the tuples.<br /> The cursor query was a FETCH N with abs(N)>1 because of<br />
thereadahead. For this reason, the cursor positions known<br /> by the application and the backend are different.<br
/><br/> The extra MOVE can be eliminated by replacing<br /><br />     UPDATE table SET ... WHERE CURRENT OF cursor;<br
/><br/> with<br /><br />     UPDATE table SET ... WHERE ctid='...' RETURNING ctid;<br /><br /> Of course, if the
originalquery contained RETURNING, the<br /> ctid field would be appended in resjunk fashion.<br /><br /> The new ctid
canbe saved back into the cache using PQsetvalue()<br /> and the (position ; new ctid) pair into a hash, both can be<br
/>looked up in case the application wants to modify the<br /> same tuple again. Some protection is needed against
growing<br/> the hash too big. But usually[*] programs don't go back<br /> to modify the same record twice.<br /><br />
[*]This is only an educated guess.<br /><br /> How about it?<br /><br /> Best regards,<br /> Zoltán Böszörményi<br
/><br/><blockquote cite="mid:520F4B90.2010800@cybertec.at" type="cite"><br /> Thanks in advance and best regards, <br
/>Zoltán Böszörményi <br /><br /><br /><fieldset class="mimeAttachmentHeader"></fieldset><br /><pre wrap="">
 
</pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a>
<aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a>
 
</pre>

pgsql-hackers by date:

Previous
From: Boszormenyi Zoltan
Date:
Subject: Re: ECPG FETCH readahead
Next
From: Huang Bambo
Date:
Subject: Re: [pgsql-zh-general] Chinese in Postgres