Re: [PERFORM] printing results of query to file in different times - Mailing list pgsql-performance

From George Neuner
Subject Re: [PERFORM] printing results of query to file in different times
Date
Msg-id 634f2eb6-daf6-6314-e6f5-1b531b50a9ec@comcast.net
Whole thread Raw
In response to [PERFORM] printing results of query to file in different times  (Mariel Cherkassky <mariel.cherkassky@gmail.com>)
Responses Re: [PERFORM] printing results of query to file in different times
List pgsql-performance
On Thu, 24 Aug 2017 16:15:19 +0300, Mariel Cherkassky
<mariel.cherkassky@gmail.com> wrote:

 >I'm trying to understand what postgresql doing in an issue that I'm
having.
 >Our app team wrote a function that runs with a cursor over the results
of a
 >query and via the utl_file func they write some columns to a file. I dont
 >understand why, but postgresql write the data into the file in the fs in
 >parts. I mean that it runs the query and it takes time to get back results
 >and when I see that the results back postgresql write to file the data and
 >then suddenly stops for X minutes. After those x minutes it starts
again to
 >write the data and it continues that way until its done. The query returns
 >total *100* rows. I want to understand why it stops suddenly. There arent
 >any locks in the database during this operation.
 >
 >my function looks like that :
 >
 >func(a,b,c...)
 >cursor cr for
 >select ab,c,d,e.....
 >begin
 >raise notice - 'starting loop time - %',timeofday();
 > for cr_record in cr
 >    Raise notice 'print to file - '%',timeofday();
 >    utl_file.write(file,cr_record)
 > end loop
 >end
 >
 >I see the log of the running the next output :
 >
 >starting loop 16:00
 >print to file : 16:03
 >print to file : 16:03
 >print to file : 16:07
 >print to file : 16:07
 >print to file : 16:07
 >print to file : 16:010
 >
 >......
 >
 >Can somebody explain to me this kind of behavior ? Why is it taking some
 >much time to write and in different minutes after the query already been
 >executed and finished ? Mybe I'm getting from the cursor only part of the
 >rows ?


First I'd ask where did you get  utl_file  from?  Postrgesql has no such
facility, so you must be using an extension.  And not one I'm familiar
with either -  EnterpriseDB has a utl_file implementation in their
Oracle compatibility stuff, but it uses "get" and "put" calls rather
than "read" and "write".

Second, raising notices can be slow - I assume you added them to see
what was happening?  How does the execution time compare if you remove them?

I saw someone else asked about the execution plan, but I'm not sure that
will help here because it would affect only the initial select ... the
cursor would be working with the result set and should be able to skip
directly to the target rows.  I might expect several seconds for the
loop with I/O ... but certainly not minutes unless the server is
severely overloaded.

One thing you might look at is the isolation level of the query.  If you
are using READ_COMMITTED or less, and the table is busy, other writing
queries may be stepping on yours and even potentially changing your
result set during the cursor loop.  I would try using REPEATABLE_READ
and see what happens.


George


pgsql-performance by date:

Previous
From: Roman Konoval
Date:
Subject: Re: [PERFORM] printing results of query to file in different times
Next
From: 우성민
Date:
Subject: [PERFORM] Handling small inserts from many connections.