Thread: Any insights on Qlik Sense using CURSOR ?
Hi, I am using postgresql 12.7 on Ubunut as a datawarehouse, that is then queried by QlikSense to produce business analytics. One of my dataloaders, that runs multiple queries, sometimes takes about 3 hours to feed Qlik with the relevant records (about 10M records), but sometimes goes crazy and times out (as Qlik stops it when it takes more than 480 minutes). The point is that Qlik is using a CURSOR to retrive the data. I'm not familiar with CURSOR and postgresql documentation mainly cites functions as use case. I don't really know how Qlik creates these cursors when executing my queries... I tried load_min_duration to pinpoint the problem, but only shows things like that: ... LOG: duration : 294774.600 ms, instruction : fetch 100000 in "SQL_CUR4" LOG: duration : 282867.279 ms, instruction : fetch 100000 in "SQL_CUR4" ... So I don't know exactly which of my queries is hiding behind "SQL_CUR4"... Is there a way to log the actual query ? Is using a CURSOR a best practice to retrieve big datasets ? (it seems Qlik is using it for every connection on Postgresql) Does each FETCH re-run the query, or is the result somehow cached (on disk ?) ? Thanks for any insight on CURSOR and/or Qlik queries on Postgresql ! Best regards, Franck
On Tue, May 18, 2021 at 6:22 PM Franck Routier (perso) <alci@mecadu.org> wrote:
Hi,
I am using postgresql 12.7 on Ubunut as a datawarehouse, that is then
queried by QlikSense to produce business analytics.
One of my dataloaders, that runs multiple queries, sometimes takes about
3 hours to feed Qlik with the relevant records (about 10M records), but
sometimes goes crazy and times out (as Qlik stops it when it takes more
than 480 minutes).
The point is that Qlik is using a CURSOR to retrive the data. I'm not
familiar with CURSOR and postgresql documentation mainly cites functions
as use case. I don't really know how Qlik creates these cursors when
executing my queries...
I tried load_min_duration to pinpoint the problem, but only shows things
like that:
...
LOG: duration : 294774.600 ms, instruction : fetch 100000 in "SQL_CUR4"
LOG: duration : 282867.279 ms, instruction : fetch 100000 in "SQL_CUR4"
...
So I don't know exactly which of my queries is hiding behind
"SQL_CUR4"...
Is there a way to log the actual query ?
Is using a CURSOR a best practice to retrieve big datasets ? (it seems
Qlik is using it for every connection on Postgresql)
Does each FETCH re-run the query, or is the result somehow cached (on
disk ?) ?
Thanks for any insight on CURSOR and/or Qlik queries on Postgresql !
Best regards,
Franck
Have you tried setting the parameter below?
log_statement = 'all'
you will get all queries logged into log files.
Regards,
Ganesh Korde.
Thanks Ganesh, this gave me the select that is slow. It effectively looks like this: begin; declare "SQL_CUR4" cursor with hold for select ... then a bunch of: fetch 100000 in "SQL_CUR4" then a commit I also found this article https://www.cybertec-postgresql.com/en/declare-cursor-in-postgresql-or-how-to-reduce-memory-consumption/ to be interesting as an introduction to CURSOR with Postgresql. I'll now work on this query to try to understand the problem. Franck Le jeudi 20 mai 2021 à 17:59 +0530, Ganesh Korde a écrit : > > Hi, > On Tue, May 18, 2021 at 6:22 PM Franck Routier (perso) > <alci@mecadu.org> wrote: > > Hi, > > > > I am using postgresql 12.7 on Ubunut as a datawarehouse, that is then > > queried by QlikSense to produce business analytics. > > > > One of my dataloaders, that runs multiple queries, sometimes takes > > about > > 3 hours to feed Qlik with the relevant records (about 10M records), > > but > > sometimes goes crazy and times out (as Qlik stops it when it takes > > more > > than 480 minutes). > > > > The point is that Qlik is using a CURSOR to retrive the data. I'm not > > familiar with CURSOR and postgresql documentation mainly cites > > functions > > as use case. I don't really know how Qlik creates these cursors when > > executing my queries... > > > > I tried load_min_duration to pinpoint the problem, but only shows > > things > > like that: > > > > ... > > LOG: duration : 294774.600 ms, instruction : fetch 100000 in > > "SQL_CUR4" > > LOG: duration : 282867.279 ms, instruction : fetch 100000 in > > "SQL_CUR4" > > ... > > > > So I don't know exactly which of my queries is hiding behind > > "SQL_CUR4"... > > > > Is there a way to log the actual query ? > > Is using a CURSOR a best practice to retrieve big datasets ? (it > > seems > > Qlik is using it for every connection on Postgresql) > > Does each FETCH re-run the query, or is the result somehow cached (on > > disk ?) ? > > > > Thanks for any insight on CURSOR and/or Qlik queries on Postgresql ! > > > > Best regards, > > Franck > > > > > > Have you tried setting the parameter below? > log_statement = 'all' > > you will get all queries logged into log files. > > Regards, > Ganesh Korde.
For the record,
Qlik uses the odbc driver with useDeclareFetch=1, hence the use of cursors.
By default, postgresql planner tries to optimize the execution plan for retrieving 10℅ of the records when using a cursor. This can be controlled with cursor_tuple_fraction parameter.
In my case, setting it to 1.0 (instead of the default 0.1) boosted the query from more than 1 hour (sometime going crazy to several hours) to 15 minutes.
In general, I think 1.0 is the correct value when using Qlik, as loaders will read all rows.
Franck
-- Envoyé depuis /e/ Mail.
Qlik uses the odbc driver with useDeclareFetch=1, hence the use of cursors.
By default, postgresql planner tries to optimize the execution plan for retrieving 10℅ of the records when using a cursor. This can be controlled with cursor_tuple_fraction parameter.
In my case, setting it to 1.0 (instead of the default 0.1) boosted the query from more than 1 hour (sometime going crazy to several hours) to 15 minutes.
In general, I think 1.0 is the correct value when using Qlik, as loaders will read all rows.
Franck
Le 20 mai 2021 21:33:25 GMT+02:00, "Franck Routier (perso)" <alci@mecadu.org> a écrit :
Thanks Ganesh,
this gave me the select that is slow. It effectively looks like this:
begin; declare "SQL_CUR4" cursor with hold for select ...
then a bunch of:
fetch 100000 in "SQL_CUR4"
then a commit
I also found this
article https://www.cybertec-postgresql.com/en/declare-cursor-in-postgresql-or-how-to-reduce-memory-consumption/
to be interesting as an introduction to CURSOR with Postgresql.
I'll now work on this query to try to understand the problem.
Franck
Le jeudi 20 mai 2021 à 17:59 +0530, Ganesh Korde a écrit :
Hi,
On Tue, May 18, 2021 at 6:22 PM Franck Routier (perso)
<alci@mecadu.org> wrote:Hi,
I am using postgresql 12.7 on Ubunut as a datawarehouse, that is then
queried by QlikSense to produce business analytics.
One of my dataloaders, that runs multiple queries, sometimes takes
about
3 hours to feed Qlik with the relevant records (about 10M records),
but
sometimes goes crazy and times out (as Qlik stops it when it takes
more
than 480 minutes).
The point is that Qlik is using a CURSOR to retrive the data. I'm not
familiar with CURSOR and postgresql documentation mainly cites
functions
as use case. I don't really know how Qlik creates these cursors when
executing my queries...
I tried load_min_duration to pinpoint the problem, but only shows
things
like that:
...
LOG: duration : 294774.600 ms, instruction : fetch 100000 in
"SQL_CUR4"
LOG: duration : 282867.279 ms, instruction : fetch 100000 in
"SQL_CUR4"
...
So I don't know exactly which of my queries is hiding behind
"SQL_CUR4"...
Is there a way to log the actual query ?
Is using a CURSOR a best practice to retrieve big datasets ? (it
seems
Qlik is using it for every connection on Postgresql)
Does each FETCH re-run the query, or is the result somehow cached (on
disk ?) ?
Thanks for any insight on CURSOR and/or Qlik queries on Postgresql !
Best regards,
Franck
Have you tried setting the parameter below?
log_statement = 'all'
you will get all queries logged into log files.
Regards,
Ganesh Korde.
-- Envoyé depuis /e/ Mail.