Thread: Issues of slow running queries when dealing with Big Data

Issues of slow running queries when dealing with Big Data

From
Shaozhong SHI
Date:
It has been found that issues occur when Big Data is being handled with PostGIS.  Typically, queries can be killed by the system or memory is out.  Often, queries can be very slow.  Sometimes, it will take days or weeks to complete.

What are the best approaches and means for improving the performance of queries and processes in PostgreSQL/PostGIS?

Can anyone shed light on this?

Regards,

Shao

Re: Issues of slow running queries when dealing with Big Data

From
Gavin Flower
Date:
On 28/07/2020 22:54, Shaozhong SHI wrote:
> It has been found that issues occur when Big Data is being handled 
> with PostGIS. Typically, queries can be killed by the system or memory 
> is out.  Often, queries can be very slow.  Sometimes, it will take 
> days or weeks to complete.
>
> What are the best approaches and means for improving the performance 
> of queries and processes in PostgreSQL/PostGIS?
>
> Can anyone shed light on this?
>
> Regards,
>
> Shao

Probably helps if you can give us more details!

Such as O/S, versions of PostgreSQL/PostGIS, and hardware used. Plus 
anything else you think might be relevant.


Cheers,
Gavin




Re: Issues of slow running queries when dealing with Big Data

From
Shaozhong SHI
Date:
Hi, Gavin,

PostgreSQL Version - 10.11. PostGIS Version - 2.4.7. Operating System - Red Hat Enterprise Linux 7.7 .

That is all I know at the moment.

As I understand, our IT staff is building another one to sit on Azure.

Regards,

Shao 

On Tue, 28 Jul 2020 at 12:31, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:
On 28/07/2020 22:54, Shaozhong SHI wrote:
> It has been found that issues occur when Big Data is being handled
> with PostGIS. Typically, queries can be killed by the system or memory
> is out.  Often, queries can be very slow.  Sometimes, it will take
> days or weeks to complete.
>
> What are the best approaches and means for improving the performance
> of queries and processes in PostgreSQL/PostGIS?
>
> Can anyone shed light on this?
>
> Regards,
>
> Shao

Probably helps if you can give us more details!

Such as O/S, versions of PostgreSQL/PostGIS, and hardware used. Plus
anything else you think might be relevant.


Cheers,
Gavin

Re: Issues of slow running queries when dealing with Big Data

From
Michael Lewis
Date:
On Tue, Jul 28, 2020 at 7:59 AM Shaozhong SHI <shishaozhong@gmail.com> wrote:
Hi, Gavin,

PostgreSQL Version - 10.11. PostGIS Version - 2.4.7. Operating System - Red Hat Enterprise Linux 7.7 .

That is all I know at the moment.

As I understand, our IT staff is building another one to sit on Azure.

Please don't top-post in these mailing lists.

Can you share the results of the below?

select name, setting, source from pg_settings where source <> 'default';

Re: Issues of slow running queries when dealing with Big Data

From
Shaozhong SHI
Date:
Hi,

Please find the result of select name, setting, source from pg_settings where source <> 'default';

Regards,

Shao



On Tue, 28 Jul 2020 at 16:42, Michael Lewis <mlewis@entrata.com> wrote:
On Tue, Jul 28, 2020 at 7:59 AM Shaozhong SHI <shishaozhong@gmail.com> wrote:
Hi, Gavin,

PostgreSQL Version - 10.11. PostGIS Version - 2.4.7. Operating System - Red Hat Enterprise Linux 7.7 .

That is all I know at the moment.

As I understand, our IT staff is building another one to sit on Azure.

Please don't top-post in these mailing lists.

Can you share the results of the below?

select name, setting, source from pg_settings where source <> 'default';
Attachment

Re: Issues of slow running queries when dealing with Big Data

From
Gavin Flower
Date:
On 29/07/2020 03:51, Shaozhong SHI wrote:
> Hi,
>
> Please find the result of select name, setting, source from 
> pg_settings where source <> 'default';
>
> Regards,
>
> Shao
>
>
>
> On Tue, 28 Jul 2020 at 16:42, Michael Lewis <mlewis@entrata.com 
> <mailto:mlewis@entrata.com>> wrote:
>
>     On Tue, Jul 28, 2020 at 7:59 AM Shaozhong SHI
>     <shishaozhong@gmail.com <mailto:shishaozhong@gmail.com>> wrote:
>
>         Hi, Gavin,
>
>         PostgreSQL Version - 10.11. PostGIS Version - 2.4.7. Operating
>         System - Red Hat Enterprise Linux 7.7 .
>
>         That is all I know at the moment.
>
>         As I understand, our IT staff is building another one to sit
>         on Azure.
>
>
>     Please don't top-post in these mailing lists.
>
Shaozhong, you're top posting again!

Although, sometimes it is appropriate to intersperse your comments in 
the previos email, like I this


>
>     Can you share the results of the below?
>
>     select name, setting, source from pg_settings where source <>
>     'default';
>
Basically you're being asked to put your reply after the rest of the 
email, like I am doing here!



Cheers,
Gavin




Re: Issues of slow running queries when dealing with Big Data

From
Shaozhong SHI
Date:


On Tue, 28 Jul 2020 at 20:36, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:
On 29/07/2020 03:51, Shaozhong SHI wrote:
> Hi,
>
> Please find the result of select name, setting, source from
> pg_settings where source <> 'default';
>
> Regards,
>
> Shao
>
>
>
> On Tue, 28 Jul 2020 at 16:42, Michael Lewis <mlewis@entrata.com
> <mailto:mlewis@entrata.com>> wrote:
>
>     On Tue, Jul 28, 2020 at 7:59 AM Shaozhong SHI
>     <shishaozhong@gmail.com <mailto:shishaozhong@gmail.com>> wrote:
>
>         Hi, Gavin,
>
>         PostgreSQL Version - 10.11. PostGIS Version - 2.4.7. Operating
>         System - Red Hat Enterprise Linux 7.7 .
>
>         That is all I know at the moment.
>
>         As I understand, our IT staff is building another one to sit
>         on Azure.
>
>
>     Please don't top-post in these mailing lists.
>
Shaozhong, you're top posting again!

Although, sometimes it is appropriate to intersperse your comments in
the previos email, like I this


>
>     Can you share the results of the below?
>
>     select name, setting, source from pg_settings where source <>
>     'default';
>
Basically you're being asked to put your reply after the rest of the
email, like I am doing here!



Cheers,
Gavin

Hi, Gavin,

Many thanks.  Is this way correct?  I am learning what is meant by 'top posting'.

Regards,

Shao 

Re: Issues of slow running queries when dealing with Big Data

From
Michael Lewis
Date:
Many thanks.  Is this way correct?  I am learning what is meant by 'top posting'.

Yes.

On the subject of your settings, I don't see anything for work_mem, random_page_cost and other commonly tuned parameters. That would be a good start. What sort of machine specs are there for Postgres instance? Cpu, ram, disk type (spinning disk, ssd, etc)? Are any other applications running on this machine? 

Do you have an example query and explain analyze output? Are you aware of https://explain.depesz.com/ ?

Re: Issues of slow running queries when dealing with Big Data

From
Shaozhong SHI
Date:


On Tue, 28 Jul 2020 at 22:43, Michael Lewis <mlewis@entrata.com> wrote:
Many thanks.  Is this way correct?  I am learning what is meant by 'top posting'.

Yes.

On the subject of your settings, I don't see anything for work_mem, random_page_cost and other commonly tuned parameters. That would be a good start. What sort of machine specs are there for Postgres instance? Cpu, ram, disk type (spinning disk, ssd, etc)? Are any other applications running on this machine? 

Do you have an example query and explain analyze output? Are you aware of https://explain.depesz.com/ ?


Hi, Michael,

Many thanks for pointing me to the right direction.   I do not know the machine specs myself.  We have an IT person who helps my team.  

As I understand, he is building a new PostGIS on Azure for us.  I will ask him.

I would like to understand the matter much better for various reasons.  If you are aware of technical papers or articles on the subject, it will be useful to me.

I have been discussing the matter with another manager.  I am also thinking about how to frame up a project as a formal project for the IT person.  If you have ideas, please let me know.  Basically, approaches and methods, areas to be investigated for tuning up performance.  So that, I can pass the information to him.  He is technically excellent, but has not got much experience in turning PostGIS for improving performance. 

I need to do such type of work, in order to not only solve the problem, but provide justifications to secure his job.

I hope that you understand.

Regards,

Shao