Issues of slow running queries when dealing with Big Data

Started by Shaozhong SHIover 5 years ago9 messagesgeneral
Jump to latest
#1Shaozhong SHI
shishaozhong@gmail.com

It has been found that issues occur when Big Data is being handled with
PostGIS. Typically, queries can be killed by the system or memory is out.
Often, queries can be very slow. Sometimes, it will take days or weeks to
complete.

What are the best approaches and means for improving the performance of
queries and processes in PostgreSQL/PostGIS?

Can anyone shed light on this?

Regards,

Shao

#2Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Shaozhong SHI (#1)
Re: Issues of slow running queries when dealing with Big Data

On 28/07/2020 22:54, Shaozhong SHI wrote:

It has been found that issues occur when Big Data is being handled
with PostGIS. Typically, queries can be killed by the system or memory
is out.  Often, queries can be very slow.  Sometimes, it will take
days or weeks to complete.

What are the best approaches and means for improving the performance
of queries and processes in PostgreSQL/PostGIS?

Can anyone shed light on this?

Regards,

Shao

Probably helps if you can give us more details!

Such as O/S, versions of PostgreSQL/PostGIS, and hardware used. Plus
anything else you think might be relevant.

Cheers,
Gavin

#3Shaozhong SHI
shishaozhong@gmail.com
In reply to: Gavin Flower (#2)
Re: Issues of slow running queries when dealing with Big Data

Hi, Gavin,

PostgreSQL Version - 10.11. PostGIS Version - 2.4.7. Operating System - Red
Hat Enterprise Linux 7.7 .

That is all I know at the moment.

As I understand, our IT staff is building another one to sit on Azure.

Regards,

Shao

On Tue, 28 Jul 2020 at 12:31, Gavin Flower <GavinFlower@archidevsys.co.nz>
wrote:

Show quoted text

On 28/07/2020 22:54, Shaozhong SHI wrote:

It has been found that issues occur when Big Data is being handled
with PostGIS. Typically, queries can be killed by the system or memory
is out. Often, queries can be very slow. Sometimes, it will take
days or weeks to complete.

What are the best approaches and means for improving the performance
of queries and processes in PostgreSQL/PostGIS?

Can anyone shed light on this?

Regards,

Shao

Probably helps if you can give us more details!

Such as O/S, versions of PostgreSQL/PostGIS, and hardware used. Plus
anything else you think might be relevant.

Cheers,
Gavin

#4Michael Lewis
mlewis@entrata.com
In reply to: Shaozhong SHI (#3)
Re: Issues of slow running queries when dealing with Big Data

On Tue, Jul 28, 2020 at 7:59 AM Shaozhong SHI <shishaozhong@gmail.com>
wrote:

Hi, Gavin,

PostgreSQL Version - 10.11. PostGIS Version - 2.4.7. Operating System -
Red Hat Enterprise Linux 7.7 .

That is all I know at the moment.

As I understand, our IT staff is building another one to sit on Azure.

Please don't top-post in these mailing lists.

Can you share the results of the below?

select name, setting, source from pg_settings where source <> 'default';

#5Shaozhong SHI
shishaozhong@gmail.com
In reply to: Michael Lewis (#4)
Re: Issues of slow running queries when dealing with Big Data

Hi,

Please find the result of select name, setting, source from pg_settings
where source <> 'default';

Regards,

Shao

On Tue, 28 Jul 2020 at 16:42, Michael Lewis <mlewis@entrata.com> wrote:

Show quoted text

On Tue, Jul 28, 2020 at 7:59 AM Shaozhong SHI <shishaozhong@gmail.com>
wrote:

Hi, Gavin,

PostgreSQL Version - 10.11. PostGIS Version - 2.4.7. Operating System -
Red Hat Enterprise Linux 7.7 .

That is all I know at the moment.

As I understand, our IT staff is building another one to sit on Azure.

Please don't top-post in these mailing lists.

Can you share the results of the below?

select name, setting, source from pg_settings where source <> 'default';

Attachments:

1.PNGimage/png; name=1.PNGDownload
2.PNGimage/png; name=2.PNGDownload
3.PNGimage/png; name=3.PNGDownload
#6Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Shaozhong SHI (#5)
Re: Issues of slow running queries when dealing with Big Data

On 29/07/2020 03:51, Shaozhong SHI wrote:

Hi,

Please find the result of select name, setting, source from
pg_settings where source <> 'default';

Regards,

Shao

On Tue, 28 Jul 2020 at 16:42, Michael Lewis <mlewis@entrata.com
<mailto:mlewis@entrata.com>> wrote:

On Tue, Jul 28, 2020 at 7:59 AM Shaozhong SHI
<shishaozhong@gmail.com <mailto:shishaozhong@gmail.com>> wrote:

Hi, Gavin,

PostgreSQL Version - 10.11. PostGIS Version - 2.4.7. Operating
System - Red Hat Enterprise Linux 7.7 .

That is all I know at the moment.

As I understand, our IT staff is building another one to sit
on Azure.

Please don't top-post in these mailing lists.

Shaozhong, you're top posting again!

Although, sometimes it is appropriate to intersperse your comments in
the previos email, like I this

Can you share the results of the below?

select name, setting, source from pg_settings where source <>
'default';

Basically you're being asked to put your reply after the rest of the
email, like I am doing here!

Cheers,
Gavin

#7Shaozhong SHI
shishaozhong@gmail.com
In reply to: Gavin Flower (#6)
Re: Issues of slow running queries when dealing with Big Data

On Tue, 28 Jul 2020 at 20:36, Gavin Flower <GavinFlower@archidevsys.co.nz>
wrote:

On 29/07/2020 03:51, Shaozhong SHI wrote:

Hi,

Please find the result of select name, setting, source from
pg_settings where source <> 'default';

Regards,

Shao

On Tue, 28 Jul 2020 at 16:42, Michael Lewis <mlewis@entrata.com
<mailto:mlewis@entrata.com>> wrote:

On Tue, Jul 28, 2020 at 7:59 AM Shaozhong SHI
<shishaozhong@gmail.com <mailto:shishaozhong@gmail.com>> wrote:

Hi, Gavin,

PostgreSQL Version - 10.11. PostGIS Version - 2.4.7. Operating
System - Red Hat Enterprise Linux 7.7 .

That is all I know at the moment.

As I understand, our IT staff is building another one to sit
on Azure.

Please don't top-post in these mailing lists.

Shaozhong, you're top posting again!

Although, sometimes it is appropriate to intersperse your comments in
the previos email, like I this

Can you share the results of the below?

select name, setting, source from pg_settings where source <>
'default';

Basically you're being asked to put your reply after the rest of the
email, like I am doing here!

Cheers,
Gavin

Hi, Gavin,

Many thanks. Is this way correct? I am learning what is meant by 'top
posting'.

Regards,

Shao

#8Michael Lewis
mlewis@entrata.com
In reply to: Shaozhong SHI (#7)
Re: Issues of slow running queries when dealing with Big Data

Many thanks. Is this way correct? I am learning what is meant by 'top
posting'.

Yes.

On the subject of your settings, I don't see anything for work_mem,
random_page_cost and other commonly tuned parameters. That would be a good
start. What sort of machine specs are there for Postgres instance? Cpu,
ram, disk type (spinning disk, ssd, etc)? Are any other applications
running on this machine?

Do you have an example query and explain analyze output? Are you aware of
https://explain.depesz.com/ ?

#9Shaozhong SHI
shishaozhong@gmail.com
In reply to: Michael Lewis (#8)
Re: Issues of slow running queries when dealing with Big Data

On Tue, 28 Jul 2020 at 22:43, Michael Lewis <mlewis@entrata.com> wrote:

Many thanks. Is this way correct? I am learning what is meant by 'top

posting'.

Yes.

On the subject of your settings, I don't see anything for work_mem,
random_page_cost and other commonly tuned parameters. That would be a good
start. What sort of machine specs are there for Postgres instance? Cpu,
ram, disk type (spinning disk, ssd, etc)? Are any other applications
running on this machine?

Do you have an example query and explain analyze output? Are you aware of
https://explain.depesz.com/ ?

Hi, Michael,

Many thanks for pointing me to the right direction. I do not know the
machine specs myself. We have an IT person who helps my team.

As I understand, he is building a new PostGIS on Azure for us. I will ask
him.

I would like to understand the matter much better for various reasons. If
you are aware of technical papers or articles on the subject, it will be
useful to me.

I have been discussing the matter with another manager. I am also thinking
about how to frame up a project as a formal project for the IT person. If
you have ideas, please let me know. Basically, approaches and methods,
areas to be investigated for tuning up performance. So that, I can pass
the information to him. He is technically excellent, but has not got much
experience in turning PostGIS for improving performance.

I need to do such type of work, in order to not only solve the problem, but
provide justifications to secure his job.

I hope that you understand.

Regards,

Shao