Same query, same data different plan

Started by Kostas Papadopoulosover 3 years ago15 messagesgeneral
Jump to latest
#1Kostas Papadopoulos
kostas@methodosit.com.cy

I have two identical databases running in the same instance of Postgresql. Ran
analyze on both. Running the same query I'm getting different plans, one x10 slower.
Although I have solved my problem by re-writing the query, I want to understand why
this is happening. If the configuration, Postgresql version, schema and data are the
same, what other factors is the planner considering?

--

Kostas Papadopoulos
KE MethodosIT

#2Daevor The Devoted
dollien@gmail.com
In reply to: Kostas Papadopoulos (#1)
Re: Same query, same data different plan

Hi

Is the table stats up to date on both?

https://www.postgresql.org/docs/current/planner-stats.html

Best regards,
Na-iem Dollie

On Mon, Oct 10, 2022 at 2:56 PM Kostas Papadopoulos <
kostas@methodosit.com.cy> wrote:

Show quoted text

I have two identical databases running in the same instance of Postgresql.
Ran
analyze on both. Running the same query I'm getting different plans, one
x10 slower.
Although I have solved my problem by re-writing the query, I want to
understand why
this is happening. If the configuration, Postgresql version, schema and
data are the
same, what other factors is the planner considering?

--

Kostas Papadopoulos
KE MethodosIT

#3Kostas Papadopoulos
kostas@methodosit.com.cy
In reply to: Daevor The Devoted (#2)
Re: Same query, same data different plan

Hi,

Yes, I ran ANALYZE in both databases.

Kostas

Show quoted text

On 10/10/2022 16:03, Daevor The Devoted wrote:

Hi

Is the table stats up to date on both?

https://www.postgresql.org/docs/current/planner-stats.html

Best regards,
Na-iem Dollie

On Mon, Oct 10, 2022 at 2:56 PM Kostas Papadopoulos <
kostas@methodosit.com.cy> wrote:

I have two identical databases running in the same instance of Postgresql.
Ran
analyze on both. Running the same query I'm getting different plans, one
x10 slower.
Although I have solved my problem by re-writing the query, I want to
understand why
this is happening. If the configuration, Postgresql version, schema and
data are the
same, what other factors is the planner considering?

--

Kostas Papadopoulos
KE MethodosIT

#4Julien Rouhaud
rjuju123@gmail.com
In reply to: Kostas Papadopoulos (#3)
Re: Same query, same data different plan

On Mon, Oct 10, 2022 at 04:05:42PM +0300, Kostas Papadopoulos wrote:

Hi,

Yes, I ran ANALYZE in both databases.

Please look at https://wiki.postgresql.org/wiki/Slow_Query_Questions to provide
more information.

#5Kostas Papadopoulos
kostas@methodosit.com.cy
In reply to: Julien Rouhaud (#4)
Re: Same query, same data different plan

Hi,

Thank you for responding. My question is not about the performance of a specific
query. As I wrote, that is already solved.

My question is "how can it be that the same query run in two exactly the same
databases can have different plans."

Kostas Papadopoulos

Show quoted text

On 10/10/2022 16:12, Julien Rouhaud wrote:

On Mon, Oct 10, 2022 at 04:05:42PM +0300, Kostas Papadopoulos wrote:

Hi,

Yes, I ran ANALYZE in both databases.

Please look at https://wiki.postgresql.org/wiki/Slow_Query_Questions to provide
more information.

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Julien Rouhaud (#4)
Re: Same query, same data different plan

po 10. 10. 2022 v 15:12 odesílatel Julien Rouhaud <rjuju123@gmail.com>
napsal:

On Mon, Oct 10, 2022 at 04:05:42PM +0300, Kostas Papadopoulos wrote:

Hi,

Yes, I ran ANALYZE in both databases.

This can be a common case. Check your configuration: work_mem,
shared_buffers, effective_cache_size, random_page_cost, seq_page_cost, ...

Show quoted text

Please look at https://wiki.postgresql.org/wiki/Slow_Query_Questions to
provide
more information.

#7Kostas Papadopoulos
kostas@methodosit.com.cy
In reply to: Pavel Stehule (#6)
Re: Same query, same data different plan

Hi,

I cannot see how it can be configuration since the two databases are in the same
Postgres instance.

Kostas Papadopoulos

Show quoted text

On 10/10/2022 16:16, Pavel Stehule wrote:

po 10. 10. 2022 v 15:12 odesílatel Julien Rouhaud <rjuju123@gmail.com>
napsal:

On Mon, Oct 10, 2022 at 04:05:42PM +0300, Kostas Papadopoulos wrote:

Hi,

Yes, I ran ANALYZE in both databases.

This can be a common case. Check your configuration: work_mem,
shared_buffers, effective_cache_size, random_page_cost, seq_page_cost, ...

Please look at https://wiki.postgresql.org/wiki/Slow_Query_Questions to
provide
more information.

#8Imre Samu
pella.samu@gmail.com
In reply to: Kostas Papadopoulos (#1)
Re: Same query, same data different plan

Ran analyze on both. Running the same query I'm getting different plans,

one x10 slower.

theory:

the "statistics target" is too low ?
THEN
different random sample --> different statistics ---> different plan,.

*"For large tables, ANALYZE takes a random sample of the table contents,
rather than examining every row. "*
https://www.postgresql.org/docs/current/sql-analyze.html

IMHO: Try to increase the statistics target (
"default_statistics_target" , .. )

regards,
Imre

Kostas Papadopoulos <kostas@methodosit.com.cy> ezt írta (időpont: 2022.
okt. 10., H, 14:56):

Show quoted text

I have two identical databases running in the same instance of Postgresql.
Ran
analyze on both. Running the same query I'm getting different plans, one
x10 slower.
Although I have solved my problem by re-writing the query, I want to
understand why
this is happening. If the configuration, Postgresql version, schema and
data are the
same, what other factors is the planner considering?

--

Kostas Papadopoulos
KE MethodosIT

#9Ron
ronljohnsonjr@gmail.com
In reply to: Kostas Papadopoulos (#5)
Re: Same query, same data different plan

How identical is "identical"?

For example, does diff says that "pg_dump --schema-only" of DB1 and DB2 are
perfectly identical?

And are the table counts identical?

On 10/10/22 08:15, Kostas Papadopoulos wrote:

Hi,

Thank you for responding. My question is not about the performance of a
specific query. As I wrote, that is already solved.

My question is "how can it be that the same query run in two exactly the
same databases can have different plans."

Kostas Papadopoulos

On 10/10/2022 16:12, Julien Rouhaud wrote:

On Mon, Oct 10, 2022 at 04:05:42PM +0300, Kostas Papadopoulos wrote:

Hi,

Yes, I ran ANALYZE in both databases.

Please look at https://wiki.postgresql.org/wiki/Slow_Query_Questions to
provide
more information.

--
Angular momentum makes the world go 'round.

#10Kostas Papadopoulos
kostas@methodosit.com.cy
In reply to: Ron (#9)
Re: Same query, same data different plan

On 10/10/2022 16:44, Ron wrote:

How identical is "identical"?

For example, does diff says that "pg_dump --schema-only" of DB1 and DB2 are perfectly
identical?

And are the table counts identical?

I created the second database using pg_dump from the first, so they should be exactly
the same.

Show quoted text

On 10/10/22 08:15, Kostas Papadopoulos wrote:

Hi,

Thank you for responding. My question is not about the performance of a specific
query. As I wrote, that is already solved.

My question is "how can it be that the same query run in two exactly the same
databases can have different plans."

Kostas Papadopoulos

On 10/10/2022 16:12, Julien Rouhaud wrote:

On Mon, Oct 10, 2022 at 04:05:42PM +0300, Kostas Papadopoulos wrote:

Hi,

Yes, I ran ANALYZE in both databases.

Please look at https://wiki.postgresql.org/wiki/Slow_Query_Questions to provide
more information.

#11Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Julien Rouhaud (#4)
Re: Same query, same data different plan

On 10/10/22 06:12, Julien Rouhaud wrote:

On Mon, Oct 10, 2022 at 04:05:42PM +0300, Kostas Papadopoulos wrote:

Hi,

Yes, I ran ANALYZE in both databases.

Please look at https://wiki.postgresql.org/wiki/Slow_Query_Questions to provide
more information.

Without the information, as detailed at above link, an answer to this
question will be nothing more then guesses.

--
Adrian Klaver
adrian.klaver@aklaver.com

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kostas Papadopoulos (#7)
Re: Same query, same data different plan

Kostas Papadopoulos <kostas@methodosit.com.cy> writes:

I cannot see how it can be configuration since the two databases are in the same
Postgres instance.

There is such a thing as ALTER DATABASE ... SET to install different
settings at the per-database level.

In general, the answer to your question is that the databases are
*not* identical. You just haven't figured out how yet. I'm wondering
if it has something to do with the dump/reload having compacted out
bloat in the tables or indexes, causing cost estimates to change.

regards, tom lane

#13Kostas Papadopoulos
kostas@methodosit.com.cy
In reply to: Tom Lane (#12)
Re: Same query, same data different plan

On 10/10/2022 17:53, Tom Lane wrote:

Kostas Papadopoulos <kostas@methodosit.com.cy> writes:

I cannot see how it can be configuration since the two databases are in the same
Postgres instance.

There is such a thing as ALTER DATABASE ... SET to install different
settings at the per-database level.

I understand, but I created the databases to be the same. Our original problem was
that developers' workstations (Debian and Windows) were running a specific query
different from a test db (Ubuntu). After eliminating everything we thought of (data,
versions, configurations, OS, etc) we ended up with the scenario I described here.

In general, the answer to your question is that the databases are
*not* identical. You just haven't figured out how yet. I'm wondering
if it has something to do with the dump/reload having compacted out
bloat in the tables or indexes, causing cost estimates to change.

I will look into that and a couple of other ideas I got from this list.

regards, tom lane

Thanks
kostas

#14Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Kostas Papadopoulos (#13)
Re: Same query, same data different plan

On 10/10/22 8:12 AM, Kostas Papadopoulos wrote:

On 10/10/2022 17:53, Tom Lane wrote:

Kostas Papadopoulos <kostas@methodosit.com.cy> writes:

I cannot see how it can be configuration since the two databases are
in the same
Postgres instance.

There is such a thing as ALTER DATABASE ... SET to install different
settings at the per-database level.

I understand, but I created the databases to be the same. Our original
problem was that developers' workstations (Debian and Windows) were
running a specific query different from a test db (Ubuntu). After
eliminating everything we thought of (data, versions, configurations,
OS, etc) we ended up with the scenario I described here.

So there is more to the story.

Information needed:

1) The query and its EXPLAIN ANALYZE for both slow/fast cases.

2) Postgres version.

3) What database are the developers workstation pointing at?

4) What is the test db and is it the same as 3)?

5) What clients are you using to run the query?

In general, the answer to your question is that the databases are
*not* identical.  You just haven't figured out how yet.  I'm wondering
if it has something to do with the dump/reload having compacted out
bloat in the tables or indexes, causing cost estimates to change.

I will look into that and a couple of other ideas I got from this list.

            regards, tom lane

Thanks
kostas

--
Adrian Klaver
adrian.klaver@aklaver.com

#15Kostas Papadopoulos
kostas@methodosit.com.cy
In reply to: Adrian Klaver (#14)
Re: Same query, same data different plan

Hi Adrian,

On 10/10/2022 20:59, Adrian Klaver wrote:

Information needed:

1) The query and its EXPLAIN ANALYZE for both slow/fast cases.

2) Postgres version.

3) What database are the developers workstation pointing at?

4) What is the test db and is it the same as 3)?

5) What clients are you using to run the query?

Thanks for the interest. I was just looking for pointers on what to look at next. Got
that from Tom (table bloat) and Imre (different random sample of analyze).

Regards
kostas