FDW, too long to run explain

Started by Vijaykumar Jainover 7 years ago16 messagesgeneral

Jump to latest

Vijaykumar Jain

vjain@opentable.com

over 7 years ago

Hi,

with pg v10.1

I have a setup enabled as below.
7 shards ( 1RW, 2 RO )
they all are fronted by FDW talking to each other.

we use writes directly to shards, and reads via FDW from all shards (RO)
our DB size is ~ 500GB each shard, and tables are huge too.
1 table ~ 200GB, 1 ~55GB, 1 ~40GB and a lot of small tables, but large
indices on large table.

the sharding was done based on a key to enable shard isolation at app layer
using a fact table.
select id,shard from fact_table;

server resources are,
32GB mem, 8 vcpu, 500GB SSD.

the FDW connect to each other shard via FDW fronted by haproxy -> pgbouncer
-> postgresql.
Hope this is good enough background :)

now we have some long running queries via FDW that take minutes and get
killed explain runs as idle in transaction on remote servers. (we set
use_remote_estimate = true )
when the query is run on individual shards directly, it runs pretty
quickly,
but when run via FDW, it takes very long.
i even altered fetch_sie to 10000, so that in case some filters do not get
pushed, those can be applied on the FDW quickly.

but i am lost at the understanding of why explain runs for ever via FDW.
we have a view on remote servers. we import public schema from remote
servers, into coordinator custom schema, and then union all

select * from (
select * from sh01.view1
union all
select * from sh01.view1
...
) t where t.foo = 'bar' limit 10;

now the explain for
select * from sh01.view1 keeps running for minutes sometimes,

then fetch too keeps running for minutes, although the total rows are <
10000 maybe.
idle in transaction | FETCH 10000 FROM c1

we have very aggressive settings for autovacuum and auto analyze.

autovacuum_naptime = '15s'
autovacuum_vacuum_scale_factor = '0.001'
autovacuum_analyze_scale_factor = '0.005'
log_autovacuum_min_duration = '0'
maintenance_work_mem = '2GB'
autovacuum_vacuum_cost_limit = '5000'
autovacuum_vacuum_cost_delay = '5ms'

other questions:
also, what is the cost of fetch_size?
we have in our settings => use_remote_estimate=true,fetch_size=10000

I mean given we have a query

select * from foobar limit 10000; via FDW
limit 10000 does not get pushed.
so it seems all rows some to FDW node and then limit is applied?

i currently do not have the queries, but i have a screenshot for long
running explain via FDW.
also since the whole query does not show up in pg_stat_statement, i am not
sure, that would be of great help since predicate although applied, do not
show up in pg_stat_activity.

I know, there can be more info i can provide, but if anyone has
experienced this, pls let me know.

BTW, i know citus is an option, but can we keep that option aside.

we see better ways to handle this in future, by sharding on ids and further
partitioning of tables and parallel execution of FDW queries, but we need
to know if this is a known issue of pg10 or i am doing something wrong
which will bite in pg11 too.

Appreciate your help, always.

Regards,
Vijay

auxsvr

auxsvr@gmail.com

over 7 years ago

In reply to: Vijaykumar Jain (#1)

Re: FDW, too long to run explain

On Monday, 4 February 2019 09:14:14 EET Vijaykumar Jain wrote:

Hi,

Hi,

with pg v10.1

we use writes directly to shards, and reads via FDW from all shards (RO)
our DB size is ~ 500GB each shard, and tables are huge too.
1 table ~ 200GB, 1 ~55GB, 1 ~40GB and a lot of small tables, but large
indices on large table.

the sharding was done based on a key to enable shard isolation at app layer
using a fact table.
select id,shard from fact_table;

server resources are,
32GB mem, 8 vcpu, 500GB SSD.

the FDW connect to each other shard via FDW fronted by haproxy -> pgbouncer
-> postgresql.
Hope this is good enough background :)

now we have some long running queries via FDW that take minutes and get
killed explain runs as idle in transaction on remote servers. (we set
use_remote_estimate = true )
when the query is run on individual shards directly, it runs pretty
quickly,
but when run via FDW, it takes very long.
i even altered fetch_sie to 10000, so that in case some filters do not get
pushed, those can be applied on the FDW quickly.

In general, the plans via FDW are not the same as the ones running locally. We're having similar issues and the reason seems to be that queries via FDW are optimized for startup cost or few rows.

Regards,
Vijay

--
Regards,
Peter

auxsvr

auxsvr@gmail.com

over 7 years ago

In reply to: auxsvr (#2)

Re: FDW, too long to run explain

Related to this question:

Postgresql cursors are in most cases I've tried extremely slow. The cause is as described in my previous answer, in my experience. Is there any plan or way to improve this situation? For example, for FDW one would expect the plan on the remote side to be similar, if not identical, to the one locally, with the exception of the setup cost.
--
Regards,
Peter

Vijaykumar Jain

vjain@opentable.com

over 7 years ago

In reply to: auxsvr (#3)

Re: [External] Re: FDW, too long to run explain

I am yet to figure out the reason, what we have done is implement fake
columns to represent samples and giving them random numbers and keeping
other bulls to fake limit.

Most of the queries that were impacted were the ones that did not push
order by and limit to foreign servers.
I am also trying to upgrade pg11 to make use of parallelisation.
For now I am making use of materialised view on each shard and using
predicates that get pushed directly to ensure a simple plan is created.
There is a compromise but this is what is reasonable for now.

On Sun, 17 Feb 2019 at 4:27 PM auxsvr <auxsvr@gmail.com> wrote:

Related to this question:

Postgresql cursors are in most cases I've tried extremely slow. The cause
is as described in my previous answer, in my experience. Is there any plan
or way to improve this situation? For example, for FDW one would expect the
plan on the remote side to be similar, if not identical, to the one
locally, with the exception of the setup cost.
--
Regards,
Peter

--

Regards,
Vijay

Jeff Janes

jeff.janes@gmail.com

over 7 years ago

In reply to: Vijaykumar Jain (#1)

Re: FDW, too long to run explain

On Mon, Feb 4, 2019 at 2:15 AM Vijaykumar Jain <vjain@opentable.com> wrote:

now we have some long running queries via FDW that take minutes and get
killed explain runs as idle in transaction on remote servers.

Are you saying the EXPLAIN itself gets killed, or execution of the plan
generated based on the EXPLAIN (issued under use_remote_estimate = true)
gets killed? Who is doing the killing, the local side or the foreign
side? Can you include verbatim log entries for this?

now the explain for
select * from sh01.view1 keeps running for minutes sometimes,

then fetch too keeps running for minutes, although the total rows are <
10000 maybe.
idle in transaction | FETCH 10000 FROM c1

What is this? Is it from some monitoring tool, or pg_stat_activity, or
what? And is it on the local side or the foreign side?

other questions:
also, what is the cost of fetch_size?

It will always fetch rows from the foreign server in this sized chunks. A
larger fetch_size will have less network latency and computational overhead
if many rows are going to be consumed, but also consume more memory on the
local server as all rows are stored in memory per each chunk. Also, in the
case of a LIMIT, it reads a large number of rows even if most of them may
be unneeded. Conceptually, the LIMIT could be used to modify the FETCH
downward to match the LIMIT, but that is not implemented. In the case of a
view over UNION ALL, I don't think the individual subqueries even know what
the global LIMIT is.

I mean given we have a query

select * from foobar limit 10000; via FDW
limit 10000 does not get pushed.
so it seems all rows some to FDW node and then limit is applied?

It should not read all rows. It should read as many multiples of
fetch_size as needed, which should just be 1 multiple in this case.

Cheers,

Jeff

Show quoted text

Vijaykumar Jain

vjain@opentable.com

over 7 years ago

In reply to: Jeff Janes (#5)

Re: [External] Re: FDW, too long to run explain

Assuming your questions as 1,2,3, please find my answers below.

1)"explain" on foreign servers run as "idle in transactions". coz they were
running very long (in the order of some minutes) , pgbouncer (in tx level
pooling) setting kill them (as idle in tx time limit exceeded of 5 mins) or
else results in too many connections piling up.
2)yes, i get those from pg_stat_activity, it truncates the full statement,
but it shows up as * EXPLAIN select col1, col2 .... * 00:00:44 | idle in
transaction (this is just one of the screenshots i have). (on the foreign
side)
3)yes, i think we kind of understood that part (fetch and memory), but i am
not sure if that is used as any hint in plan generation too. i am sorry, i
did not put auto explain on, on foreign servers, as that required a restart
of the server.

(this is the real content of the screenshot ,yes 13 mins), masking the
colname and viewname
20678 | 00:13:38.990025 | EXPLAIN SELECT cols from view | idle in
transaction

the explain analyze of the same query on the foreign server is in ms.

I am sorry, i am vague about the queries in the email. i cannot reproduce
it, as we do not have multiple shards of 500G in my qa environment and i
cannot take dump of prod to test that in our test env coz of gdpr :)
but as i said in the mail, we were speculating since limit was not passed,
the plans may have been bad. We tricked the foreign server by using a
sample column to fake limit push down, and now have improved response
times. We made vaccum/analyze very aggressive to ensure stats are never
stale after large updates or deletes.

Unless someone can else reproduce, I guess, i'll close this mail. (I'll try
to reproduce it myself again, but for now i have less data to share to
convince anyone that happened.

Regards,
Vijay

On Sun, Feb 17, 2019 at 11:11 PM Jeff Janes <jeff.janes@gmail.com> wrote:

On Mon, Feb 4, 2019 at 2:15 AM Vijaykumar Jain <vjain@opentable.com>
wrote:

now we have some long running queries via FDW that take minutes and get
killed explain runs as idle in transaction on remote servers.

Are you saying the EXPLAIN itself gets killed, or execution of the plan
generated based on the EXPLAIN (issued under use_remote_estimate = true)
gets killed? Who is doing the killing, the local side or the foreign
side? Can you include verbatim log entries for this?

explain on foreign servers run as "idle in transactions". coz they were
running very long (in the order of some minutes) , pgbouncer setting kill
them (as idle in tx time limit exceeded of 5 mins) or else results in too
many connections piling up.

now the explain for
select * from sh01.view1 keeps running for minutes sometimes,

then fetch too keeps running for minutes, although the total rows are <
10000 maybe.
idle in transaction | FETCH 10000 FROM c1

What is this? Is it from some monitoring tool, or pg_stat_activity, or
what? And is it on the local side or the foreign side?

yes, pg_stat_activity, it truncates the full statement, but it shows up as
* EXPLAIN select col1, col2 .... * 00:00:44 | idle in transaction (this is
just one of the screenshots i have). (on the foreign side)

other questions:
also, what is the cost of fetch_size?

It will always fetch rows from the foreign server in this sized chunks. A
larger fetch_size will have less network latency and computational overhead
if many rows are going to be consumed, but also consume more memory on the
local server as all rows are stored in memory per each chunk. Also, in the
case of a LIMIT, it reads a large number of rows even if most of them may
be unneeded. Conceptually, the LIMIT could be used to modify the FETCH
downward to match the LIMIT, but that is not implemented. In the case of a
view over UNION ALL, I don't think the individual subqueries even know what
the global LIMIT is.

Yep, i guess that is where i think the plan may have

I mean given we have a query

select * from foobar limit 10000; via FDW
limit 10000 does not get pushed.
so it seems all rows some to FDW node and then limit is applied?

It should not read all rows. It should read as many multiples of
fetch_size as needed, which should just be 1 multiple in this case.

Yep, i think we kind of understood that part, but i am not sure if that is
used to generate the plan too. i am sorry, i did not put auto explain on,
on foreign servers, as that required a restart of the server.

Show quoted text

Cheers,

Jeff

Jeff Janes

jeff.janes@gmail.com

over 7 years ago

In reply to: Jeff Janes (#5)

Re: FDW, too long to run explain

On Sun, Feb 17, 2019 at 12:41 PM Jeff Janes <jeff.janes@gmail.com> wrote:

On Mon, Feb 4, 2019 at 2:15 AM Vijaykumar Jain <vjain@opentable.com>
wrote:

now we have some long running queries via FDW that take minutes and get
killed explain runs as idle in transaction on remote servers.

Are you saying the EXPLAIN itself gets killed, or execution of the plan
generated based on the EXPLAIN (issued under use_remote_estimate = true)
gets killed? Who is doing the killing, the local side or the foreign
side? Can you include verbatim log entries for this?

After thinking about it a bit more, I think I see the issue here. The
EXPLAIN pursuant to use_remote_estimate is issued in the same remote
transaction as the following DECLARE and FETCH's are. But after the
EXPLAIN is issued, the local server executes the query for a different FDW
to satisfy some other branch of the UNION ALL, giving the first FDW
connection time to do an idle-in-transaction timeout. This happens even if
no rows need to fetched from that FDW, because another branch of the UNION
ALL satisfied the LIMIT.

A question for the PostgreSQL hackers would be, Is it necessary and
desirable that the EXPLAIN be issued in the same transaction as the
eventual DECLARE and FETCHes? I don't think it is. I guess if the foreign
side table definition got changed between EXPLAIN and DECLARE it would
cause problems, but changing the foreign side definition out of sync with
the local side can cause problems anyway, so is that important to preserve?

Changing that might narrow but not completely fix the problem, as there
might still be delays between the DECLARE and the FETCH or between
successive FETCHes.

So a question for you would be, why do have such an aggressive setting
for idle_in_transaction_session_timeout that it causes this to happen?
Couldn't you relax it, perhaps just for the role used for the FDW
connections?

Cheers,

Jeff

Show quoted text

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Jeff Janes (#7)

Re: FDW, too long to run explain

Jeff Janes <jeff.janes@gmail.com> writes:

A question for the PostgreSQL hackers would be, Is it necessary and
desirable that the EXPLAIN be issued in the same transaction as the
eventual DECLARE and FETCHes? I don't think it is.

It seems like a good idea to me. I certainly don't think "I've got
an idle-in-transaction timeout on the remote that's shorter than my
local transaction runtime" is a plausible argument for changing that.
You could trip over that with a slow query regardless of whether we
separated the EXPLAIN step, just because there's no guarantee how
often we'll ask the FDW to fetch some rows.

I guess if the foreign
side table definition got changed between EXPLAIN and DECLARE it would
cause problems, but changing the foreign side definition out of sync with
the local side can cause problems anyway, so is that important to preserve?

I believe that the EXPLAIN will leave the remote transaction holding
AccessShareLock on the query's tables, meaning that doing it in one
transaction provides some positive protection against such problems,
which we'd lose if we changed this.

regards, tom lane

Jeff Janes

jeff.janes@gmail.com

over 7 years ago

In reply to: Vijaykumar Jain (#6)

Re: [External] Re: FDW, too long to run explain

On Sun, Feb 17, 2019 at 1:52 PM Vijaykumar Jain <vjain@opentable.com> wrote:

Assuming your questions as 1,2,3, please find my answers below.

1)"explain" on foreign servers run as "idle in transactions". coz they
were running very long (in the order of some minutes) , pgbouncer (in tx
level pooling) setting kill them (as idle in tx time limit exceeded of 5
mins) or else results in too many connections piling up.
2)yes, i get those from pg_stat_activity, it truncates the full statement,
but it shows up as * EXPLAIN select col1, col2 .... * 00:00:44 | idle in
transaction (this is just one of the screenshots i have). (on the foreign
side)

You are misinterpreting that data. The EXPLAIN is not currently running.
It is the last statement that was running prior to the connection going
idle-in-transaction. See my just previous email--I think the reason it is
idle is that the local is servicing some other part of the query (probably
on a different FDW), and that is taking a long time.

Are all the connections piling up from postgres_fdw, or are many of them
from other applications? I think your timeout is just shifting symptoms
around without fixing the underlying problem, while also making that
underlying problem hard to diagnose.

3)yes, i think we kind of understood that part (fetch and memory), but i
am not sure if that is used as any hint in plan generation too.

The query is planned as part of a cursor. As such, it will
use cursor_tuple_fraction as the "hint". Perhaps you could tweak this
parameter on the foreign side. I think that a low setting for this
parameter should give similar plans as a small LIMIT would give you, while
large settings would give the same plans as a large (or no) LIMIT would.

I think postgres_fdw should pass does the LIMIT when it can do so, but it
doesn't currently.

Cheers,

Jeff

Show quoted text

#10

Vijaykumar Jain

vjain@opentable.com

over 7 years ago

In reply to: Tom Lane (#8)

Re: [External] Re: FDW, too long to run explain

Hey Jeff,

yes, we now relaxed the idle in transaction setting to 15 mins.

i was hesitant to increase the settings as it blocked auto vaccum. We use
hot_standby_feedback = true also as we split reads/writes and allow long
running queries on read replicas, this too affects auto vaccum.
so overall, all the options i set to ensure auto vaccum gets triggered get
impacted by increased idle in tx and hot_standby_feedback = true, both of
which seem to be necessary for the setup now.

we have been trying to work with sharding using (mutli coordinator FDW) on
our own (and have been successful although have hiccups), using directory
based sharding in pg10. (if we cannot handle growth, all goes to mongo for
its automatic sharding and failover)

I have to admit we can do better here though. we need to rebalance the data
in the shards when we come close to 90% disk. those are long delete/upsert
queries. We have very aggressive autovaccum to ensure we do not have a lot
of stale stats.
I have plans to rearchitect the whole setup with pg11 where we plan to
introduce time based sharding and then table partitioning in each shard
further by time and also use Materialized views, for day old data with pre
aggregated fields on each shard so that explain does not have to work too
hard :)

and then create foreign tables and attach them as partitions. similar to
https://github.com/MasahikoSawada/pgconf-asia-demo/tree/c47e25bf589c7d401c9d342329b400ec26eb61db

i guess, i am diverting the query, but just saying :)
Thanks for suggestions and help Jeff. Appreciate it.

Regards,
Vijay

On Mon, Feb 18, 2019 at 12:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Jeff Janes <jeff.janes@gmail.com> writes:

A question for the PostgreSQL hackers would be, Is it necessary and
desirable that the EXPLAIN be issued in the same transaction as the
eventual DECLARE and FETCHes? I don't think it is.

It seems like a good idea to me. I certainly don't think "I've got
an idle-in-transaction timeout on the remote that's shorter than my
local transaction runtime" is a plausible argument for changing that.
You could trip over that with a slow query regardless of whether we
separated the EXPLAIN step, just because there's no guarantee how
often we'll ask the FDW to fetch some rows.

I guess if the foreign
side table definition got changed between EXPLAIN and DECLARE it would
cause problems, but changing the foreign side definition out of sync with
the local side can cause problems anyway, so is that important to

preserve?

I believe that the EXPLAIN will leave the remote transaction holding
AccessShareLock on the query's tables, meaning that doing it in one
transaction provides some positive protection against such problems,
which we'd lose if we changed this.

regards, tom lane

#11

Vijaykumar Jain

vjain@opentable.com

over 7 years ago

In reply to: Jeff Janes (#9)

Re: [External] Re: FDW, too long to run explain

Regards,
Vijay

On Mon, Feb 18, 2019 at 12:56 AM Jeff Janes <jeff.janes@gmail.com> wrote:

On Sun, Feb 17, 2019 at 1:52 PM Vijaykumar Jain <vjain@opentable.com>
wrote:

Assuming your questions as 1,2,3, please find my answers below.

1)"explain" on foreign servers run as "idle in transactions". coz they
were running very long (in the order of some minutes) , pgbouncer (in tx
level pooling) setting kill them (as idle in tx time limit exceeded of 5
mins) or else results in too many connections piling up.
2)yes, i get those from pg_stat_activity, it truncates the full
statement, but it shows up as * EXPLAIN select col1, col2 .... * 00:00:44
| idle in transaction (this is just one of the screenshots i have). (on the
foreign side)

You are misinterpreting that data. The EXPLAIN is not currently running.
It is the last statement that was running prior to the connection going
idle-in-transaction. See my just previous email--I think the reason it is
idle is that the local is servicing some other part of the query (probably
on a different FDW), and that is taking a long time.

Ok, i raked this from the logs where enabled log_min_duration_statement =
10s

2019-01-31 12:48:18 UTC LOG: duration: 29863.311 ms statement: EXPLAIN
SELECT blah, FROM public.view WHERE ((scheduled_bdt >= '2019-01-20'::date))
AND ((scheduled_bdt <= '2019-01-26'::date)) AND ((somekey = ANY
('{269029,123399,263164,261487}'::bigint[]))) (both the columns are
indexed)

Are all the connections piling up from postgres_fdw, or are many of them
from other applications? I think your timeout is just shifting symptoms
around without fixing the underlying problem, while also making that
underlying problem hard to diagnose.

same application, but when more than one person is using the analytical
tool that runs the underlying query.

3)yes, i think we kind of understood that part (fetch and memory), but i
am not sure if that is used as any hint in plan generation too.

The query is planned as part of a cursor. As such, it will
use cursor_tuple_fraction as the "hint". Perhaps you could tweak this
parameter on the foreign side. I think that a low setting for this
parameter should give similar plans as a small LIMIT would give you, while
large settings would give the same plans as a large (or no) LIMIT would.

I think postgres_fdw should pass does the LIMIT when it can do so, but it
doesn't currently.

As i already said, we have overcome the limit issue with a fake sample
column in the huge tables. that way we limit the number of rows on the
foreign server itself before the fetch. this is not the best and has its
edge cases, but yeah, it works for now.

Show quoted text

Cheers,

Jeff

#12

auxsvr

auxsvr@gmail.com

over 7 years ago

In reply to: Jeff Janes (#7)

Re: FDW, too long to run explain

On Sunday, 17 February 2019 20:58:47 EET Jeff Janes wrote:

A question for the PostgreSQL hackers would be, Is it necessary and
desirable that the EXPLAIN be issued in the same transaction as the
eventual DECLARE and FETCHes? I don't think it is. I guess if the foreign
side table definition got changed between EXPLAIN and DECLARE it would
cause problems, but changing the foreign side definition out of sync with
the local side can cause problems anyway, so is that important to preserve?

Won't separate transactions cause issues if the statistics of the table change in the meantime in a way that affects the plan?

Cheers,

Jeff

--
Regards,
Peter

#13

Jeff Janes

jeff.janes@gmail.com

over 7 years ago

In reply to: Vijaykumar Jain (#11)

Re: [External] Re: FDW, too long to run explain

On Sun, Feb 17, 2019 at 2:37 PM Vijaykumar Jain <vjain@opentable.com> wrote:

Ok, i raked this from the logs where enabled log_min_duration_statement =
10s

2019-01-31 12:48:18 UTC LOG: duration: 29863.311 ms statement: EXPLAIN
SELECT blah, FROM public.view WHERE ((scheduled_bdt >= '2019-01-20'::date))
AND ((scheduled_bdt <= '2019-01-26'::date)) AND ((somekey = ANY
('{269029,123399,263164,261487}'::bigint[]))) (both the columns are
indexed)

That is interesting. Was that in the logs for the local or the foreign
side? And is it common, or rare?

If on the local side, could it be that the EXPLAINs sent to the foreign
side are being made to wait by the connection pooler, leading to long
delays? If that is from the foreign side, then it should be conceptually
unrelated to FDW. Any chance you could reproduce the slowness in your test
environment? Slowness in the planner is probably related to the schema
structure, not the data itself.

I don't think this would be related to the idle-in-transaction, except that
one FDW connection maybe idle-in-transaction after its EXPLAIN is done
because it is waiting for another FDW connection to slowly run its EXPLAIN.

Cheers,

Jeff

Show quoted text

#14

Vijaykumar Jain

vjain@opentable.com

over 7 years ago

In reply to: Jeff Janes (#13)

Re: [External] Re: FDW, too long to run explain

Ok.
I’ll try to work on it this week and see if i am able to reproduce anything.

On Mon, 18 Feb 2019 at 2:30 AM Jeff Janes <jeff.janes@gmail.com> wrote:

On Sun, Feb 17, 2019 at 2:37 PM Vijaykumar Jain <vjain@opentable.com>
wrote:

Ok, i raked this from the logs where enabled log_min_duration_statement =
10s

2019-01-31 12:48:18 UTC LOG: duration: 29863.311 ms statement: EXPLAIN
SELECT blah, FROM public.view WHERE ((scheduled_bdt >= '2019-01-20'::date))
AND ((scheduled_bdt <= '2019-01-26'::date)) AND ((somekey = ANY
('{269029,123399,263164,261487}'::bigint[]))) (both the columns are
indexed)

That is interesting. Was that in the logs for the local or the foreign
side? And is it common, or rare?

If on the local side, could it be that the EXPLAINs sent to the foreign
side are being made to wait by the connection pooler, leading to long
delays? If that is from the foreign side, then it should be conceptually
unrelated to FDW. Any chance you could reproduce the slowness in your test
environment? Slowness in the planner is probably related to the schema
structure, not the data itself.

I don't think this would be related to the idle-in-transaction, except
that one FDW connection maybe idle-in-transaction after its EXPLAIN is done
because it is waiting for another FDW connection to slowly run its EXPLAIN.

Cheers,

Jeff

--

Regards,
Vijay

#15

Jeff Janes

jeff.janes@gmail.com

over 7 years ago

In reply to: Vijaykumar Jain (#4)

Re: [External] Re: FDW, too long to run explain

On Sun, Feb 17, 2019 at 6:32 AM Vijaykumar Jain <vjain@opentable.com> wrote:

I am yet to figure out the reason, what we have done is implement fake
columns to represent samples and giving them random numbers and keeping
other bulls to fake limit.

Most of the queries that were impacted were the ones that did not push
order by and limit to foreign servers.
I am also trying to upgrade pg11 to make use of parallelisation.

postgres_fdw operates through declared cursors, and declared cursors
inhibit parallel query. This doesn't change in v11, see
https://www.postgresql.org/docs/11/when-can-parallel-query-be-used.html

I'm not aware of any other changes in v11 that are likely to help you out.

Cheers,

Jeff

Show quoted text

#16

Vijaykumar Jain

vjain@opentable.com

over 7 years ago

In reply to: Jeff Janes (#15)

Re: [External] Re: FDW, too long to run explain

Oh Wow, i guess you are right.
I just ran example where local runs make use of parallel setup, but not FDW.

i have three servers
2 x pg10
1 x pg11

i run queries on coordinator node ( pg11 ) which makes calls to foreign
server to do a simple count.
the individual nodes run the query in parallel, the setup is repeatable.
but via FDW it runs a simple seq scan.
i guess this is for the same reason as you mentioned wrt declared cursors.

on pg11
create schema pg10;
create schema pg10_qa;
import foreign schema pg10 from server pg10 into pg10;
import foreign schema pg10_qa from server pg10_qa into pg10_qa;

explain (analyze,verbose) SELECT COUNT(1) FROM pg10.tbl_ItemTransactions;
----this query is via FDW
QUERY PLAN
----------------------------------------------------------------------------------------------------
Foreign Scan (cost=108.53..152.69 rows=1 width=8) (actual
time=6584.498..6584.500 rows=1 loops=1)
Output: (count(1))
Relations: Aggregate on (pg10.tbl_itemtransactions)
Remote SQL: SELECT count(1) FROM pg10.tbl_itemtransactions
Planning Time: 0.112 ms
Execution Time: 6585.435 ms
(6 rows)

2019-02-18 09:56:48 UTC LOG: duration: 6593.046 ms plan:
Query Text: DECLARE c1 CURSOR FOR
SELECT count(1) FROM pg10.tbl_itemtransactions
Aggregate (cost=768694.80..768694.81 rows=1 width=8) (actual
time=6593.039..6593.039 rows=1 loops=1)
Output: count(1)
Buffers: shared hit=259476
-> Seq Scan on pg10.tbl_itemtransactions (cost=0.00..666851.04
rows=40737504 width=0) (actual time=0.024..3389.245 rows=40737601 loops=1)
Output: tranid, transactiondate, transactionname
Buffers: shared hit=259476

--------

on pg10 (1) -- foreign server pg10
create schema pg10;
CREATE TABLE pg10.tbl_ItemTransactions
(
TranID SERIAL
,TransactionDate TIMESTAMPTZ
,TransactionName TEXT
);
INSERT INTO pg10.tbl_ItemTransactions
(TransactionDate, TransactionName)
SELECT x, 'dbrnd'
FROM generate_series('2014-01-01 00:00:00'::timestamptz, '2016-08-01
00:00:00'::timestamptz,'2 seconds'::interval) a(x);

explain analyze SELECT count(1) FROM pg10.tbl_itemtransactions; --this
query is local

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=472650.72..472650.73 rows=1 width=8) (actual
time=2576.053..2576.054 rows=1 loops=1)
-> Gather (cost=472650.50..472650.71 rows=2 width=8) (actual
time=2575.721..2626.980 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=471650.50..471650.51 rows=1 width=8)
(actual time=2569.302..2569.302 rows=1 loops=3)
-> Parallel Seq Scan on tbl_itemtransactions
(cost=0.00..429215.60 rows=16973960 width=0) (actual time=0.048..1492.144
rows=13579200 loops=3)
Planning time: 0.405 ms
Execution time: 2627.455 ms
(8 rows)
--------

on pg10 (2) -- foreign server pg10_qa
create schema pg10_qa;
CREATE TABLE pg10_qa.tbl_ItemTransactions
(
TranID SERIAL
,TransactionDate TIMESTAMPTZ
,TransactionName TEXT
);
INSERT INTO pg10_qa.tbl_ItemTransactions
(TransactionDate, TransactionName)
SELECT x, 'dbrnd'
FROM generate_series('2014-01-01 00:00:00'::timestamptz, '2016-08-01
00:00:00'::timestamptz,'2 seconds'::interval) a(x);

explain analyze SELECT count(1) FROM pg10_qa.tbl_itemtransactions; -- this
query is local

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=472650.72..472650.73 rows=1 width=8) (actual
time=2568.469..2568.469 rows=1 loops=1)
-> Gather (cost=472650.50..472650.71 rows=2 width=8) (actual
time=2568.067..2613.006 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=471650.50..471650.51 rows=1 width=8)
(actual time=2563.893..2563.893 rows=1 loops=3)
-> Parallel Seq Scan on tbl_itemtransactions
(cost=0.00..429215.60 rows=16973960 width=0) (actual time=0.017..1388.417
rows=13579200 loops=3)
Planning time: 0.048 ms
Execution time: 2613.246 ms
(8 rows)

but i guess partition elimination still works across the shards (see
attached). atleast, we'll benefit from here :) in pg11.

Regards,
Vijay

On Mon, Feb 18, 2019 at 3:07 AM Jeff Janes <jeff.janes@gmail.com> wrote:

Show quoted text

On Sun, Feb 17, 2019 at 6:32 AM Vijaykumar Jain <vjain@opentable.com>
wrote:

I am yet to figure out the reason, what we have done is implement fake
columns to represent samples and giving them random numbers and keeping
other bulls to fake limit.

Most of the queries that were impacted were the ones that did not push
order by and limit to foreign servers.
I am also trying to upgrade pg11 to make use of parallelisation.

postgres_fdw operates through declared cursors, and declared cursors
inhibit parallel query. This doesn't change in v11, see
https://www.postgresql.org/docs/11/when-can-parallel-query-be-used.html

I'm not aware of any other changes in v11 that are likely to help you out.

Cheers,

Jeff

FDW, too long to run explain

Attachments: