explain vs auto_explain

Started by Vijaykumar Jainover 1 year ago7 messagesgeneral

vijaykumarjain.github@gmail.com

over 1 year ago

Hi,

Why does auto_explain have extended flags to support nested statements and
triggers whereas explain does not.
the objects of concern have the same ownership and access, so it does not
even sound like a security thing.

or maybe it is just a format/display issue ?
i tried to check the doc, there is no explicit mention of differences
PostgreSQL: Documentation: 17: F.3. auto_explain — log execution plans of
slow queries <https://www.postgresql.org/docs/current/auto-explain.html>

postgres@ubuntu:/tmp$ cat db1/postgresql.auto.conf | grep -v '^#'
port=5432
session_preload_libraries = auto_explain
auto_explain.log_min_duration = 0
auto_explain.log_analyze = true
auto_explain.log_buffers = true
auto_explain.log_timing = true
auto_explain.log_verbose = on
auto_explain.log_triggers = on
auto_explain.log_nested_statements = on
postgres@ubuntu:/tmp$ psql -p 5432 demo
psql (16.4 (Ubuntu 16.4-0ubuntu0.24.04.1))
Type "help" for help.

demo=# create table t(col1 int primary key, col2 int);
CREATE TABLE
demo=# create table r(col3 int primary key, col1 int references
t(col1) on update cascade on delete cascade); -- foreign table with
col1 having no index
CREATE TABLE
demo=# insert into t select x, x from generate_series(1, 1000000) x;
INSERT 0 1000000
demo=# insert into r select x, (x % 3) + 1 from generate_series(1,
100000) x; -- only 1,2,3 for col1
INSERT 0 100000
demo=# begin;
BEGIN
demo=*# explain (analyze,verbose,buffers) delete from t where col1 <
5; -- when we delete from t, it will show it touches r but will not
show what plan was used to delete data from r. we need to check the
actual plan to figure out why the delete is slow and how to fix it
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Delete on public.t (cost=0.42..8.50 rows=0 width=0) (actual
time=0.025..0.026 rows=0 loops=1)
Buffers: shared hit=13
-> Index Scan using t_pkey on public.t (cost=0.42..8.50 rows=4
width=6) (actual time=0.006..0.008 rows=4 loops=1)
Output: ctid
Index Cond: (t.col1 < 5)
Buffers: shared hit=4
Planning:
Buffers: shared hit=7
Planning Time: 0.083 ms
Trigger RI_ConstraintTrigger_a_16475 for constraint r_col1_fkey:
time=46.274 calls=4
Execution Time: 46.460 ms
(11 rows)

demo=*# -- note Trigger RI_ConstraintTrigger_a_16475 for constraint
r_col1_fkey: time=46.274 calls=4 , it made 4 calls to r
demo=*# rollback;
ROLLBACK
demo=#
\q

-------------- why do we not see this via console EXPLAIN, it is still
the same PID

postgres@ubuntu:/tmp$ less db1.log
postgres@ubuntu:/tmp$ tail -20 db1.log
2024-10-05 18:40:26.429 UTC [2260] LOG: duration: 1.087 ms plan:
Query Text: DELETE FROM ONLY "public"."r" WHERE $1
OPERATOR(pg_catalog.=) "col1"
Query Parameters: $1 = '4'
Delete on public.r (cost=0.00..1693.00 rows=0 width=0)
(actual time=1.086..1.087 rows=0 loops=1)
Buffers: shared hit=443
-> Seq Scan on public.r (cost=0.00..1693.00 rows=1
width=6) (actual time=1.086..1.086 rows=0 loops=1)
Output: ctid
Filter: (4 = r.col1)
Buffers: shared hit=443
2024-10-05 18:40:26.429 UTC [2260] CONTEXT: SQL statement "DELETE
FROM ONLY "public"."r" WHERE $1 OPERATOR(pg_catalog.=) "col1""
2024-10-05 18:40:26.429 UTC [2260] LOG: duration: 46.303 ms plan:
Query Text: explain (analyze,verbose,buffers) delete from t
where col1 < 5;
Delete on public.t (cost=0.42..8.50 rows=0 width=0) (actual
time=0.025..0.026 rows=0 loops=1)
Buffers: shared hit=13
-> Index Scan using t_pkey on public.t (cost=0.42..8.50
rows=4 width=6) (actual time=0.006..0.008 rows=4 loops=1)
Output: ctid
Index Cond: (t.col1 < 5)
Buffers: shared hit=4
Trigger RI_ConstraintTrigger_a_16475 for constraint
r_col1_fkey: time=46.274 calls=4

--
Thanks,
Vijay

Open to work
Resume - Vijaykumar Jain <https://github.com/cabecada>

David G. Johnston

david.g.johnston@gmail.com

over 1 year ago

In reply to: Vijaykumar Jain (#1)

Re: explain vs auto_explain

On Sat, Oct 19, 2024 at 10:05 AM Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

Why does auto_explain have extended flags to support nested statements and
triggers whereas explain does not.
the objects of concern have the same ownership and access, so it does not
even sound like a security thing.

or maybe it is just a format/display issue ?
i tried to check the doc, there is no explicit mention of differences
PostgreSQL: Documentation: 17: F.3. auto_explain — log execution plans of
slow queries <https://www.postgresql.org/docs/current/auto-explain.html>

With auto_explain you end up producing multiple individual explains, one
for each top-level query being executed. Each one produced by auto_explain
which hooks into the point where a top-level query begins execution.
Running explain directly on a top-level query doesn't establish any of the
needed hooks to produce these additional explains. While those hooks could
be installed the output for manual explain is the single query result sent
to the client. How multiple explains could be combined into that single
output channel would need to be figured out as well. While auto_explain
uses the log file which is much simpler to use in this manner since each
auto_explain can just be told to write its output to the log and not worry
about any other considerations.

In short, there is no fundamental reason manual explain couldn't be
improved along these lines.

David J.

Vijaykumar Jain

vijaykumarjain.github@gmail.com

over 1 year ago

In reply to: David G. Johnston (#2)

Re: explain vs auto_explain

On Sat, 19 Oct 2024 at 22:55, David G. Johnston <david.g.johnston@gmail.com>
wrote:

With auto_explain you end up producing multiple individual explains, one
for each top-level query being executed. Each one produced by auto_explain
which hooks into the point where a top-level query begins execution.
Running explain directly on a top-level query doesn't establish any of the
needed hooks to produce these additional explains. While those hooks could
be installed the output for manual explain is the single query result sent
to the client. How multiple explains could be combined into that single
output channel would need to be figured out as well. While auto_explain
uses the log file which is much simpler to use in this manner since each
auto_explain can just be told to write its output to the log and not worry
about any other considerations.

Thanks David.

The reason is, a normal db user who does not have access to logs or the db
server, will rely on what the console throws back.
as a result, they are not able to get the complete view of what is
happening ...

not sure how relevant is this context, but when i was playing with citus,
they have a flag
set citus.explain_all_tasks = 1;
which is session based and dumps the plan on all remote workers / shard
nodes right to the user, which otherwise only dumps for one node .. yes its
verbose, but it helps the user to see what was run on the shard.

Similarly, I was expecting to explain to give a full plan with nested plans
if requested back to the user in their session directly. if they dont want
the extended output, the default flags should be fine ?
Or is it the case, where there can be million rows from remote tables, and
auto_explain will log all scans for all matching rows, thereby killing the
console session or something ?
i tried to check the code for auto_explain , there is nothing that helps
understand why it was provided as a separate .

--
Thanks,
Vijay

Open to work
Resume - Vijaykumar Jain <https://github.com/cabecada>

David G. Johnston

david.g.johnston@gmail.com

over 1 year ago

In reply to: Vijaykumar Jain (#3)

Re: explain vs auto_explain

On Sat, Oct 19, 2024 at 10:43 AM Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

i tried to check the code for auto_explain , there is nothing that helps
understand why it was provided as a separate .

Probably because output to log was easier than reworking the internals to
make output to client happen.

If you are interested in working on a patch to address your use case I say
go for it. Designing the inputs and outputs actually doesn't seem that
challenging, if exceptionally vulnerable to bike-shedding. But I imagine
the internals are quite a different matter.

David J.

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: David G. Johnston (#4)

Re: explain vs auto_explain

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Sat, Oct 19, 2024 at 10:43 AM Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

i tried to check the code for auto_explain , there is nothing that helps
understand why it was provided as a separate .

Probably because output to log was easier than reworking the internals to
make output to client happen.

The reason that auto_explain exists is to capture plans for queries
that are being issued by real applications --- which aren't programmed
to issue EXPLAIN for themselves, and likely don't have a good place to
put the data if they did. Also, auto_explain can capture runtime
details for queries that are really being executed and delivering
results, whereas EXPLAIN ANALYZE doesn't deliver the query results and
thus can't be shoehorned into real applications. So it's partly a
matter of not having a protocol spec that would allow the EXPLAIN data
to be delivered on a side channel, but mostly a recognition that
rewriting applications to capture such data would be painful.

regards, tom lane

Vijaykumar Jain

vijaykumarjain.github@gmail.com

over 1 year ago

In reply to: Tom Lane (#5)

Re: explain vs auto_explain

On Sat, 19 Oct 2024 at 23:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Sat, Oct 19, 2024 at 10:43 AM Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

i tried to check the code for auto_explain , there is nothing that helps
understand why it was provided as a separate .

Probably because output to log was easier than reworking the internals to
make output to client happen.

The reason that auto_explain exists is to capture plans for queries
that are being issued by real applications --- which aren't programmed
to issue EXPLAIN for themselves, and likely don't have a good place to
put the data if they did. Also, auto_explain can capture runtime
details for queries that are really being executed and delivering
results, whereas EXPLAIN ANALYZE doesn't deliver the query results and
thus can't be shoehorned into real applications. So it's partly a
matter of not having a protocol spec that would allow the EXPLAIN data
to be delivered on a side channel, but mostly a recognition that
rewriting applications to capture such data would be painful.

regards, tom lane

ok, it makes sense for the reason of having auto_explain. but maybe i did
ask correctly,
why do we not have the extended flags in auto_explain , in , explain wrt
nested_statements, and triggers ...
a user who finds the console output complicated, could well use a pager or
redirect the output to the file via \o which is client side.

as i mentioned the reason is, there are differences on what auto_explain
captures and what explain does... and the dev user is not able to see the
difference
without having access to logs.
for example , iirc
refresh materialised view does not show the plan , although there was once
a feature reported, which showed the difference in support for parallelism.
ex in this discussion
Thread: CREATE/REFRESH MATERIALIZED VIEW planner difference? : Postgres
Professional <https://postgrespro.com/list/thread-id/2553661>

i dont expect this to be a feature request or something, it was just that i
wanted to be aware why there are differences,
because the cloud guys have strict control over logs as it has many other
things, so they just wont give access at all.

--
Thanks,
Vijay

Open to work
Resume - Vijaykumar Jain <https://github.com/cabecada>

Vijaykumar Jain

vijaykumarjain.github@gmail.com

over 1 year ago

In reply to: Vijaykumar Jain (#6)

Re: explain vs auto_explain

On Sat, 19 Oct 2024 at 23:48, Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

ok, it makes sense for the reason of having auto_explain. but maybe i did
ask correctly,
why do we not have the extended flags in auto_explain , in , explain wrt
nested_statements, and triggers ...
a user who finds the console output complicated, could well use a pager or
redirect the output to the file via \o which is client side.

actually my bad. pls ignore.

i forgot we could leverage client_min_messages to get the output to console

/*
postgres=# explain analyze create materialized view mv as select * from t;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..35.50 rows=2550 width=4) (actual
time=0.027..0.028 rows=0 loops=1)
Planning Time: 0.549 ms
Execution Time: 7.309 ms
(3 rows)

^
postgres=# explain analyze refresh materialized view mv;
QUERY PLAN
-------------------------------------------
Utility statements have no plan structure
(1 row)

postgres=# alter system set client_min_messages TO log;
ALTER SYSTEM
postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

postgres=# refresh materialized view
postgres-# mv ;
LOG: duration: 0.016 ms plan:
Query Text: refresh materialized view
mv ;
Seq Scan on public.t (cost=0.00..35.50 rows=2550 width=4) (actual
time=0.009..0.009 rows=0 loops=1)
Output: col1
REFRESH MATERIALIZED VIEW
*/

I think i am good. thanks all.
--
Thanks,
Vijay

Open to work
Resume - Vijaykumar Jain <https://github.com/cabecada>