Can we get rid of repeated queries from pg_dump?

Started by hubert depesz lubaczewskialmost 5 years ago37 messageshackersgeneral

hubert depesz lubaczewski

depesz@depesz.com

almost 5 years ago

hackersgeneral

Hi,
I have following case: local pg_dump (v15) connecting to remote
PostgreSQL (v12).

I'm trying to get just schema (pg_dump -s). It's taking very long, which
is kinda OK given that there is long distance and latency, but I got
curious and checked queries that the pg_dump was running (select * from
pg_stat_activity where application_name = 'pg_dump').

And I noticed that many of these queries repeat many times.

The ones that I noticed were:
SELECT pg_catalog.format_type('2281'::pg_catalog.oid, NULL)
around the time that
SELECT
proretset,
prosrc,
probin,
provolatile,
proisstrict,
prosecdef,
lanname,
proconfig,
procost,
prorows,
pg_catalog.pg_get_function_arguments(p.oid) AS funcargs,
pg_catalog.pg_get_function_identity_arguments(p.oid) AS funciargs,
pg_catalog.pg_get_function_result(p.oid) AS funcresult,
proleakproof,
array_to_string(protrftypes, ' ') AS protrftypes,
proparallel,
prokind,
prosupport,
NULL AS prosqlbody
FROM
pg_catalog.pg_proc p,
pg_catalog.pg_language l
WHERE
p.oid = '60188'::pg_catalog.oid
AND l.oid = p.prolang

was called too.

It seems that for every function, pg_dump is getting it's data, and then
runs format_type on each parameter/output type? I'm mostly guessing
here, as I didn't read the code.

Wouldn't it be possible to get all type formats at once, and cache them
in pg_dump? Or at the very least reuse already received information?

Unfortunately it seems I can't run pg_dump closer to the db server, and
the latency of queries is killing me.

It's been 15 minutes, and pg_dump (called: pg_dump -v -s -f schema.dump,
with env variables configuring db connection) hasn't written even single
byte to schema.dump)

depesz

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: hubert depesz lubaczewski (#1)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

hubert depesz lubaczewski <depesz@depesz.com> writes:

It seems that for every function, pg_dump is getting it's data, and then
runs format_type on each parameter/output type? I'm mostly guessing
here, as I didn't read the code.
Wouldn't it be possible to get all type formats at once, and cache them
in pg_dump? Or at the very least reuse already received information?

Send a patch ...

regards, tom lane

hubert depesz lubaczewski

depesz@depesz.com

almost 5 years ago

In reply to: Tom Lane (#2)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

On Thu, Aug 26, 2021 at 10:02:07AM -0400, Tom Lane wrote:

hubert depesz lubaczewski <depesz@depesz.com> writes:

It seems that for every function, pg_dump is getting it's data, and then
runs format_type on each parameter/output type? I'm mostly guessing
here, as I didn't read the code.
Wouldn't it be possible to get all type formats at once, and cache them
in pg_dump? Or at the very least reuse already received information?

Send a patch ...

Yeah, that's not going to work, my C skills are next-to-none :(

I guess I'll have to wait till someone else will assume it's a problem,
someone with skills to do something about it.

Best regards,

depesz

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: hubert depesz lubaczewski (#3)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

hubert depesz lubaczewski <depesz@depesz.com> writes:

On Thu, Aug 26, 2021 at 10:02:07AM -0400, Tom Lane wrote:

hubert depesz lubaczewski <depesz@depesz.com> writes:

Wouldn't it be possible to get all type formats at once, and cache them
in pg_dump? Or at the very least reuse already received information?

Send a patch ...

Yeah, that's not going to work, my C skills are next-to-none :(
I guess I'll have to wait till someone else will assume it's a problem,
someone with skills to do something about it.

Well, you could move it forward by doing the legwork to identify which
queries are worth merging. Is it really sane to do a global "select
format_type() from pg_type" query and save all the results on the client
side? I wonder whether there are cases where that'd be a net loss.
You could do the experimentation to figure that out without necessarily
having the C skills to make pg_dump actually do it.

regards, tom lane

hubert depesz lubaczewski

depesz@depesz.com

almost 5 years ago

In reply to: Tom Lane (#4)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

On Thu, Aug 26, 2021 at 10:20:29AM -0400, Tom Lane wrote:

hubert depesz lubaczewski <depesz@depesz.com> writes:

On Thu, Aug 26, 2021 at 10:02:07AM -0400, Tom Lane wrote:

hubert depesz lubaczewski <depesz@depesz.com> writes:

Wouldn't it be possible to get all type formats at once, and cache them
in pg_dump? Or at the very least reuse already received information?

Send a patch ...

Yeah, that's not going to work, my C skills are next-to-none :(
I guess I'll have to wait till someone else will assume it's a problem,
someone with skills to do something about it.

Well, you could move it forward by doing the legwork to identify which
queries are worth merging. Is it really sane to do a global "select

Sure. On it. Will report back when I'll have more info.

Best regards,

depesz

adrian.klaver@aklaver.com

almost 5 years ago

In reply to: hubert depesz lubaczewski (#1)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

On 8/26/21 1:44 AM, hubert depesz lubaczewski wrote:

Hi,
I have following case: local pg_dump (v15) connecting to remote
PostgreSQL (v12).

So you are using a dev version of pg_dump or is that a typo?

It's been 15 minutes, and pg_dump (called: pg_dump -v -s -f schema.dump,
with env variables configuring db connection) hasn't written even single
byte to schema.dump)

What happens if you run without the -v?

depesz

--
Adrian Klaver
adrian.klaver@aklaver.com

hubert depesz lubaczewski

depesz@depesz.com

almost 5 years ago

In reply to: Adrian Klaver (#6)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

On Thu, Aug 26, 2021 at 07:34:26AM -0700, Adrian Klaver wrote:

On 8/26/21 1:44 AM, hubert depesz lubaczewski wrote:

Hi,
I have following case: local pg_dump (v15) connecting to remote
PostgreSQL (v12).

So you are using a dev version of pg_dump or is that a typo?

Yes. I'm running pg_dump from my computer to (very) remote db server.

It's been 15 minutes, and pg_dump (called: pg_dump -v -s -f schema.dump,
with env variables configuring db connection) hasn't written even single
byte to schema.dump)

What happens if you run without the -v?

Well, I guess it works, but with no output I can't judge how fast.
Definitely doesn't seem to be going any faster.

Best regards,

depesz

adrian.klaver@aklaver.com

almost 5 years ago

In reply to: hubert depesz lubaczewski (#7)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

On 8/26/21 7:35 AM, hubert depesz lubaczewski wrote:

On Thu, Aug 26, 2021 at 07:34:26AM -0700, Adrian Klaver wrote:

On 8/26/21 1:44 AM, hubert depesz lubaczewski wrote:

Hi,
I have following case: local pg_dump (v15) connecting to remote
PostgreSQL (v12).

So you are using a dev version of pg_dump or is that a typo?

Yes. I'm running pg_dump from my computer to (very) remote db server.

SSHing and dumping on the remote is out as a short term solution?

Well, I guess it works, but with no output I can't judge how fast.
Definitely doesn't seem to be going any faster.

Unknown slow, that didn't help.

Best regards,

depesz

--
Adrian Klaver
adrian.klaver@aklaver.com

hubert depesz lubaczewski

depesz@depesz.com

almost 5 years ago

In reply to: Adrian Klaver (#8)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

On Thu, Aug 26, 2021 at 07:46:46AM -0700, Adrian Klaver wrote:

On 8/26/21 7:35 AM, hubert depesz lubaczewski wrote:

On Thu, Aug 26, 2021 at 07:34:26AM -0700, Adrian Klaver wrote:

On 8/26/21 1:44 AM, hubert depesz lubaczewski wrote:

Hi,
I have following case: local pg_dump (v15) connecting to remote
PostgreSQL (v12).

So you are using a dev version of pg_dump or is that a typo?

Yes. I'm running pg_dump from my computer to (very) remote db server.

SSHing and dumping on the remote is out as a short term solution?

As I mentioned in original post - I can't run pg_dump closer to server.
SSH is not available, at least for me.

Anyway - I got the dump, so I am good for now, but I think that this
could be improved, so I'll work on getting some stats on queries.

Best regards,

depesz

hubert depesz lubaczewski

depesz@depesz.com

almost 5 years ago

In reply to: Tom Lane (#4)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

On Thu, Aug 26, 2021 at 10:20:29AM -0400, Tom Lane wrote:

Well, you could move it forward by doing the legwork to identify which
queries are worth merging. Is it really sane to do a global "select
format_type() from pg_type" query and save all the results on the client
side? I wonder whether there are cases where that'd be a net loss.
You could do the experimentation to figure that out without necessarily
having the C skills to make pg_dump actually do it.

So, I got some info.

First, some stats. The DB contains:

- 14 extensions
- 1 aggregate
- 107 functions
- 5 schemas
- 5 sequences
- 188 logged tables
- 1 unlogged table
- 206 "normal" indexes
- 30 unique indexes
- 15 materialized views
- 16 triggers
- 87 types
- 26 views

pg_dump -s of it is ~ 670kB.

Interestingly, while dumping (pg_dump -s -v), we can see progress going on, and then, after:

====
...
pg_dump: reading publications
pg_dump: reading publication membership
pg_dump: reading subscriptions
pg_dump: reading dependency data
pg_dump: saving encoding = UTF8
pg_dump: saving standard_conforming_strings = on
pg_dump: saving search_path =
====

It stops (progress visible in console). And then, in pg logs I see queries like:

#v+
SELECT
proretset,
prosrc,
probin,
provolatile,
proisstrict,
prosecdef,
lanname,
proconfig,
procost,
prorows,
pg_catalog.pg_get_function_arguments(p.oid) AS funcargs,
pg_catalog.pg_get_function_identity_arguments(p.oid) AS funciargs,
pg_catalog.pg_get_function_result(p.oid) AS funcresult,
proleakproof,
array_to_string(protrftypes, ' ') AS protrftypes,
proparallel,
prokind,
prosupport,
NULL AS prosqlbody
FROM pg_catalog.pg_proc p, pg_catalog.pg_language l
WHERE p.oid = '43875'::pg_catalog.oid AND l.oid = p.prolang
#v-

Now for query stats.

To dump it all, pg_dump needed 9173 queries (logged by
log_min_duration_statement = 0 for this user).

I extracted all queries to separate files, and made stats. In total there were
only 4257 unique queries.

Then I checked for repeated queries. Top 10 most repeated offenders were:

615 times : SELECT pg_catalog.format_type('25'::pg_catalog.oid, NULL)
599 times : SELECT pg_catalog.format_type('23'::pg_catalog.oid, NULL)
579 times : SELECT pg_catalog.format_type('2281'::pg_catalog.oid, NULL)
578 times : SELECT pg_catalog.format_type('41946'::pg_catalog.oid, NULL)
523 times : SELECT pg_catalog.format_type('701'::pg_catalog.oid, NULL)
459 times : SELECT pg_catalog.format_type('42923'::pg_catalog.oid, NULL)
258 times : SELECT pg_catalog.format_type('16'::pg_catalog.oid, NULL)
176 times : SELECT pg_catalog.format_type('19'::pg_catalog.oid, NULL)
110 times : SELECT pg_catalog.format_type('21'::pg_catalog.oid, NULL)
106 times : SELECT pg_catalog.format_type('42604'::pg_catalog.oid, NULL)

In total, there were 5000 queries:
SELECT pg_catalog.format_type('[0-9]+'::pg_catalog.oid, NULL)

But there were only 83 separate oids that were scanned.

The only other repeated command was:
SELECT pg_catalog.set_config('search_path', '', false);
and it was called only twice.

Based on my reading of queries in order it seems to follow the pattern of:

One call for:

SELECT proretset, prosrc, probin, provolatile, proisstrict, prosecdef, lanname, proconfig, procost, prorows, pg_catalog.pg_get_function_arguments(p.oid) AS funcargs, pg_catalog.pg_get_function_identity_arguments(p.oid) AS funciargs, pg_catalog.pg_get_function_re
sult(p.oid) AS funcresult, proleakproof, array_to_string(protrftypes, ' ') AS protrftypes, proparallel, prokind, prosupport, NULL AS prosqlbody FROM pg_catalog.pg_proc p, pg_catalog.pg_language l WHERE p.oid = 'SOME_NUMBER'::pg_catalog.oid AND l.oid = p.prolang

and then one or more:

SELECT pg_catalog.format_type('SOME_NUMBER'::pg_catalog.oid, NULL)

In one case, after proc query, there were 94 concecutive
pg_catalog.format_type queries.

I hope it helps.

Best regards,

depesz

laurenz.albe@cybertec.at

almost 5 years ago

In reply to: hubert depesz lubaczewski (#10)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

On Thu, 2021-08-26 at 18:06 +0200, hubert depesz lubaczewski wrote:

Now for query stats.

To dump it all, pg_dump needed 9173 queries (logged by
log_min_duration_statement = 0 for this user).

I extracted all queries to separate files, and made stats. In total there were
only 4257 unique queries.

Then I checked for repeated queries. Top 10 most repeated offenders were:

615 times : SELECT pg_catalog.format_type('25'::pg_catalog.oid, NULL)
599 times : SELECT pg_catalog.format_type('23'::pg_catalog.oid, NULL)
579 times : SELECT pg_catalog.format_type('2281'::pg_catalog.oid, NULL)
578 times : SELECT pg_catalog.format_type('41946'::pg_catalog.oid, NULL)
523 times : SELECT pg_catalog.format_type('701'::pg_catalog.oid, NULL)
459 times : SELECT pg_catalog.format_type('42923'::pg_catalog.oid, NULL)
258 times : SELECT pg_catalog.format_type('16'::pg_catalog.oid, NULL)
176 times : SELECT pg_catalog.format_type('19'::pg_catalog.oid, NULL)
110 times : SELECT pg_catalog.format_type('21'::pg_catalog.oid, NULL)
106 times : SELECT pg_catalog.format_type('42604'::pg_catalog.oid, NULL)

In total, there were 5000 queries:
SELECT pg_catalog.format_type('[0-9]+'::pg_catalog.oid, NULL)

But there were only 83 separate oids that were scanned.

That is a strong argument for using a hash table to cache the types.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: Laurenz Albe (#11)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

Laurenz Albe <laurenz.albe@cybertec.at> writes:

On Thu, 2021-08-26 at 18:06 +0200, hubert depesz lubaczewski wrote:

In total, there were 5000 queries:
SELECT pg_catalog.format_type('[0-9]+'::pg_catalog.oid, NULL)
But there were only 83 separate oids that were scanned.

That is a strong argument for using a hash table to cache the types.

Those queries are coming from getFormattedTypeName(), which is used
for function arguments and the like. I'm not quite sure why Hubert
is seeing 5000 such calls in a database with only ~100 functions;
surely they don't all have an average of 50 arguments?

I experimented with the attached, very quick-n-dirty patch to collect
format_type results during the initial scan of pg_type, instead. On the
regression database in HEAD, it reduces the number of queries pg_dump
issues from 3260 to 2905; but I'm having a hard time detecting any net
performance change.

(This is not meant for commit as-is; notably, I didn't bother to fix
getTypes' code paths for pre-9.6 servers. It should be fine for
performance testing though.)

regards, tom lane

adrian.klaver@aklaver.com

almost 5 years ago

In reply to: Tom Lane (#12)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

On 8/27/21 2:23 PM, Tom Lane wrote:

Laurenz Albe <laurenz.albe@cybertec.at> writes:

On Thu, 2021-08-26 at 18:06 +0200, hubert depesz lubaczewski wrote:

In total, there were 5000 queries:
SELECT pg_catalog.format_type('[0-9]+'::pg_catalog.oid, NULL)
But there were only 83 separate oids that were scanned.

That is a strong argument for using a hash table to cache the types.

Those queries are coming from getFormattedTypeName(), which is used
for function arguments and the like. I'm not quite sure why Hubert
is seeing 5000 such calls in a database with only ~100 functions;
surely they don't all have an average of 50 arguments?

Could be.

From the stats post:

"Based on my reading of queries in order it seems to follow the pattern of:

One call for:

SELECT proretset, prosrc, probin, provolatile, proisstrict,
prosecdef, lanname, proconfig, procost, prorows,
pg_catalog.pg_get_function_arguments(p.oid) AS funcargs,
pg_catalog.pg_get_function_identity_arguments(p.oid) AS funciargs,
pg_catalog.pg_get_function_re
sult(p.oid) AS funcresult, proleakproof, array_to_string(protrftypes,
' ') AS protrftypes, proparallel, prokind, prosupport, NULL AS
prosqlbody FROM pg_catalog.pg_proc p, pg_catalog.pg_language l WHERE
p.oid = 'SOME_NUMBER'::pg_catalog.oid AND l.oid = p.prolang

and then one or more:

SELECT pg_catalog.format_type('SOME_NUMBER'::pg_catalog.oid, NULL)

In one case, after proc query, there were 94 concecutive
pg_catalog.format_type queries.
"

I experimented with the attached, very quick-n-dirty patch to collect
format_type results during the initial scan of pg_type, instead. On the
regression database in HEAD, it reduces the number of queries pg_dump
issues from 3260 to 2905; but I'm having a hard time detecting any net
performance change.

(This is not meant for commit as-is; notably, I didn't bother to fix
getTypes' code paths for pre-9.6 servers. It should be fine for
performance testing though.)

regards, tom lane

--
Adrian Klaver
adrian.klaver@aklaver.com

sfrost@snowman.net

almost 5 years ago

In reply to: Tom Lane (#12)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Laurenz Albe <laurenz.albe@cybertec.at> writes:

On Thu, 2021-08-26 at 18:06 +0200, hubert depesz lubaczewski wrote:

In total, there were 5000 queries:
SELECT pg_catalog.format_type('[0-9]+'::pg_catalog.oid, NULL)
But there were only 83 separate oids that were scanned.

That is a strong argument for using a hash table to cache the types.

Those queries are coming from getFormattedTypeName(), which is used
for function arguments and the like. I'm not quite sure why Hubert
is seeing 5000 such calls in a database with only ~100 functions;
surely they don't all have an average of 50 arguments?

I experimented with the attached, very quick-n-dirty patch to collect
format_type results during the initial scan of pg_type, instead. On the
regression database in HEAD, it reduces the number of queries pg_dump
issues from 3260 to 2905; but I'm having a hard time detecting any net
performance change.

Seems like the issue here is mainly just the latency of each query being
rather high compared to most use-cases, so local testing where there's
basically zero latency wouldn't see any change in timing, but throw a
trans-atlantic or worse amount of latency between the system running
pg_dump and the PG server and you'd see notable wall-clock savings in
time.

Only took a quick look but generally +1 on reducing the number of
queries that pg_dump is doing and the changes suggested looked good to
me.

Thanks,

Stephen

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: Stephen Frost (#14)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

I experimented with the attached, very quick-n-dirty patch to collect
format_type results during the initial scan of pg_type, instead. On the
regression database in HEAD, it reduces the number of queries pg_dump
issues from 3260 to 2905; but I'm having a hard time detecting any net
performance change.

Seems like the issue here is mainly just the latency of each query being
rather high compared to most use-cases, so local testing where there's
basically zero latency wouldn't see any change in timing, but throw a
trans-atlantic or worse amount of latency between the system running
pg_dump and the PG server and you'd see notable wall-clock savings in
time.

Yeah. What I was more concerned about was the potential downside
of running format_type() for each pg_type row, even though we might
use only a few of those results. The fact that I'm *not* seeing
a performance hit with a local server is encouraging from that
standpoint.

regards, tom lane

sfrost@snowman.net

almost 5 years ago

In reply to: Tom Lane (#15)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

I experimented with the attached, very quick-n-dirty patch to collect
format_type results during the initial scan of pg_type, instead. On the
regression database in HEAD, it reduces the number of queries pg_dump
issues from 3260 to 2905; but I'm having a hard time detecting any net
performance change.

Seems like the issue here is mainly just the latency of each query being
rather high compared to most use-cases, so local testing where there's
basically zero latency wouldn't see any change in timing, but throw a
trans-atlantic or worse amount of latency between the system running
pg_dump and the PG server and you'd see notable wall-clock savings in
time.

Yeah. What I was more concerned about was the potential downside
of running format_type() for each pg_type row, even though we might
use only a few of those results. The fact that I'm *not* seeing
a performance hit with a local server is encouraging from that
standpoint.

Ah, yes, agreed.

Thanks!

Stephen

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: Adrian Klaver (#13)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

Adrian Klaver <adrian.klaver@aklaver.com> writes:

On 8/27/21 2:23 PM, Tom Lane wrote:

Those queries are coming from getFormattedTypeName(), which is used
for function arguments and the like. I'm not quite sure why Hubert
is seeing 5000 such calls in a database with only ~100 functions;
surely they don't all have an average of 50 arguments?

Could be.

Maybe. I'm disturbed by the discrepancy between my result (about
10% of pg_dump's queries are these) and Hubert's (over 50% are).
I'd like to know the reason for that before we push forward.

regards, tom lane

hubert depesz lubaczewski

depesz@depesz.com

almost 5 years ago

In reply to: Tom Lane (#12)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

On Fri, Aug 27, 2021 at 05:23:23PM -0400, Tom Lane wrote:

I experimented with the attached, very quick-n-dirty patch to collect
format_type results during the initial scan of pg_type, instead. On the
regression database in HEAD, it reduces the number of queries pg_dump
issues from 3260 to 2905; but I'm having a hard time detecting any net
performance change.
(This is not meant for commit as-is; notably, I didn't bother to fix
getTypes' code paths for pre-9.6 servers. It should be fine for
performance testing though.)

Hi,
thanks a lot for this. Will test and report back, most likely on Monday,
though.

Best regards,

depesz

hubert depesz lubaczewski

depesz@depesz.com

almost 5 years ago

In reply to: Tom Lane (#12)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

On Fri, Aug 27, 2021 at 05:23:23PM -0400, Tom Lane wrote:

Those queries are coming from getFormattedTypeName(), which is used
for function arguments and the like. I'm not quite sure why Hubert
is seeing 5000 such calls in a database with only ~100 functions;
surely they don't all have an average of 50 arguments?

Oh. missed that part.
So I checked. In the mean time I got -Fc dump, so:

#v+
=$ pg_restore -l schema.dump | \
grep -P '^\d*; \d+ \d+ FUNCTION ' |
sed 's/^[^(]*(//; s/)[^)]*$//' |
awk -F, '{print NF}' |
sort -n |
uniq -c
23 0
52 1
21 2
8 3
1 4
2 5
#v-

23 functions with 0 arguments, 52 with 1, and the max is 5 arguments - two
functions have these.

Not sure if it matters but there is a lot of enums. 83 of them. And they have
up to 250 elements (2 such types).

Best regards,

depesz

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: hubert depesz lubaczewski (#19)

hackersgeneral

Re: Can we get rid of repeated queries from pg_dump?

hubert depesz lubaczewski <depesz@depesz.com> writes:

On Fri, Aug 27, 2021 at 05:23:23PM -0400, Tom Lane wrote:

Those queries are coming from getFormattedTypeName(), which is used
for function arguments and the like. I'm not quite sure why Hubert
is seeing 5000 such calls in a database with only ~100 functions;
surely they don't all have an average of 50 arguments?

23 functions with 0 arguments, 52 with 1, and the max is 5 arguments - two
functions have these.
Not sure if it matters but there is a lot of enums. 83 of them. And they have
up to 250 elements (2 such types).

Hmm, no, I don't see any getFormattedTypeName calls in dumpEnumType.

There are two of 'em in dumpCast though. Does this DB by chance
have a ton of user-defined casts?

regards, tom lane

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: Tom Lane (#20)

hackersgeneral

gus.spier@gmail.com

almost 5 years ago

In reply to: Tom Lane (#21)

hackersgeneral

alvherre@2ndquadrant.com

almost 5 years ago

In reply to: Tom Lane (#21)

hackersgeneral

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: Alvaro Herrera (#23)

hackersgeneral

sfrost@snowman.net

over 4 years ago

In reply to: Tom Lane (#24)

hackersgeneral

hubert depesz lubaczewski

depesz@depesz.com

over 4 years ago

In reply to: Tom Lane (#12)

hackersgeneral

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: hubert depesz lubaczewski (#26)

hackersgeneral

hubert depesz lubaczewski

depesz@depesz.com

over 4 years ago

In reply to: Tom Lane (#27)

hackersgeneral

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: hubert depesz lubaczewski (#28)

hackersgeneral

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: Tom Lane (#29)

hackersgeneral

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: hubert depesz lubaczewski (#28)

hackersgeneral

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: Tom Lane (#12)

hackersgeneral

hubert depesz lubaczewski

depesz@depesz.com

over 4 years ago

In reply to: Tom Lane (#32)

hackersgeneral

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: hubert depesz lubaczewski (#33)

hackersgeneral

hubert depesz lubaczewski

depesz@depesz.com

over 4 years ago

In reply to: Tom Lane (#34)

hackersgeneral

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: Tom Lane (#30)

hackersgeneral

hubert depesz lubaczewski

depesz@depesz.com

over 4 years ago

In reply to: Tom Lane (#36)

hackersgeneral