having difficulty with explain analyze output
I'm not really looking for information on how to speed this query
up. I'm just trying to interpret the output enough to tell me which step
is slow:
Seq Scan on mags (cost=0.00..187700750.56 rows=47476 width=4) (actual time=3004851.889..3004851.889 rows=0 loops=1)
Filter: ((signum IS NOT NULL) AND (NOT (SubPlan 1)))
SubPlan 1
-> Materialize (cost=0.00..3713.93 rows=95862 width=4) (actual time=0.011..16.145 rows=48139 loops=94951)
-> Seq Scan on sigs (cost=0.00..2906.62 rows=95862 width=4) (actual time=0.010..674.201 rows=95862 loops=1)
Total runtime: 3004852.005 ms
It looks like the inner seq scan takes 674ms, then the materialize takes
an additional 16ms? Or is that 16ms * 94951? Or 674 * 94951?
And the outer seq scan takes 3004851-3004851 = 0ms?
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Tue, Nov 26, 2013 at 02:43:42PM -0500, David Rysdam wrote:
I'm not really looking for information on how to speed this query
up. I'm just trying to interpret the output enough to tell me which step
is slow:Seq Scan on mags (cost=0.00..187700750.56 rows=47476 width=4) (actual time=3004851.889..3004851.889 rows=0 loops=1)
Filter: ((signum IS NOT NULL) AND (NOT (SubPlan 1)))
SubPlan 1
-> Materialize (cost=0.00..3713.93 rows=95862 width=4) (actual time=0.011..16.145 rows=48139 loops=94951)
-> Seq Scan on sigs (cost=0.00..2906.62 rows=95862 width=4) (actual time=0.010..674.201 rows=95862 loops=1)
Total runtime: 3004852.005 msIt looks like the inner seq scan takes 674ms, then the materialize takes
an additional 16ms? Or is that 16ms * 94951? Or 674 * 94951?
The Seq Scan took 674ms and was run once (loops=1)
The Materialise was run 94951 times and took, on average, 0.011ms to
return the first row and 16ms to complete.
16.145 * 94951 = 1532983.895
And the outer seq scan takes 3004851-3004851 = 0ms?
The outer plan took 3004851ms to return its first row, and last row
also as apparently it matched now rows at all. And if this is the
complete plan, it took 1,500 seconds for itself.
Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
He who writes carelessly confesses thereby at the very outset that he does
not attach much importance to his own thoughts.
-- Arthur Schopenhauer
David Rysdam <drysdam@ll.mit.edu> writes:
On Tue, 26 Nov 2013 14:51:22 -0500, Martijn van Oosterhout <kleptog@svana.org> wrote:
The Seq Scan took 674ms and was run once (loops=1)
The Materialise was run 94951 times and took, on average, 0.011ms to
return the first row and 16ms to complete.16.145 * 94951 = 1532983.895
OK, this is helpful. But why would Materialize run 94k times? I thought
the whole point of Materialize was to make a "virtual table" that could
speed up "parent" queries. If it has to recreate the subquery, why would
the planner choose to Materialize?
It *does* create a virtual table, or tuplestore --- what you are seeing
here is just the cost of pulling tuples out of that tuplestore. There
were 48139 * (94951 - 1) = 4570798050 tuples returned out of the
tuplestore rather than by directly querying the underlying seqscan, and
the time spent doing that was 16.145 * 94951 - 674.201 = 1532309.694 msec,
so on your machine it takes about a third of a microsecond to pull a tuple
from the in-memory tuplestore, which sounds about right allowing for
EXPLAIN ANALYZE's instrumentation overhead. Without the Materialize
node it'd have had to rescan the "sigs" table for each outer row, and
you'd probably still be waiting :-(
BTW, it's possible for a tuplestore to spill to disk, negating some of the
benefit, if the storage required exceeds work_mem. If you have a small
work_mem setting, does it go any faster if you increase work_mem?
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general