Weird CPU utilization patterns with Postgres
Hi,
We are having a really interesting problem with our Postgres 9.3 instance
in our infrastructure.
Few days ago our box started to show huge CPU spikes while the IO Wait is
negligible on the box. After a while I have installed perf and started to
monitor the Postgres master process and here is what I have found:
Samples: 372K of event 'cycles', Event count (approx.): 110095222173,
ThreaSamples: 372K of event 'cycles', Event count (approx.): 1100 93.65%
libc-2.12.so [.] __strcoll_l
0.97% libc-2.12.so [.] memcpy
0.90% postgres [.] slot_getattr
0.88% postgres [.] nocachegetattr
0.64% postgres [.] varstr_cmp
0.52% libc-2.12.so [.] __strcmp_sse42
0.43% postgres [.] hash_any
0.32% postgres [.] pg_detoast_datum_packed
0.31% libc-2.12.so [.] __strlen_sse2
0.22% postgres [.] bttextcmp
0.18% postgres [.] ExecStoreTuple
0.14% postgres [.] MemoryContextReset
0.09% postgres [.] pgstat_end_function_usage
0.08% libc-2.12.so [.] strcoll
0.08% postgres [.] heap_hot_search_buffer
0.07% postgres [.] lc_collate_is_c
0.06% [kernel] [k] sys_semtimedop
0.06% postgres [.] heap_page_prune_opt
0.05% postgres [.] slot_getsomeattrs
0.05% postgres [.] heap_fill_tuple
0.04% postgres [.] hash_search
0.03% postgres [.] GetMemoryChunkSpace
0.03% postgres [.] heap_form_minimal_tuple
0.03% [kernel] [k] update_queue
0.02% postgres [.] ReadBufferExtended
0.02% postgres [.] memcpy@plt
It seems that the box is using __strcoll a lot. The query performance is
down, while previously the box was able to sustain with ~20 clients right
now it is hardly able to keep up with 5.
I am wondering why the root cause might be here.
Let me know if anybody has seen this before.
Regards,
Istvan
--
the sun shines for all
On Fri, Dec 5, 2014 at 5:14 PM, István <leccine@gmail.com> wrote:
I am wondering why the root cause might be here.
My guess would be that an important text-based sort operation began to
go to disk. The external sort code (tapesort) is known to do far more
comparisons than quicksort. With text sorts, you tend to see tapesort
very CPU bound, where that might not be the case with integer sorts.
I'm currently trying to fix this across the board [1]https://commitfest.postgresql.org/action/patch_view?id=1462 -- Regards, Peter Geoghegan, but my first
suggestion is to try enabling log_temp_files to see if external sorts
can be correlated with these stalls.
[1]: https://commitfest.postgresql.org/action/patch_view?id=1462 -- Regards, Peter Geoghegan
--
Regards,
Peter Geoghegan
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Tue, Dec 9, 2014 at 5:46 PM, Peter Geoghegan
<peter.geoghegan86@gmail.com> wrote:
I'm currently trying to fix this across the board [1], but my first
suggestion is to try enabling log_temp_files to see if external sorts
can be correlated with these stalls.
See also: /messages/by-id/CAM3SWZTijoBPpqFF7mN3021Vvtu+5Fd1ymABQ8tLoV4zhfAqxA@mail.gmail.com
--
Regards,
Peter Geoghegan
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general