Lock Wait Statistics (next commitfest)

Started by Mark Kirkwoodalmost 17 years ago37 messages

markir@paradise.net.nz

almost 17 years ago

1 attachment(s)

Where I work they make extensive use of Postgresql. One of the things
they typically want to know about are lock waits. Out of the box in
there is not much in the way of tracking for such, particularly in older
versions. The view pg_stats is fine for stuff happening *now*, but
typically I find I'm being asked about something that happened last
night...

Now for those platforms with dtrace there is a lock wait probe, useful -
but not for those of us on Linux! There is the conf option to log lock
waits > deadlock timeout (8.3 onwards), and this is great, but I
wondered if having something like this available as part of the stats
module would be useful.

So here is my initial attempt at this, at this point merely to spark
discussion (attached patch)

I have followed some of the ideas from the function execution stats, but
locks required some special treatment.

- new parameter track_locks (maybe should be track_lock_waits?)
- new hash locks attached to stats dbentry
- several new stat*lock c functions
- new view pg_stat_lock_waits
- new attributes for pg_stat_database

This version has simply exposed the locktag structure in the view along
with corresponding #waits and wait time. This should probably get
reformed to look a little more like pg_locks. I figured this is easily
doable along with the no doubt many changes coming from revew comments.

Also I did not do any clever amalgamation of transaction lock waits -
there is probably gonna be a lot of those and keeping the detail is
unlikely to be useful. It would be easy to collect them all together in
one transaction entry.

regards

Mark

Attachments:

lockstats-1.patch.gzapplication/x-gzip; name=lockstats-1.patch.gzDownload

���|Ilockstats-1.patch�=kw�����_�����M��vn��������+)��v�:�D��P�JRq|������$(�i���������0/�����������}�������0�=N��_N?�}ZO
����?�Fgc�B�������Y��>�Ig��[��xT���?K�c����A�f��6r4#G	~%�Q�4�J������f��m��z��?�����np��P���v�B\��[��uZ]����Y�N�����~6��L��*�����fw��z^��y���	���N �(���C<�
�z5��0�S��7;����_��r$^���`s��x�O��C	�;
O'C�������L�h�����q����'k4�"�g��^��Zs�p���,X�6��+��mm�-��������nt��+���?^����^^�(��^z�����&f�C+6�"�	�������	~�\�������5���om�-{X��K?�����Ndm~����N���������tj���v���:+���T`���M��J�\>��00��������7�["��N���"N���'7���I%A�?�/d�${ ���|`���9�:$�������~��~Q���8���� \��	���E��l�{����D�����E���J���������Tc��t���Z��AO,�o�t�������dz�r�����������J�[�19��t�n���wWg����\��|����+�b��(���>�f�i6;���0���M���R������d���d�����^q����]�UR����Na(8����W��,Pq-n�Q|������i������~�3az-���v��q��� ��(�������P�a H�������d
�c+?9���1���l?���,~��e�1���,CX�q��<
R�>�#��B�T� on�n#�����:mr�9�?�����^r�^��d2�P"'��:X���C��K�u��������h?#~�X�/`��HEm��#�A��'�_�����@��9n'w���B����{�������iw�$�[J��	���)��')��,��O>D�=��sJnc>Q�x�2;�����~4����WH��H�������O�"Y����o�J��:n����������N��`�p�:�9�����/U>�7���=!Y�Lok�74F���R���s5s��t}�UI��h]�S��V��bx��2-���x��_�����4�;���4���Y	�B�tB���;h���>���}����'��F�����#rTt��H�Ge��P95(���GBY��[?�>+����#fLs�Y�����X���8�N[�I��uz=�F�@�W���be.�,6���T�jv��qQ�PN�4fP��������gSB|���6�~����5�~��\W��Ei|fF�����M�3T�(����@���'d��!�4}s62O	h����������mX�P�N3�c^��A�S�D�g�������Ob?=�d��g��3��������S�@�������kE\ub��<9�+�Va����S4�����h��UC�1EW�.���$U%@R{sX�����n�8��z3����]���T���",����9�z1��D�43��W�����S$h�;��C4�Y'�wj�*6����F$H��p���F<���6;��e�@�}�!XQ�� Q,��7��������s$�����R~�/g�U�;����!�
��b����DS-^T,���5��U��`)�}��D�*�>�]��d	\/J�8k�B
���LQ����_���0���V�Q�C���_9����=W�`�h�d���#F]��H�]\�'�����p�z|xq6���E`��xt*s<����W��P�^MF�1#3�]���M�D�4i��!���_j�P����,M�OP�d�T*��)O����eI��Oo��}�/���y��l`4�����p�j�
�aG&-J���6�������l�^���)q|Z���l
4=�lw]��i��(/0am���Z����Q}��I����3JM��3	�w�9�u�^Z+�}����8����+a�2��*n#e��j�j�T�����r�B-�Je��9�YB�X�j�$���E�Zk���.���{$Bzfw��o����:��^d�gq�|g��
S"�,�+DF��:��gb�����w�s�,W�r�Y������YV�n���9_I���&>�����7_#{F,��� �:��sZx9|C��>���!�������L�-Fm!��0����ghp!#����$F���6n�p�T(
��&�(�}1�J��b"!�xWR-5�4����R��|wb�N�\-�����C��lm!�,�����5�8�x�n4�U�:������Wt�X���'�D-��f������t���M�n��[�	�vn�L��e��b�c���ue�.��4��OU=aC1�m,��.��^bx�?�i��*i�V17�DE��F�q���h���6�M���gC}d�(i�Y��P���+��$[A4��U�W�t*����O��<��S����Zi��(B)����Pn7B�#�:���OC������W��qW��:(�hH���_�/9����5��I�fWG{smy��
����V���C���9�����eX�H�N�p4D����u��HPm�u�G����������h�=G��w�?v���S����	V3d�	���L
�X�*Y�)���s��c��$b;���n|��K������4��N����%YQ�T.�g���s+D���������nQ[�'�J������>�N�v'~��of'�*���	4�8���Pj5�H�^G��L���{j��]	�H�3�M��
�b�h���)�3��R��X��h�pC���V�:�AG�4F�o�h������I����b
�L�u��'�N�$L��J��u��cWW�lt%���`��u�ZnO�.�f�5�)�u�j���?�l�mo�z;�By.6���~r$�R��U��B�+��,�(c.F�e��n�dv\m2�.Nfk '3��c[KIVZ"����P�������w�+0c�X%3��YU�M
������U�f�Q�r���@�4���,d>��@d�����)O���b���p���
l����[��D�����<4um��j4� �\��l����R-91�����= �����P��!�q?5��Qz�y�c��*�����������*�Zb�"��X��Yu�r�!t"D�qA��T�y��\�i����$����|���9�B�3t���]*��\��>Y�&�Cw�~e�9Y6V�Y�Y/���xc�]iF���,!+|:9}I����	C���K��ebhj��s�����|!���[g�9��h�3�>j
����������qRl�ETM{rqX�!��ni�&�����y���%#�?����|�����\>;Y\h��t��v��m�2��w��P�����0��~v}5�m�T��J����6E���>e�3�p��,�2�SV:��VAX�
�7&}���E�:M/��Yb��+�g	��8�����Ed��	�j������e�c�I����iB��j�5aa\@Q���}����
hw`t��
�~���� �Eno&�z�]1���CKhlu���;���F-�����\��1����|>���^��F9@M3�Au�Fe��QdS����$*�R�Y��4��Qx��t4�/���59!�OGWW�5���I�pu�5Y�P��R�.�y�W�����mL<�#_<��D�N=���`&����-K��W	[��y�RJ*.�M����+vyT@F@.�Yd F�_���K�1����r��lq��Y��c9���
`�����W&�H����lC?A�������Zx� ��*�*���n���J�d�V��sZ}��V��fl�������[���V�4�7���91P�8a�p��P�{�/��)���
�J�b�Z�������F���QT�N�]h���Ql�m07[����*�G������V�=���������=|������Z�x�=}3�5����{wGc�n�V��W������/�i7���Q������=n�9le
�[�p�>fg�9l7{���ss(�������C���CdrK�l���kt{�������l�*�|�+��r>2�7���s��*��a&���ZS����
3���g8�!��G[w���>�'��R���H�.�%�����7��mc�Me�������,|�>[g��
�;���J����������lCJ��1B4������]FJ7����2K�U&LW����������+Z=8z��<8�O~|�B0h%�{���:��_u�����6��Im\f�����uWM��j����~� ���>Y������	������6��/���Q��#a;G*�������4�B�9Rs��:v�ezw��o.if<�"b��
^�6�o�K�=���y�?	���pfU���\z.����ak����F�{����D��E����mQ���������H���Xjb�R���DO�Y���\4�c��FV@�xV-�}�J5Mj~���[�9$RO��7��4!�����������L���|Z��BZ��"��e�����D��f\Sq#�
�gL�I���h�u�M]U*e8��o�De���YX��t8�`~R�>����T���G�k)U��Q���"�~~�qOCN%`9�\���L����vf�1q.o�c��O9�m�b���n��v�p�r'�-`-'�{���n���
���vg���^���Z����2]�1�o8}{=��
��d������:�T9?H>���������(�V�kf:�U��u���et�~pH|�I�'�����&jv�q���W����v��q9���O�r�E)��������p�7r9�c����r���9�Gry���[z��i����A4�s��y3�m��Q�e�n�V���|�20_j<����]��3���Bwb��jVV��2��;���q��]\��;����0��y��I�U-�������'�\NAQ�Z������G�����T��;�M8>������H��Js[������`�TW^�h�^���%���0=�����}�S�]	&��g����Z�A~��������:�UX�e`,���av{NO��W��O�"6�z������Ud�t�zLw�r�/m/nf��?��^y���-�����&��Q|�|>������+.���n���
��%0��`�%�Z{���N�i����8{����#In:�_���S��<�9�0��B"�C;��E/eI}��7���.QRn4A���
}��N�Q�r1��=���]�+:�%TZ��WO]�?Q�������%�3�
3p���P*�����F��4mW���=�G})�x|��x���-hp�+�G����!�������FM���i����������c��h�~���������U�SM�m��^�������w����������5���6��0��1���}4��u�����1�[*S�50��Ky�1��3��b�`����~RAu�#N�q@�L5/�P��Qq3�V�:�qNfw2)cc��"[��l��B�8��"g����$1%U�P��B6U���fk0�pK��Q����@�3��,��W'���<��^���.F�����e������N�~:��Y2����~�/�������J�q*~��S}�J��I���l&R��R�^&����F�$j�L��:��=t���p�$�rHG�/���)v8]��Sxl���D�T�"�� gt@��#��2������x�,�3�TT9�TO�,�Z/o���9V{��Ds�VL��!GP��X����~M�-��5��������`|M�0�{�f��J�����s�K0�S�4���tM���u����Z����DeKJ�LJ1�8�^��(����ynK���jT�<
(E�L���� $|ov����5�����
�1H�����X$m�
=G�g���q�E����&A�6�8��HGUK���0��;m�>{�C���[Q��P~2�++����fQvk�D�sW��bGUl����n~?��g��'�k�@�t�5����9�F�m~mW�:�~���S�������N�Qmr�x���
]���s����%6�[���m��_�!�Ug�V4^L�Y0��%�u�N�J:;�//������E�""��1�s���\.��My5\�����m"�;D�`�0�*��9���K�����9��[����`5�n���x��i�����w-����,��
����-{}�EK�������V���y��$7�! �	����f��FtW����v���S�H'�t����o���^��-��������!m�LP +�������W]1�p��:��B�B��q��|'{����r�6K�\G���~��.G����6���������d�����_p=�����Q��{��E�_���4��g��'
)41�=���W�
����}H�f������<��,��a3���-p��a��E=�������v��w�s{e.n�������-*�����:q{'n[mb���m�������VW��������8R��y��~�!�#8�[��!��Aw��[���x�(tg�Q���Tb@3����\M�AB�0���:;[��u�����]w s��.���k�`�%[�syX����Isp�Q[�O0V��#���r�����U��S�ap���x�+h)f2$�w�0�����|��W���.�X���0���7�����!P`�*��h�q�4�Fk�^�^����/_�]`�w�mD��o�)��}7z=��6A;�����.zv��h']���E�
��p���A�i741�n4�vG������-��g�2]��/t[@����;�]��a������ ����wFP�K��*��hOy&���	f|����.^�s�����9\�#��2q��/����x�3�d���pM�+�1%~�gxk��^�MSw�����9�	HKq�!�wg�l8u�1�\@��Z��Rg�_'��/o{�bn`����szv����������!��
�*l�mr����zfn��+�z�6�J.O�s&7��Y%�#�>����"4�����C��.��BXv-
��Tc�����������=n��qk���`Z���#`;�,����=��c\���Qq����f�����~�2���B�MD�[�	�_�*$������O�����H?X����
�.kG��/�f���9��Z���	���S����iy�Q����8g�/��i8�4��"���9x���bq�2N�O���h03V��	���UW��������|�ez+����u��I��x�����PT�*� � �Jz��Q�c�:2w��[�����Bx�7�wo]\�yq=z�/�+�O��i�����m�������R���+T�����~,9y9�?����u�����+;���n�+���_��|S��H�L��<�;)��h��D�����U����v�����B��,�x�
'?C���aY���%;�:���"B�&X��Z��r��,q��q7�����+5��6����2�.<���O��-b�&�e4��cbH�T�F���������[�Tq=CJmXj{�(����/]�l��DH-�F
|�Gx�N��-x�|��g�Z"S��s�=���f�B\����+NmL-z�*G:-��cJ�5��`P���Z���66*������,��Tz|3�#T+�a_�����P|�����y���I�7���3��v�����E4��&.��<'��$���_��X?�=o���Y�;�N'��wg@��[�����p�:�AO�.o;]��y,����=��7U�������QK�������������7o�G�"HK8Mk�n���e�����/���N��]���
�^��#��r4��0��j!W�����������/=/nY����D]L�����k+k������J��=����+^�*���FOU���������4�9��;F��������l�P���M��&������U�0��nP��'�Y�6�"p����7/�2y�&��q��h[���������(���H�����Smc���je;[�Z���~���8�SB�64�O!5�1�������6������f���bG�P������hz������,���)���trzy�z��p4����
v�h�n�����$f��.L��t�5���C�%�I6�a���u��w����Q<���ig�:�l� y�r�q����l�#�>��F�=e����������X����0�Uv��\_����u]R%�O���p5~��A����-����'����=2�������n�M�`���s�=�
v�m[�|��m�Z��<�
R;���;��B��� Y7_8P����`���Q�>���C����8Y�q�!Q�O��J�YN�^��(�C��+a�=��&	[���e�����/5�m!����^����Y�.+���#��������W��7�=��X�7?��I%B�S�=��<#�������?��c���OcG�G���B�_G������=����y�F���x]�p��^��'`�m��Rr���le��`�1���1_���Pz�|���n�I�?~��]S�	��2b�e�Lwe�������������G��9�����_'	n�;A���7R��R��+��H�~�W0{�o�85�!5p�������V4�x�����x�����L�/Zj�����=��>�����D���Q.A���
��P�7�^�xt�%�f�b<
�D�\�������Ifu.�h�C��V��z�� ������K���P�}�G�1+�����h���)>3+r��]���s���^lgm�o8�4�%����
�V��tziJf�_�/�E�|"�WP����T��,ZU����i��zUN7�|9|5�-�����$b�"�J���b��?GCR���/���O�z:����~�r��1�=��g��/L������L�.3l����1�n�4���4V���Z�(IONn�[��S-�/����&b�9�:Dy��"� �kqmL�><�������sk}P�}�2�]|�yc��xn��������y6n����1��Vdi�����S����>���7de%=nl���(�$����8	Lq��������[��'&����vC��aH^��xw����Uq����@@�7��?f~T����]���������K���e�i����0;����1q�.}���7j
�
��)�MU��u�P��c�b��zZh���@k�=��/1���R�53_m ������*�j���bL����w���N5aj�=.k�����O�I�c����	�`��O�	�c6�Y*��d��!B�������un���t��V��.%���f�Q�`��K�����f��lt.���G�������{6�}-d��2T�#����&��	U@���D�u����m��j�X^R6o�aiY������W���Vh����PjZ�R�|+�5�F�23����Vl5���6��P'oI����?�q|�5�6�;�����i�����,�`�S����X`����f�%������sPV��*�{L#�LP��haYC���V!nf��
O���v�����UZz�-�����t�-c��:/������\]O��l�$����<��y
�<�@�B O!���S�)�y
��>!�B#��1�@=,��D���f��B-�����u��!>��SK~j�'L��?,�#���������V�&�l��K}k$���D?������k=����ZOq����S\��kYS}Z����
Dx����&����rK�����M�������h�qb>
��|����U���������:���x2&U
�:���#��&�8���m�y������,�i�k�G ��a'���_y�z{3�����O$�Wk&x�,��,�����f��B<U���
�=���jS�P���O�-��]�2
9��R/����^��0���j�	���5v���@�=�Z|���������	f����|8B?���T[�60Z[x}b��_�����?��t_��

Jaime Casanova

jcasanov@systemguards.com.ec

over 16 years ago

In reply to: Mark Kirkwood (#1)

Re: Lock Wait Statistics (next commitfest)

On Sun, Jan 25, 2009 at 6:57 PM, Mark Kirkwood<markir@paradise.net.nz> wrote:

So here is my initial attempt at this, at this point merely to spark
discussion (attached patch)

this patch doesn't apply cleanly to head... can you update it, please?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Jaime Casanova

jcasanov@systemguards.com.ec

over 16 years ago

In reply to: Jaime Casanova (#2)

1 attachment(s)

Re: Lock Wait Statistics (next commitfest)

On Fri, Jul 10, 2009 at 12:23 PM, Jaime
Casanova<jcasanov@systemguards.com.ec> wrote:

On Sun, Jan 25, 2009 at 6:57 PM, Mark Kirkwood<markir@paradise.net.nz> wrote:

So here is my initial attempt at this, at this point merely to spark
discussion (attached patch)

this patch doesn't apply cleanly to head... can you update it, please?

i did it myself, i think this is something we need...

this compile and seems to work... something i was wondering is that
having the total time of lock waits is not very accurate because we
can have 9 lock waits awaiting 1 sec each and one awaiting for 1
minute... simply sum them all will give a bad statistic or am i
missing something?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Attachments:

lockstats-2.patchtext/x-diff; charset=US-ASCII; name=lockstats-2.patchDownload

Index: src/backend/catalog/system_views.sql
===================================================================
RCS file: /home/postgres/pgrepo/pgsql/src/backend/catalog/system_views.sql,v
retrieving revision 1.60
diff -c -r1.60 system_views.sql
*** src/backend/catalog/system_views.sql	7 Apr 2009 00:31:26 -0000	1.60
--- src/backend/catalog/system_views.sql	10 Jul 2009 17:38:08 -0000
***************
*** 356,362 ****
              pg_stat_get_db_tuples_fetched(D.oid) AS tup_fetched,
              pg_stat_get_db_tuples_inserted(D.oid) AS tup_inserted,
              pg_stat_get_db_tuples_updated(D.oid) AS tup_updated,
!             pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted
      FROM pg_database D;
  
  CREATE VIEW pg_stat_user_functions AS 
--- 356,364 ----
              pg_stat_get_db_tuples_fetched(D.oid) AS tup_fetched,
              pg_stat_get_db_tuples_inserted(D.oid) AS tup_inserted,
              pg_stat_get_db_tuples_updated(D.oid) AS tup_updated,
!             pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted,
!             pg_stat_get_db_lock_waits(D.oid) AS lock_waits,
!             pg_stat_get_db_lock_wait_time(D.oid) AS lock_wait_time
      FROM pg_database D;
  
  CREATE VIEW pg_stat_user_functions AS 
***************
*** 403,408 ****
--- 405,424 ----
  
  REVOKE ALL on pg_user_mapping FROM public;
  
+ CREATE VIEW pg_stat_lock_waits AS
+   SELECT
+       L.field1,
+       L.field2,
+       L.field3,
+       L.field4,
+       L.typ,
+       L.method,
+       pg_stat_get_lock_waits( L.field1, L.field2, L.field3, L.field4,
+                               L.typ, L.method) AS waits,
+       pg_stat_get_lock_wait_time( L.field1, L.field2, L.field3, L.field4,
+                                   L.typ, L.method) AS wait_time
+   FROM pg_stat_get_lock_wait_tagset() AS L;
+ 
  --
  -- We have a few function definitions in here, too.
  -- At some point there might be enough to justify breaking them out into
Index: src/backend/postmaster/pgstat.c
===================================================================
RCS file: /home/postgres/pgrepo/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.189
diff -c -r1.189 pgstat.c
*** src/backend/postmaster/pgstat.c	11 Jun 2009 14:49:01 -0000	1.189
--- src/backend/postmaster/pgstat.c	11 Jul 2009 00:29:19 -0000
***************
*** 101,106 ****
--- 101,107 ----
  #define PGSTAT_DB_HASH_SIZE		16
  #define PGSTAT_TAB_HASH_SIZE	512
  #define PGSTAT_FUNCTION_HASH_SIZE	512
+ #define PGSTAT_LOCK_HASH_SIZE 512
  
  
  /* ----------
***************
*** 110,115 ****
--- 111,117 ----
  bool		pgstat_track_activities = false;
  bool		pgstat_track_counts = false;
  int			pgstat_track_functions = TRACK_FUNC_OFF;
+ bool        pgstat_track_locks = false;
  int			pgstat_track_activity_query_size = 1024;
  
  /* ----------
***************
*** 173,178 ****
--- 175,193 ----
  static bool have_function_stats = false;
  
  /*
+  * Backends store per-lock info that's waiting to be sent to the collector
+  * in this hash table (indexed by lock LOCKTAG).
+  */
+ static HTAB *pgStatLocks = NULL;
+  
+ /*
+  * Indicates if backend has some lock stats that it hasn't yet
+  * sent to the collector.
+  */
+ static bool have_lock_stats = false;
+ 
+ 
+ /*
   * Tuple insertion/deletion counts for an open transaction can't be propagated
   * into PgStat_TableStatus counters until we know if it is going to commit
   * or abort.  Hence, we keep these counts in per-subxact structs that live
***************
*** 253,258 ****
--- 268,274 ----
  
  static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
  static void pgstat_send_funcstats(void);
+ static void pgstat_send_lockstats(void);
  static HTAB *pgstat_collect_oids(Oid catalogid);
  
  static PgStat_TableStatus *get_tabstat_entry(Oid rel_id, bool isshared);
***************
*** 273,278 ****
--- 289,295 ----
  static void pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len);
  static void pgstat_recv_funcstat(PgStat_MsgFuncstat *msg, int len);
  static void pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len);
+ static void pgstat_recv_lockstat(PgStat_MsgLockstat *msg, int len);
  
  
  /* ------------------------------------------------------------
***************
*** 751,756 ****
--- 768,777 ----
  
  	/* Now, send function statistics */
  	pgstat_send_funcstats();
+ 
+     /* Also send lock (wait) statistics */
+     pgstat_send_lockstats();
+ 
  }
  
  /*
***************
*** 847,852 ****
--- 868,927 ----
  }
  
  
+ /*
+  * Subroutine for pgstat_report_stat: populate and send a lock stat message
+  */
+ static void
+ pgstat_send_lockstats(void)
+ {
+   /* we assume this inits to all zeroes: */
+   static const PgStat_LockCounts all_zeroes;
+ 
+   PgStat_MsgLockstat msg;
+   PgStat_BackendLockEntry *entry;
+   HASH_SEQ_STATUS fstat;
+ 
+   if (pgStatLocks == NULL)
+       return;
+   pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_LOCKSTAT);
+   msg.m_databaseid = MyDatabaseId;
+   msg.m_nentries = 0;
+  
+   hash_seq_init(&fstat, pgStatLocks);
+   while ((entry = (PgStat_BackendLockEntry *) hash_seq_search(&fstat)) != NULL)
+   {
+       PgStat_LockEntry *m_ent;
+ 
+       /* Skip it if no counts accumulated since last time */
+       if (memcmp(&entry->l_counts, &all_zeroes,
+                  sizeof(PgStat_LockCounts)) == 0)
+           continue;
+ 
+       /* need to convert format of time accumulators */
+       m_ent = &msg.m_entry[msg.m_nentries];
+       memcpy(&m_ent->l_tag, &entry->l_tag, sizeof(LOCKTAG));
+       m_ent->l_numwaits = entry->l_counts.l_numwaits;
+       m_ent->l_wait_time = INSTR_TIME_GET_MICROSEC(entry->l_counts.l_wait_time);
+ 
+       if (++msg.m_nentries >= PGSTAT_NUM_LOCKENTRIES)
+       {
+           pgstat_send(&msg, offsetof(PgStat_MsgLockstat, m_entry[0]) +
+                       msg.m_nentries * sizeof(PgStat_LockEntry));
+           msg.m_nentries = 0;
+       }
+ 
+       /* reset the entry's counts */
+       MemSet(&entry->l_counts, 0, sizeof(PgStat_LockCounts));
+   }
+ 
+   if (msg.m_nentries > 0)
+       pgstat_send(&msg, offsetof(PgStat_MsgLockstat, m_entry[0]) +
+                   msg.m_nentries * sizeof(PgStat_LockEntry));
+ 
+   have_lock_stats = false;
+ }
+ 
+ 
  /* ----------
   * pgstat_vacuum_stat() -
   *
***************
*** 1392,1397 ****
--- 1467,1544 ----
  }
  
  
+ /*
+  * Initialize lock wait tracking data.
+  * Called by the lock manager before entering lock wait.
+  */
+ void
+ pgstat_init_lock_wait(LOCKTAG *locktag)
+ {
+   PgStat_BackendLockEntry *htabent;
+   bool        found;
+   instr_time  l_curr;
+   instr_time  l_start;
+   
+ 
+   if (!pgStatLocks)
+   {
+       /* First time through - initialize lock stat table */
+       HASHCTL     hash_ctl;
+ 
+       memset(&hash_ctl, 0, sizeof(hash_ctl));
+       hash_ctl.keysize = sizeof(LOCKTAG);
+       hash_ctl.entrysize = sizeof(PgStat_BackendLockEntry);
+       hash_ctl.hash = tag_hash;
+       pgStatLocks = hash_create("Lock stat entries",
+                                     PGSTAT_LOCK_HASH_SIZE,
+                                     &hash_ctl,
+                                     HASH_ELEM | HASH_FUNCTION);
+   }
+ 
+ 
+   /* Get the stats entry for this lock, create if necessary */
+   htabent = hash_search(pgStatLocks, locktag,
+                         HASH_ENTER, &found);
+   if (!found)
+   {
+       MemSet(&htabent->l_tag, 0, sizeof(LOCKTAG));
+       MemSet(&htabent->l_counts, 0, sizeof(PgStat_LockCounts));
+       memcpy(&htabent->l_tag, locktag, sizeof(LOCKTAG));
+   }
+ 
+   htabent->l_counts.l_numwaits++;
+ 
+   l_curr = htabent->l_counts.l_wait_time;
+   INSTR_TIME_SET_CURRENT(l_start);
+   INSTR_TIME_ADD(l_curr, l_start);
+   htabent->l_counts.l_wait_time = l_curr;
+ }
+ 
+ /*
+  * Calculate lock end wait data.
+  * Called by the lock manager after ending lock wait.
+  */
+ void
+ pgstat_end_lock_wait(LOCKTAG  *locktag)
+ {
+   PgStat_BackendLockEntry *htabent;
+   bool        found;
+   instr_time  l_start;
+   instr_time  l_end;
+ 
+   /* Get the stats entry for this lock */
+   htabent = hash_search(pgStatLocks, locktag,
+                         HASH_ENTER, &found);
+   Assert(found);
+ 
+   l_start = htabent->l_counts.l_wait_time;
+   INSTR_TIME_SET_CURRENT(l_end);
+   INSTR_TIME_SUBTRACT(l_end, l_start);
+   htabent->l_counts.l_wait_time = l_end;
+ 
+ }
+ 
+ 
  /* ----------
   * pgstat_initstats() -
   *
***************
*** 2010,2015 ****
--- 2157,2191 ----
  
  
  /* ----------
+  * pgstat_fetch_stat_lockentry() -
+  *
+  *    Support function for the SQL-callable pgstat* functions. Returns
+  *    the collected statistics for one lock(tag) or NULL.
+  * ----------
+  */
+ PgStat_StatLockEntry *
+ pgstat_fetch_stat_lockentry(LOCKTAG *locktag)
+ {
+   PgStat_StatDBEntry *dbentry;
+   PgStat_StatLockEntry *lockentry = NULL;
+ 
+   /* load the stats file if needed */
+   backend_read_statsfile();
+ 
+   /* Lookup our database, then find the requested function.  */
+   dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+   if (dbentry != NULL && dbentry->locks != NULL)
+   {
+       lockentry = (PgStat_StatLockEntry *) hash_search(dbentry->locks,
+                                                        (void *) locktag,
+                                                        HASH_FIND, NULL);
+   }
+ 
+   return lockentry;
+ }
+ 
+ 
+ /* ----------
   * pgstat_fetch_stat_beentry() -
   *
   *	Support function for the SQL-callable pgstat* functions. Returns
***************
*** 2833,2838 ****
--- 3009,3018 ----
  					pgstat_recv_funcpurge((PgStat_MsgFuncpurge *) &msg, len);
  					break;
  
+                 case PGSTAT_MTYPE_LOCKSTAT:
+                     pgstat_recv_lockstat((PgStat_MsgLockstat *) &msg, len);
+                     break;
+ 
  				default:
  					break;
  			}
***************
*** 2899,2904 ****
--- 3079,3085 ----
  
  		result->tables = NULL;
  		result->functions = NULL;
+         result->locks = NULL;
  		result->n_xact_commit = 0;
  		result->n_xact_rollback = 0;
  		result->n_blocks_fetched = 0;
***************
*** 2909,2914 ****
--- 3090,3097 ----
  		result->n_tuples_updated = 0;
  		result->n_tuples_deleted = 0;
  		result->last_autovac_time = 0;
+         result->n_lock_waits = 0;
+         result->lock_wait_time = 0;
  
  		memset(&hash_ctl, 0, sizeof(hash_ctl));
  		hash_ctl.keysize = sizeof(Oid);
***************
*** 2926,2931 ****
--- 3109,3122 ----
  										PGSTAT_FUNCTION_HASH_SIZE,
  										&hash_ctl,
  										HASH_ELEM | HASH_FUNCTION);
+ 
+         hash_ctl.keysize = sizeof(LOCKTAG);
+         hash_ctl.entrysize = sizeof(PgStat_StatLockEntry);
+         hash_ctl.hash = tag_hash;
+         result->locks = hash_create("Per-database Lock",
+                                         PGSTAT_LOCK_HASH_SIZE,
+                                         &hash_ctl,
+                                         HASH_ELEM | HASH_FUNCTION);
  	}
  
  	return result;
***************
*** 2948,2956 ****
--- 3139,3149 ----
  	HASH_SEQ_STATUS hstat;
  	HASH_SEQ_STATUS tstat;
  	HASH_SEQ_STATUS fstat;
+ 	HASH_SEQ_STATUS lstat;
  	PgStat_StatDBEntry *dbentry;
  	PgStat_StatTabEntry *tabentry;
  	PgStat_StatFuncEntry *funcentry;
+ 	PgStat_StatLockEntry *lockentry;
  	FILE	   *fpout;
  	int32		format_id;
  	const char *tmpfile = permanent ? PGSTAT_STAT_PERMANENT_TMPFILE : pgstat_stat_tmpname;
***************
*** 3019,3025 ****
--- 3212,3229 ----
  			fwrite(funcentry, sizeof(PgStat_StatFuncEntry), 1, fpout);
  		}
  
+ 
  		/*
+          * Walk through the database's lock stats table.
+          */
+         hash_seq_init(&lstat, dbentry->locks);
+         while ((lockentry = (PgStat_StatLockEntry *) hash_seq_search(&lstat)) != NULL)
+         {
+             fputc('L', fpout);
+             fwrite(lockentry, sizeof(PgStat_StatLockEntry), 1, fpout);
+         }
+   
+         /*
  		 * Mark the end of this DB
  		 */
  		fputc('d', fpout);
***************
*** 3095,3104 ****
--- 3299,3311 ----
  	PgStat_StatTabEntry tabbuf;
  	PgStat_StatFuncEntry funcbuf;
  	PgStat_StatFuncEntry *funcentry;
+     PgStat_StatLockEntry lockbuf;
+     PgStat_StatLockEntry *lockentry;
  	HASHCTL		hash_ctl;
  	HTAB	   *dbhash;
  	HTAB	   *tabhash = NULL;
  	HTAB	   *funchash = NULL;
+     HTAB       *lockhash = NULL;
  	FILE	   *fpin;
  	int32		format_id;
  	bool		found;
***************
*** 3194,3199 ****
--- 3401,3407 ----
  				memcpy(dbentry, &dbbuf, sizeof(PgStat_StatDBEntry));
  				dbentry->tables = NULL;
  				dbentry->functions = NULL;
+                 dbentry->locks = NULL;
  
  				/*
  				 * Don't collect tables if not the requested DB (or the
***************
*** 3224,3229 ****
--- 3432,3446 ----
  												 PGSTAT_FUNCTION_HASH_SIZE,
  												 &hash_ctl,
  								   HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT);
+  
+                 hash_ctl.keysize = sizeof(LOCKTAG);
+                 hash_ctl.entrysize = sizeof(PgStat_StatLockEntry);
+                 hash_ctl.hash = tag_hash;
+                 hash_ctl.hcxt = pgStatLocalContext;
+                 dbentry->locks = hash_create("Per-database lock",
+                                                  PGSTAT_LOCK_HASH_SIZE,
+                                                  &hash_ctl,
+                                    HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT);
  
  				/*
  				 * Arrange that following records add entries to this
***************
*** 3231,3236 ****
--- 3448,3454 ----
  				 */
  				tabhash = dbentry->tables;
  				funchash = dbentry->functions;
+ 				lockhash = dbentry->locks;
  				break;
  
  				/*
***************
*** 3239,3244 ****
--- 3457,3463 ----
  			case 'd':
  				tabhash = NULL;
  				funchash = NULL;
+                 lockhash = NULL;
  				break;
  
  				/*
***************
*** 3306,3311 ****
--- 3525,3562 ----
  				break;
  
  				/*
+                  * 'L'  Lock wait entries 
+                  */
+             case 'L':
+                 if (fread(&lockbuf, 1, sizeof(PgStat_StatLockEntry),
+                           fpin) != sizeof(PgStat_StatLockEntry))
+                 {
+                     ereport(pgStatRunningInCollector ? LOG : WARNING,
+                             (errmsg("corrupted pgstat.stat file")));
+                     goto done;
+                 }
+   
+                 /*
+                  * Skip if lock belongs to a not requested database.
+                  */
+                 if (lockhash == NULL)
+                     break;
+   
+                 lockentry = (PgStat_StatLockEntry *) hash_search(lockhash,
+                                                     (void *) &lockbuf.l_tag,
+                                                          HASH_ENTER, &found);
+   
+                 if (found)
+                 {
+                     ereport(pgStatRunningInCollector ? LOG : WARNING,
+                             (errmsg("corrupted pgstat.stat file")));
+                     goto done;
+                 }
+   
+                 memcpy(lockentry, &lockbuf, sizeof(lockbuf));
+                 break;
+   
+                 /*
  				 * 'E'	The EOF marker of a complete stats file.
  				 */
  			case 'E':
***************
*** 3649,3654 ****
--- 3900,3907 ----
  			hash_destroy(dbentry->tables);
  		if (dbentry->functions != NULL)
  			hash_destroy(dbentry->functions);
+         if (dbentry->locks != NULL)
+             hash_destroy(dbentry->locks);
  
  		if (hash_search(pgStatDBHash,
  						(void *) &(dbentry->databaseid),
***************
*** 3688,3700 ****
--- 3941,3958 ----
  		hash_destroy(dbentry->tables);
  	if (dbentry->functions != NULL)
  		hash_destroy(dbentry->functions);
+     if (dbentry->locks != NULL)
+         hash_destroy(dbentry->locks);
  
  	dbentry->tables = NULL;
  	dbentry->functions = NULL;
+ 	dbentry->locks = NULL;
  	dbentry->n_xact_commit = 0;
  	dbentry->n_xact_rollback = 0;
  	dbentry->n_blocks_fetched = 0;
  	dbentry->n_blocks_hit = 0;
+     dbentry->n_lock_waits = 0;
+ 	dbentry->lock_wait_time = 0;
  
  	memset(&hash_ctl, 0, sizeof(hash_ctl));
  	hash_ctl.keysize = sizeof(Oid);
***************
*** 3712,3717 ****
--- 3970,3983 ----
  									 PGSTAT_FUNCTION_HASH_SIZE,
  									 &hash_ctl,
  									 HASH_ELEM | HASH_FUNCTION);
+  
+     hash_ctl.keysize = sizeof(LOCKTAG);
+     hash_ctl.entrysize = sizeof(PgStat_StatLockEntry);
+     hash_ctl.hash = tag_hash;
+     dbentry->locks = hash_create("Per-database Lock",
+                                     PGSTAT_LOCK_HASH_SIZE,
+                                     &hash_ctl,
+                                     HASH_ELEM | HASH_FUNCTION);
  }
  
  /* ----------
***************
*** 3933,3935 ****
--- 4199,4301 ----
  						   HASH_REMOVE, NULL);
  	}
  }
+ 
+ /* ----------
+  * pgstat_recv_lockstat() -
+  *
+  *    Count what the backend has done.
+  * ----------
+  */
+ static void
+ pgstat_recv_lockstat(PgStat_MsgLockstat *msg, int len)
+ {
+   PgStat_LockEntry *lockmsg = &(msg->m_entry[0]);
+   PgStat_StatDBEntry *dbentry;
+   PgStat_StatLockEntry *lockentry;
+   int         i;
+   bool        found;
+ 
+   dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+   Assert(dbentry->locks); 
+ 
+   /*
+    * Process all lock entries in the message.
+    */
+   for (i = 0; i < msg->m_nentries; i++, lockmsg++)
+   {
+       lockentry = (PgStat_StatLockEntry *) hash_search(dbentry->locks,
+                                                 (void *) &(lockmsg->l_tag),
+                                                      HASH_ENTER, &found);
+ 
+       if (!found)
+       {
+           /*
+            * If it's a new lock entry, initialize counters to the values
+            * we just got.
+            */
+           lockentry->l_numwaits = lockmsg->l_numwaits;
+           lockentry->l_wait_time = lockmsg->l_wait_time;
+       }
+       else
+       {
+           /*
+            * Otherwise add the values to the existing entry.
+            */
+           lockentry->l_numwaits += lockmsg->l_numwaits;
+           lockentry->l_wait_time += lockmsg->l_wait_time;
+       }
+ 
+       /* update the lock wait totals and time for this database */
+       dbentry->n_lock_waits += lockmsg->l_numwaits;
+       dbentry->lock_wait_time += lockmsg->l_wait_time;
+   }
+ }
+ 
+ /*
+  * Scan the collected locktags and return them for use by user callable stats 
+  * functions.
+  */
+ PgStat_LockWaitTags *
+ pgstat_get_lock_wait_tags(void)
+ {
+ 
+   PgStat_StatDBEntry  *dbentry;
+   PgStat_LockWaitTags *waitTags;
+   HASH_SEQ_STATUS     lstat;
+   PgStat_StatLockEntry    *lockentry;
+   LOCKTAG             locktag;
+   int                 numtags;
+   int                 i = 0;
+ 
+   waitTags = (PgStat_LockWaitTags *) palloc(sizeof(PgStat_LockWaitTags));
+ 
+   /* load the stats file if needed */
+   backend_read_statsfile();
+ 
+   /* get the database entry for the db and count its lock hash */
+   dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+   if (dbentry == NULL || dbentry->locks == NULL)
+   {
+       waitTags->n_tags = 0;
+       waitTags->locktags = NULL;
+   }
+   else
+   {
+       numtags = hash_get_num_entries(dbentry->locks);
+       
+       waitTags->n_tags = numtags;
+       waitTags->locktags = (LOCKTAG *) palloc(sizeof(LOCKTAG) * numtags);
+ 
+       /* scan the lock hash and copy the locktags */
+       hash_seq_init(&lstat, dbentry->locks);
+       while ((lockentry = (PgStat_StatLockEntry *) hash_seq_search(&lstat)) != NULL)
+       {
+           locktag = lockentry->l_tag;
+           memcpy(&(waitTags->locktags[i]), &locktag, sizeof(LOCKTAG));
+           i++;
+       }
+   }
+ 
+   return waitTags;
+ }
+  
Index: src/backend/storage/lmgr/lock.c
===================================================================
RCS file: /home/postgres/pgrepo/pgsql/src/backend/storage/lmgr/lock.c,v
retrieving revision 1.188
diff -c -r1.188 lock.c
*** src/backend/storage/lmgr/lock.c	11 Jun 2009 14:49:02 -0000	1.188
--- src/backend/storage/lmgr/lock.c	10 Jul 2009 20:07:46 -0000
***************
*** 793,801 ****
--- 793,805 ----
  										 locktag->locktag_field4,
  										 locktag->locktag_type,
  										 lockmode);
+         if (pgstat_track_locks)
+             pgstat_init_lock_wait (locktag);
  
  		WaitOnLock(locallock, owner);
  
+         if (pgstat_track_locks)
+             pgstat_end_lock_wait (locktag);
  		TRACE_POSTGRESQL_LOCK_WAIT_DONE(locktag->locktag_field1,
  										locktag->locktag_field2,
  										locktag->locktag_field3,
Index: src/backend/storage/lmgr/proc.c
===================================================================
RCS file: /home/postgres/pgrepo/pgsql/src/backend/storage/lmgr/proc.c,v
retrieving revision 1.207
diff -c -r1.207 proc.c
*** src/backend/storage/lmgr/proc.c	11 Jun 2009 14:49:02 -0000	1.207
--- src/backend/storage/lmgr/proc.c	10 Jul 2009 20:16:47 -0000
***************
*** 38,43 ****
--- 38,44 ----
  #include "access/transam.h"
  #include "access/xact.h"
  #include "miscadmin.h"
+ #include "pgstat.h"
  #include "postmaster/autovacuum.h"
  #include "storage/ipc.h"
  #include "storage/lmgr.h"
***************
*** 528,533 ****
--- 529,538 ----
  			GrantAwaitedLock();
  	}
  
+     /* If tracking lock waits, do necessary arithmetic here */
+     if (pgstat_track_locks)
+             pgstat_end_lock_wait (&lockAwaited->tag.lock);
+ 
  	lockAwaited = NULL;
  
  	LWLockRelease(partitionLock);
Index: src/backend/utils/adt/pgstatfuncs.c
===================================================================
RCS file: /home/postgres/pgrepo/pgsql/src/backend/utils/adt/pgstatfuncs.c,v
retrieving revision 1.54
diff -c -r1.54 pgstatfuncs.c
*** src/backend/utils/adt/pgstatfuncs.c	11 Jun 2009 14:49:04 -0000	1.54
--- src/backend/utils/adt/pgstatfuncs.c	10 Jul 2009 20:42:09 -0000
***************
*** 67,72 ****
--- 67,74 ----
  extern Datum pg_stat_get_db_tuples_inserted(PG_FUNCTION_ARGS);
  extern Datum pg_stat_get_db_tuples_updated(PG_FUNCTION_ARGS);
  extern Datum pg_stat_get_db_tuples_deleted(PG_FUNCTION_ARGS);
+ extern Datum pg_stat_get_db_lock_waits(PG_FUNCTION_ARGS);
+ extern Datum pg_stat_get_db_lock_wait_time(PG_FUNCTION_ARGS);
  
  extern Datum pg_stat_get_bgwriter_timed_checkpoints(PG_FUNCTION_ARGS);
  extern Datum pg_stat_get_bgwriter_requested_checkpoints(PG_FUNCTION_ARGS);
***************
*** 78,87 ****
--- 80,99 ----
  
  extern Datum pg_stat_clear_snapshot(PG_FUNCTION_ARGS);
  extern Datum pg_stat_reset(PG_FUNCTION_ARGS);
+ extern Datum pg_stat_get_lock_wait_tagset(PG_FUNCTION_ARGS);
+ extern Datum pg_stat_get_lock_waits(PG_FUNCTION_ARGS);
+ extern Datum pg_stat_get_lock_wait_time(PG_FUNCTION_ARGS);
  
  /* Global bgwriter statistics, from bgwriter.c */
  extern PgStat_MsgBgWriter bgwriterStats;
  
+ /* Working status for pg_stat_get_lock_wait_tagset */
+ typedef struct 
+ {
+   PgStat_LockWaitTags *waitTags;
+   int                 currIdx;
+ } PgStat_LockWaitTags_State;
+ 
  Datum
  pg_stat_get_numscans(PG_FUNCTION_ARGS)
  {
***************
*** 1039,1044 ****
--- 1051,1086 ----
  }
  
  Datum
+ pg_stat_get_db_lock_waits(PG_FUNCTION_ARGS)
+ {
+   Oid         dbid = PG_GETARG_OID(0);
+   int64       result;
+   PgStat_StatDBEntry *dbentry;
+ 
+   if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
+       result = 0;
+   else
+       result = (int64) (dbentry->n_lock_waits);
+ 
+   PG_RETURN_INT64(result);
+ }
+ 
+ Datum
+ pg_stat_get_db_lock_wait_time(PG_FUNCTION_ARGS)
+ {
+   Oid         dbid = PG_GETARG_OID(0);
+   int64       result;
+   PgStat_StatDBEntry *dbentry;
+ 
+   if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
+       result = 0;
+   else
+       result = (int64) (dbentry->lock_wait_time);
+ 
+   PG_RETURN_INT64(result);
+ }
+ 
+ Datum
  pg_stat_get_bgwriter_timed_checkpoints(PG_FUNCTION_ARGS)
  {
  	PG_RETURN_INT64(pgstat_fetch_global()->timed_checkpoints);
***************
*** 1099,1101 ****
--- 1141,1270 ----
  
  	PG_RETURN_VOID();
  }
+ 
+ 
+ /* Get the lock "tags" for captured lock waits */
+ Datum
+ pg_stat_get_lock_wait_tagset(PG_FUNCTION_ARGS)
+ {
+   FuncCallContext *funcctx;
+   PgStat_LockWaitTags *waitTags;
+   PgStat_LockWaitTags_State   *state;
+   
+   /* stuff done only on the first call of the function */
+   if (SRF_IS_FIRSTCALL())
+   {
+       TupleDesc   tupdesc;
+       MemoryContext   oldcontext;
+ 
+       /* create a function context for cross-call persistence */
+       funcctx = SRF_FIRSTCALL_INIT();
+ 
+       oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ 
+       /* make a tupdesc */
+       tupdesc = CreateTemplateTupleDesc(6, false);
+       TupleDescInitEntry(tupdesc, (AttrNumber) 1, "field1",
+                                   INT4OID, -1, 0);
+       TupleDescInitEntry(tupdesc, (AttrNumber) 2, "field2",
+                                   INT4OID, -1, 0);
+       TupleDescInitEntry(tupdesc, (AttrNumber) 3, "field3",
+                                   INT4OID, -1, 0);
+       TupleDescInitEntry(tupdesc, (AttrNumber) 4, "field4",
+                                   INT4OID, -1, 0);
+       TupleDescInitEntry(tupdesc, (AttrNumber) 5, "typ",
+                                   INT4OID, -1, 0);
+       TupleDescInitEntry(tupdesc, (AttrNumber) 6, "method",
+                                   INT4OID, -1, 0);
+       funcctx->tuple_desc = BlessTupleDesc(tupdesc);
+ 
+       state = (PgStat_LockWaitTags_State *) palloc(sizeof(PgStat_LockWaitTags_State));
+       funcctx->user_fctx = (void *) state;
+ 
+       state->waitTags = pgstat_get_lock_wait_tags();
+       state->currIdx = 0;
+ 
+       MemoryContextSwitchTo(oldcontext);
+   }
+ 
+   /* stuff performed for each call */
+   funcctx = SRF_PERCALL_SETUP();
+   state = (PgStat_LockWaitTags_State *) funcctx->user_fctx;
+   waitTags = state->waitTags;
+ 
+   while (state->currIdx < waitTags->n_tags)
+   {
+       LOCKTAG     *locktag;
+       Datum       values[6];
+       bool        nulls[6];
+       HeapTuple   tuple;
+       Datum       result;
+ 
+       locktag = &(waitTags->locktags[state->currIdx]);
+ 
+       /* setup values for use in a tuple */
+       MemSet(values, 0, sizeof(values));
+       MemSet(nulls, false, sizeof(nulls));
+ 
+       /* read all the values from the locktag */
+       values[0] = UInt32GetDatum(locktag->locktag_field1);
+       values[1] = UInt32GetDatum(locktag->locktag_field2);
+       values[2] = UInt32GetDatum(locktag->locktag_field3);
+       values[3] = UInt16GetDatum(locktag->locktag_field4);
+       values[4] = UInt8GetDatum(locktag->locktag_type);
+       values[5] = UInt8GetDatum(locktag->locktag_lockmethodid);
+ 
+       state->currIdx++;
+ 
+       /* form the tuple */
+       tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
+       result = HeapTupleGetDatum(tuple);
+       SRF_RETURN_NEXT(funcctx, result);
+   }
+ 
+   SRF_RETURN_DONE(funcctx);
+ }
+ 
+ Datum
+ pg_stat_get_lock_waits(PG_FUNCTION_ARGS)
+ {
+   LOCKTAG     locktag;
+   int64       result;
+   PgStat_StatLockEntry *lockentry;
+ 
+   locktag.locktag_field1 = PG_GETARG_INT32(0);
+   locktag.locktag_field2 = PG_GETARG_INT32(1);
+   locktag.locktag_field3 = PG_GETARG_INT32(2);
+   locktag.locktag_field4 = PG_GETARG_INT32(3);
+   locktag.locktag_type = PG_GETARG_INT32(4);
+   locktag.locktag_lockmethodid = PG_GETARG_INT32(5);
+ 
+   if ((lockentry = pgstat_fetch_stat_lockentry(&locktag)) == NULL)
+       result = 0;
+   else
+       result = (int64) (lockentry->l_numwaits);
+ 
+   PG_RETURN_INT64(result);
+ }
+ 
+ Datum
+ pg_stat_get_lock_wait_time(PG_FUNCTION_ARGS)
+ {
+   LOCKTAG     locktag;
+   int64       result;
+   PgStat_StatLockEntry *lockentry;
+ 
+   locktag.locktag_field1 = PG_GETARG_INT32(0);
+   locktag.locktag_field2 = PG_GETARG_INT32(1);
+   locktag.locktag_field3 = PG_GETARG_INT32(2);
+   locktag.locktag_field4 = PG_GETARG_INT32(3);
+   locktag.locktag_type =  PG_GETARG_INT32(4);
+   locktag.locktag_lockmethodid = PG_GETARG_INT32(5);
+ 
+   if ((lockentry = pgstat_fetch_stat_lockentry(&locktag)) == NULL)
+       result = 0;
+   else
+       result = (int64) (lockentry->l_wait_time);
+ 
+   PG_RETURN_INT64(result);
+ }
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /home/postgres/pgrepo/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.505
diff -c -r1.505 guc.c
*** src/backend/utils/misc/guc.c	11 Jun 2009 14:49:06 -0000	1.505
--- src/backend/utils/misc/guc.c	10 Jul 2009 21:51:32 -0000
***************
*** 885,890 ****
--- 885,898 ----
  		&pgstat_track_counts,
  		true, NULL, NULL
  	},
+     {
+         {"track_locks", PGC_SUSET, STATS_COLLECTOR,
+             gettext_noop("Collects statistics on lock activity."),
+             NULL
+         },
+         &pgstat_track_locks,
+         false, NULL, NULL
+     },
  
  	{
  		{"update_process_title", PGC_SUSET, STATS_COLLECTOR,
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
RCS file: /home/postgres/pgrepo/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.261
diff -c -r1.261 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample	3 Jul 2009 19:14:25 -0000	1.261
--- src/backend/utils/misc/postgresql.conf.sample	10 Jul 2009 21:52:12 -0000
***************
*** 365,370 ****
--- 365,371 ----
  #track_activities = on
  #track_counts = on
  #track_functions = none			# none, pl, all
+ #track_locks = off
  #track_activity_query_size = 1024
  #update_process_title = on
  #stats_temp_directory = 'pg_stat_tmp'
Index: src/include/pgstat.h
===================================================================
RCS file: /home/postgres/pgrepo/pgsql/src/include/pgstat.h,v
retrieving revision 1.83
diff -c -r1.83 pgstat.h
*** src/include/pgstat.h	11 Jun 2009 14:49:08 -0000	1.83
--- src/include/pgstat.h	11 Jul 2009 00:07:05 -0000
***************
*** 13,18 ****
--- 13,19 ----
  
  #include "libpq/pqcomm.h"
  #include "portability/instr_time.h"
+ #include "storage/lock.h"
  #include "utils/hsearch.h"
  #include "utils/relcache.h"
  #include "utils/timestamp.h"
***************
*** 43,49 ****
  	PGSTAT_MTYPE_ANALYZE,
  	PGSTAT_MTYPE_BGWRITER,
  	PGSTAT_MTYPE_FUNCSTAT,
! 	PGSTAT_MTYPE_FUNCPURGE
  } StatMsgType;
  
  /* ----------
--- 44,51 ----
  	PGSTAT_MTYPE_ANALYZE,
  	PGSTAT_MTYPE_BGWRITER,
  	PGSTAT_MTYPE_FUNCSTAT,
! 	PGSTAT_MTYPE_FUNCPURGE,
! 	PGSTAT_MTYPE_LOCKSTAT
  } StatMsgType;
  
  /* ----------
***************
*** 398,403 ****
--- 400,459 ----
  	Oid			m_functionid[PGSTAT_NUM_FUNCPURGE];
  } PgStat_MsgFuncpurge;
  
+ /* ----------
+  * PgStat_LockCounts  The actual per-lock type counts kept by a backend
+  *
+  * This struct should contain only actual event counters, because we memcmp
+  * it against zeroes to detect whether there are any counts to transmit.
+  *
+  * Note that the time counters are in instr_time format here.  We convert to
+  * microseconds in PgStat_Counter format when transmitting to the collector.
+  * ----------
+  */
+ typedef struct PgStat_LockCounts
+ {
+   PgStat_Counter  l_numwaits;
+   instr_time      l_wait_time;
+ } PgStat_LockCounts;
+ 
+ /* ----------
+  * PgStat_BackendLockEntry            Per-Lock info in a backend
+  * ----------
+  */
+ typedef struct PgStat_BackendLockEntry
+ {
+   LOCKTAG           l_tag;        /* LOCKTAG for this lock */
+   PgStat_LockCounts l_counts;
+ } PgStat_BackendLockEntry;
+ 
+ /* ----------
+  * PgStat_LockEntry           Per-Lock info in a MsgLockstat
+  * ----------
+  */
+ typedef struct PgStat_LockEntry
+ {
+   LOCKTAG           l_tag;        /* LOCKTAG for this lock */
+   PgStat_Counter    l_numwaits;
+   PgStat_Counter    l_wait_time;
+ } PgStat_LockEntry;
+ 
+ /* ----------
+  * PgStat_MsgLockstat         Sent by the backend to report lock usage
+  *                                statistics.
+  * ----------
+  */
+ #define PGSTAT_NUM_LOCKENTRIES  \
+   ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int))  \
+    / sizeof(PgStat_LockEntry))
+ 
+ typedef struct PgStat_MsgLockstat
+ {
+   PgStat_MsgHdr m_hdr;
+   Oid         m_databaseid;
+   int         m_nentries;
+   PgStat_LockEntry m_entry[PGSTAT_NUM_LOCKENTRIES];
+ } PgStat_MsgLockstat;
+ 
  
  /* ----------
   * PgStat_Msg					Union over all possible messages.
***************
*** 418,423 ****
--- 474,480 ----
  	PgStat_MsgBgWriter msg_bgwriter;
  	PgStat_MsgFuncstat msg_funcstat;
  	PgStat_MsgFuncpurge msg_funcpurge;
+ 	PgStat_MsgLockstat msg_lockstat;
  } PgStat_Msg;
  
  
***************
*** 429,435 ****
   * ------------------------------------------------------------
   */
  
! #define PGSTAT_FILE_FORMAT_ID	0x01A5BC98
  
  /* ----------
   * PgStat_StatDBEntry			The collector's data per database
--- 486,492 ----
   * ------------------------------------------------------------
   */
  
! #define PGSTAT_FILE_FORMAT_ID	0x01A5BC99
  
  /* ----------
   * PgStat_StatDBEntry			The collector's data per database
***************
*** 448,453 ****
--- 505,512 ----
  	PgStat_Counter n_tuples_updated;
  	PgStat_Counter n_tuples_deleted;
  	TimestampTz last_autovac_time;
+     PgStat_Counter n_lock_waits;
+     PgStat_Counter lock_wait_time;
  
  	/*
  	 * tables and functions must be last in the struct, because we don't write
***************
*** 455,460 ****
--- 514,520 ----
  	 */
  	HTAB	   *tables;
  	HTAB	   *functions;
+ 	HTAB	   *locks;
  } PgStat_StatDBEntry;
  
  
***************
*** 505,510 ****
--- 565,583 ----
  } PgStat_StatFuncEntry;
  
  
+ /* ----------
+  * PgStat_StatFuncEntry           The collector's data per function
+  * ----------
+  */
+ typedef struct PgStat_StatLockEntry
+ {
+   LOCKTAG         l_tag;
+ 
+   PgStat_Counter  l_numwaits;
+   PgStat_Counter  l_wait_time;            /* times in microseconds */
+ } PgStat_StatLockEntry;
+ 
+ 
  /*
   * Global statistics kept in the stats collector
   */
***************
*** 584,589 ****
--- 657,671 ----
  	instr_time	f_start;
  } PgStat_FunctionCallUsage;
  
+ /*
+  * Structure to hold information passed to the lock wait user level functions,
+  * For each collected locktag an entry is stored.
+  */
+ typedef struct PgStat_LockWaitTags
+ {
+   int         n_tags;     /* the number of tags in the array */
+   LOCKTAG    *locktags;   /* the tags */
+ } PgStat_LockWaitTags;
  
  /* ----------
   * GUC parameters
***************
*** 592,597 ****
--- 674,680 ----
  extern bool pgstat_track_activities;
  extern bool pgstat_track_counts;
  extern int	pgstat_track_functions;
+ extern bool pgstat_track_locks;
  extern PGDLLIMPORT int pgstat_track_activity_query_size;
  extern char *pgstat_stat_tmpname;
  extern char *pgstat_stat_filename;
***************
*** 696,701 ****
--- 779,788 ----
  extern void pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu,
  						  bool finalize);
  
+ extern void pgstat_init_lock_wait(LOCKTAG *locktag);
+ extern void pgstat_end_lock_wait(LOCKTAG *locktag);
+ extern PgStat_LockWaitTags *pgstat_get_lock_wait_tags(void);
+ 
  extern void AtEOXact_PgStat(bool isCommit);
  extern void AtEOSubXact_PgStat(bool isCommit, int nestDepth);
  
***************
*** 718,723 ****
--- 805,811 ----
  extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
  extern PgBackendStatus *pgstat_fetch_stat_beentry(int beid);
  extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
+ extern PgStat_StatLockEntry *pgstat_fetch_stat_lockentry(LOCKTAG *locktag);
  extern int	pgstat_fetch_stat_numbackends(void);
  extern PgStat_GlobalStats *pgstat_fetch_global(void);
  
Index: src/include/catalog/catversion.h
===================================================================
RCS file: /home/postgres/pgrepo/pgsql/src/include/catalog/catversion.h,v
retrieving revision 1.532
diff -c -r1.532 catversion.h
*** src/include/catalog/catversion.h	7 Jul 2009 18:23:14 -0000	1.532
--- src/include/catalog/catversion.h	11 Jul 2009 00:08:58 -0000
***************
*** 53,58 ****
   */
  
  /*							yyyymmddN */
! #define CATALOG_VERSION_NO	200907071
  
  #endif
--- 53,58 ----
   */
  
  /*							yyyymmddN */
! #define CATALOG_VERSION_NO	200907101
  
  #endif
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /home/postgres/pgrepo/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.546
diff -c -r1.546 pg_proc.h
*** src/include/catalog/pg_proc.h	7 Jul 2009 18:49:16 -0000	1.546
--- src/include/catalog/pg_proc.h	11 Jul 2009 00:13:37 -0000
***************
*** 3028,3033 ****
--- 3028,3037 ----
  DESCR("statistics: tuples updated in database");
  DATA(insert OID = 2762 (  pg_stat_get_db_tuples_deleted PGNSP PGUID 12 1 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_db_tuples_deleted _null_ _null_ _null_ ));
  DESCR("statistics: tuples deleted in database");
+ DATA(insert OID = 2964 (  pg_stat_get_db_lock_waits PGNSP PGUID 12 1 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_db_lock_waits _null_ _null_ _null_ ));
+ DESCR("statistics: lock waits in database");
+ DATA(insert OID = 2965 (  pg_stat_get_db_lock_wait_time PGNSP PGUID 12 1 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_db_lock_wait_time _null_ _null_ _null_ ));
+ DESCR("statistics: lock wait time in database");
  DATA(insert OID = 2769 ( pg_stat_get_bgwriter_timed_checkpoints PGNSP PGUID 12 1 0 0 f f f t f s 0 0 20 "" _null_ _null_ _null_ _null_ pg_stat_get_bgwriter_timed_checkpoints _null_ _null_ _null_ ));
  DESCR("statistics: number of timed checkpoints started by the bgwriter");
  DATA(insert OID = 2770 ( pg_stat_get_bgwriter_requested_checkpoints PGNSP PGUID 12 1 0 0 f f f t f s 0 0 20 "" _null_ _null_ _null_ _null_ pg_stat_get_bgwriter_requested_checkpoints _null_ _null_ _null_ ));
***************
*** 3049,3055 ****
  DESCR("statistics: execution time of function");
  DATA(insert OID = 2980 (  pg_stat_get_function_self_time	PGNSP PGUID 12 1 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_function_self_time _null_ _null_ _null_ ));
  DESCR("statistics: self execution time of function");
! 
  DATA(insert OID = 2230 (  pg_stat_clear_snapshot		PGNSP PGUID 12 1 0 0 f f f f f v 0 0 2278 "" _null_ _null_ _null_ _null_	pg_stat_clear_snapshot _null_ _null_ _null_ ));
  DESCR("statistics: discard current transaction's statistics snapshot");
  DATA(insert OID = 2274 (  pg_stat_reset					PGNSP PGUID 12 1 0 0 f f f f f v 0 0 2278 "" _null_ _null_ _null_ _null_	pg_stat_reset _null_ _null_ _null_ ));
--- 3053,3064 ----
  DESCR("statistics: execution time of function");
  DATA(insert OID = 2980 (  pg_stat_get_function_self_time	PGNSP PGUID 12 1 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_function_self_time _null_ _null_ _null_ ));
  DESCR("statistics: self execution time of function");
! DATA(insert OID = 2966 (  pg_stat_get_lock_wait_tagset   PGNSP PGUID 12 1 1000 0 f f f t t v 0 0 2249 "" "{23,23,23,23,23,23}" "{o,o,o,o,o,o}" "{field1,field2,field3,field4,typ,method}" _null_ pg_stat_get_lock_wait_tagset _null_ _null_ _null_ ));
! DESCR("statistics: get locktags for lock waiters");
! DATA(insert OID = 2967 (  pg_stat_get_lock_waits      PGNSP PGUID 12 1 0 0 f f f t f s 6 0 20 "23 23 23 23 23 23" _null_ _null_ _null_ _null_ pg_stat_get_lock_waits _null_ _null_ _null_ ));
! DESCR("statistics: number of lock waits");
! DATA(insert OID = 2995 (  pg_stat_get_lock_wait_time      PGNSP PGUID 12 1 0 0 f f f t f s 6 0 20 "23 23 23 23 23 23" _null_ _null_ _null_ _null_ pg_stat_get_lock_wait_time _null_ _null_ _null_ ));
! DESCR("statistics: lock wait time");
  DATA(insert OID = 2230 (  pg_stat_clear_snapshot		PGNSP PGUID 12 1 0 0 f f f f f v 0 0 2278 "" _null_ _null_ _null_ _null_	pg_stat_clear_snapshot _null_ _null_ _null_ ));
  DESCR("statistics: discard current transaction's statistics snapshot");
  DATA(insert OID = 2274 (  pg_stat_reset					PGNSP PGUID 12 1 0 0 f f f f f v 0 0 2278 "" _null_ _null_ _null_ _null_	pg_stat_reset _null_ _null_ _null_ ));
Index: src/test/regress/expected/rules.out
===================================================================
RCS file: /home/postgres/pgrepo/pgsql/src/test/regress/expected/rules.out,v
retrieving revision 1.149
diff -c -r1.149 rules.out
*** src/test/regress/expected/rules.out	6 Feb 2009 21:15:12 -0000	1.149
--- src/test/regress/expected/rules.out	11 Jul 2009 00:20:42 -0000
***************
*** 1293,1299 ****
   pg_stat_all_indexes      | SELECT c.oid AS relid, i.oid AS indexrelid, n.nspname AS schemaname, c.relname, i.relname AS indexrelname, pg_stat_get_numscans(i.oid) AS idx_scan, pg_stat_get_tuples_returned(i.oid) AS idx_tup_read, pg_stat_get_tuples_fetched(i.oid) AS idx_tup_fetch FROM (((pg_class c JOIN pg_index x ON ((c.oid = x.indrelid))) JOIN pg_class i ON ((i.oid = x.indexrelid))) LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char"]));
   pg_stat_all_tables       | SELECT c.oid AS relid, n.nspname AS schemaname, c.relname, pg_stat_get_numscans(c.oid) AS seq_scan, pg_stat_get_tuples_returned(c.oid) AS seq_tup_read, (sum(pg_stat_get_numscans(i.indexrelid)))::bigint AS idx_scan, ((sum(pg_stat_get_tuples_fetched(i.indexrelid)))::bigint + pg_stat_get_tuples_fetched(c.oid)) AS idx_tup_fetch, pg_stat_get_tuples_inserted(c.oid) AS n_tup_ins, pg_stat_get_tuples_updated(c.oid) AS n_tup_upd, pg_stat_get_tuples_deleted(c.oid) AS n_tup_del, pg_stat_get_tuples_hot_updated(c.oid) AS n_tup_hot_upd, pg_stat_get_live_tuples(c.oid) AS n_live_tup, pg_stat_get_dead_tuples(c.oid) AS n_dead_tup, pg_stat_get_last_vacuum_time(c.oid) AS last_vacuum, pg_stat_get_last_autovacuum_time(c.oid) AS last_autovacuum, pg_stat_get_last_analyze_time(c.oid) AS last_analyze, pg_stat_get_last_autoanalyze_time(c.oid) AS last_autoanalyze FROM ((pg_class c LEFT JOIN pg_index i ON ((c.oid = i.indrelid))) LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char"])) GROUP BY c.oid, n.nspname, c.relname;
   pg_stat_bgwriter         | SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed, pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req, pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint, pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean, pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean, pg_stat_get_buf_written_backend() AS buffers_backend, pg_stat_get_buf_alloc() AS buffers_alloc;
!  pg_stat_database         | SELECT d.oid AS datid, d.datname, pg_stat_get_db_numbackends(d.oid) AS numbackends, pg_stat_get_db_xact_commit(d.oid) AS xact_commit, pg_stat_get_db_xact_rollback(d.oid) AS xact_rollback, (pg_stat_get_db_blocks_fetched(d.oid) - pg_stat_get_db_blocks_hit(d.oid)) AS blks_read, pg_stat_get_db_blocks_hit(d.oid) AS blks_hit, pg_stat_get_db_tuples_returned(d.oid) AS tup_returned, pg_stat_get_db_tuples_fetched(d.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(d.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(d.oid) AS tup_updated, pg_stat_get_db_tuples_deleted(d.oid) AS tup_deleted FROM pg_database d;
   pg_stat_sys_indexes      | SELECT pg_stat_all_indexes.relid, pg_stat_all_indexes.indexrelid, pg_stat_all_indexes.schemaname, pg_stat_all_indexes.relname, pg_stat_all_indexes.indexrelname, pg_stat_all_indexes.idx_scan, pg_stat_all_indexes.idx_tup_read, pg_stat_all_indexes.idx_tup_fetch FROM pg_stat_all_indexes WHERE ((pg_stat_all_indexes.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_indexes.schemaname ~ '^pg_toast'::text));
   pg_stat_sys_tables       | SELECT pg_stat_all_tables.relid, pg_stat_all_tables.schemaname, pg_stat_all_tables.relname, pg_stat_all_tables.seq_scan, pg_stat_all_tables.seq_tup_read, pg_stat_all_tables.idx_scan, pg_stat_all_tables.idx_tup_fetch, pg_stat_all_tables.n_tup_ins, pg_stat_all_tables.n_tup_upd, pg_stat_all_tables.n_tup_del, pg_stat_all_tables.n_tup_hot_upd, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze FROM pg_stat_all_tables WHERE ((pg_stat_all_tables.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_tables.schemaname ~ '^pg_toast'::text));
   pg_stat_user_functions   | SELECT p.oid AS funcid, n.nspname AS schemaname, p.proname AS funcname, pg_stat_get_function_calls(p.oid) AS calls, (pg_stat_get_function_time(p.oid) / 1000) AS total_time, (pg_stat_get_function_self_time(p.oid) / 1000) AS self_time FROM (pg_proc p LEFT JOIN pg_namespace n ON ((n.oid = p.pronamespace))) WHERE ((p.prolang <> (12)::oid) AND (pg_stat_get_function_calls(p.oid) IS NOT NULL));
--- 1293,1300 ----
   pg_stat_all_indexes      | SELECT c.oid AS relid, i.oid AS indexrelid, n.nspname AS schemaname, c.relname, i.relname AS indexrelname, pg_stat_get_numscans(i.oid) AS idx_scan, pg_stat_get_tuples_returned(i.oid) AS idx_tup_read, pg_stat_get_tuples_fetched(i.oid) AS idx_tup_fetch FROM (((pg_class c JOIN pg_index x ON ((c.oid = x.indrelid))) JOIN pg_class i ON ((i.oid = x.indexrelid))) LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char"]));
   pg_stat_all_tables       | SELECT c.oid AS relid, n.nspname AS schemaname, c.relname, pg_stat_get_numscans(c.oid) AS seq_scan, pg_stat_get_tuples_returned(c.oid) AS seq_tup_read, (sum(pg_stat_get_numscans(i.indexrelid)))::bigint AS idx_scan, ((sum(pg_stat_get_tuples_fetched(i.indexrelid)))::bigint + pg_stat_get_tuples_fetched(c.oid)) AS idx_tup_fetch, pg_stat_get_tuples_inserted(c.oid) AS n_tup_ins, pg_stat_get_tuples_updated(c.oid) AS n_tup_upd, pg_stat_get_tuples_deleted(c.oid) AS n_tup_del, pg_stat_get_tuples_hot_updated(c.oid) AS n_tup_hot_upd, pg_stat_get_live_tuples(c.oid) AS n_live_tup, pg_stat_get_dead_tuples(c.oid) AS n_dead_tup, pg_stat_get_last_vacuum_time(c.oid) AS last_vacuum, pg_stat_get_last_autovacuum_time(c.oid) AS last_autovacuum, pg_stat_get_last_analyze_time(c.oid) AS last_analyze, pg_stat_get_last_autoanalyze_time(c.oid) AS last_autoanalyze FROM ((pg_class c LEFT JOIN pg_index i ON ((c.oid = i.indrelid))) LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char"])) GROUP BY c.oid, n.nspname, c.relname;
   pg_stat_bgwriter         | SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed, pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req, pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint, pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean, pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean, pg_stat_get_buf_written_backend() AS buffers_backend, pg_stat_get_buf_alloc() AS buffers_alloc;
!  pg_stat_database         | SELECT d.oid AS datid, d.datname, pg_stat_get_db_numbackends(d.oid) AS numbackends, pg_stat_get_db_xact_commit(d.oid) AS xact_commit, pg_stat_get_db_xact_rollback(d.oid) AS xact_rollback, (pg_stat_get_db_blocks_fetched(d.oid) - pg_stat_get_db_blocks_hit(d.oid)) AS blks_read, pg_stat_get_db_blocks_hit(d.oid) AS blks_hit, pg_stat_get_db_tuples_returned(d.oid) AS tup_returned, pg_stat_get_db_tuples_fetched(d.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(d.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(d.oid) AS tup_updated, pg_stat_get_db_tuples_deleted(d.oid) AS tup_deleted, pg_stat_get_db_lock_waits(d.oid) AS lock_waits, pg_stat_get_db_lock_wait_time(d.oid) AS lock_wait_time FROM pg_database d;
!  pg_stat_lock_waits       | SELECT l.field1, l.field2, l.field3, l.field4, l.typ, l.method, pg_stat_get_lock_waits(l.field1, l.field2, l.field3, l.field4, l.typ, l.method) AS waits, pg_stat_get_lock_wait_time(l.field1, l.field2, l.field3, l.field4, l.typ, l.method) AS wait_time FROM pg_stat_get_lock_wait_tagset() l(field1, field2, field3, field4, typ, method);
   pg_stat_sys_indexes      | SELECT pg_stat_all_indexes.relid, pg_stat_all_indexes.indexrelid, pg_stat_all_indexes.schemaname, pg_stat_all_indexes.relname, pg_stat_all_indexes.indexrelname, pg_stat_all_indexes.idx_scan, pg_stat_all_indexes.idx_tup_read, pg_stat_all_indexes.idx_tup_fetch FROM pg_stat_all_indexes WHERE ((pg_stat_all_indexes.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_indexes.schemaname ~ '^pg_toast'::text));
   pg_stat_sys_tables       | SELECT pg_stat_all_tables.relid, pg_stat_all_tables.schemaname, pg_stat_all_tables.relname, pg_stat_all_tables.seq_scan, pg_stat_all_tables.seq_tup_read, pg_stat_all_tables.idx_scan, pg_stat_all_tables.idx_tup_fetch, pg_stat_all_tables.n_tup_ins, pg_stat_all_tables.n_tup_upd, pg_stat_all_tables.n_tup_del, pg_stat_all_tables.n_tup_hot_upd, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze FROM pg_stat_all_tables WHERE ((pg_stat_all_tables.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_tables.schemaname ~ '^pg_toast'::text));
   pg_stat_user_functions   | SELECT p.oid AS funcid, n.nspname AS schemaname, p.proname AS funcname, pg_stat_get_function_calls(p.oid) AS calls, (pg_stat_get_function_time(p.oid) / 1000) AS total_time, (pg_stat_get_function_self_time(p.oid) / 1000) AS self_time FROM (pg_proc p LEFT JOIN pg_namespace n ON ((n.oid = p.pronamespace))) WHERE ((p.prolang <> (12)::oid) AND (pg_stat_get_function_calls(p.oid) IS NOT NULL));
***************
*** 1329,1335 ****
   shoelace_obsolete        | SELECT shoelace.sl_name, shoelace.sl_avail, shoelace.sl_color, shoelace.sl_len, shoelace.sl_unit, shoelace.sl_len_cm FROM shoelace WHERE (NOT (EXISTS (SELECT shoe.shoename FROM shoe WHERE (shoe.slcolor = shoelace.sl_color))));
   street                   | SELECT r.name, r.thepath, c.cname FROM ONLY road r, real_city c WHERE (c.outline ## r.thepath);
   toyemp                   | SELECT emp.name, emp.age, emp.location, (12 * emp.salary) AS annualsal FROM emp;
! (51 rows)
  
  SELECT tablename, rulename, definition FROM pg_rules 
  	ORDER BY tablename, rulename;
--- 1330,1336 ----
   shoelace_obsolete        | SELECT shoelace.sl_name, shoelace.sl_avail, shoelace.sl_color, shoelace.sl_len, shoelace.sl_unit, shoelace.sl_len_cm FROM shoelace WHERE (NOT (EXISTS (SELECT shoe.shoename FROM shoe WHERE (shoe.slcolor = shoelace.sl_color))));
   street                   | SELECT r.name, r.thepath, c.cname FROM ONLY road r, real_city c WHERE (c.outline ## r.thepath);
   toyemp                   | SELECT emp.name, emp.age, emp.location, (12 * emp.salary) AS annualsal FROM emp;
! (52 rows)
  
  SELECT tablename, rulename, definition FROM pg_rules 
  	ORDER BY tablename, rulename;

Mark Kirkwood

markir@paradise.net.nz

over 16 years ago

In reply to: Jaime Casanova (#3)

Re: Lock Wait Statistics (next commitfest)

Jaime Casanova wrote:

i did it myself, i think this is something we need...

this compile and seems to work... something i was wondering is that
having the total time of lock waits is not very accurate because we
can have 9 lock waits awaiting 1 sec each and one awaiting for 1
minute... simply sum them all will give a bad statistic or am i
missing something?

Thank you Jaime - looks good. I seem to have been asleep at the wheel
and missed *both* of your emails until now, belated apologies for that
- especially the first one :-(

With respect to the sum of wait times being not very granular, yes -
quite true. I was thinking it is useful to be able to answer the
question 'where is my wait time being spent' - but it hides cases like
the one you mention. What would you like to see? would max and min wait
times be a useful addition, or are you thinking along different lines?

Cheers

Mark

Jaime Casanova

jcasanov@systemguards.com.ec

over 16 years ago

In reply to: Mark Kirkwood (#4)

Re: Lock Wait Statistics (next commitfest)

On Fri, Jul 17, 2009 at 3:38 AM, Mark Kirkwood<markir@paradise.net.nz> wrote:

With respect to the sum of wait times being not very granular, yes - quite
true. I was thinking it is useful to be able to answer the question 'where
is my wait time being spent' - but it hides cases like the one you mention.
What would you like to see? would max and min wait times be a useful
addition, or are you thinking along different lines?

track number of locks, sum of wait times, max(wait time).
but actually i started to think that the best is just make use of
log_lock_waits send the logs to csvlog and analyze there...

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Mark Kirkwood

markir@paradise.net.nz

over 16 years ago

In reply to: Jaime Casanova (#5)

Re: Lock Wait Statistics (next commitfest)

Jaime Casanova wrote:

On Fri, Jul 17, 2009 at 3:38 AM, Mark Kirkwood<markir@paradise.net.nz> wrote:

With respect to the sum of wait times being not very granular, yes - quite
true. I was thinking it is useful to be able to answer the question 'where
is my wait time being spent' - but it hides cases like the one you mention.
What would you like to see? would max and min wait times be a useful
addition, or are you thinking along different lines?

track number of locks, sum of wait times, max(wait time).
but actually i started to think that the best is just make use of
log_lock_waits send the logs to csvlog and analyze there...

Right - I'll look at adding max (at least) early next week.

Yeah, enabling log_lock_waits is certainly another approach, however you
currently miss out on those that are < deadlock_timeout - and
potentially they could be the source of your problem (i.e millions of
waits all < deadlock_timeout but taken together rather significant).
This shortcoming could be overcome by making the cutoff wait time
decoupled from deadlock_timeout (e.g a new parameter
log_min_lock_wait_time or similar).

I'm thinking that having the lock waits analyzable via sql easily may
mean that for most people they don't need to collect and analyze their
logs for this stuff (they just examine the lock stats view from Pgadmin
or similar).

Cheers

Mark

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Mark Kirkwood (#6)

Re: Lock Wait Statistics (next commitfest)

Mark Kirkwood <markir@paradise.net.nz> writes:

Yeah, enabling log_lock_waits is certainly another approach, however you
currently miss out on those that are < deadlock_timeout - and
potentially they could be the source of your problem (i.e millions of
waits all < deadlock_timeout but taken together rather significant).
This shortcoming could be overcome by making the cutoff wait time
decoupled from deadlock_timeout (e.g a new parameter
log_min_lock_wait_time or similar).

The reason that they're tied together is to keep from creating
unreasonable complexity (and an unreasonable number of extra kernel
calls) in management of the timeout timers. You will find that you
can't just wave your hand and decree that they are now decoupled.

regards, tom lane

Mark Kirkwood

markir@paradise.net.nz

over 16 years ago

In reply to: Tom Lane (#7)

Re: Lock Wait Statistics (next commitfest)

Tom Lane wrote:

Mark Kirkwood <markir@paradise.net.nz> writes:

Yeah, enabling log_lock_waits is certainly another approach, however you
currently miss out on those that are < deadlock_timeout - and
potentially they could be the source of your problem (i.e millions of
waits all < deadlock_timeout but taken together rather significant).
This shortcoming could be overcome by making the cutoff wait time
decoupled from deadlock_timeout (e.g a new parameter
log_min_lock_wait_time or similar).

The reason that they're tied together is to keep from creating
unreasonable complexity (and an unreasonable number of extra kernel
calls) in management of the timeout timers. You will find that you
can't just wave your hand and decree that they are now decoupled.

Thanks Tom - I did wonder if there was a deeper reason they were tied
together!

Cheers

Mark

Mark Kirkwood

markir@paradise.net.nz

over 16 years ago

In reply to: Mark Kirkwood (#6)

Re: Lock Wait Statistics (next commitfest)

Mark Kirkwood wrote:

Jaime Casanova wrote:

On Fri, Jul 17, 2009 at 3:38 AM, Mark
Kirkwood<markir@paradise.net.nz> wrote:

With respect to the sum of wait times being not very granular, yes -
quite
true. I was thinking it is useful to be able to answer the question
'where
is my wait time being spent' - but it hides cases like the one you
mention.
What would you like to see? would max and min wait times be a useful
addition, or are you thinking along different lines?

track number of locks, sum of wait times, max(wait time).
but actually i started to think that the best is just make use of
log_lock_waits send the logs to csvlog and analyze there...

Right - I'll look at adding max (at least) early next week.

I'm also thinking of taking a look at amalgamating transaction type lock
waits. This seems like a good idea because:

- individually, and viewed at a later date, I don't think they
individual detail is going to be useful
- there will be a lot of them
- I think the statistical data (count, sum elapsed, max elapsed) may be
sufficiently interesting

Cheers

Mark

#10

Mark Kirkwood

markir@paradise.net.nz

over 16 years ago

In reply to: Mark Kirkwood (#6)

1 attachment(s)

Re: Lock Wait Statistics (next commitfest)

Mark Kirkwood wrote:

Jaime Casanova wrote:

On Fri, Jul 17, 2009 at 3:38 AM, Mark
Kirkwood<markir@paradise.net.nz> wrote:

With respect to the sum of wait times being not very granular, yes -
quite
true. I was thinking it is useful to be able to answer the question
'where
is my wait time being spent' - but it hides cases like the one you
mention.
What would you like to see? would max and min wait times be a useful
addition, or are you thinking along different lines?

track number of locks, sum of wait times, max(wait time).
but actually i started to think that the best is just make use of
log_lock_waits send the logs to csvlog and analyze there...

Right - I'll look at adding max (at least) early next week.

Patch with max(wait time).

Still TODO

- amalgamate individual transaction lock waits
- redo (rather ugly) temporary pg_stat_lock_waits in a form more like
pg_locks

#11

Mark Kirkwood

markir@paradise.net.nz

over 16 years ago

In reply to: Mark Kirkwood (#10)

1 attachment(s)

Re: Lock Wait Statistics (next commitfest)

Mark Kirkwood wrote:

Mark Kirkwood wrote:

Jaime Casanova wrote:

On Fri, Jul 17, 2009 at 3:38 AM, Mark
Kirkwood<markir@paradise.net.nz> wrote:

With respect to the sum of wait times being not very granular, yes
- quite
true. I was thinking it is useful to be able to answer the question
'where
is my wait time being spent' - but it hides cases like the one you
mention.
What would you like to see? would max and min wait times be a useful
addition, or are you thinking along different lines?

track number of locks, sum of wait times, max(wait time).
but actually i started to think that the best is just make use of
log_lock_waits send the logs to csvlog and analyze there...

Right - I'll look at adding max (at least) early next week.

Patch with max(wait time).

Still TODO

- amalgamate individual transaction lock waits
- redo (rather ugly) temporary pg_stat_lock_waits in a form more like
pg_locks

This version has the individual transaction lock waits amalgamated.

Still TODO: redo pg_stat_lock_waits ...

#12

Stephen Frost

sfrost@snowman.net

over 16 years ago

In reply to: Mark Kirkwood (#11)

Re: Lock Wait Statistics (next commitfest)

Mark,

Your last email on this patch, from August 9th, indicates that you've
still got "TODO: redo pg_stat_lock_waits ...". Has you updated this
patch since then?

Thanks!

Stephen

#13

Pierre Frédéric Caillaud

lists@peufeu.com

over 16 years ago

In reply to: Stephen Frost (#12)

1 attachment(s)

Re: Lock Wait Statistics (next commitfest)

I have this patch, if you're interested.

LWLock Instrumentation Patch

- counts locks and waits in shared and exclusive mode
- for selected locks, measures wait and hold times
- for selected locks, displays a histogram of wait and hold times
- information is printed at backend exit

Configurable by #define's in lwlock.c

Regards,
pierre

Attachments:

lwlock_instrument_v4.patchapplication/octet-stream; name=lwlock_instrument_v4.patchDownload

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index f2ccbe1..b3e5f76 100644
*** a/src/backend/storage/lmgr/lwlock.c
--- b/src/backend/storage/lmgr/lwlock.c
***************
*** 29,34 ****
--- 29,88 ----
  #include "storage/ipc.h"
  #include "storage/proc.h"
  #include "storage/spin.h"
+ #include "portability/instr_time.h"
+ #include "utils/ps_status.h"
+ 
+ #include <math.h>
+ 
+ /******************* LWLock statistics controls **************************/
+ 
+ /* displays counters (number of lock acquires, etc)
+  * on process exit */
+ #define LWLOCK_STATS
+ 
+ #ifdef LWLOCK_STATS
+ 
+ /* also displays cumulative timings of time spent
+  * waiting and holding locks
+  * (depends on LWLOCK_STATS)
+  * Caution : measuring time has significant overhead !
+  */
+ #define LWLOCK_TIMING_STATS
+ 
+ /* If defined, the "lock wait" time doesn't include spinlock wait time.
+  * This has lower overhead, since if the lock is acquired without wait,
+  * we record a wait time of 0 without actually measuring it. But we won't
+  * know if the spinlock itself delays.
+  */
+ #define LWLOCK_TIMING_STARTS_AFTER_SPINLOCK
+ 
+ /* displays only locks which were waited on more than
+  * this amount of seconds (set to 0 to display all)
+  * (depends on LWLOCK_TIMING_STATS)
+  */
+ #define LWLOCKS_STATS_TIME_DISPLAY_THRESHOLD 0.1
+ 
+ #ifdef LWLOCK_TIMING_STATS
+ /* allows the selection of a speficic lock (or locks) to time
+  * so no time is wasted timing other locks */
+ static inline bool lwlock_stats_time_lock( LWLockId id )
+ {
+ 	return id == WALInsertLock || id == WALWriteLock;
+ }
+ /* allows the selection of a speficic lock (or locks) for which
+  * we'll need a histogram of lock wait/held times */
+ static inline bool lwlock_stats_time_histo_lock( LWLockId id )
+ {
+ 	return id == WALInsertLock;
+ }
+ #endif
+ 
+ static void lock_stats_setup( void );
+ static void print_lwlock_stats(int code, Datum arg);
+ 
+ #endif /* LWLOCK_STATS */
+ 
+ /*************************************************************************/
  
  
  /* We use the ShmemLock spinlock to protect LWLockAssign */
*************** static LWLockId held_lwlocks[MAX_SIMUL_L
*** 88,100 ****
  static int	lock_addin_request = 0;
  static bool lock_addin_request_allowed = true;
  
- #ifdef LWLOCK_STATS
- static int	counts_for_pid = 0;
- static int *sh_acquire_counts;
- static int *ex_acquire_counts;
- static int *block_counts;
- #endif
- 
  #ifdef LOCK_DEBUG
  bool		Trace_lwlocks = false;
  
--- 142,147 ----
*************** LOG_LWDEBUG(const char *where, LWLockId 
*** 119,148 ****
  #define LOG_LWDEBUG(a,b,c)
  #endif   /* LOCK_DEBUG */
  
! #ifdef LWLOCK_STATS
  
! static void
! print_lwlock_stats(int code, Datum arg)
  {
! 	int			i;
! 	int		   *LWLockCounter = (int *) ((char *) LWLockArray - 2 * sizeof(int));
! 	int			numLocks = LWLockCounter[1];
! 
! 	/* Grab an LWLock to keep different backends from mixing reports */
! 	LWLockAcquire(0, LW_EXCLUSIVE);
! 
! 	for (i = 0; i < numLocks; i++)
! 	{
! 		if (sh_acquire_counts[i] || ex_acquire_counts[i] || block_counts[i])
! 			fprintf(stderr, "PID %d lwlock %d: shacq %u exacq %u blk %u\n",
! 					MyProcPid, i, sh_acquire_counts[i], ex_acquire_counts[i],
! 					block_counts[i]);
! 	}
! 
! 	LWLockRelease(0);
! }
! #endif   /* LWLOCK_STATS */
  
  
  /*
   * Compute number of LWLocks to allocate.
--- 166,206 ----
  #define LOG_LWDEBUG(a,b,c)
  #endif   /* LOCK_DEBUG */
  
! /* struct for statistics info */
  
! #ifdef LWLOCK_STATS
! typedef struct
  {
! 	/* Process-local Counters for the number of times the lock has been ... */
! 	int		count_acquired_share;			/* acquired in SHARE mode */
! 	int		count_waited_share;				/* and it needed to wait */
! 	
! 	int		count_acquired_exclusive;		/*	acquired in EXCLUSIVEmode */
! 	int		count_waited_exclusive;			/* and it needed to wait */
! 	
! #ifdef LWLOCK_TIMING_STATS
! 	/* Process local sum of time during which ... */
! 	instr_time		time_waited_share;		/* we waited to get a SHARE lock */
! 	instr_time		time_held_share;		/* we held the SHARE lock */
! 	instr_time		time_waited_exclusive;	/* we waited to get an EXCLUSIVE lock */
! 	instr_time		time_held_exclusive;	/* we held the EXCLUSIVE lock */
! 	
! 	/* histogram of lock wait/hold times is an array of ints, with array[n]
! 	 * being a counter for how many occurences of wait time approx. 2^n nanoseconds.
! 	 */
! 	int32			*time_waited_exclusive_histogram;
! 	int32			*time_held_exclusive_histogram;
! 	
! 	instr_time		ta, th;	/* temp variables */
! #endif
! } LWLockStatsData;
  
+ static int	counts_for_pid = 0;
+ static LWLockStatsData *lwlock_stats;
+ #ifdef LWLOCK_TIMING_STATS
+ static instr_time lwlock_stats_begin_time;
+ #endif	/* LWLOCK_TIMING_STATS */
+ #endif	/* LWLOCK_STATS */
  
  /*
   * Compute number of LWLocks to allocate.
*************** LWLockAssign(void)
*** 300,306 ****
  	return result;
  }
  
- 
  /*
   * LWLockAcquire - acquire a lightweight lock in the specified mode
   *
--- 358,363 ----
*************** LWLockAcquire(LWLockId lockid, LWLockMod
*** 315,341 ****
  	PGPROC	   *proc = MyProc;
  	bool		retry = false;
  	int			extraWaits = 0;
! 
  	PRINT_LWDEBUG("LWLockAcquire", lockid, lock);
  
  #ifdef LWLOCK_STATS
  	/* Set up local count state first time through in a given process */
  	if (counts_for_pid != MyProcPid)
  	{
! 		int		   *LWLockCounter = (int *) ((char *) LWLockArray - 2 * sizeof(int));
! 		int			numLocks = LWLockCounter[1];
! 
! 		sh_acquire_counts = calloc(numLocks, sizeof(int));
! 		ex_acquire_counts = calloc(numLocks, sizeof(int));
! 		block_counts = calloc(numLocks, sizeof(int));
! 		counts_for_pid = MyProcPid;
! 		on_shmem_exit(print_lwlock_stats, 0);
  	}
! 	/* Count lock acquisition attempts */
! 	if (mode == LW_EXCLUSIVE)
! 		ex_acquire_counts[lockid]++;
! 	else
! 		sh_acquire_counts[lockid]++;
  #endif   /* LWLOCK_STATS */
  
  	/*
--- 372,399 ----
  	PGPROC	   *proc = MyProc;
  	bool		retry = false;
  	int			extraWaits = 0;
! #ifdef LWLOCK_TIMING_STATS
! 	bool		acq_time_recorded = false;
! #endif
! 	
  	PRINT_LWDEBUG("LWLockAcquire", lockid, lock);
  
  #ifdef LWLOCK_STATS
  	/* Set up local count state first time through in a given process */
  	if (counts_for_pid != MyProcPid)
+ 		lock_stats_setup();
+ 	
+ 	/* Count lock acquisition attempts */
+ 	if (mode == LW_EXCLUSIVE)	lwlock_stats[lockid].count_acquired_exclusive++;
+ 	else						lwlock_stats[lockid].count_acquired_share++;
+ #if defined(LWLOCK_TIMING_STATS) && !defined(LWLOCK_TIMING_STARTS_AFTER_SPINLOCK)
+ 	/* record start of wait time */
+ 	if( lwlock_stats_time_lock( lockid ))
  	{
! 		INSTR_TIME_SET_CURRENT( lwlock_stats[lockid].ta );
! 		acq_time_recorded = true;
  	}
! #endif
  #endif   /* LWLOCK_STATS */
  
  	/*
*************** LWLockAcquire(LWLockId lockid, LWLockMod
*** 355,361 ****
  	 * manipulations of data structures in shared memory.
  	 */
  	HOLD_INTERRUPTS();
! 
  	/*
  	 * Loop here to try to acquire lock after each time we are signaled by
  	 * LWLockRelease.
--- 413,419 ----
  	 * manipulations of data structures in shared memory.
  	 */
  	HOLD_INTERRUPTS();
! 	
  	/*
  	 * Loop here to try to acquire lock after each time we are signaled by
  	 * LWLockRelease.
*************** LWLockAcquire(LWLockId lockid, LWLockMod
*** 445,451 ****
  		LOG_LWDEBUG("LWLockAcquire", lockid, "waiting");
  
  #ifdef LWLOCK_STATS
! 		block_counts[lockid]++;
  #endif
  
  		TRACE_POSTGRESQL_LWLOCK_WAIT_START(lockid, mode);
--- 503,519 ----
  		LOG_LWDEBUG("LWLockAcquire", lockid, "waiting");
  
  #ifdef LWLOCK_STATS
! 		/* count lock waits */
! 		if (mode == LW_EXCLUSIVE)	lwlock_stats[lockid].count_waited_exclusive++;
! 		else						lwlock_stats[lockid].count_waited_share++;
! #if defined(LWLOCK_TIMING_STATS) && defined(LWLOCK_TIMING_STARTS_AFTER_SPINLOCK)
! 		/* record start of wait time */
! 		if( lwlock_stats_time_lock( lockid ))
! 		{
! 			INSTR_TIME_SET_CURRENT( lwlock_stats[lockid].ta );
! 			acq_time_recorded = true;
! 		}
! #endif
  #endif
  
  		TRACE_POSTGRESQL_LWLOCK_WAIT_START(lockid, mode);
*************** LWLockAcquire(LWLockId lockid, LWLockMod
*** 480,485 ****
--- 548,582 ----
  	 */
  	while (extraWaits-- > 0)
  		PGSemaphoreUnlock(&proc->sem);
+ 	
+ #ifdef LWLOCK_TIMING_STATS
+ 	/* record end of wait time and start of hold time */
+ 	if( lwlock_stats_time_lock( lockid ))
+ 	{
+ 		INSTR_TIME_SET_CURRENT( lwlock_stats[lockid].th );
+ 		if( acq_time_recorded )	
+ 		{
+ 			if (mode == LW_EXCLUSIVE)	
+ 			{
+ 				instr_time dt = lwlock_stats[lockid].th;
+ 				INSTR_TIME_SUBTRACT( dt, lwlock_stats[lockid].ta );
+ 				INSTR_TIME_ADD( lwlock_stats[lockid].time_waited_exclusive, dt );
+ 				if( lwlock_stats_time_histo_lock( lockid ))
+ 				{
+ 					double	t = INSTR_TIME_GET_DOUBLE( dt );
+ 					/* increment a counter for a histocram bin, we use log2 ( time in ns ) */
+ 					int		p = Max( 0, Min( 31, log2f( t*1e9 )));
+ 					if( !lwlock_stats[lockid].time_waited_exclusive_histogram )
+ 						lwlock_stats[lockid].time_waited_exclusive_histogram = calloc( 32, sizeof(int32) );
+ 					lwlock_stats[lockid].time_waited_exclusive_histogram[p]++;
+ 				}
+ 			}
+ 			else						
+ 				INSTR_TIME_ACCUM_DIFF( lwlock_stats[lockid].time_waited_share,     
+ 									   lwlock_stats[lockid].th, lwlock_stats[lockid].ta );
+ 		}
+ 	}
+ #endif
  }
  
  /*
*************** LWLockConditionalAcquire(LWLockId lockid
*** 548,553 ****
--- 645,665 ----
  		/* Add lock to list of locks held by this backend */
  		held_lwlocks[num_held_lwlocks++] = lockid;
  		TRACE_POSTGRESQL_LWLOCK_CONDACQUIRE(lockid, mode);
+ 		
+ #ifdef LWLOCK_STATS
+ 		/* Set up local count state first time through in a given process */
+ 		if (counts_for_pid != MyProcPid)
+ 			lock_stats_setup();
+ 		
+ 		/* Count lock acquisition attempts */
+ 		if (mode == LW_EXCLUSIVE)	lwlock_stats[lockid].count_acquired_exclusive++;
+ 		else						lwlock_stats[lockid].count_acquired_share++;
+ #ifdef LWLOCK_TIMING_STATS
+ 		/* record start of hold time */
+ 		if( lwlock_stats_time_lock( lockid ))
+ 			INSTR_TIME_SET_CURRENT( lwlock_stats[lockid].th );
+ #endif
+ #endif   /* LWLOCK_STATS */
  	}
  
  	return !mustwait;
*************** LWLockRelease(LWLockId lockid)
*** 563,568 ****
--- 675,683 ----
  	PGPROC	   *head;
  	PGPROC	   *proc;
  	int			i;
+ #ifdef LWLOCK_TIMING_STATS
+ 	bool		was_exclusive = false;
+ #endif
  
  	PRINT_LWDEBUG("LWLockRelease", lockid, lock);
  
*************** LWLockRelease(LWLockId lockid)
*** 586,592 ****
--- 701,712 ----
  
  	/* Release my hold on lock */
  	if (lock->exclusive > 0)
+ 	{
  		lock->exclusive--;
+ #ifdef LWLOCK_TIMING_STATS
+ 		was_exclusive = true;
+ #endif
+ 	}
  	else
  	{
  		Assert(lock->shared > 0);
*************** LWLockRelease(LWLockId lockid)
*** 651,656 ****
--- 771,802 ----
  	 * Now okay to allow cancel/die interrupts.
  	 */
  	RESUME_INTERRUPTS();
+ 	
+ #ifdef LWLOCK_TIMING_STATS
+ 	/* record end of hold time */
+ 	if( lwlock_stats_time_lock( lockid ))
+ 	{
+ 		INSTR_TIME_SET_CURRENT( lwlock_stats[lockid].ta );
+ 		
+ 		if (was_exclusive)	
+ 		{
+ 			instr_time dt = lwlock_stats[lockid].ta;
+ 			INSTR_TIME_SUBTRACT( dt, lwlock_stats[lockid].th );
+ 			INSTR_TIME_ADD( lwlock_stats[lockid].time_held_exclusive, dt );
+ 			if( lwlock_stats_time_histo_lock( lockid ))
+ 			{
+ 				double	t = INSTR_TIME_GET_DOUBLE( dt );
+ 				int		p = Max( 0, Min( 31, log2f( t*1e9 )));
+ 				if( !lwlock_stats[lockid].time_held_exclusive_histogram )
+ 					lwlock_stats[lockid].time_held_exclusive_histogram = calloc( 32, sizeof(int32) );
+ 				lwlock_stats[lockid].time_held_exclusive_histogram[p]++;
+ 			}
+ 		}
+ 		else				
+ 			INSTR_TIME_ACCUM_DIFF( lwlock_stats[lockid].time_held_share,     
+ 								   lwlock_stats[lockid].ta, lwlock_stats[lockid].th );
+ 	}
+ #endif
  }
  
  
*************** LWLockHeldByMe(LWLockId lockid)
*** 693,695 ****
--- 839,1057 ----
  	}
  	return false;
  }
+ 
+ #ifdef LWLOCK_STATS
+ /*	LWLocks stats functions */
+ 
+ /* returns name of a lock from its Id */
+ static const char* lwlock_stats_lock_name( LWLockId id )
+ {
+ 	static const char * names[] = {
+ 	"BufFreelist",
+ 	"ShmemIndex",
+ 	"OidGen",
+ 	"XidGen",
+ 	"ProcArray",
+ 	"SInvalRead",
+ 	"SInvalWrite",
+ 	"WALInsert",
+ 	"WALWrite",
+ 	"ControlFile",
+ 	"Checkpoint",
+ 	"CLogControl",
+ 	"SubtransControl",
+ 	"MultiXactGen",
+ 	"MultiXactOffsetControl",
+ 	"MultiXactMemberControl",
+ 	"RelCacheInit",
+ 	"BgWriterComm",
+ 	"TwoPhaseState",
+ 	"TablespaceCreate",
+ 	"BtreeVacuum",
+ 	"AddinShmemInit",
+ 	"Autovacuum",
+ 	"AutovacuumSchedule",
+ 	"SyncScan" };
+ 	
+ 	if( id >= 0 && id < (sizeof( names )/sizeof( names[0] )))
+ 		return names[id];
+ 	else
+ 		return "";
+ }
+ 
+ #ifdef LWLOCK_TIMING_STATS
+ /* prints histogram of time values */
+ static void lwlock_stats_print_histo( int lockid, const char* legend, int count_nowait, int* histo )
+ {
+ 	/* tabular display (col,line) */
+ 	int col,line,bin;
+ 	double total_time = 0;
+ 	
+ 	for( bin=1; bin<31; bin++ )
+ 		total_time += (double)(1<<(bin-1)) * histo[bin-1];
+ 	
+ 	printf( "\nHistogram for lock %d (%s) : %s\n", lockid, lwlock_stats_lock_name( lockid ), legend);
+ 	for( line=0; line<8; line++ )
+ 	{
+ 		for( col=0; col<4; col++ )
+ 		{
+ 			/* bin is the cell to display, #0 is no-wait locks */
+ 			bin = line+col*8;
+ 			
+ 			if( !bin )
+ 				printf( "|         no wait %9d                ", count_nowait );
+ 			else
+ 			{
+ 				double	bin_time = (1U<<(bin-1));		/* time waited (approx) in nanoseconds */
+ 				double	time = bin_time * histo[bin-1];
+ 				static const char* units[] = { "ns", "us", "ms", " s" };	/* unit display */
+ 				int unit;
+ 				
+ 				for( unit=0; unit<3; unit++ )	/* get right unit */
+ 				{
+ 					if( bin_time < 1000 ) break;
+ 					bin_time *= .001;
+ 				}
+ 				
+ 				printf( "|%11.00f %s x %9d   ", 
+ 						bin_time, units[unit], histo[bin-1] );
+ 				
+ 				if( histo[bin-1] )
+ 					printf( "-> %5.02f%%    ", 100.1 * time / total_time );
+ 				else
+ 					printf( "             " );
+ 			}
+ 		}
+ 		printf( "\n" );
+ 	}
+ }
+ #endif
+ 
+ /* prints formatted statistics */
+ static void
+ print_lwlock_stats(int code, Datum arg)
+ {
+ 	int		id, lines=0, display;
+ 	int		*LWLockCounter = (int *) ((char *) LWLockArray - 2 * sizeof(int));
+ 	int		numLocks = LWLockCounter[1];
+ 	bool	something_to_display = false;
+ 	
+ #ifdef LWLOCK_TIMING_STATS
+ 	/* compute process life time to display % of time spent in locks */
+ 	instr_time	end_time;
+ 	double		total_percents;
+ 	INSTR_TIME_SET_CURRENT( end_time );
+ 	INSTR_TIME_SUBTRACT( end_time, lwlock_stats_begin_time );
+ 	total_percents = 100.0 / INSTR_TIME_GET_DOUBLE( end_time );
+ #endif
+ 	
+ 	/* Grab an LWLock to keep different backends from mixing reports */
+ 	LWLockAcquire(0, LW_EXCLUSIVE);
+ 
+ 
+ 	for( display=0; display<=1; display++ )
+ 	{
+ 		/* first time through the loop, check if we have something to display */
+ 		if( display )
+ 		{
+ 			const char* title;
+ 			int			title_len;
+ 			title = get_ps_display( &title_len );
+ 			
+ 			fprintf( stderr, "\n-------- Lock stats for PID %d : %*s\n", MyProcPid, title_len, title );
+ 		}
+ 
+ 		for (id = 0; id < numLocks; id++)
+ 		{
+ 			LWLockStatsData *p = &lwlock_stats[id];
+ #ifdef LWLOCK_TIMING_STATS
+ 			double	time_waited_share		= INSTR_TIME_GET_DOUBLE(p->time_waited_share	);
+ 			double	time_held_share			= INSTR_TIME_GET_DOUBLE(p->time_held_share		);		
+ 			double	time_waited_exclusive	= INSTR_TIME_GET_DOUBLE(p->time_waited_exclusive);	
+ 			double	time_held_exclusive		= INSTR_TIME_GET_DOUBLE(p->time_held_exclusive	);	
+ #endif
+ 		
+ 			if( !( p->count_waited_share || p->count_waited_exclusive ))
+ 				continue;		
+ 		
+ #ifdef LWLOCK_TIMING_STATS
+ 			if(    time_waited_share		< LWLOCKS_STATS_TIME_DISPLAY_THRESHOLD
+ 				&& time_waited_exclusive	< LWLOCKS_STATS_TIME_DISPLAY_THRESHOLD )
+ 				continue;
+ #endif		
+ 			
+ 			something_to_display = true;
+ 			
+ 			if( !display )
+ 				continue;		
+ 			
+ #ifdef LWLOCK_TIMING_STATS
+ 			if( !(lines++ & 15) )
+ 				fprintf( stderr, "    PID    Lock      ShAcq     ShWait     ShWaitT     ShHeldT      ExAcq     "
+ 								 "ExWait             ExWaitT                ExHeldT     Name\n" );
+ 		
+ 			fprintf( stderr,
+ 				"%7d %7d %10d %10d %11.02f %11.02f %10d %10d %11.02f (%6.02f %%) %11.02f (%6.02f %%)  %s\n",
+ 				MyProcPid, id, 
+ 				p->count_acquired_share, p->count_waited_share, time_waited_share, time_held_share,
+ 				p->count_acquired_exclusive, p->count_waited_exclusive, 
+ 				time_waited_exclusive, 	time_waited_exclusive * total_percents,
+ 				time_held_exclusive,	time_held_exclusive * total_percents,
+ 				lwlock_stats_lock_name( id ));
+ #else
+ 			if( !(lines++ & 15) )
+ 				fprintf( stderr, "    PID    Lock      ShAcq     ShWait     ExAcq      ExWait    Name\n" );
+ 			
+ 			fprintf( stderr,
+ 				"%7d %7d %10d %10d %10d %10d    %s\n",
+ 				MyProcPid, id, 
+ 				p->count_acquired_share, p->count_waited_share, 
+ 				p->count_acquired_exclusive, p->count_waited_exclusive, 
+ 				lwlock_stats_lock_name( id ));
+ #endif		
+ 		}
+ 		
+ 		if( !something_to_display )
+ 			break;
+ 	}
+ 		
+ #ifdef LWLOCK_TIMING_STATS
+ 	/* display histogram of lock wait times */
+ 	if( something_to_display )
+ 		for (id = 0; id < numLocks; id++)
+ 		{
+ 			LWLockStatsData *p = &lwlock_stats[id];
+ 			double	time_waited_exclusive	= INSTR_TIME_GET_DOUBLE(p->time_waited_exclusive);	
+ 			double	time_held_exclusive		= INSTR_TIME_GET_DOUBLE(p->time_held_exclusive	);	
+ 			
+ 			if( p->time_waited_exclusive_histogram && time_waited_exclusive > LWLOCKS_STATS_TIME_DISPLAY_THRESHOLD )
+ 				lwlock_stats_print_histo(	id, "Exclusive WAIT Time", 
+ 											p->count_acquired_exclusive - p->count_waited_exclusive,
+ 											p->time_waited_exclusive_histogram );
+ 			
+ 			if( p->time_held_exclusive_histogram && time_held_exclusive > LWLOCKS_STATS_TIME_DISPLAY_THRESHOLD)
+ 				lwlock_stats_print_histo(	id, "Exclusive HOLD Time", 
+ 											0,
+ 											p->time_held_exclusive_histogram );
+ 		}
+ #endif
+ 
+ 	LWLockRelease(0);
+ }
+ 
+ /* setup backend-local stats information */
+ static void lock_stats_setup( void )
+ {
+ 	int		   *LWLockCounter = (int *) ((char *) LWLockArray - 2 * sizeof(int));
+ 	int			numLocks = LWLockCounter[1];
+ 
+ 	lwlock_stats = calloc(numLocks, sizeof(LWLockStatsData));
+ 	counts_for_pid = MyProcPid;
+ 
+ #ifdef LWLOCK_TIMING_STATS
+ 	INSTR_TIME_SET_CURRENT( lwlock_stats_begin_time );
+ #endif
+ 
+ 	on_shmem_exit(print_lwlock_stats, 0);
+ }
+ #endif   /* LWLOCK_STATS */

#14

Josh Berkus

josh@agliodbs.com

over 16 years ago

In reply to: Pierre Frédéric Caillaud (#13)

Re: Lock Wait Statistics (next commitfest)

Pierre,

Configurable by #define's in lwlock.c

Given that we already have dtrace/systemtap probes around the lwlocks,
is there some way you could use those instead of extra #defines?

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

#15

Mark Kirkwood

markir@paradise.net.nz

over 16 years ago

In reply to: Stephen Frost (#12)

Re: Lock Wait Statistics (next commitfest)

Stephen Frost wrote:

Mark,

Your last email on this patch, from August 9th, indicates that you've
still got "TODO: redo pg_stat_lock_waits ...". Has you updated this
patch since then?

Stephen,

No - that is still a TODO for me - real life getting in the way :-)

Cheers

Mark

#16

Jaime Casanova

jcasanov@systemguards.com.ec

over 16 years ago

In reply to: Mark Kirkwood (#11)

Re: Lock Wait Statistics (next commitfest)

On Sat, Aug 8, 2009 at 7:47 PM, Mark Kirkwood <markir@paradise.net.nz> wrote:

Patch with max(wait time).

Still TODO

- amalgamate individual transaction lock waits
- redo (rather ugly) temporary pg_stat_lock_waits in a form more like
pg_locks

This version has the individual transaction lock waits amalgamated.

Still TODO: redo pg_stat_lock_waits ...

it applies with some hunks, compiles fine and seems to work...
i'm still not sure this is what we need, some more comments could be helpful.

what kind of questions are we capable of answer with this and and what
kind of questions are we still missing?

for example, now we know "number of locks that had to wait", "total
time waiting" and "max time waiting for a single lock"... but still we
can have an inaccurate understanding if we have lots of locks waiting
little time and a few waiting a huge amount of time...

something i have been asked when system starts to slow down is "can we
know if there were a lock contention on that period"? for now the only
way to answer that is logging locks

about the patch itself:
you haven't documented either. what is the pg_stat_lock_waits view
for? and what are those fieldx it has?

i'll let this patch as "needs review" for more people to comment on it...

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#17

Jeff Janes

jeff.janes@gmail.com

over 16 years ago

In reply to: Jaime Casanova (#16)

Re: Lock Wait Statistics (next commitfest)

On Mon, Sep 28, 2009 at 12:14 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

On Sat, Aug 8, 2009 at 7:47 PM, Mark Kirkwood <markir@paradise.net.nz> wrote:

Patch with max(wait time).

Still TODO

- amalgamate individual transaction lock waits
- redo (rather ugly) temporary pg_stat_lock_waits in a form more like
pg_locks

This version has the individual transaction lock waits amalgamated.

Still TODO: redo pg_stat_lock_waits ...

it applies with some hunks, compiles fine and seems to work...
i'm still not sure this is what we need, some more comments could be helpful.

I'm pretty sure the logic of this patch is not correct.

in pgstat_init_lock_wait(LOCKTAG *locktag)
...
+       l_curr = htabent->l_counts.l_tot_wait_time;
+       INSTR_TIME_SET_CURRENT(l_start);
+       INSTR_TIME_ADD(l_curr, l_start);
+       htabent->l_counts.l_tot_wait_time = l_curr;

in pgstat_end_lock_wait(LOCKTAG  *locktag)
...
+       l_start = htabent->l_counts.l_tot_wait_time;
+       INSTR_TIME_SET_CURRENT(l_end);
+       INSTR_TIME_SUBTRACT(l_end, l_start);
+       htabent->l_counts.l_tot_wait_time = l_end;

So l_start = time cumulatively waited previously + time at start of this wait.

l_end - l_start is equal to:

= time at end of this wait - ( time at start of this wait + time
cumulatively waited previously)
= (time at end of this wait - time at start of this wait) - time
cumulatively waited previously
= (duration of this wait) - time waited cumulatively previously.

That minus sign in the last line can't be good, can it?

Also

+       htabent->l_counts.l_tot_wait_time = l_end;
+
+       if (INSTR_TIME_GET_MICROSEC(l_end) >
INSTR_TIME_GET_MICROSEC(htabent->l_counts.l_max_wait_time))
+               htabent->l_counts.l_max_wait_time = l_end;

The total wait time is equal to the max wait time (which are both
equal to l_end)?
One or both of those has to end up being wrong. At this stage, is
l_end supposed to be the last wait time, or the cumulative wait time?

One of the things in the patch review checklist is about compiler
warnings, so I am reporting these:

lock.c: In function `LockAcquire':
lock.c:797: warning: passing arg 1 of `pgstat_init_lock_wait' discards
qualifiers from pointer target type
lock.c:802: warning: passing arg 1 of `pgstat_end_lock_wait' discards
qualifiers from pointer target type

Cheers,

Jeff

#18

Mark Kirkwood

markir@paradise.net.nz

over 16 years ago

In reply to: Jaime Casanova (#16)

Re: Lock Wait Statistics (next commitfest)

Jaime Casanova wrote:

it applies with some hunks, compiles fine and seems to work...
i'm still not sure this is what we need, some more comments could be helpful.

Yeah, that's the big question. Are the current capabilities (logging 'em
for waits > deadlock timeout + dtrace hooks) enough? I'm thinking that
if we had dtrace for linux generally available, then the need for this
patch would be lessened.

what kind of questions are we capable of answer with this and and what
kind of questions are we still missing?

for example, now we know "number of locks that had to wait", "total
time waiting" and "max time waiting for a single lock"... but still we
can have an inaccurate understanding if we have lots of locks waiting
little time and a few waiting a huge amount of time...

something i have been asked when system starts to slow down is "can we
know if there were a lock contention on that period"? for now the only
way to answer that is logging locks

Right - there still may be other aggregates that need to be captured....
it would be great to have some more feedback from the field about this.
In my case, I was interested in seeing if the elapsed time was being
spent waiting for locks or actually executing (in fact it turned out to
be the latter - but was still very useful to be able to rule out locking
issues). However , as you mention - there maybe cases where the
question is more about part of the system suffering a disproportional
number/time of lock waits...

about the patch itself:
you haven't documented either. what is the pg_stat_lock_waits view
for? and what are those fieldx it has?

Yeah, those fields are straight from the LOCKTAG structure. I need to
redo the view to be more like pg_locks, and also do the doco. However
I've been a bit hesitant to dive into these last two steps until I see
that there is some real enthusiasm for this patch (or failing that, a
feeling that it is needed...).

i'll let this patch as "needs review" for more people to comment on it...

Agreed, thanks for the comments.

Mark

#19

Mark Kirkwood

markir@paradise.net.nz

over 16 years ago

In reply to: Jeff Janes (#17)

Re: Lock Wait Statistics (next commitfest)

Jeff Janes wrote:

The total wait time is equal to the max wait time (which are both
equal to l_end)?
One or both of those has to end up being wrong. At this stage, is
l_end supposed to be the last wait time, or the cumulative wait time?

Hmm - I may well have fat fingered the arithmetic, thanks I'll take a look!

One of the things in the patch review checklist is about compiler
warnings, so I am reporting these:

lock.c: In function `LockAcquire':
lock.c:797: warning: passing arg 1 of `pgstat_init_lock_wait' discards
qualifiers from pointer target type
lock.c:802: warning: passing arg 1 of `pgstat_end_lock_wait' discards
qualifiers from pointer target type

Right, will look at too.

Cheers

Mark

#20

Jeff Janes

jeff.janes@gmail.com

over 16 years ago

In reply to: Jaime Casanova (#16)

Re: Lock Wait Statistics (next commitfest)

On Mon, Sep 28, 2009 at 12:14 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

On Sat, Aug 8, 2009 at 7:47 PM, Mark Kirkwood <markir@paradise.net.nz> wrote:

Patch with max(wait time).

Still TODO

- amalgamate individual transaction lock waits
- redo (rather ugly) temporary pg_stat_lock_waits in a form more like
pg_locks

This version has the individual transaction lock waits amalgamated.

Still TODO: redo pg_stat_lock_waits ...

it applies with some hunks, compiles fine and seems to work...
i'm still not sure this is what we need, some more comments could be helpful.

what kind of questions are we capable of answer with this and and what
kind of questions are we still missing?

for example, now we know "number of locks that had to wait", "total
time waiting" and "max time waiting for a single lock"... but still we
can have an inaccurate understanding if we have lots of locks waiting
little time and a few waiting a huge amount of time...

Aren't the huge ones already loggable from the deadlock detector?

With the max, we can at least put an upper limit on how long the
longest ones could have been. However, is there a way to reset the
max? I tried deleting data/pg_stat_tmp, but that didn't work. With
cumulative values, you can you take snapshots and then take the
difference of them, that won't work with max. If the max can't be
reset except with an initdb, I think that makes it barely usable.

something i have been asked when system starts to slow down is "can we
know if there were a lock contention on that period"? for now the only
way to answer that is logging locks

I was surprised to find that running with track_locks on did not cause
a detectable difference in performance, so you could just routinely do
regularly scheduled snapshots and go back and mine them over the time
that a problem was occurring. I just checked with pgbench over
various levels of concurrency and fsync settings. If potential
slowness wouldn't show up there, I don't know how else to look for it.

Cheers,

Jeff

#21

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Jeff Janes (#20)

Re: Lock Wait Statistics (next commitfest)

On Sun, Oct 4, 2009 at 4:14 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Mon, Sep 28, 2009 at 12:14 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

On Sat, Aug 8, 2009 at 7:47 PM, Mark Kirkwood <markir@paradise.net.nz> wrote:

Patch with max(wait time).

Still TODO

- amalgamate individual transaction lock waits
- redo (rather ugly) temporary pg_stat_lock_waits in a form more like
pg_locks

This version has the individual transaction lock waits amalgamated.

Still TODO: redo pg_stat_lock_waits ...

it applies with some hunks, compiles fine and seems to work...
i'm still not sure this is what we need, some more comments could be helpful.

what kind of questions are we capable of answer with this and and what
kind of questions are we still missing?

for example, now we know "number of locks that had to wait", "total
time waiting" and "max time waiting for a single lock"... but still we
can have an inaccurate understanding if we have lots of locks waiting
little time and a few waiting a huge amount of time...

Aren't the huge ones already loggable from the deadlock detector?

With the max, we can at least put an upper limit on how long the
longest ones could have been. However, is there a way to reset the
max? I tried deleting data/pg_stat_tmp, but that didn't work. With
cumulative values, you can you take snapshots and then take the
difference of them, that won't work with max. If the max can't be
reset except with an initdb, I think that makes it barely usable.

something i have been asked when system starts to slow down is "can we
know if there were a lock contention on that period"? for now the only
way to answer that is logging locks

I was surprised to find that running with track_locks on did not cause
a detectable difference in performance, so you could just routinely do
regularly scheduled snapshots and go back and mine them over the time
that a problem was occurring. I just checked with pgbench over
various levels of concurrency and fsync settings. If potential
slowness wouldn't show up there, I don't know how else to look for it.

It seems that this patch had open TODO items at the beginning of the
CommitFest (so perhaps we should have bounced it immediately), and I
think that's still the case now, so I am going to mark this as
Returned with Feedback. A lot of good reviewing has been done,
though, so many this can be submitted for a future CommitFest in
something close to a final form.

Thanks,

...Robert

#22

Bruce Momjian

bruce@momjian.us

almost 16 years ago

In reply to: Mark Kirkwood (#1)

Re: Lock Wait Statistics (next commitfest)

What happened to this patch?

---------------------------------------------------------------------------

Mark Kirkwood wrote:

Where I work they make extensive use of Postgresql. One of the things
they typically want to know about are lock waits. Out of the box in
there is not much in the way of tracking for such, particularly in older
versions. The view pg_stats is fine for stuff happening *now*, but
typically I find I'm being asked about something that happened last
night...

Now for those platforms with dtrace there is a lock wait probe, useful -
but not for those of us on Linux! There is the conf option to log lock
waits > deadlock timeout (8.3 onwards), and this is great, but I
wondered if having something like this available as part of the stats
module would be useful.

So here is my initial attempt at this, at this point merely to spark
discussion (attached patch)

I have followed some of the ideas from the function execution stats, but
locks required some special treatment.

- new parameter track_locks (maybe should be track_lock_waits?)
- new hash locks attached to stats dbentry
- several new stat*lock c functions
- new view pg_stat_lock_waits
- new attributes for pg_stat_database

This version has simply exposed the locktag structure in the view along
with corresponding #waits and wait time. This should probably get
reformed to look a little more like pg_locks. I figured this is easily
doable along with the no doubt many changes coming from revew comments.

Also I did not do any clever amalgamation of transaction lock waits -
there is probably gonna be a lot of those and keeping the detail is
unlikely to be useful. It would be easy to collect them all together in
one transaction entry.

regards

Mark

[ application/x-gzip is not supported, skipping... ]

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com
  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do
  + If your life is a hard drive, Christ can be your backup. +

#23

Greg Smith

greg@2ndquadrant.com

almost 16 years ago

In reply to: Bruce Momjian (#22)

Re: Lock Wait Statistics (next commitfest)

Bruce Momjian wrote:

What happened to this patch?

Returned with feedback in October after receiving a lot of review, no
updated version submitted since then:

https://commitfest.postgresql.org/action/patch_view?id=98

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

#24

Gokulakannan Somasundaram

gokul007@gmail.com

almost 16 years ago

In reply to: Greg Smith (#23)

Re: Lock Wait Statistics (next commitfest)

I am just adding my two cents, please ignore it, if its totally irrelevant.
While we do performance testing/tuning of any applications, the important
things, a standard monitoring requirement from a database are
a) Different type of wait events and the time spent in each of them
b) Top ten Queries by Total Logical reads & Average Logical Reads
c) Top ten Queries by Total CPU Time & Average CPU Time

The monitoring methodology should not put too much overhead during the test
to invalidate the application response times captured during the performance
test (Let's not worry about Heisenberg uncertainty for now :)) )

Of all the databases i worked with, Oracle provides the best monitoring
product in the form of Statspack.

Statspack works by the following way -a) it takes a copy of important
catalog tables(pg_ tables) which store the information like wait statistics
against wait events, i/o statistics cumulative against each SQL_Hash( and
SQL_Text), whether a particular plan went for hard parse/ soft parse(because
of plan caching) and the status of different in-memory data structures etc.

So we take a snapshot like this before and after the test and generate
statspack report out of it, which contains all the necessary information for
database level tuning. So we are never left in the dark from database tuning
perspective.

Recently i wrote a set of SQL Statements, which will do the same for SQL
Server from their sys tables like wait_io_events, query_io_stats etc and
finally will retrieve the information in the same format as Statspack.

But i think we lack some functionality like that in Postgres. I think things
like DTrace are more for developers than for users and as already pointed
out, will work only in Solaris. While we can expect that for Linux shortly,
people in windows do not have much options. (While i am maintaining that
DTrace is a absolutely wonderful hooking mechanism). So we should aim to
develop a monitoring mechanism like statspack for postgres.

Hope i have delievered my concern.

Thanks,
Gokul.

On Sat, Feb 27, 2010 at 10:40 AM, Greg Smith <greg@2ndquadrant.com> wrote:

Show quoted text

Bruce Momjian wrote:

What happened to this patch?

Returned with feedback in October after receiving a lot of review, no
updated version submitted since then:

https://commitfest.postgresql.org/action/patch_view?id=98

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

almost 16 years ago

In reply to: Greg Smith (#23)

Re: Lock Wait Statistics (next commitfest)

Greg Smith wrote:

Bruce Momjian wrote:

What happened to this patch?

Returned with feedback in October after receiving a lot of review, no
updated version submitted since then:

https://commitfest.postgresql.org/action/patch_view?id=98

Hmm - I would say a bit of review rather than a lot :-)

The feeling I received from the various comments was a lukewarm level of
interest at best, so I must confess that I let other things take
precedence. Also I was after some clear feedback about whether a
separate stats utility was necessary at all given we have Dtrace support
- despite this not being available for Linux... and the only comment
dealing to this concern is from Gokul just now!

I'd also like to take the opportunity to express a little frustration
about the commitfest business - really all I wanted was the patch
*reviewed* as WIP - it seemed that in order to do that I needed to enter
it into the various commitfests... then I was faced with comments to the
effect that it was not ready for commit so should not have been entered
into a commifest at all... sigh, a bit of an enthusiasm killer I'm afraid...

regards

Mark

#26

Tom Lane

tgl@sss.pgh.pa.us

almost 16 years ago

In reply to: Mark Kirkwood (#25)

Re: Lock Wait Statistics (next commitfest)

Mark Kirkwood <mark.kirkwood@catalyst.net.nz> writes:

I'd also like to take the opportunity to express a little frustration
about the commitfest business - really all I wanted was the patch
*reviewed* as WIP - it seemed that in order to do that I needed to enter
it into the various commitfests... then I was faced with comments to the
effect that it was not ready for commit so should not have been entered
into a commifest at all... sigh, a bit of an enthusiasm killer I'm afraid...

Well, entering a patch in a commitfest is certainly the best way to
ensure that you'll get some feedback. If you just pop it up on the
mailing lists, you may or may not draw much commentary depending on
how interested people are and how much time they have that day.
(A day or so later there'll be other topics to distract them.)

As long as the patch submission is clearly labeled WIP you shouldn't
get complaints about it not being ready to commit.

The other approach I'd suggest, if what you mainly want is design
review, is to not post a patch at all. Post a design spec, or even
just specific questions. It's less work for people to look at and
so you're more likely to get immediate feedback.

regards, tom lane

#27

Greg Smith

greg@2ndquadrant.com

almost 16 years ago

In reply to: Mark Kirkwood (#25)

Re: Lock Wait Statistics (next commitfest)

Mark Kirkwood wrote:

Greg Smith wrote:

Returned with feedback in October after receiving a lot of review, no
updated version submitted since then:

https://commitfest.postgresql.org/action/patch_view?id=98

Hmm - I would say a bit of review rather than a lot :-)

It looks like you got useful feedback from at least three people, and
people were regularly looking at your patch in some form for about three
months. That's a lot of review. In many other open-source projects,
your first patch would have been rejected after a quick look as
unsuitable and that would have been the end of things for you. I feel
lucky every time I get a volunteer to spend time reading my work and
suggesting how it could be better; your message here doesn't seem to
share that perspective.

I'd also like to take the opportunity to express a little frustration
about the commitfest business - really all I wanted was the patch
*reviewed* as WIP - it seemed that in order to do that I needed to
enter it into the various commitfests... then I was faced with
comments to the effect that it was not ready for commit so should not
have been entered into a commifest at all... sigh, a bit of an
enthusiasm killer I'm afraid...

To lower your frustration level next time, make sure to label the e-mail
and the entry on the CommitFest app with the magic abbreviation "WIP"
and this shouldn't be so much of an issue. The assumption for patches
is that someone submitted them as commit candidates, and therefore they
should be reviewed to that standard, unless clearly labeled otherwise.
You briefly disclaimed yours as not being in that category in the
initial text of your first message, but it was easy to miss that,
particularly once it had been >8 months from when that messages showed
up and it was still being discussed.

If you wanted to pick this back up again, I'd think that a look at
what's been happening with the lock_timeout GUC patch would be
informative--I'd think that has some overlap with the sort of thing you
were trying to do.

FYI, I thought your patch was useful, but didn't spent time on it
because it's not ambitious enough. I would like to see statistics on a
lot more types of waiting than just locks, and keep trying to find time
to think about that big problem rather than worrying about the
individual pieces of it.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

#28

Greg Smith

greg@2ndquadrant.com

almost 16 years ago

In reply to: Gokulakannan Somasundaram (#24)

Re: Lock Wait Statistics (next commitfest)

Gokulakannan Somasundaram wrote:

Statspack works by the following way -a) it takes a copy of important
catalog tables(pg_ tables) which store the information like wait
statistics against wait events, i/o statistics cumulative against each
SQL_Hash( and SQL_Text), whether a particular plan went for hard
parse/ soft parse(because of plan caching) and the status of different
in-memory data structures etc.

This is actually missing the real work that went into building this
feature into Oracle. Before it was possible to build statspack, first
they went to the trouble of noting every source of this form of
latency--lock waits, I/O statistics and waits, buffer pool waits--and
instrumented every single one of them internally. Basically, every time
something that might wait for a resource you later wanted to monitor the
wait for happens, a start/end timestamp for that wait is noted, and
ultimately the difference between them noting how long the event took is
stored into the database. That's the data you must have collected at
some point in order to get the summary.

Meanwhile, PostgreSQL development is resistant to making any changes in
this direction under the theory that a) it adds a lot of code complexity
and b) constant calls to get the current system time are too expensive
on some platforms to do them all the time. Until those two things are
sorted out, what the high-level interface to the direction you're
suggesting looks like doesn't really matter. DTrace support has managed
to clear both of those hurdles due to its optional nature, perceived low
overhead, and removing *storing* all the events generated to something
that happens outside of the database.

I agree with you that something like statspack is the direction
PostgreSQL eventually needs to go, but it's going to be an uphill battle
the whole time to get it built. The objections will be that it will add
too much overhead at the lowest levels, where the data needed to support
it is collected at. Once that is cleared, the high-level interface is
easy to build.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

#29

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

almost 16 years ago

In reply to: Greg Smith (#27)

Re: Lock Wait Statistics (next commitfest)

Greg Smith wrote:

Mark Kirkwood wrote:

Greg Smith wrote:

Returned with feedback in October after receiving a lot of review,
no updated version submitted since then:

https://commitfest.postgresql.org/action/patch_view?id=98

Hmm - I would say a bit of review rather than a lot :-)

It looks like you got useful feedback from at least three people, and
people were regularly looking at your patch in some form for about
three months. That's a lot of review. In many other open-source
projects, your first patch would have been rejected after a quick look
as unsuitable and that would have been the end of things for you. I
feel lucky every time I get a volunteer to spend time reading my work
and suggesting how it could be better; your message here doesn't seem
to share that perspective.

I don't mean to be ungrateful about the actual reviews at all - and I
did value the feedback received (which I hope was reasonably clear in
the various replies I sent). I sense a bit of attacking the messenger in
your tone... I've submitted several patches to Postgres in the past, and
have previously always enjoyed the experience, and I do get the culture
- being a volunteer myself.

To lower your frustration level next time, make sure to label the
e-mail and the entry on the CommitFest app with the magic abbreviation
"WIP" and this shouldn't be so much of an issue. The assumption for
patches is that someone submitted them as commit candidates, and
therefore they should be reviewed to that standard, unless clearly
labeled otherwise. You briefly disclaimed yours as not being in that
category in the initial text of your first message, but it was easy to
miss that, particularly once it had been >8 months from when that
messages showed up and it was still being discussed.

LOL - I said a bit - it was only a little, connected with the commit vs
review confusion. I think I just got caught in the bedding in time for
the new development processes, I was advised to add it to the
commitfests, and them advised that it should not have been at a later
time. Yes, a bit frustrating at the time but not earth shattering at
all. I'm mentioning it now mainly to help clarify the situation for
anyone else wanting a WIP patch reviewed! In my case while labeling as
WIP could well have helped - the (pretty short) message accompanying the
patch is very clear that there is stuff to do - using the magic phrase
"...merely to spark discussion..." - as were all the followup
accompanying ones, with phrases like "still todo...". So yes, next time
I'll label any patches more clearly, reviewers need to read the text of
the patch they are about to review (note that most did).

If you wanted to pick this back up again, I'd think that a look at
what's been happening with the lock_timeout GUC patch would be
informative--I'd think that has some overlap with the sort of thing
you were trying to do.

FYI, I thought your patch was useful, but didn't spent time on it
because it's not ambitious enough. I would like to see statistics on
a lot more types of waiting than just locks, and keep trying to find
time to think about that big problem rather than worrying about the
individual pieces of it.

Excellent thanks - that is exactly the sort of comment that would have
been very valuable to have made at the time (echo's Gokul's recent one
interestingly enough). After all if enough people share this view, then
clearly I need to either forget about the current patch, or design
something more ambitious!

regards

Mark

#30

Josh Berkus

josh@agliodbs.com

almost 16 years ago

In reply to: Mark Kirkwood (#29)

Re: Lock Wait Statistics (next commitfest)

On 2/27/10 1:04 PM, Mark Kirkwood wrote:

LOL - I said a bit - it was only a little, connected with the commit vs
review confusion. I think I just got caught in the bedding in time for
the new development processes, I was advised to add it to the
commitfests, and them advised that it should not have been at a later time.

Yeah,this is only the 2nd year we have done CFs, and is the first year
we've had non-Core Team managing them. So the cement on the procedure
is still wet.

--Josh

#31

Greg Smith

greg@2ndquadrant.com

almost 16 years ago

In reply to: Mark Kirkwood (#29)

Re: Lock Wait Statistics (next commitfest)

Mark Kirkwood wrote:

I don't mean to be ungrateful about the actual reviews at all - and I
did value the feedback received (which I hope was reasonably clear in
the various replies I sent). I sense a bit of attacking the messenger
in your tone...

I thought there was a moderately big difference between the reality of
the review you got and how you were characterizing it, and I was just
trying to provide some perspective on how bad a true "bit of review"
only would have worked. Since I saw you disclaimed that wording with a
smiley I know it wasn't intending to be ungrateful, and I didn't intend
to shoot the messenger. Apologies if my tone grazed you though.

In any case, process feedback noted and assimilated into recommended
practice: I just added a section about WIP patches to
http://wiki.postgresql.org/wiki/Submitting_a_Patch#Patch_submission

While I was in there I also added some more notes on my personal top
patch submission peeve, patches whose purpose in life is to improve
performance that don't come with associated easy to run test cases,
including a sample of that test running on a system that shows the
speedup clearly. If I were in charge I just would make it standard
project policy to reject any performance patch without those
characteristics immediately.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

#32

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

almost 16 years ago

In reply to: Greg Smith (#31)

Performance Patches Was: Lock Wait Statistics (next commitfest)

Greg Smith wrote:

While I was in there I also added some more notes on my personal top
patch submission peeve, patches whose purpose in life is to improve
performance that don't come with associated easy to run test cases,
including a sample of that test running on a system that shows the
speedup clearly. If I were in charge I just would make it standard
project policy to reject any performance patch without those
characteristics immediately.

While I completely agree that the submitter should be required to supply
a test case and their results, so the rest of us can try to reproduce
said improvement - rejecting the patch out of hand is a bit harsh I feel
- Hey, they may just have forgotten to supply these things! The reviewer
can always ask, can they not? I would prefer to see the wiki say
something along the lines of "If you don't supply a test case you will
be asked for one before any further review can proceed..."

Cheers

Mark

#33

Greg Smith

greg@2ndquadrant.com

almost 16 years ago

In reply to: Mark Kirkwood (#32)

Re: Performance Patches Was: Lock Wait Statistics (next commitfest)

Mark Kirkwood wrote:

While I completely agree that the submitter should be required to
supply a test case and their results, so the rest of us can try to
reproduce said improvement - rejecting the patch out of hand is a bit
harsh I feel - Hey, they may just have forgotten to supply these things!

I didn't put any strong wording in the Wiki, I was just mentioning my
personal position is far less tolerant of this than the current project
policy. What I added was:

"If the patch is intended to improve performance, it's a good idea to
include some reproducible tests to demonstrate the improvement. If a
reviewer cannot duplicate your claimed performance improvement in a
short period of time, it's very likely your patch will be bounced. Do
not expect that a reviewer is going to find your performance feature so
interesting that they will build an entire test suite to prove it works.
You should have done that as part of patch validation, and included the
necessary framework for testing with the submission."

Finding a reviewer for a performance patch and getting them up to speed
to evaluate any submitted patch is time intensive, and it really sucks
from the perspective of the CF manager and any reviewer who is handed a
messy one. The intention was not to cut people off without warning
them. The position I would advocate as being a fair one is that if you
don't provide a test case for a performance improvement patch, you can't
then expect that you'll be assigned a reviewer by the CF manager either
until that's corrected. And if time for the CF runs out before you do
that, you're automatically moved to "returned with
feedback"--specifically, "write us a test case".

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

#34

Robert Haas

robertmhaas@gmail.com

almost 16 years ago

In reply to: Mark Kirkwood (#25)

Re: Lock Wait Statistics (next commitfest)

On Sat, Feb 27, 2010 at 5:40 AM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz> wrote:

I'd also like to take the opportunity to express a little frustration about
the commitfest business - really all I wanted was the patch *reviewed* as
WIP - it seemed that in order to do that I needed to enter it into the
various commitfests... then I was faced with comments to the effect that it
was not ready for commit so should not have been entered into a commifest at
all... sigh, a bit of an enthusiasm killer I'm afraid...

This might be my fault, so I apologize for killing your enthusiasm. I
think when I get wrapped up in a CommitFest (and especially during the
second half) I get wound up in determining whether or not things are
going to get applied and tend to give short shrift to thinks that seem
like they won't. My bad.

Generally speaking, I am always in favor of adding things to the
CommitFest, but I guess the one exception I would make is if there are
outstanding comments already given that haven't been addressed yet.
In that case it seems a little unfair to ask people to review it
further unless there are very specific questions you need answered. I
think you were good about communicating that the patch wasn't ready to
be applied yet, but I also think that it's to be expected that you'll
get less feedback while it's in that state.

Anyway, my apologies for turning you off to the process... that wasn't
my intent and I'm sorry that it had that effect.

...Robert

#35

Robert Haas

robertmhaas@gmail.com

almost 16 years ago

In reply to: Mark Kirkwood (#32)

Re: Performance Patches Was: Lock Wait Statistics (next commitfest)

On Sat, Feb 27, 2010 at 6:22 PM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz> wrote:

Greg Smith wrote:

While I was in there I also added some more notes on my personal top patch
submission peeve, patches whose purpose in life is to improve performance
that don't come with associated easy to run test cases, including a sample
of that test running on a system that shows the speedup clearly. If I were
in charge I just would make it standard project policy to reject any
performance patch without those characteristics immediately.

While I completely agree that the submitter should be required to supply a
test case and their results, so the rest of us can try to reproduce said
improvement - rejecting the patch out of hand is a bit harsh I feel - Hey,
they may just have forgotten to supply these things! The reviewer can always
ask, can they not? I would prefer to see the wiki say something along the
lines of "If you don't supply a test case you will be asked for one before
any further review can proceed..."

Agreed. Personally, I have no problem with giving a patch a brief
once-over even if it lacks an appropriate test case, but serious
review without a test case is really hard. That's one of the things
that slowed down rbtree a lot this last CommitFest. We should
probably try to make a point of trying to point this problem out to
patch submitters before the CommitFest even starts, so that they can
address it in advance.

...Robert

#36

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

almost 16 years ago

In reply to: Robert Haas (#34)

Re: Lock Wait Statistics (next commitfest)

Robert Haas wrote:

This might be my fault, so I apologize for killing your enthusiasm. I
think when I get wrapped up in a CommitFest (and especially during the
second half) I get wound up in determining whether or not things are
going to get applied and tend to give short shrift to thinks that seem
like they won't. My bad.

Generally speaking, I am always in favor of adding things to the
CommitFest, but I guess the one exception I would make is if there are
outstanding comments already given that haven't been addressed yet.
In that case it seems a little unfair to ask people to review it
further unless there are very specific questions you need answered. I
think you were good about communicating that the patch wasn't ready to
be applied yet, but I also think that it's to be expected that you'll
get less feedback while it's in that state.

Yeah, makes sense, altho perhaps there needs to be a way to get
incremental progress reviewed?

Anyway, my apologies for turning you off to the process... that wasn't
my intent and I'm sorry that it had that effect.

I think there was a level of confusion on both sides, especially with a
newish process for me to get my head around, so no apology needed at all
as it is/was clear that there was no intent on your part to make things
hard! (that is why I said nothing at the time). But thank you for your
kind words, much appreciated.

Best wishes

Mark

best wishes

#37

Robert Haas

robertmhaas@gmail.com

almost 16 years ago

In reply to: Mark Kirkwood (#36)

Re: Lock Wait Statistics (next commitfest)

On Sun, Feb 28, 2010 at 1:06 AM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz> wrote:

Robert Haas wrote:

This might be my fault, so I apologize for killing your enthusiasm. I
think when I get wrapped up in a CommitFest (and especially during the
second half) I get wound up in determining whether or not things are
going to get applied and tend to give short shrift to thinks that seem
like they won't. My bad.

Generally speaking, I am always in favor of adding things to the
CommitFest, but I guess the one exception I would make is if there are
outstanding comments already given that haven't been addressed yet.
In that case it seems a little unfair to ask people to review it
further unless there are very specific questions you need answered. I
think you were good about communicating that the patch wasn't ready to
be applied yet, but I also think that it's to be expected that you'll
get less feedback while it's in that state.

Yeah, makes sense, altho perhaps there needs to be a way to get incremental
progress reviewed?

I think it's possible to get that, but there's a certain way you need
to ask. As a general rule, anything that is of the form "here's my
code, can you take a look" gets less attention - with the possible
except of a patch from a committer who is planning to commit it if no
one writes back. And even then it often doesn't get looked at. Code
dumps are just no fun. Now if you write something like "here's my
patch... I can't quite finish it because of X and I'm not sure whether
the best solution is Y or Z", those tend to get answered a lot more
often, at least IME. Reading a patch and trying to understand what
it's doing and why it's doing it and whether it's really the best
solution is a fairly time-consuming effort; giving the reader some
context makes that a lot easier, and so people are more likely to help
you if you do it, again IME.

...Robert