Possible performance regression in version 10.1 with pgbench read-write tests.

Started by Mithun Cyalmost 8 years ago23 messages
#1Mithun Cy
mithun.cy@enterprisedb.com
2 attachment(s)

Hi all,

When I was trying to do read-write pgbench bench-marking of PostgreSQL
9.6.6 vs 10.1 I found PostgreSQL 10.1 regresses against 9.6.6 in some
cases.

Non Default settings and test
======================
Server:
./postgres -c shared_buffers=8GB -N 200 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c
maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 &

Pgbench:
CASE 1: when data fits shared buffers.
./pgbench -i -s 1000 postgres

CASE 2: when data exceeds shared buffers.
./pgbench -i -s 1000 postgres

./pgbench -c $threads -j $threads -T 1800 -M prepared postgres

Script "perf_buff_mgmt_write-2.sh" which is added below can be used to run same.

Machine : "cthulhu" 8 node numa machine with 128 hyper threads.
===================================================

numactl --hardware

available: 8 nodes (0-7)
node 0 cpus: 0 65 66 67 68 69 70 71 96 97 98 99 100 101 102 103
node 0 size: 65498 MB
node 0 free: 37885 MB
node 1 cpus: 72 73 74 75 76 77 78 79 104 105 106 107 108 109 110 111
node 1 size: 65536 MB
node 1 free: 31215 MB
node 2 cpus: 80 81 82 83 84 85 86 87 112 113 114 115 116 117 118 119
node 2 size: 65536 MB
node 2 free: 15331 MB
node 3 cpus: 88 89 90 91 92 93 94 95 120 121 122 123 124 125 126 127
node 3 size: 65536 MB
node 3 free: 36774 MB
node 4 cpus: 1 2 3 4 5 6 7 8 33 34 35 36 37 38 39 40
node 4 size: 65536 MB
node 4 free: 62 MB
node 5 cpus: 9 10 11 12 13 14 15 16 41 42 43 44 45 46 47 48
node 5 size: 65536 MB
node 5 free: 9653 MB
node 6 cpus: 17 18 19 20 21 22 23 24 49 50 51 52 53 54 55 56
node 6 size: 65536 MB
node 6 free: 50209 MB
node 7 cpus: 25 26 27 28 29 30 31 32 57 58 59 60 61 62 63 64
node 7 size: 65536 MB
node 7 free: 43966 MB

CASE 1:
In 9.6.6 peak performance is achieved at 72 concurrent cleints TPS :
35554.573858 and in 10.1 at 72 clients TPS dips to 26882.828133 so
nearly 23% decrease in TPS.

CASE 2:
In 9.6.6 peak performance is achieved at 72 concurrent cleints TPS :
24861.074079 and in 10.1 at 72 clients TPS dips to 18372.565663 so
nearly 26% decrease in TPS.

Added "Postgresql_benchmarking_9.6vs10.ods" which gives more detailed
TPS numbers. And, TPS is median of 3 runs result.

I have not run bisect yet to find what has caused the issue.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

Postgresql_benchmarking_9.6vs10.odsapplication/vnd.oasis.opendocument.spreadsheet; name=Postgresql_benchmarking_9.6vs10.odsDownload
PK��7L�l9�..mimetypeapplication/vnd.oasis.opendocument.spreadsheetPK��7Lcontent.xml�]m���q��_�WRN�����v]�u%��W�o�q�?p$J��"U$5�{�>
���I��OJ�{W�B�4
40��7����1���,��1B�&i1+�Y�|s���=2w�y����E6K���l�N�&��E�'�.������mU��I���E�N��fv_n����C�����I�|���V8D7��v���_s+��U��;Y0j_�X��:�%X}�I�����������i6�����y���Sf���ow
�vr�m��R��4�SWY=e�M��:m�~N6T����
m��I^���Jk��:b�

1�z\����<a��*��<k��"�x��y�]'��D����������=��5�.'{`�Y�m����C|Y�;U�����R9�~����?UY�V�����,�g;���cF96�(}t��
"g���O��;�z~��������*]'{����QV�MR�-S�N8�R5��MY5;�,��z��t[5����po��������:b
nq���O�:����`��P�O��:������;�j������wn�3F�i�V�{��-������y��"�9-(���gi�G��IG�)�h]����>@����8G�r�x^��A?�k�����w��6ab�k
�~����/�����E2K�y:����u>~�x����~s��4L�l�D��|~7?�%�Y���@X���iMQ?eu}7=_��f��?'��>R�?%����gr�CD���&kf���*k���~��-���y��J��&]�N�U�L~�>&E��i{=�E��sZ�q���M�'-&�K��G��7'�<!1]���@�<�6%���,j������A�f%�U�7�u����uq����hC=��,�'����J����F=���%��O��Z�6M���9��/��-�r�r�S�0"���S�Z�O�4�'�Z�+�p�f�xm��q�v^�m�F����I�����x�����������6�2��/[$��y���A��j���<�������?X2E�r%�U�<�����I����$�[����U����#����\T��S�j�W��_�e5w+ J��4����������W�����?�h+XnX����$���W�y��@��*���EY�j>o������lz�o��gx�hO�<)�[��7wiq�)�-�
L�����S����6�*������`��ju=h�/��g�C<�u�q{Y�-���1R�:%��IqwRx�t/�q�;g"�=Z+��Vay-�w�h����^��y0���V�y���B��/U��:����i�v"/�']oVIG����<KG����S��	���-��y��85�]�U|�`�<5�]T���l�v>������/Y<�x�(O��2:����X{N�)7'���J���o���\��Y5���&'y�,���V.}��>��U�I�]�c�&����f��J6�g��7�qp��~�����-�Cu�U��������^��[O��v����u�}x(�*���}�������%40�Ve��\��i��fi�fw�]
���:��s_Z	fp?I7��m�����yUhi�k��A�����/t���	�_<������
f^���y�]�c�2�������q%~3e_�"^��C����O}�������e�~����P���.�v����t�@����vg�
�w�9���h�w�+�Z��z���m'=.���$7D����M+���?�:���:V[B���u����(�l�9�m7��I�rU������A�����Gq>���;�(�����Q�J����c�z�Q�=�����~���I{X�~U9�M���!��]\Qgn��������C:���Y���}sW�d��#[�u���M���3��u�g����!��1�Xi5�, �(���G���!"
���a�-�,�^�Pz�G����{�^��A
~��r�Z��p^	&�>KsdBj�d}&����X�@��Sty����
_���/��D�U�t�@��zV7�w�f�^���|���>�!����G�A���y��*�'mP3q�$+ �J�dR>�_��$Y@�4��E������D��I���lk�x�|H��
"�j���B=�DL�r�(P���sl���m�N��=PG��@:���E/lb��&�B�dc�#(����[y��	�L���z����a�X����#�c��b�|�!���_Y����	�����vkZ���5�ri]�>*�D=�[����1���������<[,�jrA*Qv��Is��6�?�|'���
3 �r�)QB[cPZ���Q\&'��1�qV����`�$�6O �\������+��N��_`Y���v�����nl1�s��Vr\+��5v�x14��h���2�I!�������)���?���q��x8?c��P@M����s�5~2�s�}����X)b���`$���Q��X�J�2,5=KM�R3���5�Q�b3
����o���~~��ak
3��I�0�J��9����J�2<�=Om�S;���H��fR[%c��B���TpT�����A�`�fU�q�#���	b����R�K�i
?�<u����X&el(��n��[#���&���N4+4�)�K��"�����0�9�y�p!�2OT��
'����xmb�$������Qq��c��*m�1��-C�X*1��R� �^�B4�����)�2��V�1e��Q7GT��
5�BJ��T���g��)�6v�
��y/��{���4{�g���+��BJy��g�R\��n�����#-���<&�2iqd	#���i��q�
�2Xz���r��5�m�h�.j�!��(��^_��x=���1Jp�c����
���RXy
���j0�� q,X���\#S��m�F��*3#���RJ�pQw��1�{�Ta� �p��>���\���8'�+&e&V��n.F&B�J�
i�"`��hB�X����TRR��
�R�g�X�c��a�	D���Ra$n>@�\����y;�JNPe�����x+9I��F"�0.�}����26<Y�(��K��X��w���X�Q��c_}�'1��/�h�nV#�D%��*��7��2������Y$�o�4�:��[=!���ps��EU.c�c
�p��$	���Jp�r�4�}��N�E������R�����c+�-�@n,/fcA8|E�N	cm=S�jl4R�������f�����y3A�����&ud���{&d�Q|�H�!1���	��HbJ��\ #�p!����Sg|p����$��jj����{h����E�o)�o� Z����.���z�E�[���M���&"&.3�L���GRGK��F����/���w��w�b�����bJX!��1��k�5z^�*��V@t!��^|u4��f2:
�/DPB�� ��c���4�j���,��?�}��s!<���!`,��fP���B��\������;�Wv ��5�������	P7wk���!,%���{��X
���q�����BD�GD�A>� �����Dn���n��W������y�q/>�:��D1���p!��C�?� Z��Re��Z��p
	1���+��`�2pDBp$'��#)d��4o���=D\���d��O&��'8�J��������q�9�^�&�6&���y�0�J4��#��R�Q����?�8������������n��W�����:S���*`,��j�(+���0�	�O%��T�~*�	0��a���B28@���|�K�~��!$E^�	c���D�La#�0*��?� ��
r�Qf	�1k�`
y$�us����Cpi4����1��{�X���9QZi��a�R��`��	9�C�5��R3�6v?@����}.eLa���"H
�cQX+5n�'<���/�g���[���e�4J�9��������u.$��j,��,����!�Y�)�0.�}JL�)19�f��x�T�p��F�{���I��m�.rA�s�����X��R�>I������i2����+�d.!"��BXk�)�usy�~�s�����^�
cmD���h���r#b������2�g������X�b�
�&~�z-D�Y�}�CC�!o�������h�j������g�d�1����
�ila�,�1.;@�Y�}
�SX��a�G�`��5�������Bd�i3�������$����-��} sN{�
R��7!�U������
c�!���9���.DU�8�}�LN�A�$���
F����Vzk7|r����q��>Q�}�s������2G�Lk�+d��6U��!��8�t<����<����m���Ot}��PK����PK��7L
styles.xml�Zko�6���BP���#�����A�I�M�]�~�%J�
%
$5c����9z[M���m�y����^j|{�X��H������8H�����]������~��[�e4!��'uAJ��zbD \����kQ�8�T�J\�S��W�tB�6zg��#F�RqnK+���
klG?,_������������3�T�Q2�q�������xd��x��vQt:�V��y��n����'W��T�D������u��Qx�}�6���"S�x�D���r���L'�����:�4',��wC�"]*i[���0����-L�_o���JK���U����mZt[�s�M�6����8���s}���UD���,<�,���b�4��#@ r�!��BozR�U$H����d������zP�NU=���H�Q(�sA�B��#%��Z6��62�v��X����z��V��8�x]�6�,��"��)������1����ii�VJ��6��Q5��B"ZB��j���d��j�?�����
��f��a�	����8��NJI����V<?�gm�]���#Q4��o��aU�!��Fme�rR�>!]��JF���P�
��\�������pvp��mDEU���5���i���?�y�Z�%&=IE����
?P�"G\�d�����E����7
�C��5��8��I#'�����T
4���t{JI�k�t�Nsc��^(!��^a�s��� ��P�R;h��+�R�p�;���J���y;3fN�E�w�y�s�%�H�R)���!��@p@�����8�O7�(t7T�[F�{���U�����P���x�xg4�N!������qh
_\\^_��0�GT[�����6K2��	T!^����H?7"��S~B=�(��Y]�[�&��d�$^t8�ZqYa,4%�B1��qX�e�j�*H�������A04��s�(7��������u<K���B���m���e�I�Yvq-����|��� ��h�������������-(����~-IR����i����}$�44Y����_�a���q~�W���u?7}�%���u���wq��SAK�� ���*	&��FtzI-)������������3�"}�6Q�T<p���F����a��7��bbp��3T1�s���������������9[lu�Y�W8��&l�����Z�N��#��y���k�)p���L�=/���������7w>�pU��l�	�WY���^�T{��H�L����t�6(�����nN�T���uz���4��U=�����I�=��z�8���D�;���X�w��3�j��}�I��	�R���k.����M' '��wA�I�@4��|��O���.������|;]b�+�~��3y��k��{�_�1^h��c4/�����_F�rM�dBm*�k��J����.L�~�0}���@�M49������'Q����R��kX��6��z���Qw�	�^u��~�����mU�����n��Rg�`w���3�q������quh��Y�6�i9L1��D��}�J���/W��^U�i�[�3��1bc��h��������g�H����%��/b�eN�l��j��p�Q79e���I��=;-�P9�`��������QMV�f������n�jMs�d;F��5k>�����@�E����o��`3R�H�y^{��TB��48�,7~���^�m����k�^kl`�S�!}��h��s��o>��av3`����>��A����+�ST�6�>= ��Rg�dPouh��dO��!s����7(^��E������qc���8���Uk�S��w����fuu��_^o/c�����M��_wXLQh�����
�c�Q�F��"����PK]"j%j�!PK��7Lmeta.xml���n�0E��
��V�C����]t����29q���AQ�����qR/����s��T�p>��|���M��J��=T�i�9����C��������`C|��D���bnU��V������Z�p'��D\�b2�+g���
��$0��>������s<uWT�w���(%1ZL�WvL���F�:�s�b4�s������{�^)sk��M���	M������e��g�^Ow�Q�S��	b�������[�E�E�',�9��"/�
E��x/�D�TdE���3�����
J���c��iV�mK���_<�����?�@[����<��0�������is�'$)v��mw����"������_ &��S����b�:o�y�m"�A�h��fo ����B�E	��5��/U��^Y\���
�[C�PKR,�
�KPK��7Lmanifest.rdf���n�0��<�e��@/r(��j��5�X/������VQ�������F3�����a�����T4c)%�Hh��+:�.���:���+��j���*�wn*9_��-7l���(x��<O�"��8qH���	�Bi��|9��	fWQt���y� =��:���
a�R��� ��@�	L��t��NK�3��Q9�����`����<`�+�������^����\��|�hz�czu����#�`�2�O��;y���.�����vDl@��g�����UG�PK��h��PK��7LConfigurations2/statusbar/PK��7LConfigurations2/images/Bitmaps/PK��7LConfigurations2/popupmenu/PK��7L'Configurations2/accelerator/current.xmlPKPK��7LConfigurations2/progressbar/PK��7LConfigurations2/toolbar/PK��7LConfigurations2/menubar/PK��7LConfigurations2/floater/PK��7LConfigurations2/toolpanel/PK��7LObject 1/content.xml�[[���~��0|�>]H�����I��%)NOR�o�,��I(ym��w���+{����N4rf>����_�i2{���E�8'�9��,Q������7�������Z�!_D"��<+�Pd%�=��Y��G�[�-DP��"R^,�p!r���X{Q��%EyHFO�������������`9�r��gG2����tT<}%�N����z�e|��>�����MY����v���B�
���Q�v��^��I��O�2VD'F���2����.e�t��hh�2x����ztF<��@n9:7*���e���e������x�g�>>�sA�cm)�#�B������|!D���P��]j��Q?#��E���K.�zxQ=��C\�C��1@C��*M��W@g&P�����������K��i�+��+kqV�A�##�&���6$��,;`V�	v�v�m�49��h���Q4�
�0�><�9������r>�F�tt�.� �Q)a�|m���,@��/�]Z�m!@�j�����X
I5mq���D���M�B+�y���/���BKH8&"_����A��q���!����'4+�������Tq�o,������ ���-���������������������Y���Y���E�:.�,����`���P/��@�!�F�
�89(���v�Z�<�.��3�_���a�Ld�2�,\2G/�������(��\qY�Z�-o����rN��jGL�M=����������@�����i�J��T�2��f�������\�zp���\�u�����qMM#���OK�~���*N�V�t��Q?�5/:8�j^��C��Y�l#�m�(���$�XI�_�44T�����A��4/��	����VG��MYgh�{^��96bg8�FX�������`Dq�'�AK�%O�CJ�:��k���	<�+B�@���5D�xu'�[�����y���!'�b+�,�}\ ����~>��a������A�u1����M���b���!.f�X���5;�t)�<�]-w�G��]�����^��)����3T���-�h�5���e����x�������?��E��������+��k�V!�8�]�5��W����f������#��5�X���=T>==��Y}V^4F�N���
W���T/�W�s]o���/b��m��IPM�U��C�nx�e\*4�"d��{~��A%=#����[�:�S�^�,�����\�<�<z0��j����a�Sf!�pGv�s�Yo8���b�������%��mV�<+� ��VAX�2��y��o����a�JVl��������G��������'p���~@u��mX��-�������z������y��/6���Q[*���<�CCs9�7������5	-�EI��/x���DQ�%���O��y���^1�C����u����^��`�Y�n�����hmu?>/E��!�-g
�ss���CY<=I����
�B�E(�6J f��
h�P�a��
c�����Z�t�V��m�����F��}��A����8
�A����������>��up8�
�J{c�x�����<�$��C���,Lb��T��,.��<X��u������{�/C~��p�AMH?�������C���/��[�8\�?a�4���g�.�D}�1���:�A����c�a��|��i�Q���5V����|�k�%������sH�N��@��#�	�Gr��`�x���]�$� ��W	P��C+wNT�+�g�1��D�i�w)GZ�b�R�r�4+��j�2t��Aa�O���Q�z�j�zu����������h9��������6B�oSF��IOV}�h���N�IAe�V��kh��#^�O7;��a�k>���GFL��(���ll`��-�����/C8�)����������	�����;?�OM�f���������#��1�����.A�-�G��O���/�c�����1��;�6�	'�j��n�����j/�huJ�����&xB�����\��Nd�X�����Z>���w��Z�C��(|�u ��8|$�����p��#�Hz7�z���4�S�A���]����U��{��:�Xz7�-�w�2'
�!p��Ei>�Ne�v�Nl������aM�o;��y��Pe����Bu��=����n���{��� �m�sl�Lh�t������z�������3Q1e�C]�1�����t"�=�jJI�����;Q1e6�n��lFq�#�T���{�#�aV��w��C�����2��X�#3��0��t"�]�#����8���~����D�3�c��	�aJ���H��rm��e��dz��`�Q9��*���=����v$���Z������j+��k% <L<��_�<�Nd���	�JQ�#��`'����T�S��6�U$������*;��J�<EU��;�e:�O�B
K��/�>0��9�l/�#��
�+/����&�t���
I'������e��5��^	���8u_�6�������c��m����f~�?PK�>���	=PK��7LObject 1/styles.xml��K�� ��9�KYc,+��)���LM� $SA�
�#� KA��PYJ����/�/c#w=�F��d�����bP
U_�����[�r�t����X�pe��?%7;'V�L�K�iE�a��
7�2-W���4	��?�Y�<�������=����������C���.����T�h$�1hZj���P?.����`<�~(��k��N'���������*���`���������G-��2��������nT'�7��ez��ru|�E�z�c=,�����C����<}G����\G8{�3*��h��q9v��/�Lk����b�[�v9H�>�.;��mo�����[g��e�������k>�>|^�����p�Vm�T�p������3KW���J��2��S�
n�����k�MTYy�{L���o���j-	.��Y���5	��Z�W�kLa������
�����=N�����L�
����Ct�PK�P���PK��7LObject 1/meta.xml��AO�0��~
Rw�����X���/�x[�}cUhI)c�QdA�������k�=�UpB�*�3QF��H�����<����Rs8(�\���]X�+�K�n�X�Hg57E�Z��[�7
����i����s��gF��5��{�/��%DI���N�W��l�))+Z�h;�Wj`�J��k����>.fl�z�K+eu�v	���Ia?u�ca��!����f#�I>�c�S�_�F[8c��z���x���i��)���o��~�
f���������d����~�����PK[��H!IPK��7L��\\Thumbnails/thumbnail.png�PNG


IHDR���P��PLTE   ###'((+,+.0//00333788===>@@CCCGHHKKKNPPSSSWYY[[[^`_^``cccehgfihkkknponppsssuxwvyx{{{}�~�����������������������������������������������������������������������������������������|���EIDATx��}
{������q��TQ]G!%%!�e�		S�4|������Big��=�:w_�3��� [ �W��,P���V�d�S�]Bl���K2�%��qX�)�����?Fo������WY
��h�o1��h�b�����#���)'��NU�<C�pGW��S�!X8��\�����vX���Y���X2���qX`W3��
p�n3f����;1[�f������u�
�\��*g1�X��[�B�<s�X��E�E7�N�j� ��i�f���b�Z����M�Y�X��y�)�E�����n�b�:�^�_3��n�b����d^W�/6�0f��i�����|�;�kh�+��z�Z5��t�T����@!�	Y�����C��JT�^�^�CU~RRCYT6Tz�}e*��zU�V�X���A)��Hs!K|���4�3
��C�\v��j���b�WrL���Y�w���3���O��@})�W����(g�bM�%��'B�P95�g��wsm�P���"�P���	*gB��Y��L{�*����������r3��3��a&E�������Z���1�)��3�%��C�$��0��f��J���P�\��<�P#f^�^H���&5sx����,p�R3���hg��
O3]/�����Y�1���|��J$d�i�^l��#HL5�<�
��e=��u���^��1@ZC���R�hv�z��&��c9??Y�����9����bK:J��7wI=�T��9t\};a`����v�����$�C���bf�{��F}�������kni~<V���v-����'=���f`�� ��Q2�{�`�`�i�������CZx���R�X��{�=oS����v���4� ��p�D�JJ���=w~<J/��[�b��A�X�r+��t5jc?�
����$C6�a�?�>��Ro�!��"a(���$��L��f����wdZ���z@F&NK����%M�T{��qn]�=��b�-*��e�:����wm*u�	�GYh�{��X��Gdw8�k����:�wr]~���)?o�����5`���������R��g*�}���e�@I�R%M��<����8f`d;A�F���[�{K��r
H�\����k1K�
f�=��QM]c�z������\3�9��s�md�1�w,��6���������t���z��a�	c9Ki����
5��BU0�M�d���YY��v��YF�w�'��q�l<��H��rN�w9B���T�aH�Y�u*i����]?d�RR��\
���}P&�����:e�����p9h�6���i�\��d���-������uC�+��T=�������w�����7�9��(�=��{��cNZ~#��W&A��$~GcP���\����]�x�?�u+gH�b��x���9�7�� 4I,m��a`�
��de��w��Y����w��5�Evc|y�	�RvQ_Q���}@���?pP����,dXU�N�!�u��z(�b�&��8|o�������pm[�`�����-j� �Eu����7Y��F�Z�3����������a�nK���5�{���Q����p,T;��k��@d���FM����(i���.�R�l��R0&� s��0���opJ��i�SX����H���8D(N�S"F9���4���7��m�����:�[,��p�H�ySs�%���$Y��������6��_i�=��#�p�0��CQ���l����4��&���`��R
����;�=�#�I����(�Qw)�9�����n/�������B����90���:�PV�6���	`F�0!����H���`G��$�����1�=W��Z�G�;Paz��&���RQj(���N&�Z��w%��3��F].$��7��hC}Q�.�*:�.;X��r6pV�|��\��x����3T���$���!��������'������������������n�]����������/?l������VJ}���B�5^R};��/_�����~��?��S���_�l^�����������j��~|���x���`��_Vs�}�v�9������_�L���|�����7zX-?��<�����������>���������U����W�>�n��z89�O������������}�BdU��������Y���}����m_���W��#��}_}zx�����<����w�9�W��o���m�j�������)�oyu����]�'�f�������o?5W��?7��S�q�^V��W�*V]}�;��,��u�����2?��h��ME�T�:�:h�|zS�tv[}|��"��?^u���
�|������^\=<l���W�����������m��Y^,���������Q�a���#f����~����?���������~�\����n�V3��Sn���(��3Z�8�z��P}�P�U��W����/�]@n�pS�/�O���.Pu��9�~y��^��+]�f�����N���g[nqW]����6_��o�y]�>�|�|��n����9:y�[����d�i�NV���_|���suU
c������q����������a��m�}��������G��/��ShP��O'��v����oof�/�'Wo��tzzx����O~�n�T�lsrp��p}p������/���Y^]|���/>l?^\~�������o���}z ����]����k��
��&������������m;�/~{�|u���_�A����7_��}K�����5�D������c����_�������>�����R�o�����qwq��7u��)���"������%����i�W��<e[���77�����W���p\>�������q��G�W�zJ�-K��r����zSf�����Gp��������3);��.dn�B�oQy�1[H(�)���s�)!+��k���?���M������O�}+��O�����������yp�p%N��1�e��Dgi'���L\�f��K�W������$3G��KKDo��������')��E�]��$�Q"^'Dmj�+CrOi���M��s�<*J�2J���dO��<�r �'���y��f*��=�rv����V7��A7fIi D
���RW���`!�Q�������$���r�O�9��v6#R�}r�Z%�T4{H�F���<	����d��m��2^�i.�Bc�l\�2P��	?�n�w�'+��f�z�Lm��Eqmv�����>�_=�X,� ��#&D�c������s]sn ��v�q��	��Y7���@�)��b�N�Q	1�1�����������vqp�];��
&	��sY�P�dwb8�"$d�����,�7;c�z�y��v��mR$����}@1���M���Bb87|�'��
;~]�����*2v���/!n�
���hqo�����~
������x3p���N����k���n���|�?X�����#f���H1#s>Q}F���1q� *n�20c�Z�������s_�!/��Hnp�-���i7�W��48��b�)Q������n���=����75v���n����d�0�9K�L���������1r����w�,�:O7Ql�%�������kc�������i����n��!z��������LS�CG�F|��eqP�9)l@s.FDov�|�}���T� N�d��2���:y�
sq����:fd��m�r������#�d�[��0pH|��Y/2�6�NG���s#_���u;Q�`����DG]`,� 1���"w�&���������Z�������<��q���7����S
����RdH�r�����R6P�U�2�s}Vl�����$��9g��v:+3�T�@w�LD@g�x7�v�c��\y�4������QfA(��ez���6f@�F�ge�V���Q������.SPf.��������2e�")����Z��W��4����� �NA3r���A#bY�}+P�+�9��
�+fg���h3�^7!��K1��v����
��B]N%��$�=�����h�!�='}���;5_,�4��t���E6�Lwa�(�S�t��#i�����>����������e����/�%�����.�2����)��|�;��x7���[���LT7v�lc�(6K>�F��J�����rP�|��|�+�z�����_�ix���T����*�q��)������B��a�����O�)kS��v��?�L4�J�v�E�1�P�0O��1���HMZ�MZ���A��>�����N�%���\9����������q���,�S�g�)���3m��D}������$s~T�\���<��
;cf��;���4���������
��T`��X
��$��H�v�A���[5_-&�������Q>t����-�L3K!@"�6\�[�w.�5>����~>M9����V.�20o	N��k&s�����������_V��x{4Q������O�6�v�s��k�����q�)�����n
m�M3�ZJNu��4��]A3c��}�L�<_��g(�Y���}�{F�u���z+����Y]����hl�C���CPY��
��f4xJT��]I�]�����^|���Sq�SKa|0$�q���
�.k����v�������=���6+��h�pd��
���T���YHx�_�u���t���)����,�bG������[X�SD?b����rq����T��F�`��	:��"��>������[(� �z����:�z�~��c����(�����Q#"$��J���������Y��{F<�^���z~4�������%��`�-e�����Cp}�x���8,]����9�f<%��D����u(>��(�6�(���^��,�������j3������72�����H�a�G�Xl�5������
�a��x�?���&���h��R8���G��NAY#��\������=��N#7B����`M��g9��W�P�76@��}gz���ZS-�T�TmpS�u#�u%YU�������,�jgU� 1T�0T������)�?sg��=N���F�g�|*i(Kj��TV�*.c]���P��4&�>��sR�r��i�Fj�&�eo1��>0��C$����-Z�9�	��/An��v�~Q4�����`����JQ�b��������=gyq=��8@a�x�S����1Jf��e��!	�������w�3i����3W
�U�r�b�����Q�Y�3�e5�'`v
�����D����Di�#��!�A1:����%�rG6]����m����Mo@xl��m��������4c43L��x���#�S�9%=$����@x���sq����S��alQfa����^���HR�����xBwvw_�~�,���	���Hb�zL�Ve$��e'��/rK1$��
���q���->'���s,�Xd�{Y�vS�z\�lA�@-�]qa�-0���|3m}������X0I$����"�Iq�-$"�v���i�'��~*�]��L�p_�s��2N��H,]uO�������8�w��\-�����jN1$�����)s�����E����{��SS�}v���sF���fY������c������ z�G�D��.�����X����V���|!� �����0���t~�v�0,e��RN(�rg�m�^��_LT�#��P
&_*k�������Eq����mA��uc�g����4u# ��3��C�I�R';���wyd�ou�;��o��c���i���E]3Y���e�Rg{�|�r�Vz�����S�����Q$�Y���q�6S�)��D�w��	�qB�3Z~����k>M_nB] 8�;�;�;Z��#�-����H��v�u�Z������T�u�g$�>R+����t�@B��hi/����S��������`q��0m�����-���nL�$y�A9���d��	z����`�������g6�v��0�&���l�Xa�d��|z1��_'W����x3��+%4��v���(u}�o��pw�=$DB)$HO��O�un�_�������E�5Z?���������p�1zIE�$'N���+-�)����/�0Y,���#9;�E�?F�d=�/D�q;�L�=�"j1��I�A'�$���HYqa������n����SY�e�B3����m�Fb���}[���bW������W/'ZcT�k����-��v��8�J��F��0��S�n��g�|+E{�O�_�������jc������}k��=�p$�2�����By#M!���`B�ZG����5 �|J��
������1���������m51�l��g�Y3�Zp�39���0��� ���V��$��f�uv;�_�4�:�g�~k�JL#�XlC�IFm����� ���PB�b�u��"��=�-#��S��1�8N�i��}r����f$�	@8�����b���\���s�e9�lh��������j����|�LE����m������M�s2��=�������q�:��A���c{�Q�d_5Ce'�.��D�4K��Z�7���8t}:�:h��	����1>;N���u����o�����O���G�������L�kT��>�q��Y%����n��:�F��CT�6-sj:�:A�G���!���D�,�����XY����.�����uy�4�V(�v����C���q��Y�9������D<X��
FD��ZC=��K"���k�I2���������o���Tm�wex��]h��W��r;$��j(g���-�eM'5�����|��4G��lXBq*�1��Y���!T+B�"�B%�8�}����cx����zN<�z���s��v�5���V'7�6������T>��`	����3�����'42�@����(E��g��Aw��������������M��u���V�E���k���`�w��n}������������&z�����a�#
����K;��P�����7��1���9U#@�,���)8�����!�8mP�����Qu�/���1_.�)�Pw��9��	q�I�Q-G��s< ��|XG���;��p�kaW����*�P:G;.�����������/�����
e�����#0����1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1�1��n�V��y�h�IEND�B`�PK��7LObject 2/content.xml�[Yo��~���H���>x*�;$/3�fg��m@�-��M4)K��O5��L��mB�c�VWwU}u6e��rH���E,��%Ax��Y(�8��/���o�����Owb��C��D�KyV��J����Y��W��;��DP��*R^��p%r���V:���US���L�^1��K~(�nV�����t���;��~�f�����b��C�k��yP�'Z�8�~���e�2��~��	�1���f��)v|�N&W�<�JXaD��7�e0U?������G.'C��3�O����9M�
�������e�t��H�����O<�3,V�>�cA�Se)�T����f���~!D���P'h�.��2�g�{�}/��K�=��I�!.�1�����a�'�]�+ �3�Y/w�Et���|��%��4������8+� ����	g-�M�s!�����	���n�2M���ZmY72�FYAfB�C�O1��4�����7+�A�]�A�Y1�U��
�����x����Z�20:W?�\�j)H�m��	z�'�G6�K;aP�c����3i�!�����4�J�=�2=L;N�����'����(X9
_7����#��F�6��C;�R@m�C�����]]���_Su5W��/���'L��)��J�
���M\B.���pg���pW�\@�%����89*���v�F�4�.��_p����Ld��I�$�^W=������3rQ�e�j)w�=G���9#�6j�J�zI�D?k-��0���������i�� ;��e���V�+�~�i��i���i�:��ws�yU�-`ri(�|"`b�����e�������+w��d@�%@G{@�u���4��dqcG���'	T�u�`~�RJ�O����'<x��&�[o�eo�x��f�����
�j��
��]4-��<	�F<��M�z)�&�m�'�J�j0��(yXY�^]c`���#��d���
MI�B�������C\h��/��H����a
������f�:���M�<������(�*�����1}�Q��3L�����#u�{a:�u�"f{�X��W����
7�����C����'g��?��k���!�_����������I���4Y������Fa���F���W��(�c�P��pW�g�����0J��w1��jR���|�R��]�j+��~�P�\�}+w��&AQ4T�H��'������q����#��V��
|�l�<v��|V�N��oyn��E.yH������	j����`�,�����v�	f���G�	p3^��\\w%_�]V��(� ���AX�	���E��"����*Z�U�~{���\�{�����j�j���8������i��E�1�.YV��:�����Y}�������T�y"`��s�f������
	c�(�\�_P������H������ Z�[�������^��hy���>02�,FZ���]Ujc�Xj?�r�����U�`�O�P�F��N�A����R�
��P���3����t<{ �R�T
C>qO����	;����o��q��5�v\.�4�G�0�~�!+��B������D��*����e:v�(=�hn�ha���Bhaeq9�CPa#��������WU�����A?�}D�,_
B�9������/����8=�?��i�_!�^p�MH����s��d��u��0�5�5m�Ti\�J��������_�IUWr�n��9�8� �]
��)H�wA���b�6$?^[���}�$�8��L����\L;��~�WU}��7������F�q������\��n�f�p�&*A��%���o�F����|��F��������h+�����J��������T7��d��b��AN=U-�8���jf���
�0�E����
Cw{���
z�F��l��mSm3�9~,BGcdJ��A����������	����=�?j��\�b��m���1�'v���t,d��a�,������c���W3_��=���!�g����i��t��!F<���B5�E�s�M4�L�l[>�.���Z@��%�9$������z3���+D���g���:�kk�����zA����������4��e5��O:vmW���6�L�s�M�dsO�����L�{>��b��������cA �l=�t��`���5S��.�1���W:u&�����T��Qo�=�(f2�QL���j�T��;4ld{�M]��z3���:35Sj��PO���z���s������T7�z3����;W�[k���;�����S�K��������f�{ZK�fj���\��~I��s���$�hn�������_L�j��"�=���^�{�l���!�1��`���7�]�h_�c_;u����������?m����{!����s�SJ!��6���:���Z�L�^�4���'�}���P	�b�O����Q���Y ��L��x�uTx�m~�}D(,�`~���9$@h����F�!���l������-W�/����,9��A���+�?�CR��k�[l����<����>��M���PK�(�N�	�<PK��7LObject 2/styles.xml��K�� ��9�KYc,+��)���LM� $SA�
�#� KA��PYJ����/�/c#w=�F��d�����bP
U_�����[�r�t����X�pe��?%7;'V�L�K�iE�a��
7�2-W���4	��?�Y�<�������=����������C���.����T�h$�1hZj���P?.����`<�~(��k��N'���������*���`���������G-��2��������nT'�7��ez��ru|�E�z�c=,�����C����<}G����\G8{�3*��h��q9v��/�Lk����b�[�v9H�>�.;��mo�����[g��e�������k>�>|^�����p�Vm�T�p������3KW���J��2��S�
n�����k�MTYy�{L���o���j-	.��Y���5	��Z�W�kLa������
�����=N�����L�
����Ct�PK�P���PK��7LObject 2/meta.xml��AO�0��~
Rw�����X���/�x[�}cUhI)c�QdA�������k�=�UpB�*�3QF��H�����<����Rs8(�\���]X�+�K�n�X�Hg57E�Z��[�7
����i����s��gF��5��{�/��%DI���N�W��l�))+Z�h;�Wj`�J��k����>.fl�z�K+eu�v	���Ia?u�ca��!����f#�I>�c�S�_�F[8c��z���x���i��)���o��~�
f���������d����~�����PK[��H!IPK��7Lsettings.xml�Z]s�8}�_��u'�8-]3��q�J�4�`�d[�Y�J2����
tj7���Lx,������{_V�,!����R��*H<�#2o*�Q��/�K��:�!6|�E$��C!d^��	ol��J�H��x�����4�d7���w#5�y���<4��a�Z���C|���y���z5m�u�(�����6������4��8��T�cu�]�l�|Mi����~�bk`�v���T�����4�X"����5��q�bh0F4Tv�b�FD��R/����
�
�D9���E�GM��O��C4_d��������g���
���`��J��/�>�c�=7�`�J+!D�U�&�{~����K�>rK��3h�*~e�d�iF�S&G#!��]f��[����������S�)�!F���p9��@y���B��~�"��S�K�7<��0E�����H���-X�v�V����[���P�T<�4I�B��:G����5���9L�����#���{���K)��(-�"�������G�d�E��>��y�F��.���.���%:�9���Zi��Sd&�'+��(�.`�UX���S�d�%��}�e0����7�����)d4��������PQ�q��%�����=hL�-U�������K�g�ZBT��K���X4��Y9F��Q*O	��=h��bn��#�#)������!/��C!���	�l�m��p�UE��9m��J�`A.����k����}���Dx;r}�D�����"����+��5�������S���ev��7������Z�[����4�e�������������E��$_��u	���H����5� .^.���2��� ��s��
2bp�mD<��J���u�]���^o_���J\v�P�Z��I%�q�e�e��`	�������)/#)3;/?�J���%�d�A�*��o8�.�x�B�as��_������?��J�:�HP`/*���R[�2��u���mD[+��������
�5�\�����%����|�U�;�m�y���0��������N��������u�{�X��-��I��
tKs���ex�n�G~�Y��.�������:��G����fp��z{���j/�C7��8�~��]2���8��j�k#��j�����-z;�����s��$�e{<�M�w��^�O���1����C�s�L�v����|��d���0�G=GM|��������i����q����,f���������p�q��T�g���#Y�x=��}�bS��<6i2��M���������?��y��j�PKB��+e%PK��7LMETA-INF/manifest.xml�VMO�0��+���6�*�8�@�3
����$l��d�n
���b�y�9�����J&o`�0�$'�1I@sS	�����*=%����bZ����,�pN��,���0�	Wh�������
4���5Sg���� ���BB���6��R�-�EI�>��[A%X��J��V
�0��7]ek������X�H�)�h�!�=�K�q{���7:�d��~�2[�=J��s\]���5��)�$�X������\=��y%d^d|a���pLN�T��L�b�$����}�����S����Y3!��2ku�G�P����[>u���K�OZ���8N� �gq����V2Q�������2��������#��������X�0�����7xv}ys��B��c,�}�������/��PK�)Z���PK��7LObjectReplacements/Object 1�\mpTW~7!�|�@>�
�	�@�����In�$$t�$N����S��v�N��V����3m�����(��~���?�����vF[��j�����s7{s�o�M��!9�9��{�s��>���9���zzc;<��@8?/����]
�<> w~�8��3������W�8ms�V'r���U�^�p�!���ry�����5��O��po�p5	����9�9�E�Y"R%$,�A�-O��_���+������0�������;2�������#���Q�
�6����V�\��U��v$,���)%���g��i!>��~��W�P>�O�#�G)V�)�!Ed�mI#�3�f�O���"Q(�;����J�D'������*#�t�������*#�t����e\d��w:���S+d\d��w:��|{����2�N':��'�"�����N�O����*#�t�y����2�N'���x.��'��������>Ye��Ny
�_.�q�UF�������[E2.�������/+�q�UF������|o����2�n'���O��q�UF�������s~Ye��Ny
�_��q�UF��������V�EVy�C^���2.�������x����2�n'���� �"�����!����d\d��w;1�5 _�"�"�����!��;����*#�vb�k@������	*!�vb�k@>���R�n'���n�
x�����������NyM�+���������G��NyM�+����������
�NyM�+����������L�NyM�+�������||}A)y����0�������������m���B�|��L�`��O��/��
?_(���/�#,��O�}-W>c�-l�����9����y7��yo����N�������\�>���z��=�=!������V������������}���?�R�gl=�-�z<l�q��G��Gq����yR��|��7_�q_�����}��x#W��D��(lA}����\z
�2a`�
�b���pg��XtWg$Q���"z�D��6��C��]���=2������L#������egN9��
��0<����$�Z����=��`�5h�a���F�Q@�B{/�����:�l;�H��$5���~'Dn�gJq����,�,l�.X
��K�({@�>�����������gyXhqhZ_��4�I4G��)h:�*�i����8j4MA��������DSq�h���/W:5
'�4��h�k���	M���QMf�JQ�PuB���cT��R��PMB���cT��R��\]B�p�1*l��5�U�S�dcT��Q)j��!�ik�1���Q)h���A�)��V�(�{��g�K��V-�����@�\�@��E/��V��A1�%�=��o��~o�t������AqQ�N�e���r���/��X��`����,�����FJ� d���5��tt*���jyq�lC�D_-�)`�Ne�Dq���^B���%de���)�����k�c����L�!Z\$/8��4-O��{� ^xskyA���)��<Yx�9���iy����x�����L���������8���hZ�<��+�oS��1/�G� ��a��!�x3���(�cGaD�O�����;�e��^�^z}}��K�<����yLg�l��]����'�/cg}>����}�ZY�%<��CKYh���e<��_����<��_���\��P��P��
����N_�����X3c@��R��V6����
�����0"cp�c.?���m�q���������v�>�?V��0�,�����"����aO}��M��a�	N��7�ZvB�X��x���dS2_��������9��{��b�,���;��b=�j�B�0��bV��1�v
�O�:=���V�_bj��+�=����<s�k�.��,�3�������d����Wc��U�O����!��*��1�`�#����h��{RL�k��7`�����p#�[�^���i=�=�zQ��c���]LWq=��W�t5�T#��MV3r��>�WZP�0@W+���Q�j�h�Y;�z�������khX����C�U�C�<����C��
��P)��P9!Y<T�_����<���������xhU��z���jx���6����U�Cyh��a��������!�Vj��y(�C�<��C�<��Caj���<��C�|�����C�x�#<�����N��/�C��C��P��v��N��C�<�Q��C=<��C{x��W���~��C��P����7=��o�o/���I�'��%-��E�PR'�^h���+d\g?��iR}�4A��da��I/��Q�����g?N��q�vPL���zp:���f��@�=��3�\Z����d�%���_��4I�E�1��3�7����,���h/����%>����c��W�e��X�Q@�����H�}�8��pp^�EM����2���SBMWA�������Dv�C�n������]]�83��.d�����a�	�k�m����J���B�BP@�����n'7h�����i��mK_Ov�70��`pT��a/0*���xi����'�>��� {-
�:����)������e��|�n�o��.���F���o�}�������e����<�R�|I"�cc�/I|���qL}�L��g�y�[
?��=Z���
��GQ[��N���S��PK�z!c�
��PK��7LObjectReplacements/Object 2�\l�~gcl�c�lc0�wg�3`b�\��`������?�C��	R#�:�IT5J���D�Z�ETTU�I�T��?*Uj#U)MhD[U�M�*����=�z}����t����������}3;s3;�{���w(D��E��<�C�RQk&Q����o`�4�g`wt�$����^D���D�
z[���P2�Q�p���98�,v�l��������(W��w��0r4�����N�>q�|"_�R/^d_z;%����{[W���{b�]�]�[b[����������;�^�"Z������9D�kKX>�gSJ������q!>��~�2�kZAdQ�?��R,�)� El�]I#�3�&�C������������&��D�:1���E:.���w:1��K�:.���w:1���"[��;��D~MP���3�N'&����b�y���_��q������)�9��{������
 ��<[������
 ��P���3�n'>y��R��b�y������t\l=#�v��7���E:.���w;��@�Kt\l=#�v��7�|l����g��N|��?_��b�y����ku\l=#�v��7����t\l=#�v��7����:.���w;��@~KD���3�n'>y��Y���3�n'>y����q������O����:.���w;��@��
p\��b��w;��@>�^�S�n'�����	��t;��B����n'>yC�{�����'oy�����
!���k���!�=_r�v��7������N|����5<%�v�^��b���79�7��K��4�U<�����W�����R�R��D����z��A���3z���yz�A]�~���|�,�7�����y�9��|=k=���0�P�g���z�����e��b������E.^��i)��-�7Q�C���z����T��8i�q�����eA�����co����P�q.��X�z�
j=>��(�z�_p���OR�&�KOas�
���������}gw�����D��	x��I�]@����;:0<xx��k`���4��������5�h4S������������)J�a�Ed�hT	��+u��)����
���a��#�uXNR��v�tB��x���`���E]�E}��V
�������,���%;�=u4����M?(Jh��$�����)h������7QSu��4MQ�����h������i���������zM��Lh����j���5��"�is�>����R���+��K�G����5}}uB�h�>*��Q)j����i�>*��Q)j�D8�iK�>����R�t-�3MZS>1���Q��t�Z��Gl,M�Z"4;d�0��=�:��������z��7:e��.���4�x�^�_�9^��$���S���s�o���TO�6����k���N��&zg�~����lYj�YS����c9���V�,.�M��]^F*S����f�:�.����8��_K����Sx{����4��3���������S�����gN�]�/���ya��3��jsu�E�x�E�;o������+�KV�$E���h
�:����|��8v���K��s�
�m�<��+������������h���P-�C:=�h�m��=��G��g��N������b���l�+B�2d��P�|A����m�e��/Cy2�/C2��B�]���G�w���<7P��f����{kw��=L��Q���! Ct������q��:l��:@��x?���0�>J�cC�u�������|�j���	��vp�>�G[p���=�^���5E��"��;�Q�p_z }���9�
�=R,G�CA:��~�n��#���X�j��~���H��B}
5g?����ZP�0*x��%P/ �-��R�������D��D�Fz��� j[�v
���O�U�*<��^%�O�a{9zb��MU(�j��HjP�+�zk�.!�"���W�=�#:��hd��G���O���C���-�v�r��[�hW���f��u��t�;Q��[O�>�_��D��D�leeo�z0����iu���5����d�X�Jd^2d-��R*���2T.G(C�"�C��Y�eh�-��
���e2�\�V�P�U�P�����!�V�V��j���5r�2d���z��2d5�PH��2��Fj��fZ'CQj���2�A�6�P�\(2dm��;e�M�6�a���E�,��2d�%C1�[����6�.C���{d�]�v��N���N��!k����.������� '�i��k�a��S�%�;�-����HR'�y7-����N�|�7���Nf��h<����~���ub�v��j��E��c����Ur�+���|.��W;��2�<�b��c.���<��	w�����l�T����"��@�#��"}c�����z�����r!{�f�3k����$���i
(
�=�5~�8�����Q��s�il�mpn��d��1��������sSm
#���{���A�Sa{fs����[�R��+H��Fs�-�^�~;8s��^����AC�M�k��=��>\N�5p��A��v����q����T��I7k���.8���6~h�7�A��M�d��Y��'9�?���I�[����RM}M�Q��q�^������k������8�}�L��O������0�xB<C��2�F�������}||����/PKVj�m
��PK��7L�l9�..mimetypePK��7L����Tcontent.xmlPK��7L]"j%j�!
�styles.xmlPK��7LR,�
�KFmeta.xmlPK��7L��h��manifest.rdfPK��7LWConfigurations2/statusbar/PK��7L�Configurations2/images/Bitmaps/PK��7L�Configurations2/popupmenu/PK��7L'Configurations2/accelerator/current.xmlPK��7L[Configurations2/progressbar/PK��7L�Configurations2/toolbar/PK��7L�Configurations2/menubar/PK��7LConfigurations2/floater/PK��7L7Configurations2/toolpanel/PK��7L�>���	=oObject 1/content.xmlPK��7L�P����(Object 1/styles.xmlPK��7L[��H!I�*Object 1/meta.xmlPK��7L��\\	,Thumbnails/thumbnail.pngPK��7L�(�N�	�<�IObject 2/content.xmlPK��7L�P����SObject 2/styles.xmlPK��7L[��H!I�UObject 2/meta.xmlPK��7LB��+e%9Wsettings.xmlPK��7L�)Z����\META-INF/manifest.xmlPK��7L�z!c�
��w^ObjectReplacements/Object 1PK��7LVj�m
��kiObjectReplacements/Object 2PK�!t
perf_buff_mgmt_write-2.shapplication/x-sh; name=perf_buff_mgmt_write-2.shDownload
#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Mithun Cy (#1)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On Wed, Jan 24, 2018 at 12:06 AM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

Hi all,

When I was trying to do read-write pgbench bench-marking of PostgreSQL
9.6.6 vs 10.1 I found PostgreSQL 10.1 regresses against 9.6.6 in some
cases.

Non Default settings and test
======================
Server:
./postgres -c shared_buffers=8GB -N 200 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c
maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 &

Pgbench:
CASE 1: when data fits shared buffers.
./pgbench -i -s 1000 postgres

CASE 2: when data exceeds shared buffers.
./pgbench -i -s 1000 postgres

Both the cases look identical, but from the document attached, it
seems the case-1 is for scale factor 300.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#3Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Amit Kapila (#2)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On Wed, Jan 24, 2018 at 7:36 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Both the cases look identical, but from the document attached, it
seems the case-1 is for scale factor 300.

Oops sorry it was a typo. CASE 1 is scale factor 300 which will fit in
shared buffer =8GB.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#4Robert Haas
robertmhaas@gmail.com
In reply to: Mithun Cy (#1)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On Wed, Feb 21, 2018 at 10:03 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

seeing futex in the call stack andres suggested that following commit could
be the reason for regression

commit ecb0d20a9d2e09b7112d3b192047f711f9ff7e59
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: 2016-10-09 18:03:45 -0400

Use unnamed POSIX semaphores, if available, on Linux and FreeBSD.

Commenting out same in src/template/linux I did run the benchmark tests
again
performance improved from 26871.567326 to 34286.620251 (both median of 3
TPS).

Hmm. So that commit might not have been the greatest idea.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#5Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#4)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On Thu, Feb 22, 2018 at 7:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Feb 21, 2018 at 10:03 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

seeing futex in the call stack andres suggested that following commit could
be the reason for regression

commit ecb0d20a9d2e09b7112d3b192047f711f9ff7e59
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: 2016-10-09 18:03:45 -0400

Use unnamed POSIX semaphores, if available, on Linux and FreeBSD.

Commenting out same in src/template/linux I did run the benchmark tests
again
performance improved from 26871.567326 to 34286.620251 (both median of 3
TPS).

Hmm. So that commit might not have been the greatest idea.

It appears so. I think we should do something about it as the
regression is quite noticeable.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Amit Kapila (#5)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On 2018-Jul-19, Amit Kapila wrote:

On Thu, Feb 22, 2018 at 7:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Feb 21, 2018 at 10:03 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

seeing futex in the call stack andres suggested that following commit could
be the reason for regression

commit ecb0d20a9d2e09b7112d3b192047f711f9ff7e59
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: 2016-10-09 18:03:45 -0400

Use unnamed POSIX semaphores, if available, on Linux and FreeBSD.

Hmm. So that commit might not have been the greatest idea.

It appears so. I think we should do something about it as the
regression is quite noticeable.

So the fix is just to revert the change for the linux makefile? Sounds
easy enough, code-wise. Do we need more evidence that it's harmful?

Since it was changed in pg10 not 11, I don't think this is an open-item
per se. (Maybe an "older bug", if we must really have it there.)

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Andres Freund
andres@anarazel.de
In reply to: Mithun Cy (#1)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

Hi,

On 2018-01-24 00:06:44 +0530, Mithun Cy wrote:

Server:
./postgres -c shared_buffers=8GB -N 200 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c
maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 &

Which kernel & glibc version does this server have?

Greetings,

Andres Freund

#8Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#6)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

Hi,

On 2018-07-19 15:39:44 -0400, Alvaro Herrera wrote:

On 2018-Jul-19, Amit Kapila wrote:

On Thu, Feb 22, 2018 at 7:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Feb 21, 2018 at 10:03 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

seeing futex in the call stack andres suggested that following commit could
be the reason for regression

commit ecb0d20a9d2e09b7112d3b192047f711f9ff7e59
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: 2016-10-09 18:03:45 -0400

Use unnamed POSIX semaphores, if available, on Linux and FreeBSD.

Hmm. So that commit might not have been the greatest idea.

It appears so. I think we should do something about it as the
regression is quite noticeable.

So the fix is just to revert the change for the linux makefile? Sounds
easy enough, code-wise. Do we need more evidence that it's harmful?

Since it was changed in pg10 not 11, I don't think this is an open-item
per se. (Maybe an "older bug", if we must really have it there.)

I'm a bit hesitant to just revert without further evaluation - it's just
about as likely we'll regress on other hardware / kernel
versions. Except it'd be in a minor release, whereas the current issue
was in a major release. It'd also suddenly make some installations not
start, due to sysv semaphore # limitations.

There've been a few annoying, and a few embarassing, issues with
futexes, but they receive far more attention from a performance POV.

Greetings,

Andres Freund

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#6)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2018-Jul-19, Amit Kapila wrote:

It appears so. I think we should do something about it as the
regression is quite noticeable.

It's not *that* noticeable, as I failed to demonstrate any performance
difference before committing the patch. I think some more investigation
is warranted to find out why some other people are getting different
results.

So the fix is just to revert the change for the linux makefile? Sounds
easy enough, code-wise. Do we need more evidence that it's harmful?

Some fraction of 3d21f08bc would also need to be undone. It's also worth
contemplating that we'd be re-introducing old problems with not-enough-
SysV-semaphores, so even if there is a performance benefit to be had,
it's very far from being free.

regards, tom lane

#10Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Andres Freund (#7)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

Hi Andres,

On Fri, Jul 20, 2018 at 1:21 AM, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2018-01-24 00:06:44 +0530, Mithun Cy wrote:

Server:
./postgres -c shared_buffers=8GB -N 200 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c
maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 &

Which kernel & glibc version does this server have?

[mithun.cy@cthulhu ~]$ cat /proc/version
Linux version 3.10.0-693.5.2.el7.x86_64 (builder@kbuilder.dev.centos.org)
(gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Fri Oct 20
20:32:50 UTC 2017

[mithun.cy@cthulhu ~]$ ldd --version
ldd (GNU libc) 2.17

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#8)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

Andres Freund <andres@anarazel.de> writes:

I'm a bit hesitant to just revert without further evaluation - it's just
about as likely we'll regress on other hardware / kernel
versions.

I looked into the archives for the discussion that led up to ecb0d20a9,
and found it here:

/messages/by-id/8536.1475704230@sss.pgh.pa.us

The test cases I tried in that thread said that POSIX semas were *faster*
... by single-digit percentages, but still faster. So I think we really
need to study this issue, rather than just take one contrary result as
being gospel.

regards, tom lane

#12Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Tom Lane (#9)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On Fri, Jul 20, 2018 at 7:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2018-Jul-19, Amit Kapila wrote:

It appears so. I think we should do something about it as the
regression is quite noticeable.

It's not *that* noticeable, as I failed to demonstrate any performance
difference before committing the patch. I think some more investigation
is warranted to find out why some other people are getting different
results.

Maybe false sharing is a factor, since sizeof(sem_t) is 32 bytes on
Linux/amd64 and we're probably hitting elements clustered at one end
of the array? Let's see... I tried sticking padding into
PGSemaphoreData and I got ~8% more TPS (72 client on multi socket
box, pgbench scale 100, only running for a minute but otherwise the
same settings that Mithun showed).

--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -45,6 +45,7 @@
 typedef struct PGSemaphoreData
 {
        sem_t           pgsem;
+       char            padding[PG_CACHE_LINE_SIZE - sizeof(sem_t)];
 } PGSemaphoreData;

That's probably not the right idiom and my tests probably weren't long
enough, but there seems to be some effect here.

--
Thomas Munro
http://www.enterprisedb.com

#13Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Thomas Munro (#12)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On Fri, Jul 20, 2018 at 10:52 AM, Thomas Munro <
thomas.munro@enterprisedb.com> wrote:

On Fri, Jul 20, 2018 at 7:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It's not *that* noticeable, as I failed to demonstrate any performance
difference before committing the patch. I think some more investigation
is warranted to find out why some other people are getting different
results

Maybe false sharing is a factor, since sizeof(sem_t) is 32 bytes on
Linux/amd64 and we're probably hitting elements clustered at one end
of the array? Let's see... I tried sticking padding into
PGSemaphoreData and I got ~8% more TPS (72 client on multi socket
box, pgbench scale 100, only running for a minute but otherwise the
same settings that Mithun showed).

--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -45,6 +45,7 @@
typedef struct PGSemaphoreData
{
sem_t           pgsem;
+       char            padding[PG_CACHE_LINE_SIZE - sizeof(sem_t)];
} PGSemaphoreData;

That's probably not the right idiom and my tests probably weren't long
enough, but there seems to be some effect here.

I did a quick test applying the patch with same settings as initial mail I
have reported (On postgresql 10 latest code)
72 clients

CASE 1:
Without Patch : TPS 29269.823540

With Patch : TPS 36005.544960. --- 23% jump

Just Disabling using unnamed POSIX semaphores: TPS 34481.207959

So it seems that is the issue as the test is being run on 8 node numa
machine.
I also came across a presentation [1]https://www.slideshare.net/davidlohr/futex-scaling-for-multicore-systems : slide 20 which says one of those
futex architecture is bad for NUMA machine. I am not sure the new fix for
same is included as part of Linux version 3.10.0-693.5.2.el7.x86_64 which
is on my test machine.

[1]: https://www.slideshare.net/davidlohr/futex-scaling-for-multicore-systems

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#14Andres Freund
andres@anarazel.de
In reply to: Mithun Cy (#13)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

Hi,

On 2018-07-21 00:53:28 +0530, Mithun Cy wrote:

On Fri, Jul 20, 2018 at 10:52 AM, Thomas Munro <
thomas.munro@enterprisedb.com> wrote:

On Fri, Jul 20, 2018 at 7:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It's not *that* noticeable, as I failed to demonstrate any performance
difference before committing the patch. I think some more investigation
is warranted to find out why some other people are getting different
results

Maybe false sharing is a factor, since sizeof(sem_t) is 32 bytes on
Linux/amd64 and we're probably hitting elements clustered at one end
of the array? Let's see... I tried sticking padding into
PGSemaphoreData and I got ~8% more TPS (72 client on multi socket
box, pgbench scale 100, only running for a minute but otherwise the
same settings that Mithun showed).

--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -45,6 +45,7 @@
typedef struct PGSemaphoreData
{
sem_t           pgsem;
+       char            padding[PG_CACHE_LINE_SIZE - sizeof(sem_t)];
} PGSemaphoreData;

That's probably not the right idiom and my tests probably weren't long
enough, but there seems to be some effect here.

I did a quick test applying the patch with same settings as initial mail I
have reported (On postgresql 10 latest code)
72 clients

CASE 1:
Without Patch : TPS 29269.823540

With Patch : TPS 36005.544960. --- 23% jump

Just Disabling using unnamed POSIX semaphores: TPS 34481.207959

So it seems that is the issue as the test is being run on 8 node numa
machine.

Cool. I think we should just backpatch that then. Does anybody want to
argue against?

I also came across a presentation [1] : slide 20 which says one of those
futex architecture is bad for NUMA machine. I am not sure the new fix for
same is included as part of Linux version 3.10.0-693.5.2.el7.x86_64 which
is on my test machine.

Similar issues are also present internally for sysv semas, so I don't
think this really means that much.

Greetings,

Andres Freund

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#14)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

Andres Freund <andres@anarazel.de> writes:

On 2018-07-21 00:53:28 +0530, Mithun Cy wrote:

I did a quick test applying the patch with same settings as initial mail I
have reported (On postgresql 10 latest code)
72 clients

CASE 1:
Without Patch : TPS 29269.823540

With Patch : TPS 36005.544960. --- 23% jump

Just Disabling using unnamed POSIX semaphores: TPS 34481.207959

So it seems that is the issue as the test is being run on 8 node numa
machine.

Cool. I think we should just backpatch that then. Does anybody want to
argue against?

Not entirely clear to me what change is being proposed here?

In any case, I strongly resist making performance-based changes on
the basis of one test on one kernel and one hardware platform.
We should reproduce the results on a few different machines before
we even think of committing anything. I'm happy to test on what
I have, although I'd be the first to agree that what I'm checking
is relatively low-end cases. (Too bad hydra's gone.)

regards, tom lane

#16Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#15)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On 2018-07-20 15:35:39 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2018-07-21 00:53:28 +0530, Mithun Cy wrote:

I did a quick test applying the patch with same settings as initial mail I
have reported (On postgresql 10 latest code)
72 clients

CASE 1:
Without Patch : TPS 29269.823540

With Patch : TPS 36005.544960. --- 23% jump

Just Disabling using unnamed POSIX semaphores: TPS 34481.207959

So it seems that is the issue as the test is being run on 8 node numa
machine.

Cool. I think we should just backpatch that then. Does anybody want to
argue against?

Not entirely clear to me what change is being proposed here?

Adding padding to struct PGSemaphoreData, so multiple semas don't share
a cacheline.

In any case, I strongly resist making performance-based changes on
the basis of one test on one kernel and one hardware platform.
We should reproduce the results on a few different machines before
we even think of committing anything. I'm happy to test on what
I have, although I'd be the first to agree that what I'm checking
is relatively low-end cases. (Too bad hydra's gone.)

Sure, it'd be good to do more of that. But from a theoretical POV it's
quite logical that posix semas sharing cachelines is bad for
performance, if there's any contention. When backed by futexes -
i.e. all non ancient linux machines - the hot path just does a cmpxchg
of the *userspace* data (I've copied the relevant code below). Given
that we don't have a large number of semas these days, that there's
reasons to make the change even without measuring it, that we have
benchmark results, and that it's hard to see how it'd cause regressions,
I don't think going for a fix quickly is unreasonable.

You could argue it'd be better to make semaphores be embeddable in
bigger structures like PGPROC, rather than allocated in an array. While
I suspect you'd get a bit of a performance benefit from that, it seems
like a far bigger change we'd want to do in a minor release.

int
__new_sem_wait (sem_t *sem)
{
/* We need to check whether we need to act upon a cancellation request here
because POSIX specifies that cancellation points "shall occur" in
sem_wait and sem_timedwait, which also means that they need to check
this regardless whether they block or not (unlike "may occur"
functions). See the POSIX Rationale for this requirement: Section
"Thread Cancellation Overview" [1]http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xsh_chap02.html and austin group issue #1076 [2]http://austingroupbugs.net/view.php?id=1076 for thoughts on why this */ __pthread_testcancel ();
for thoughs on why this may be a suboptimal design.

[1]: http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xsh_chap02.html
[2]: http://austingroupbugs.net/view.php?id=1076 for thoughts on why this */ __pthread_testcancel ();
*/
__pthread_testcancel ();

if (__new_sem_wait_fast ((struct new_sem *) sem, 0) == 0)
return 0;
else
return __new_sem_wait_slow((struct new_sem *) sem, NULL);
}

/* Fast path: Try to grab a token without blocking. */
static int
__new_sem_wait_fast (struct new_sem *sem, int definitive_result)
{
/* We need acquire MO if we actually grab a token, so that this
synchronizes with all token providers (i.e., the RMW operation we read
from or all those before it in modification order; also see sem_post).
We do not need to guarantee any ordering if we observed that there is
no token (POSIX leaves it unspecified whether functions that fail
synchronize memory); thus, relaxed MO is sufficient for the initial load
and the failure path of the CAS. If the weak CAS fails and we need a
definitive result, retry. */
#if __HAVE_64B_ATOMICS
uint64_t d = atomic_load_relaxed (&sem->data);
do
{
if ((d & SEM_VALUE_MASK) == 0)
break;
if (atomic_compare_exchange_weak_acquire (&sem->data, &d, d - 1))
return 0;
}
while (definitive_result);
return -1;
#else
unsigned int v = atomic_load_relaxed (&sem->value);
do
{
if ((v >> SEM_VALUE_SHIFT) == 0)
break;
if (atomic_compare_exchange_weak_acquire (&sem->value,
&v, v - (1 << SEM_VALUE_SHIFT)))
return 0;
}
while (definitive_result);
return -1;
#endif
}

/* See sem_wait for an explanation of the algorithm. */
int
__new_sem_post (sem_t *sem)
{
struct new_sem *isem = (struct new_sem *) sem;
int private = isem->private;

#if __HAVE_64B_ATOMICS
/* Add a token to the semaphore. We use release MO to make sure that a
thread acquiring this token synchronizes with us and other threads that
added tokens before (the release sequence includes atomic RMW operations
by other threads). */
/* TODO Use atomic_fetch_add to make it scale better than a CAS loop? */
uint64_t d = atomic_load_relaxed (&isem->data);
do
{
if ((d & SEM_VALUE_MASK) == SEM_VALUE_MAX)
{
__set_errno (EOVERFLOW);
return -1;
}
}
while (!atomic_compare_exchange_weak_release (&isem->data, &d, d + 1));

/* If there is any potentially blocked waiter, wake one of them. */
if ((d >> SEM_NWAITERS_SHIFT) > 0)
futex_wake (((unsigned int *) &isem->data) + SEM_VALUE_OFFSET, 1, private);
#else
/* Add a token to the semaphore. Similar to 64b version. */
unsigned int v = atomic_load_relaxed (&isem->value);
do
{
if ((v >> SEM_VALUE_SHIFT) == SEM_VALUE_MAX)
{
__set_errno (EOVERFLOW);
return -1;
}
}
while (!atomic_compare_exchange_weak_release
(&isem->value, &v, v + (1 << SEM_VALUE_SHIFT)));

/* If there is any potentially blocked waiter, wake one of them. */
if ((v & SEM_NWAITERS_MASK) != 0)
futex_wake (&isem->value, 1, private);
#endif

return 0;
}

Greetings,

Andres Freund

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#16)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

Andres Freund <andres@anarazel.de> writes:

On 2018-07-20 15:35:39 -0400, Tom Lane wrote:

In any case, I strongly resist making performance-based changes on
the basis of one test on one kernel and one hardware platform.

Sure, it'd be good to do more of that. But from a theoretical POV it's
quite logical that posix semas sharing cachelines is bad for
performance, if there's any contention. When backed by futexes -
i.e. all non ancient linux machines - the hot path just does a cmpxchg
of the *userspace* data (I've copied the relevant code below).

Here's the thing: the hot path is of little or no interest, because
if we are in the sema code at all, we are expecting to block. The
only case where we wouldn't block is if the lock manager decided the
current process needs to sleep, but some other process already released
us by the time we reach the futex/kernel call. Certainly that will happen
some of the time, but it's not likely to be the way to bet. So I'm very
dubious of any arguments based on the speed of the "uncontended" path.

It's possible that the bigger picture here is that the kernel boys
optimized for the "uncontended" path to the point where they broke
performance of the blocking path. It's hard to see how they could
have broke it to the point of being slower than the SysV sema API,
though.

Anyway, I think we need to test first and patch second. I'm working
on getting some numbers on my own machines now.

On my RHEL6 machine, with unmodified HEAD and 8 sessions (since I've
only got 8 cores) but other parameters matching Mithun's example,
I just got

transaction type: <builtin: TPC-B (sort of)>
scaling factor: 300
query mode: prepared
number of clients: 8
number of threads: 8
duration: 1800 s
number of transactions actually processed: 29001016
latency average = 0.497 ms
tps = 16111.575661 (including connections establishing)
tps = 16111.623329 (excluding connections establishing)

which is interesting because vmstat was pretty consistently reporting
around 500000 context swaps/second during the run, or circa 30
cs/transaction. We'd have a minimum of 14 cs/transaction just between
client and server (due to seven SQL commands per transaction in TPC-B)
so that seems on the low side; not a lot of lock contention here it
seems. I wonder what the corresponding ratio was in Mithun's runs.

regards, tom lane

#18Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#17)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

Hi,

On 2018-07-20 16:43:33 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2018-07-20 15:35:39 -0400, Tom Lane wrote:

In any case, I strongly resist making performance-based changes on
the basis of one test on one kernel and one hardware platform.

Sure, it'd be good to do more of that. But from a theoretical POV it's
quite logical that posix semas sharing cachelines is bad for
performance, if there's any contention. When backed by futexes -
i.e. all non ancient linux machines - the hot path just does a cmpxchg
of the *userspace* data (I've copied the relevant code below).

Here's the thing: the hot path is of little or no interest, because
if we are in the sema code at all, we are expecting to block.

Note that we're also using semas for ProcArrayGroupClearXid(), which is
pretty commonly hot for pgbench style workloads, and where the expected
wait times are very short.

It's possible that the bigger picture here is that the kernel boys
optimized for the "uncontended" path to the point where they broke
performance of the blocking path. It's hard to see how they could
have broke it to the point of being slower than the SysV sema API,
though.

I don't see how this is a likely proposition, given that adding padding
to the *userspace* portion of futexes increased the performance quite
significantly.

On my RHEL6 machine, with unmodified HEAD and 8 sessions (since I've
only got 8 cores) but other parameters matching Mithun's example,
I just got

It's *really* common to have more actual clients than cpus for oltp
workloads, so I don't think it's insane to test with more clients.

Greetings,

Andres Freund

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#18)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

Andres Freund <andres@anarazel.de> writes:

On 2018-07-20 16:43:33 -0400, Tom Lane wrote:

On my RHEL6 machine, with unmodified HEAD and 8 sessions (since I've
only got 8 cores) but other parameters matching Mithun's example,
I just got

It's *really* common to have more actual clients than cpus for oltp
workloads, so I don't think it's insane to test with more clients.

I finished a set of runs using similar parameters to Mithun's test except
for using 8 clients, and another set using 72 clients (but, being
impatient, 5-minute runtime) just to verify that the results wouldn't
be markedly different. I got TPS numbers like this:

8 clients 72 clients

unmodified HEAD 16112 16284
with padding patch 16096 16283
with SysV semas 15926 16064
with padding+SysV 15949 16085

This is on RHEL6 (kernel 2.6.32-754.2.1.el6.x86_64), hardware is dual
4-core Intel E5-2609 (Sandy Bridge era). This hardware does show NUMA
effects, although no doubt less strongly than Mithun's machine.

I would like to see some other results with a newer kernel. I tried to
repeat this test on a laptop running Fedora 28, but soon concluded that
anything beyond very short runs was mainly going to tell me about thermal
throttling :-(. I could possibly get repeatable numbers from, say,
1-minute SELECT-only runs, but that would be a different test scenario,
likely one with a lot less lock contention.

Anyway, for me, the padding change is a don't-care. Given that both
Mithun and Thomas showed some positive effect from it, I'm not averse
to applying it. I'm still -1 on going back to SysV semas.

regards, tom lane

#20Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Tom Lane (#19)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On Sun, Jul 22, 2018 at 8:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

On 2018-07-20 16:43:33 -0400, Tom Lane wrote:

On my RHEL6 machine, with unmodified HEAD and 8 sessions (since I've
only got 8 cores) but other parameters matching Mithun's example,
I just got

It's *really* common to have more actual clients than cpus for oltp
workloads, so I don't think it's insane to test with more clients.

I finished a set of runs using similar parameters to Mithun's test except
for using 8 clients, and another set using 72 clients (but, being
impatient, 5-minute runtime) just to verify that the results wouldn't
be markedly different. I got TPS numbers like this:

8 clients 72 clients

unmodified HEAD 16112 16284
with padding patch 16096 16283
with SysV semas 15926 16064
with padding+SysV 15949 16085

This is on RHEL6 (kernel 2.6.32-754.2.1.el6.x86_64), hardware is dual
4-core Intel E5-2609 (Sandy Bridge era). This hardware does show NUMA
effects, although no doubt less strongly than Mithun's machine.

I would like to see some other results with a newer kernel. I tried to
repeat this test on a laptop running Fedora 28, but soon concluded that
anything beyond very short runs was mainly going to tell me about thermal
throttling :-(. I could possibly get repeatable numbers from, say,
1-minute SELECT-only runs, but that would be a different test scenario,
likely one with a lot less lock contention.

I did some testing on 2-node, 4-node and 8-node systems running Linux
3.10.something (slightly newer but still ancient). Only the 8-node
box (= same one Mithun used) shows the large effect (the 2-node box
may be a tiny bit faster patched but I'm calling that noise for now...
it's not slower, anyway).

On the problematic box, I also tried some different strides (char
padding[N - sizeof(sem_t)]) and was surprised by the result:

Unpatched = ~35k TPS
64 byte stride = ~35k TPS
128 byte stride = ~42k TPS
4096 byte stride = ~47k TPS

Huh. PG_CACHE_LINE_SIZE is 128, but the true cache line size on this
system is 64 bytes. That exaggeration turned out to do something
useful, though I can't explain it.

While looking for discussion of 128 byte cache effects I came across
the Intel "L2 adjacent cache line prefetcher"[1]https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors. Maybe this, or some
of the other prefetchers (enabled in the BIOS) or related stuff could
be at work here. It could be microarchitecture-dependent (this is an
old Westmere box), though I found a fairly recent discussion about a
similar effect[2]https://groups.google.com/forum/#!msg/mechanical-sympathy/i3-M2uCYTJE/P7vyoOTIAgAJ that mentions more recent hardware. The spatial
prefetcher reference can be found in the Optimization Manual[3]https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf.

[1]: https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors
[2]: https://groups.google.com/forum/#!msg/mechanical-sympathy/i3-M2uCYTJE/P7vyoOTIAgAJ
[3]: https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf

--
Thomas Munro
http://www.enterprisedb.com

#21Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Thomas Munro (#20)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On Mon, Jul 23, 2018 at 3:40 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

I did some testing on 2-node, 4-node and 8-node systems running Linux
3.10.something (slightly newer but still ancient). Only the 8-node
box (= same one Mithun used) shows the large effect (the 2-node box
may be a tiny bit faster patched but I'm calling that noise for now...
it's not slower, anyway).

(I forgot to add that the 4 node system that showed no change is POWER
architecture.)

--
Thomas Munro
http://www.enterprisedb.com

#22Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Thomas Munro (#20)
1 attachment(s)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On Mon, Jul 23, 2018 at 3:40 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

On Sun, Jul 22, 2018 at 8:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

8 clients 72 clients

unmodified HEAD 16112 16284
with padding patch 16096 16283
with SysV semas 15926 16064
with padding+SysV 15949 16085

This is on RHEL6 (kernel 2.6.32-754.2.1.el6.x86_64), hardware is dual
4-core Intel E5-2609 (Sandy Bridge era). This hardware does show NUMA
effects, although no doubt less strongly than Mithun's machine.

I would like to see some other results with a newer kernel. I tried to
repeat this test on a laptop running Fedora 28, but soon concluded that
anything beyond very short runs was mainly going to tell me about thermal
throttling :-(. I could possibly get repeatable numbers from, say,
1-minute SELECT-only runs, but that would be a different test scenario,
likely one with a lot less lock contention.

I did some testing on 2-node, 4-node and 8-node systems running Linux
3.10.something (slightly newer but still ancient). Only the 8-node
box (= same one Mithun used) shows the large effect (the 2-node box
may be a tiny bit faster patched but I'm calling that noise for now...
it's not slower, anyway).

Here's an attempt to use existing style better: a union, like
LWLockPadded and WALInsertLockPadded. I think we should back-patch to
10. Thoughts?

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

0001-Pad-semaphores-to-avoid-false-sharing.patchapplication/octet-stream; name=0001-Pad-semaphores-to-avoid-false-sharing.patchDownload
From 82756e109a0d38a590402881a6094c8bef7f5ece Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Mon, 23 Jul 2018 20:49:45 +1200
Subject: [PATCH] Pad semaphores to avoid false sharing.

In a USE_UNNAMED_SEMAPHORES build, the default on Linux and FreeBSD
since commit ecb0d20a, we have an array of sem_t objects.  This
turned out to reduce performance compared to the previous default
USE_SYSV_SEMAPHORES on an 8 socket system.  Testing showed that the
lost performance could be regained by padding the array elements so
that they have their own cachelines.  This matches what we do for
similar hot arrays (see LWLockPadded, WALInsertLockPadded).

Author: Thomas Munro
Reviewed-by: Andres Freund
Reported-by: Mithun Cy
Tested-by: Mithun Cy, Tom Lane
Discussion: https://postgr.es/m/CAD__OugYDM3O%2BdyZnnZSbJprSfsGFJcQ1R%3De59T3hcLmDug4_w%40mail.gmail.com
---
 src/backend/port/posix_sema.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/backend/port/posix_sema.c b/src/backend/port/posix_sema.c
index a2cabe58fc..5174550794 100644
--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -41,13 +41,19 @@
 #error cannot use named POSIX semaphores with EXEC_BACKEND
 #endif
 
+typedef union SemTPadded
+{
+	sem_t		pgsem;
+	char		pad[PG_CACHE_LINE_SIZE];
+} SemTPadded;
+
 /* typedef PGSemaphore is equivalent to pointer to sem_t */
 typedef struct PGSemaphoreData
 {
-	sem_t		pgsem;
+	SemTPadded	sem_padded;
 } PGSemaphoreData;
 
-#define PG_SEM_REF(x)	(&(x)->pgsem)
+#define PG_SEM_REF(x)	(&(x)->sem_padded.pgsem)
 
 #define IPCProtection	(0600)	/* access/modify by user only */
 
-- 
2.17.0

#23Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Thomas Munro (#22)
Re: Possible performance regression in version 10.1 with pgbench read-write tests.

On Mon, Jul 23, 2018 at 10:06 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

Here's an attempt to use existing style better: a union, like
LWLockPadded and WALInsertLockPadded. I think we should back-patch to
10. Thoughts?

Pushed to 10, 11, master.

It's interesting that I could see a further ~12% speedup by using VM
page-size stride on that 8 socket machine, but that's something to
look at another day. The PG_CACHE_LINE_SIZE padding change gets us
back to approximately where we were in 9.6.

/me . o O ( Gee, it'd be really nice to see this change on a graph on
a web page that tracks a suite of tests on a farm of interesting
machines on each branch over time. )

--
Thomas Munro
http://www.enterprisedb.com