Further XLogInsert scaling tweaking

Started by Heikki Linnakangasover 12 years ago5 messages
#1Heikki Linnakangas
hlinnakangas@vmware.com
2 attachment(s)

Now that I've had a little break from the big XLogInsert scaling patch,
I went back to do some testing and profiling of it. I saw a lot of
contention on the first access of RedoRecPtr and force/fullPageWrites,
which made me realize that I put those variables right next to the
heavily-contended insertpos_lck spinlock and the variables that it
protects. Needless to say, that's a recipe for contention.

I added some padding between those two, per attached patch, and got a
big improvement. So we should definitely do that. I just added a
char[128] field between them, which is enough to put them on different
cache lines on the server I'm testing on. I don't think there is any
platform-independent way to get the current cache line size,
unfortunately. Since this doesn't affect correctness, I'm inclined to
just add the 128-byte padding field there.

Attached is a graph generated by pgbench-tools. Full results are
available here: http://hlinnaka.iki.fi/xloginsert-scaling/padding/. The
test query used was:

insert into foo:client_id select generate_series(1, 100);

That is, each client inserts a lot of rows into a different table. The
table has no indexes. This is pretty much the worst-case scenario for
stressing XLogInsert.

The "master-b03d196" is unpatched git master, "xlog-padding-fb741c0" is
with the padding. The -16 plots are the same, but with
xloginsert_slots=16 (the default is 8). The server has 16 cores, 32 with
hyperthreading.

It's interesting that although the peak performance is better with 8
slots than with 16, it doesn't scale as gracefully with 16 slots. I
believe that's because with more insertion slots, you have more backends
fighting over the insertpos_lck. With fewer slots, the extra work is
queued behind the insertion slots instead, and sleeping is better than
spinning.

In either case, the performance with the padding is better than without.

- Heikki

Attachments:

clients-sets-padding.pngimage/png; name=clients-sets-padding.pngDownload
�PNG


IHDR��,� PLTE���������������@��Ai��� �@���0`��@�������**��@��333MMMfff�������������������22�������U����������������d�"�".�W��p��������������P����E��r��z�����k������� �����������P@Uk/���@�@��`��`�����@��@��`��p������������������������|�@�� �����K	pHYs���+IDATx�����*57���x��m�O�7�MTB�
������\���S@�Dm��N�0}%����<���88^)��6\�����T@�lD<�
��uM�Ji�I���.q��V;��a�M���G���MuU�c�G����A�\
T�4j+31����>��x�75���vU��k��Y_b�Shj�),�#�rT7�
.��6�lX��p�E{}bo_x��x�9��'�q���F@~���8�T@={���O�Dm�u�  (b@P�$G�R������i}�m�+��#.C��IE������IK'i�GZ:�2$-�T�}i�����tR�����t,��c�zy����dOJ>nB��������R��W�z��zT��!%71/-~�����r��u������-CJ>n  Y@�
P�$�����"y�u��R��x8*{�.��A����M����:��<E^�Q�].�_��0K:g
����u������r����31�Y�Hs�������1�O���&�L�" ��a��@��	��bt�y�f�!�b�l/e�K��M�:�2��x���Fw[�l��O��$�gcxs
(p Z��-����Ky���1��x)�#e�K�x);1�K�J�y#GR�!��<�6���*�	8C���N�9�������,2��H�T&����N�G&:�z:\3��e����i�`���&�T���Q��2�c>z����TZ�����2Kz8�[R2fUGBr
����LhU-�|��Me�u��2A,_�����B�|�{��n����T+��*���k2���l������_��(H@-^S<��k����Jf���u'��W���,}��Y7p4���`�~�����RN��O;!��I�����a�Z������6�i��f���A��VL3>DO�����H��-��] ��\�����M������@������i�������!zP#��TP��D/X>=�y�QG��L�v��q-!%����L3c7������21`�i	�>yP�N�|��Y�3},��K��%������0���Zd��R�B+�|9"'���]k��;�Z��i����t�a/�k�����2X|3�������T��
zb�m���N���o6�:�y_;`��]FV�mGgd�o�7�gL2OE��
x5��z��eU�;�,��p��~}���5�@���cA&�Q�-�U<�S�|��Yc���]���^H�z�yb�����	8("X��V�@vI�H*������%��u�`��j���c����	�n��5]�j\
��M�T������b�l��	|�d����,�����t-�*G|�U���(���[5
��}�[i���*��	��+��J-k4@b�K���f�yY�o:�o���k���X*����R�Dy���y !�����~l������I�V/8��|������_ ������\��q��y��1��B�tz��7)!4�����D�)nZLD
��C��b�;�S�M@{8�+5
��n�E���������6�}��'iB�{t!`����&=F��CH�!���� �H�2����J�Dd�H���������:	U�N;Zr0��CH�Q.i���E�=oN�'�wdW3�3���F�M!H�PJ>�T�8�p�C���o�j�u6���h��P&����O�����!%S�T�4:Kxp�����U
�)k�����xi����:�=�G�t���t��$<���j�)�m��O@��������e��D�+�g�C����P�
'����n�[@��`P�l0]��,	�P��N�!�po����J��wR���Z�TN�}K�9+�p���}��SP��N���v�$<,0z�VZM�vP������U��M5�?�5N��P�^���p
k5��*����c��T�U�{�(B)�c�o�{�,Gg�^++t����M@���t��e.N������
��M)�����&`	�^0$�.�3���e�KE%x�����pQk?
<�M�#vs�����h]@��]��-<+����;�>A����.������g%	��G&zTy��M��b��amYMj��;j�y�90����m���<4.����y���p�V��9��<��W��|2��3�K�$}YXy��T���K�������fS��V�����dY�&���ro.x�5im��u"�6Cp�,;�S$4}���37o�VIp?>�����k�Gy�������������
P����z3K
��r�
Mp!1X��M�����Vk�0�!��e5}Q;(���$�������6q����[i�^��6���)`���&x8/�o�����3���L�+����b��X�0����ZYo��8]	TQ�	�|'��Do��-h�/O,L@�V��3���^�����0� ��q��j�}�J�|�&_8FA3���	��U�t#a6����G�0[SnI����2�+��M�@�g�O��a��ry6,)�B����X~�c<����aV�a#�A
��xi�9.�	9K7��q���F�����pa��{�����>����s�g���IX=�Y'��S$ms0J�����t������I�
���g��sC��SLBDB���/���[��:t��b��P�4�f��G��=���]�B,L�����������-tJ�T
�����O���5b@�S�I�������p��N�_M���"��X�����pOgb}d���xHt9�h��L��$6�i�[P�OAB�$�N��QgP5.�U@���{�0��g�u2j��q�b�>W�k����a4�8I}
�O�*�MD2[���{���N�N/�N����[x����n����7���?���V@X����A�'��k@����-�j :!r��M@g�D(m?��\	P1�9C�^�D\���� PT�1 C�N�{�0J@i�I\��, ��. `h\����.(�5�@)�&�z�t�<�	��;f�G�
����\��$$���tp�������(`��4�}4�P
��y�����	H�
�T=6L��7��VUv�|jf�����aZ0�B�e�&���IX�;���.>��IS>;�LtgKZ���P9gYV.?V���i�����S�N�Z+��r�=%���8���uE�n��q,!F)���|����hR�'�[*�vL��)?)�����������C��0S�&�b�>���;��
h�K35��Pa���Q&�
�K�v����������u,�2��vL������z�	d���+�}C��*`bz=�X�#���*���z���%`p�(��l�o��Q����'���G��
�����9��MUjc�"�UnT��x�W�A�yv�|��4B��n�EhlMz���=�X4��f��n�j�Ka��n�����SL�lGF,OZ:/X�:�O��N��0���y�V�>������]._�x�D/����=�c���Qw�whH�qxB@����w �}��.��n�wM��'�g����3����5�)�v�`�S�|$��_��w>+��������#���_�V|��/��7#���������R��;t�~>3����l�r��K��y��Zt=��Y�����`����<1`�^�m>���>��|��~�?|����6x�m+]�]������{���p��'C*#?:���Vv��C{��}_���~��j�i�7
�hk z�?k@3�{�����.:��"�^c�(�����V�0+/�h1�cg�\-M��4�	y*�����'b�U����.8z(9���	��B[��o�� ���1Z"hM�i�M��[aq�.[���`EBt�ciJ����{���kg�h�mg��FsF�Q�3�]�#��)2�&�ZzT�%�:��
/�L�u���2�& W����S`M���Pd5p�U���=cn���*�<��A��HDk��/�v�{H����P�������%�Q@���]�K3��9&�!&��H���Z���wXf\
�>O����5pM�K2t�J�&���������)�����1���fL�9�:4:5��L�pP��U�����������A��Y�@��t��[(-v{ah���n/���q��/�V�S�Hx�\�8������z�W�|oM�>���N�p<.�Q-�>�n�������P��������IW�6�C�,L��@��V��~S��r*�U�>�K�1`}W��P�r�F�ql_�eL�Y����GX(C@�)�2��k�C�oF��	)Z�PX����tEoS���j�b�����a��4�Q��5��>w b\
X�B���b�[<#�~�����& ������$UW|�l"����V�\6��X�����K"b�G>t\�e�5��l?��	�i�,�4�U��t `�~&'����9�q�P~�B����������EPO~��f�aL��QZ(\@�6Ll�5�euCDX.��X�8�d�FHl����-4�
C�����+���04,�Y�kX$� `�}��.Nthz;�v���X�~�����#��e���,C��� �#V�!��`�>3�A�1�#����
�����T3�������
9��_Huo��5���:=
���T�:�s��#O7����/��(Y��%������/dh�	nV@� F�����m��_/������C��-y�a�;&����������Znwax&O9@������������x������/`�/�>���tEs*�������e�����59���/dhN@���V���7Ac@@P��3Z�1 C^3a8��/dh@@e*Vk�6��&�����=���������T���I+_� ^��7��/�04 `T
��3��!^���i�iP@�����V���%C*@i�z4�tq�s��|���7��/d.�2�i�|�n��l�,b@��0�?i� `�t�h�o�G+��N-`|,���	(�|! �tc+@i�$��H��p�!`��!d�/`�����/d,�F�IZ���-0��1�oV�C8����|��1 C!��� �5<
�8����SD>���$�|C�&X��f
��g���W�?�@b\�,
���������=�{���z��O�TFfh�.Y�an���b@tB�D�l���/`�M@�R\oN�G��=����7��/d��e�B<C��[`�!X�B@�!P��F"d.`���2@������!W@��
 `Y2�&�����������t��V��A������A��L�$���zPZ�B@��m���/`,`-0����*�,��
x�?i�$
���W���0���p����N�a����6b@��d��f
(�|! �H�����0pZ�:i��@�j@�"��R"d(�����K+_� P��<E��PCZ��������x�vc b@�~��;��H+_� Q��S2��/`+ xD�Ep�7��{�u% b@�2�	�k��)J+_�PD�1I��a@i������_
�� ���
�-���"�5�=�Ar��G
����}����?	���O������O�TFfh$��a��P�0��^����( �{'�2�#�	a�'�0���KGZ�B@y.}����+_� R�����/R��/A���|}��A���z��V��A����V��A�������<��8�=���_j]�"d'��Jk�������5PZ�y��aP�k�8��\���k�vjB}2�P_������(�|! �8mz�m^��H+_������f��u!ha�f�����t��QPI�6�U��.�V�1 C�����ub.��'q����?��
�aw���kCP�^t-����*!1���9�zP+g��
�H�O�2����FR:���A'�!$	�����;[�s�AA$	��Cx���f���H�.��u�2�����q�\���]��U.�mm�wKDo��Q�����[�,|%Rv��|i��z|��Rv��|T=Zk��v"�#-��a��9+"��B@;M	�%F�\v��O}z�>�ZY����������Rv��|�P���������/%��i��s����d);���L:�tB"jR�Z�R|��4��S�B
i����
���/H�������S#C^�����7�F��x>zMG\�D8���6%,��^��!i�(i��#�`�P:4��|���n����O�B������r��y�YO�O�tr}�l��7U&,�1GJ��wKgK+S�A���%�d2�b���c����4D�L�!`zE|�%��t�����e����JM'�r����OL$OJGd��LM��%���\;,��)})M����P�%��� %�������2���I�����K�����|���D2���N<�q��G@H	�q�w���b�r���4��\�fR������G�(�m���m�������� ���	q4���#�
�h���g����x��`��#�f�IEND�B`�
xloginsert-padding-1.patchtext/x-diff; name=xloginsert-padding-1.patchDownload
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 39c58d0..28e62ea 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -428,8 +428,14 @@ typedef struct XLogCtlInsert
 	uint64		CurrBytePos;
 	uint64		PrevBytePos;
 
-	/* insertion slots, see above for details */
-	XLogInsertSlotPadded *insertSlots;
+	/*
+	 * Make sure the above heavily-contended spinlock and byte positions are
+	 * on their own cache line. In particular, the RedoRecPtr and full page
+	 * write variables below should be on a different cache line. They are
+	 * read on every WAL insertion, but updated rarely, and we don't want
+	 * those reads to steal the cache line containing Curr/PrevBytePos.
+	 */
+	char		pad[128];
 
 	/*
 	 * fullPageWrites is the master copy used by all backends to determine
@@ -455,6 +461,9 @@ typedef struct XLogCtlInsert
 	bool		exclusiveBackup;
 	int			nonExclusiveBackups;
 	XLogRecPtr	lastBackupStart;
+
+	/* insertion slots, see XLogInsertSlot struct above for details */
+	XLogInsertSlotPadded *insertSlots;
 } XLogCtlInsert;
 
 /*
#2Bruce Momjian
bruce@momjian.us
In reply to: Heikki Linnakangas (#1)
Re: Further XLogInsert scaling tweaking

On Mon, Sep 2, 2013 at 10:14:03AM +0300, Heikki Linnakangas wrote:

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 39c58d0..28e62ea 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -428,8 +428,14 @@ typedef struct XLogCtlInsert
uint64		CurrBytePos;
uint64		PrevBytePos;
-	/* insertion slots, see above for details */
-	XLogInsertSlotPadded *insertSlots;
+	/*
+	 * Make sure the above heavily-contended spinlock and byte positions are
+	 * on their own cache line. In particular, the RedoRecPtr and full page
+	 * write variables below should be on a different cache line. They are
+	 * read on every WAL insertion, but updated rarely, and we don't want
+	 * those reads to steal the cache line containing Curr/PrevBytePos.
+	 */
+	char		pad[128];

Do we adjust for cache line lengths anywhere else? PGPROC? Should it
be a global define?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Merlin Moncure
mmoncure@gmail.com
In reply to: Bruce Momjian (#2)
Re: Further XLogInsert scaling tweaking

On Mon, Sep 2, 2013 at 10:32 PM, Bruce Momjian <bruce@momjian.us> wrote:

On Mon, Sep 2, 2013 at 10:14:03AM +0300, Heikki Linnakangas wrote:

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 39c58d0..28e62ea 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -428,8 +428,14 @@ typedef struct XLogCtlInsert
uint64          CurrBytePos;
uint64          PrevBytePos;
-     /* insertion slots, see above for details */
-     XLogInsertSlotPadded *insertSlots;
+     /*
+      * Make sure the above heavily-contended spinlock and byte positions are
+      * on their own cache line. In particular, the RedoRecPtr and full page
+      * write variables below should be on a different cache line. They are
+      * read on every WAL insertion, but updated rarely, and we don't want
+      * those reads to steal the cache line containing Curr/PrevBytePos.
+      */
+     char            pad[128];

Do we adjust for cache line lengths anywhere else? PGPROC? Should it
be a global define?

+1 -- that is, I think it should be.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bruce Momjian (#2)
Re: Further XLogInsert scaling tweaking

Bruce Momjian wrote:

On Mon, Sep 2, 2013 at 10:14:03AM +0300, Heikki Linnakangas wrote:

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 39c58d0..28e62ea 100644
-	XLogInsertSlotPadded *insertSlots;
+	/*
+	 * Make sure the above heavily-contended spinlock and byte positions are
+	 * on their own cache line. In particular, the RedoRecPtr and full page
+	 * write variables below should be on a different cache line. They are
+	 * read on every WAL insertion, but updated rarely, and we don't want
+	 * those reads to steal the cache line containing Curr/PrevBytePos.
+	 */
+	char		pad[128];

Do we adjust for cache line lengths anywhere else? PGPROC? Should it
be a global define?

We have LWLockPadded in lwlock.c; it only adjusts the size of the struct
to be a power of 2.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Merlin Moncure (#3)
Re: Further XLogInsert scaling tweaking

On 03.09.2013 16:22, Merlin Moncure wrote:

On Mon, Sep 2, 2013 at 10:32 PM, Bruce Momjian<bruce@momjian.us> wrote:

On Mon, Sep 2, 2013 at 10:14:03AM +0300, Heikki Linnakangas wrote:

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 39c58d0..28e62ea 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -428,8 +428,14 @@ typedef struct XLogCtlInsert
uint64          CurrBytePos;
uint64          PrevBytePos;
-     /* insertion slots, see above for details */
-     XLogInsertSlotPadded *insertSlots;
+     /*
+      * Make sure the above heavily-contended spinlock and byte positions are
+      * on their own cache line. In particular, the RedoRecPtr and full page
+      * write variables below should be on a different cache line. They are
+      * read on every WAL insertion, but updated rarely, and we don't want
+      * those reads to steal the cache line containing Curr/PrevBytePos.
+      */
+     char            pad[128];

Do we adjust for cache line lengths anywhere else? PGPROC? Should it
be a global define?

+1 -- that is, I think it should be.

Ok, committed that way. No, we adjust for cache line lengths anywhere
else. As Alvaro noted, LWLocks are padded, but that's just to keep them
from crossing cache line boundaries, not to keep two lwlocks on separate
cache lines.

Thanks!

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers