basebackups seem to have serious issues with FILE_COPY in CREATE DATABASE

Started by Tomas Vondraover 1 year ago3 messages
#1Tomas Vondra
tomas.vondra@enterprisedb.com
1 attachment(s)

Hi,

While doing some additional testing of (incremental) backups, I ran into
a couple regular failures. After pulling my hair for a couple days, I
realized the issue seems to affect regular backups, and incremental
backups (which I've been trying to test) are likely innocent.

I'm using a simple (and admittedly not very pretty) bash scripts that
takes and verified backups, concurrently with this workload:

1) initialize a cluster

2) initialize pgbench in database 'db'

3) run short pgbench on 'db'

4) maybe do vacuum [full] on 'db'

5) drop a database 'db_copy' if it exists

6) create a database 'db_copy' by copying 'db' using one of the
available strategies (file_copy, wal_log)

7) run short pgbench on 'db_copy'

8) maybe do vacuum [full] on 'db_copy'

And concurrently with this, it takes a basebackup, starts a cluster on
it (on a different port, ofc), and does various checks on that:

a) verify checksums using pg_checksums (cluster has them enabled)

b) run amcheck on tables/indexes on both databases

c) SQL check (we expect all tables to be 'consistent' as if we did a
PITR - in particular sum(balance) is expected to be the same value on
all pgbench tables) on both databases

I believe those are reasonable expectations - that we get a database
with valid checksums, with non-broken tables/indexes, and that the
database looks as a snapshot taken at a single instant.

Unfortunately it doesn't take long for the tests to start failing with
various strange symptoms on the db_copy database (I'm yet to see an
issue on the 'db' database):

i) amcheck fails with 'heap tuple lacks matching index tuple'

ERROR: heap tuple (116195,22) from table "pgbench_accounts" lacks
matching index tuple within index "pgbench_accounts_pkey"
HINT: Retrying verification using the function
bt_index_parent_check() might provide a more specific error.

I've seen this with other tables/indexes too, e.g. system catalogs
pg_statitics or toast tables, but 'accounts' is most common.

ii) amcheck fails with 'could not open file'

ERROR: could not open file "base/18121/18137": No such file or
directory
LINE 9: lateral verify_heapam(relation => c.oid, on_error_stop =>
f...
^
ERROR: could not open file "base/18121/18137": No such file or
directory

iii) failures in the SQL check, with different tables have different
balance sums

SQL check fails (db_copy) (account 156142 branches 136132 tellers
136132 history -42826)

Sometimes this is preceded by amcheck issue, but not always.

I guess this is not the behavior we expect :-(

I've reproduced all of this on PG16 - I haven't tried with older
releases, but I have no reason to assume pre-16 releases are not affected.

With incremental backups I've observed a couple more symptoms, but those
are most likely just fallout of this - not realizing the initial state
is a bit wrong, and making it worse by applying the increments.

The important observation is that this only happens if a database is
created while the backup is running, and that it only happens with the
FILE_COPY strategy - I've never seen this with WAL_LOG (which is the
default since PG15).

I don't recall any reports of similar issues from pre-15 releases, where
FILE_COPY was the only available option - I'm not sure why is that.
Either it didn't have this issue back then, or maybe people happen to
not create databases concurrently with a backup very often. It's a race
condition / timing issue, essentially.

I have no ambition to investigate this part of the code much deeper, or
invent a fix myself, at least not in foreseeable future. But it seems
like something we probably should fix - subtly broken backups are not a
great thing.

I see there have been a couple threads proposing various improvements to
FILE_COPY, that might make it more efficient/faster, namely using the
filesystem cloning [1]/messages/by-id/CA+hUKGLM+t+SwBU-cHeMUXJCOgBxSHLGZutV5zCwY4qrCcE02w@mail.gmail.com or switching pg_upgrade to use it [2]/messages/by-id/Zl9ta3FtgdjizkJ5@nathan. But having
something that's (maybe) faster but not quite correct does not seem like
a winning strategy to me ...

Alternatively, if we don't have clear desire to fix it, maybe the right
solution would be get rid of it?

regards

[1]: /messages/by-id/CA+hUKGLM+t+SwBU-cHeMUXJCOgBxSHLGZutV5zCwY4qrCcE02w@mail.gmail.com
/messages/by-id/CA+hUKGLM+t+SwBU-cHeMUXJCOgBxSHLGZutV5zCwY4qrCcE02w@mail.gmail.com

[2]: /messages/by-id/Zl9ta3FtgdjizkJ5@nathan

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

basebackup-test-scripts.tgzapplication/x-compressed-tar; name=basebackup-test-scripts.tgzDownload
��\[s�8����W`��ur��HN��u*�e���w;;��'EB��!H������u�BP�E������-��h4���J��i%��y�z��b��w��y=����������3��0z��������#<�|5��A�?)��F�q%����������~z��g�'>	9��D�xd����Lb��r������e��	��t����<+$�S����j���B�\���*����]b���U��q>x.��YVK]���x!�R4Z�!���\X&�����Ri��T\g��6{�".�3�al$fZ0��I���e-�L�D^W ���E*�<W�2.�R�$.h��j0����q�J��7�L����L�����u�S-�L]�c\%K()`s*'q��vq��d�"l��"���]�L&Y��|dW��E�_V����jD!�1��U�2��f�+	p�R��)�U�rLh"�+��8?~�������)��^Z���(������
����|��f��$

�@��2(����b��h���r2�I�M��f9��:�;���#od���h���������H��=���
"�Coc�Y2cq���%7���m����M���g���S:`��x�D���~���$,��"1����F<�S�>I�Uw�f�m!�Q�YZ[/������S`5L�oUq���X&B�k�[���A�M�8u�)��`j��\U�Y���8P��+��py�F�<K*�%�K5���� �Y�VP�������w��@h�r�����wbxO��}��a"��D����0���`�1
3�T��_����4^���������? �o�'G����������w_������.CK|��x�Kc����bR�9Ke����e�Z���������UV����G���ZH<�(�;�|���F/��8J��IM�Vl���'Oz����	�b*��7��N�(��������8�����_�a�C�-��A���X�Z6X�?	����d)�������m���5�sXty�}��,���[0P��Ju� ��g[�����w-�UV����*���$��X�������NO�8H#������|���������P�>��5����4DA�,�8M�e���e�J������z�~�QX�gR?'b&���qLde�kI��T��G�����N7X�
+��o�sF����
����}�r�!�y|z4�
��'E��L�f���RV�M��������n"��KY{C���yQ���{�����p�i��y�Ys6:��v����i�XNa�8�1�����sRl6k7V�o�l@�&���m�h�*����[cdS��Pk�7.�_�������m���8�����r�%:�u��p�=<�.%�.�H��m�D�8:��:9�&a��<C<�q���p���b����W��^�z���O�������^|6V
)�aL����O�������Li����F/�)�	t/Jf2�����+D���0i����to�2�"����.�v��H���.�_P"e�c�����v�r��"BB2����!�2��ZUi$��Z�(�g�8��gqg�5�;�)����0E�4�(�t����/���
�9k�$����}~���JK[�����F��GF� �a)������)�����R��-3-A ��W�v��T�(NY���������/��O�SZ�n�����'0
���B��v�U�2�@j��.c�#��3C���H&�>���?/�����~>9�=��3�v����q�Y���`y\&�,J��eQ�^�����P�JT�6|��89�
v�mPv+p���yA�U�*�Pd�{#87r������!R�tt~����#��|���x&oHX������&����F���A��G5�����6em�vy�N��L%x8e�]W�W�6���	�z�C����v�+��
���{!��`s��~����&
/������/�h2,����n�ur������d�}�v���0\����B�zS`im�Q4x%�+�q/Ru8"��l��2��i���{����h��X�h4����I�bg�;�9CB�h��y9�����DF���yM+K���8i�y�@,/����m���)
��2��^Y4�%qG��B�f��P�� XM��jD��$��%(	.�fV
��"�2��������T������`��7_����"U�k��n-K�[�/��nV�ot��.��%��Ew��h
��[7F*�G{�d+\F�������<�������qAa`M%	�O�<A�Zk�kJ:�Qa�A�S{��p�>�O��\��4y���������d����<����S��8�>��'XS���-Y�A�D�)����8����$52�D���4�j�s"�����b��#�����d~5�9�y���O
���7�C�6��W<z���;�}���4��1=��f�o���D`=h�L5��3|��00�-,s����N�*�T�����5�gP�53cd��k��t����b!�uG�������������2�[�?�b�b��HR�8@6���xR3�������2�
�&H�����
zQ$�J���@���LY.�w�sK�����K8�(S�!���Y,����z�������i��ka�a����O@������|��Hf���a0V��	Kh���	��O;�wQ4�1��D���z�0�^����90����+��er��	��G��	����q�g����6�#EPV�|2}C��\_6�C+a��o*�_>_��>�?'g�'���=�cb%�E�b����=�<���Cv6�Z�COH[�8(�����xi�
��:�B��?Bg.%��.���$�X8������B(�7������:����56��K	n(�j$����0'#\����U
U ����e�/�$h�<B����)�0��#��B�������{$������hn8p�[%�����`�������liM�M�<7��A�;]����07J��q���\7h+ej
�L����3��)R7�g��@n�O��$N4%v�V!����%��}�'���G�����
C9��B$���_u�v�*RG�yV��>��\���s��,��E���q�`	�MGm�G��K��m�X���9x�F�Nz�&n�*���{#������5��Q����Xb�WT�d�\
�L$l����x��v���9�1�a�m�I�R����K�����k��f�������mt��d\M%�g!����I��X���/F�aW��&�x�������D��5<;X��)���BP�<�i����(���?V�M��D.\qe��`��+z�y����8���;'���Z���J]�3��1������i��b��m�+��
1=Vw�5��}c�O�u�_�����������������w[��)b��;�3�����H���)k��P�&O�$�UJ��V|������2�j���t��j�1�������rv�7H����h�6S�?��D���S�}a1"wz��HJZ�N��v����	I�J���D�hE��6�t7&������P���,�+�9�z%!����ZU������w��?p*�:��^c����Jq���]u6e����I2C����
�efkai�,m��7G�����Y>!�����O���=Q����36Y�� 4l��������V����E+e�6�$���MjKj��+�>l]��nhMPI�S/K��(�W�)�����?�&_k�w�Kb�}��,�,������c,��"\�t����1V��?�21D���I6��+��T��U�g����p:+F�;���g��s�rBA�VA'��L
�[�Za�=���-S�����Ss����}){�t�Gy�c���[@E���ZB�\�qn�>�<js�9���s��z������(�dB�(H�g�0�,��0�����%l!�#B�=_l�% �����O�PKF:W�v��.��2{;�L.�|�����}sj���;���g"�Z��l��������{+��a��K[o���b����BN
��������K�	��I	#�{�-��BU&Q������n����.g�SL��$�*K%�R� -��{����v�b@BDb�;�������*���3����~��[��r��Q'K}G������K�	7�d�D��RA�
����(�A�NgY��[-k�y3�;�j��|�`"���G��u�,gr�\���O>z��|�%�<)�yb]LB%�<h�q�pFo8�G�t@�/�@�,��{���(�,���1Er����([�[���0�(��i]��Qy��s���D����37���%�'6�6�o�5���a��]ef{����@b�	��i�`��b�	nI���]^��w3CFk�Dp�2>��cl��K���������\.4�
n(�6��(��!�v��B��N>�(��m�XXJM��I�������S*�p�7�T�y3�����D+��i�����f��*�o��1823��T��6����vyE|�/]��m�.�.�K����vVW����{"�����1"�_>}����|z|�H�1s�(���=�u^ S����O<��GS	����z�t��}n��vw��s��t��o*[������>B"�a�M������R���9Le����O�\������RT@@s����P�V��/iy��X����u����a�"Gu_�
�V��)���U|���h����U^������#=�PA�C��������P����I�����
����@M��" M��3����)xa0���a�6zI�DuS�z8��gY�4��w��SU8c��)��aV��e���j�Y"mt�*���g�c�	�r;5�u`��C����	��,g�
^1�����/�����e�1�B�+84��Y{
��W��_�7�.N�m�wh������|VPr��b�SaF��c�b��=D��2�q�\yuI�)��
�.j^k�����-��d2}��Y���B�Y"~ag�"�pM���*(�������8W�f���/��������]���>���������[f(�y�,�m?�%m���=�L���Ws�a����Ag\�rv��4{,�"����\G|���]�����%�7\7�0���'��a�3�
�������$��]�{MD���k��t?��-�,���n����m�� ��&��='f?;>IN.ji�����*��54��ms�aN<K"mLm�F�d�1��+B��b#c�&1���C�-����qjfg�.>�~'g��?Z(���AE���- u����um�:��y���8�R��,��8������h��m��+ ��i�6��^h���`9����,m"��q,5Y��O�d������������B��j��
����+�/�	��L�1���4��C
��emr������~�R�T��iW�O�j�G����9:��<�z�u�6��)�{���������������3z���j[���V9�K)�R}�
�uh�!q����q����?��������~xz>`Bz~Ys3WR�Yu���*������g��D�P��p�9A�c&R��d�&�G���#_�D^�����3�Q����A��n��DM
����q��o/bS�� �,-�Z����?Zl�$3��j������z�X���#����q<a7��H���n��s���m&�hlz���'�F44��n���\�P�a2�O���	��Na>�"��,�h����}�B���6��T�4��^�~���:��j��'�
�����k-KL�R[�����m����B{���U#��s�����;N�<���-������3/�z�!�i��$2�e�.�����Jjm{�M���'��N
�1������m�[2�	���'j������XvP�����uMy��T�SI
�a������6n$?��?��B1�CW6�O�dGY������qx�$>�
�t$ov?���l��L��L���b7@���PU?
���[ac|���������]M����`#t���j�@e��.h�����y�G�kTi��k7���GQND[�������f�>��pH�&�x���������.�4��'.�~�����gq���5-ZW/�3�C��z�Y�78�s"�&�6�:xt(�w����C	�������Xu�������$�]�1�/~��]�t��cc��k�x$�1�a�[`�/�\�����[�K\����1�X'_������'�C����&?���K��k��(ENK�}��c�M��t��_��u�-��CJ-�u�H9S����M5�a�Wt���������\0B���G�2��}�c�b�5R��c�����gH���f��������e����3&����=�f�	��E_�K
��dI�u�(�T��4J�4���iK��K��gld�����{�x��"��������u�%��G$����O�#����F�*��������m�m���Yb����:��B�����!��1��:X�iy,q��=��!�@�:y�A,F�{�4R��RB�1^��x�NC���,A��-tn�8B���b7LfA���,��=+��U��G�1��*w'*?T�7��Z��rW=���A�y��X���~����������=T(A:��*-���~��F�=�ad�C��]���#bS��n�e6�Mt��������z�>�Z����;	�<������{a��>G�%��B<x����J��*������K��V�����*����|_��g�4n������V�slVK���9���D��s�dg�R�ft�a�b��N���xs�b*�����H������]�C��5�8���>IMB�������x��CC���1�����4>_���"���lK?��k�d�����7V����j/�>1\�������^�`[�����z/����?�z&[���Q�C�K�c�|.��C-l��
�7�0o���.^�GZ�����8��LN��,!����}�����@
�`%z�������������G,��qw��<:�B{r�)�gj��|,���� ~�����Y��[�'����B��}��|�k�.c�Y�-w�������o�,Q
�hj�!�������u~�������kC�4��av����� N(9����PZ�\�������Y�b���<�i������'[�=6�{��q�w�K�����lv���D�pZ[L�������R�e�����>�
��E�0y�b���>������mt
!�9C�a��eH��5��E��*v�J���mS]���F
����f���q:���7��N�e�N������K	)O�%�L��6yrF��|����������*3�6��0������j�-n���E��p���������6f������x+�c[�O��OEZ��jl.�/n0���~�vI���he�qdV������R���*<0�����Z_L������K/��:<z�ur� �����%�~1a�'��`��Y�i�^�B���a|��;�90,d��[���Z���	
 X����g�I@�1."��@��\.��mX���X-�������������"?��D�����OG�������<g\rD�xmC�Gv�$&j�c\m���\��o9I�������~��1 ����V>�	�T����o�R��)���,ISS��}���EB���I�$�/��4Ay�)��1����T�G?�a5+���y�~[�l)��G<�M�j��|��D���c����~m��)'kLp��]N2�d�iUh�����~�����Xg��`����r���}�o��z6!V_�gHr�����e\%)��x�]�n������T�Ze+��G�6�|e'}���\'�a�N�����5��������� ��1��{�Xy���b�����0����wy>>�s��EH�^-AL��Dfh��I�1�d��J�23���C��[*�?��N���n�L�R��x�7��������1�����{�,
0�H�����"%���u�D�����u�_���7dsYJ���u�=6���Ddl�E���T������7!�5�2pc��R�������uj���n���eM�������EXh��'�#����Vc]5�����A����|��]]l\]�.~K�hz9H�j���C-��#lM�������l�k���i�b�7���Y�?j������:�cz�4�Z"��[�L�(�_A�pu=>�2-������`$��g���+.�@P���e�H�,��j,����kM8j��l���a��&�Q'�����50� ~
e��<�y��OIx�oO�����*1	s���������xR�����}<"�l�\�Je9}��2���xo��P���}���������9z�%�H�I�:����G5�SH�.y_
�����q�����>�9�-������0����(�5������*N��I�R�2`=;���n1+����P��/���������ZQ{�n�����������vI$=���(Ix�����{M`;#��	f��L��������0��cz����('���{�����:�����K
�p�����������P�!AmQ���y��Dw6��9�y~���L>��v�	��-���,�{t~~v�~���d1������,s6H�< ��P���_R���)iW�2�l������jN���K�@��=�C�z����`����8&�X�>h��#(�=���ti�	W��u������	������a��>~C�W��'W�G�����[��V�3Ru8Cg":��;7\��3���Fp'-����'�f��J��B.�K7T���o��)5���{�N���-��kbJ)�7�yV��C�O%)�@��<!P��Y��9�3��a�����}]C���u����k�NO5���l1����������w�P�o����F������],�� �R>[I�����$��j�+;��]F�_���_�����"������
���>�W��W��W��W��W���"����+������
����
����
����
����>����_���_���_���_���_���_���_����;��L����
����
����
����
����Nk^��W��W��W��W������
����
������W���/�V��W���-����+����ko����
����
������q���_=����
����
������3���_���_���_���_���_���_��M�~��+��������~B7��+�������J�R��nN�<�wswg���e��_�����-��]F�_���_�����"������
���>�W��W��W��w�����<������W>�_���_WI`~�wA0�o�(�����+��:����Ac�S����@��K�z0���WSdMu|��3���^�������%%�������n"��=O<H���n���xj_T����(>u�L��,=�i������@�xJ��=v��R��v{w����ll.;���&3�B�g���U%ur|qyt��q���h��^&D4>�IABtb���fU5��P`!�������`L/cZ���Y����U���Y����������gA~�gA~��/��5Awt�t�������z��l���b�I��htp��B�}����P�����,���{B
6[�Rn�`������
R,Wg�
`vr��?7�t��7�8PN��zzzz��_��ql���@��1k�\1����A�o!��@_��@_��}��9&�t�F�+)i�7��}�d�|;�b&S�a�6/����|�m��!��9�0����%����3n��n���O��[0��{��XP���k

��pvhA�nA�nA�n%(���-(���-(���-(��}	
������-(����U�{	(��G�pg��q�3��8 noM����#0��d�~��[���{N`�����~�s@�~�S���~������ay���H/�{���|g��O	��2_t��O��^�|��^:��t�B���� |/
��L��z��������zhp��|na��-����F����;y�7)T�Vc#�]�0�<om=��\�lV��yww{���r��Y���e��������h�Ja����A�������*�6��CG���|Y��k��r'��������'g��*Ka$`c6N`����"W�!n�PF�q��j�� ��{x��+<�\�8��M�vDq9��q`���Y�%2��]es����S�JV��{ p �6���d����
y��t-���6���
Z�I9��Z���*����&����x��O���uBH�C;��P�=�x���F��C����
?�@�Q�_U��p/�������<��������|����a>��z%�i[`<����XM<��`���bJk�,������j����]������e�T��J�������!N�=z�D�f*m~.�X��R]l_Y��W&��:�L�)e���=�E����h�`����#��lV�����o�������J����6-�T�P����~$���[��:���2��]�^��*������\5�8�?9��mj������rX�'���+5
�^]����BQ�������]�<�#~8:����mZB�;����.�{� x��t�e�������H���� .�f�%M�����_����9LG���!\G{%GB5�vU��Z�xNT��S���(z����	�@�Gz����b_e������\���JJ��z�A�bd���P'0b�a��N{����h�2��Q���;�3T�Crl<���%���{������F�����U���B��8��}������o^ei�4�Rb���t���)ea���}#�||%�ph��K�2���f��(�Pb*���I��@���e����M#�5omG���X��R9���l`��Vc>Y�P�]�Wgw��8�LM����>}�������H�\�.���'�\*�U1���?4���Skc��4����`�\>Bh�$�*~�`�I._��Y�D%��"o����j���`�d���%�UDmi���U��W�wOHP�5�Q)�f��9���t[�'�|�"�Bq^:YgA�5;�=l`%0�f�D���T������p�YaA����S��7�Pzf*����N��#�{���U�������EB7�8(�]R@�X[����c'*�lo@�9��O���J��"��^���������
�50�3��y���?���������n1��	�pf�����'������z�}
"�e {�m������oX�t�������cAd�b�:9�"?��'F?���|l��zJ+�z���}}O��|��]e��T���^�F��/f�8-�wF����#��5@�%`�����V 			�����c�%T�d*=d7�fs�1�2��X�94������2��������p�"���?��������m���L?]�`i���w������������������7Y�/W*��ze��-��R�����B<�prvz���6o�ywp�W)��\
rLG@?�{�W��9�B2v�E(�<|�.�[���T��*�t<_�1f��u����_iBw�7d�T�6$�`�lL�T)��C�"!��CD���3�i�_eg�LP�A�fi�Tx���:�����/�����T��T���
v}�y�����f��T�mt�������|-��kE���
�_	����_�	��>�|�=�p�~��X=�&d��u3�>����VA
��S���0/��e��A��|(w,8�~����p�m�x
CPC������0��E��]�;�Q?~���*���}J��8{�YP�O�����$��^������na�M�v#5Y����h��������7h5�#��;*g��������(��d1�v�6���	��A��o���K��&rM0Y|����<�?,�����y���&7���a�0y"8yp�<��Z��S�Z%���D��Y�����-o�d���t�WY��!��f��\S���G���*��0eo���`�)��3��a���7��Q��H������(oC������ug��5m�e�������/aC�O+�J�b���c�6�w�y�n���b����l���F�Q`���vu�j+^��Y�>I�s�fN
eJ������w������&������������^	~�?�D��R\\���t��j�s�N��f�����>���X�K���(�������y��.���������"�����}(�������U�����+U��-���0�����DA,E�C�U#������1������o�����jZ���:�7p���P������r"���gU�a�J��[-�}f^e�������_W���"�l�9x�]K	=��+���)>u8��gnP���hO����C���9f&O�����_�Vww��������M��^J�j�'k�o�g��3��_�^���;r�'*���}����r���TM��3Y��eJ+�w�����;�����}��,0|~6 4�B.D4�I1P��k��S{}��4gxm��?��T����n�jc�-�&�LKWL(��I�6j�#�����5x��\��J��cf���y���}��{������"�OG����������g5L��������)���*���V�2O6M>�C8�;��,E��E��6E���Ce)�j+9z��E��.�28��9�Wk9z��M�HT��z�7�%�;$"�)X6G5�
#{�S��}�����;./(�����O
�G��<��,s�l��|���1�y� \)���m\��4��r�~����8�F��H��6���~�a�K���C�������m����yk�b9?IH�����=4�^�ys��_�(���3��kz}u���^�k�4y<������F~w
�=����5��3����(�t��do�����3;��a����������N�)�*I��YR��>�S�����f}���c��1��������{R����p4�^�}��X"�z�0�}��D�p�}���������6d�,�5�DZ��(3��GU���7���-������U��?��������zu+���� ��S�n;�j�xK���r��$g�L�����9�QKY�2���G����y����4�?W���u3+w�;b�w���B�r:��!�'%�2i����7�t����H�}nF���f|�����.���t��pv�s���4���>�/f���gs�L{���l��&�I������rA��^I;�y337�9�e�8�7��������,XI�SY8;�i�����8]#�O���?�ZW������E��������=+E������L�(rk�1��q�����^��(v�'"�����5?L��`�/��	�F$T�w�����U�����/�8�_�L�4����������mH��
���-��/�05k�j����������o���$�;Z&-�����h�k�6�Iw��&:KH�C�����g�����~��N�\���z�/��N`L�r����M�'���U�����N����z�@PP������a��8:?�.�]�5F�0���:���#f���6bd��o��k�PA��t�);� �,��xET�c~�M�f$ldt�?&���F|�@��!�|�yLa��b4��h���5{����������Bq�d%� /���������������e��A�=�k	w�m~�c4t������|c�4�V}�b����i
�Fw�%��$0��<��_�!���!'�'�p9z��z��W�T���������	\
{������P�`{dp�Z&E�)R�H�"E�)R�H�"E�)R�H�"E�)R�H�s��	�,N�
#2Nathan Bossart
nathandbossart@gmail.com
In reply to: Tomas Vondra (#1)
Re: basebackups seem to have serious issues with FILE_COPY in CREATE DATABASE

On Mon, Jun 24, 2024 at 04:12:38PM +0200, Tomas Vondra wrote:

The important observation is that this only happens if a database is
created while the backup is running, and that it only happens with the
FILE_COPY strategy - I've never seen this with WAL_LOG (which is the
default since PG15).

My first thought is that this sounds related to the large comment in
CreateDatabaseUsingFileCopy():

/*
* We force a checkpoint before committing. This effectively means that
* committed XLOG_DBASE_CREATE_FILE_COPY operations will never need to be
* replayed (at least not in ordinary crash recovery; we still have to
* make the XLOG entry for the benefit of PITR operations). This avoids
* two nasty scenarios:
*
* #1: When PITR is off, we don't XLOG the contents of newly created
* indexes; therefore the drop-and-recreate-whole-directory behavior of
* DBASE_CREATE replay would lose such indexes.
*
* #2: Since we have to recopy the source database during DBASE_CREATE
* replay, we run the risk of copying changes in it that were committed
* after the original CREATE DATABASE command but before the system crash
* that led to the replay. This is at least unexpected and at worst could
* lead to inconsistencies, eg duplicate table names.
*
* (Both of these were real bugs in releases 8.0 through 8.0.3.)
*
* In PITR replay, the first of these isn't an issue, and the second is
* only a risk if the CREATE DATABASE and subsequent template database
* change both occur while a base backup is being taken. There doesn't
* seem to be much we can do about that except document it as a
* limitation.
*
* See CreateDatabaseUsingWalLog() for a less cheesy CREATE DATABASE
* strategy that avoids these problems.
*/

I don't recall any reports of similar issues from pre-15 releases, where
FILE_COPY was the only available option - I'm not sure why is that.
Either it didn't have this issue back then, or maybe people happen to
not create databases concurrently with a backup very often. It's a race
condition / timing issue, essentially.

If it requires concurrent activity on the template database, I wouldn't be
surprised at all that this is rare.

I see there have been a couple threads proposing various improvements to
FILE_COPY, that might make it more efficient/faster, namely using the
filesystem cloning [1] or switching pg_upgrade to use it [2]. But having
something that's (maybe) faster but not quite correct does not seem like
a winning strategy to me ...

Alternatively, if we don't have clear desire to fix it, maybe the right
solution would be get rid of it?

It would be unfortunate if we couldn't use this for pg_upgrade, especially
if it is unaffected by these problems.

--
nathan

#3Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Nathan Bossart (#2)
Re: basebackups seem to have serious issues with FILE_COPY in CREATE DATABASE

On 6/24/24 17:14, Nathan Bossart wrote:

On Mon, Jun 24, 2024 at 04:12:38PM +0200, Tomas Vondra wrote:

The important observation is that this only happens if a database is
created while the backup is running, and that it only happens with the
FILE_COPY strategy - I've never seen this with WAL_LOG (which is the
default since PG15).

My first thought is that this sounds related to the large comment in
CreateDatabaseUsingFileCopy():

/*
* We force a checkpoint before committing. This effectively means that
* committed XLOG_DBASE_CREATE_FILE_COPY operations will never need to be
* replayed (at least not in ordinary crash recovery; we still have to
* make the XLOG entry for the benefit of PITR operations). This avoids
* two nasty scenarios:
*
* #1: When PITR is off, we don't XLOG the contents of newly created
* indexes; therefore the drop-and-recreate-whole-directory behavior of
* DBASE_CREATE replay would lose such indexes.
*
* #2: Since we have to recopy the source database during DBASE_CREATE
* replay, we run the risk of copying changes in it that were committed
* after the original CREATE DATABASE command but before the system crash
* that led to the replay. This is at least unexpected and at worst could
* lead to inconsistencies, eg duplicate table names.
*
* (Both of these were real bugs in releases 8.0 through 8.0.3.)
*
* In PITR replay, the first of these isn't an issue, and the second is
* only a risk if the CREATE DATABASE and subsequent template database
* change both occur while a base backup is being taken. There doesn't
* seem to be much we can do about that except document it as a
* limitation.
*
* See CreateDatabaseUsingWalLog() for a less cheesy CREATE DATABASE
* strategy that avoids these problems.
*/

Perhaps, the mentioned risks certainly seem like it might be related to
the issues I'm observing.

I don't recall any reports of similar issues from pre-15 releases, where
FILE_COPY was the only available option - I'm not sure why is that.
Either it didn't have this issue back then, or maybe people happen to
not create databases concurrently with a backup very often. It's a race
condition / timing issue, essentially.

If it requires concurrent activity on the template database, I wouldn't be
surprised at all that this is rare.

Right. Although, "concurrent" here means a somewhat different thing.
AFAIK there can't be a any changes concurrent with the CREATE DATABASE
directly, because we make sure there are no connections:

createdb: error: database creation failed: ERROR: source database
"test" is being accessed by other users
DETAIL: There is 1 other session using the database.

But per the comment, it'd be a problem if there is activity after the
database gets copied, but before the backup completes (which is where
the replay will happen).

I see there have been a couple threads proposing various improvements to
FILE_COPY, that might make it more efficient/faster, namely using the
filesystem cloning [1] or switching pg_upgrade to use it [2]. But having
something that's (maybe) faster but not quite correct does not seem like
a winning strategy to me ...

Alternatively, if we don't have clear desire to fix it, maybe the right
solution would be get rid of it?

It would be unfortunate if we couldn't use this for pg_upgrade, especially
if it is unaffected by these problems.

Yeah. I wouldn't mind using FILE_COPY in contexts where we know it's
safe, like pg_upgrade. I just don't want to let users to unknowingly
step on this.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company