Abbreviated keys for text cost model fix
Over in the abbreviated keys for numeric thread, Tomas Vondra reports
a case where the ad-hoc cost model of abbreviated keys/sortsupport for
text is far too conservative, and aborts a case where abbreviated keys
can still greatly help.
Previously, I proposed additions to the cost model that dealt with all
this, by passing down a reltuples hint from the optimizer or a
relcache entry to tuplesort. Robert seemed to think that this was a
little bit much, and that it could be discussed at a later point. It's
clear that we need to do something about cases like the one reported
by Tomas, though.
Attached patch adds a decay to the threshold that (estimated)
abbreviated key cardinality must cross as a proportion of the
(estimated) cardinality of the original set of strings to be sorted,
while also quadrupling the initial required proportional cardinality
to 20% of full key cardinality (but for just the first 10,000 strings,
before this threshold is decayed aggressively). This seems
significantly less invasive than what I previously proposed, while
still being effective. I suggest that this be treated as a bugfix,
since Tomas reports a 5% -7% regression with his case. OTOH, with this
patch, I saw the same case have a ~300% improvement in performance.
The adjusted cost model strongly prefers to decide to abort early,
which seems desirable. In one way this makes it a little more
vulnerable to skewed sets, since early data matters more, but in
another more important way it's less vulnerable, since perhaps that
skew indicates a general skew in the data, a skew that makes the idea
of measuring the cardinality of abbreviated keys as a proxy for full
key cardinality less useful.
Thoughts?
--
Peter Geoghegan
Attachments:
cost_model.patchtext/x-patch; charset=US-ASCII; name=cost_model.patchDownload
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 3edd283..d7ff7ad 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -66,6 +66,7 @@ typedef struct
bool collate_c;
hyperLogLogState abbr_card; /* Abbreviated key cardinality state */
hyperLogLogState full_card; /* Full key cardinality state */
+ double prop_card; /* Required cardinality proportion */
#ifdef HAVE_LOCALE_T
pg_locale_t locale;
#endif
@@ -1845,6 +1846,7 @@ btsortsupport_worker(SortSupport ssup, Oid collid)
*/
if (abbreviate)
{
+ tss->prop_card = 0.20;
initHyperLogLog(&tss->abbr_card, 10);
initHyperLogLog(&tss->full_card, 10);
ssup->abbrev_full_comparator = ssup->comparator;
@@ -2125,7 +2127,7 @@ bttext_abbrev_abort(int memtupcount, SortSupport ssup)
Assert(ssup->abbreviate);
/* Have a little patience */
- if (memtupcount < 20)
+ if (memtupcount < 100)
return false;
abbrev_distinct = estimateHyperLogLog(&tss->abbr_card);
@@ -2151,8 +2153,9 @@ bttext_abbrev_abort(int memtupcount, SortSupport ssup)
{
double norm_abbrev_card = abbrev_distinct / (double) memtupcount;
- elog(DEBUG_elog_output, "abbrev_distinct after %d: %f (key_distinct: %f, norm_abbrev_card: %f)",
- memtupcount, abbrev_distinct, key_distinct, norm_abbrev_card);
+ elog(DEBUG_elog_output, "abbrev_distinct after %d: %f (key_distinct: %f, norm_abbrev_card: %f, prop_card: %f)",
+ memtupcount, abbrev_distinct, key_distinct, norm_abbrev_card,
+ tss->prop_card);
}
#endif
@@ -2172,8 +2175,38 @@ bttext_abbrev_abort(int memtupcount, SortSupport ssup)
* abbreviated comparison with a cheap memcmp()-based authoritative
* resolution are equivalent.
*/
- if (abbrev_distinct > key_distinct * 0.05)
+ if (abbrev_distinct > key_distinct * tss->prop_card)
+ {
+ /*
+ * When we have exceeded 10,000 tuples, decay required cardinality
+ * aggressively for next call.
+ *
+ * This is useful because the number of comparisons required on average
+ * increases at a linearithmic rate, and at roughly 10,000 tuples that
+ * factor will start to dominate over the linear costs of string
+ * transformation (this is a conservative estimate). The decay rate is
+ * chosen to be a little less aggressive than halving -- which (since
+ * we're called at points at which memtupcount has doubled) would never
+ * see the cost model actually abort past the first call following a
+ * decay. This decay rate is mostly a precaution against a sudden,
+ * violent swing in how well abbreviated cardinality tracks full key
+ * cardinality. The decay also serves to prevent a marginal case from
+ * being aborted too late, when too much has already been invested in
+ * string transformation.
+ *
+ * It's possible for sets of several million distinct strings with mere
+ * tens of thousands of distinct abbreviated keys to still benefit very
+ * significantly. This will generally occur provided each abbreviated
+ * key is a proxy for a roughly uniform number of the set's full keys.
+ * If it isn't so, we hope to catch that early and abort. If it isn't
+ * caught early, by the time the problem is apparent it's probably not
+ * worth aborting.
+ */
+ if (memtupcount > 10000)
+ tss->prop_card *= 0.65;
+
return false;
+ }
/*
* Abort abbreviation strategy.
@@ -2187,8 +2220,8 @@ bttext_abbrev_abort(int memtupcount, SortSupport ssup)
* lose but much to gain, which our strategy reflects.
*/
#ifdef DEBUG_ABBREV_KEYS
- elog(DEBUG_elog_output, "would have aborted abbreviation due to worst-case at %d. abbrev_distinct: %f, key_distinct: %f",
- memtupcount, abbrev_distinct, key_distinct);
+ elog(DEBUG_elog_output, "would have aborted abbreviation due to worst-case at %d. abbrev_distinct: %f, key_distinct: %f, prop_card: %f",
+ memtupcount, abbrev_distinct, key_distinct, tss->prop_card);
/* Actually abort only when debugging is disabled */
return false;
#endif
Hi,
On 22.2.2015 02:59, Peter Geoghegan wrote:
Over in the abbreviated keys for numeric thread, Tomas Vondra reports
a case where the ad-hoc cost model of abbreviated keys/sortsupport for
text is far too conservative, and aborts a case where abbreviated keys
can still greatly help.Previously, I proposed additions to the cost model that dealt with all
this, by passing down a reltuples hint from the optimizer or a
relcache entry to tuplesort. Robert seemed to think that this was a
little bit much, and that it could be discussed at a later point. It's
clear that we need to do something about cases like the one reported
by Tomas, though.Attached patch adds a decay to the threshold that (estimated)
abbreviated key cardinality must cross as a proportion of the
(estimated) cardinality of the original set of strings to be sorted,
while also quadrupling the initial required proportional cardinality
to 20% of full key cardinality (but for just the first 10,000 strings,
before this threshold is decayed aggressively). This seems
significantly less invasive than what I previously proposed, while
still being effective. I suggest that this be treated as a bugfix,
since Tomas reports a 5% -7% regression with his case. OTOH, with this
patch, I saw the same case have a ~300% improvement in performance.The adjusted cost model strongly prefers to decide to abort early,
which seems desirable. In one way this makes it a little more
vulnerable to skewed sets, since early data matters more, but in
another more important way it's less vulnerable, since perhaps that
skew indicates a general skew in the data, a skew that makes the idea
of measuring the cardinality of abbreviated keys as a proxy for full
key cardinality less useful.Thoughts?
I've done some more testing, and this helps a lot. I did the tests on
two different machines (one with Xeon E5450, one with i5-2500k), to see
how the patches (this and the datum/numeric ones) behave on different
CPUs. I also extended the tests a bit, to test random data (as before,
and data already sorted in ASC/DESC order). And of course, I did that
with different dataset sizes.
The aggregated results are available as a spreadsheet here:
Xeon E5450: http://bit.ly/1AyRN7M
i5-2500k: http://bit.ly/1z9dLdP
You probably only want to look at the 'summary' sheet, which shows
speedup vs. master. Detailed results along with benchmarking scripts and
logs attached.
In short, this fixes all the cases except for the ASC sorted data. I
haven't done any code review, but I think we want this.
I'll use data from the i5-2500k, but it applies to the Xeon too, except
that the Xeon results are more noisy and the speedups are not that
significant.
For the 'text' data type, and 'random' dataset, the results are these:
scale datum cost-model
-------------------------------
100000 328% 323%
1000000 392% 391%
2000000 96% 565%
3000000 97% 572%
4000000 97% 571%
5000000 98% 570%
The numbers are speedup vs. master, so 100% means exactly the same
speed, 200% means twice as fast.
So while with 'datum' patch this actually caused very nice speedup for
small datasets - about 3-4x speedup up to 1M rows, for larger datasets
we've seen small regression (~3% slower). With the cost model fix, we
actually see a significant speedup (about 5.7x) for these cases.
I haven't verified whether this produces the same results, but if it
does this is very nice.
For 'DESC' dataset (i.e. data sorted in reverse order), we do get even
better numbers, with up to 6.5x speedup on large datasets.
But for 'ASC' dataset (i.e. already sorted data), we do get this:
scale datum cost-model
-------------------------------
100000 85% 84%
1000000 87% 87%
2000000 76% 96%
3000000 82% 90%
4000000 91% 83%
5000000 93% 81%
Ummm, not that great, I guess :-(
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
results-xeon-e5450.tgzapplication/x-tgz; name=results-xeon-e5450.tgzDownload
� �B�T ��m�\�����_�G���|�f���G>G����5�V�v��"�@��
@c�
I�����'�V#�����������=Dg����7_"#�x�oW�o�^���g���?�\�9������V��[�|H5�$�I�8|M�\��$����������O�]_��W��������n>�L����_?���z��4x���%��������?����������o��>p�����C��������O.��������/��x��������'w�7��������>�|s��������w�����|u��/o�������/o�{����w���^������w_���������7��__������?�^����W/__}~�B�,�]��Y
b��!�������_�T��.CpS%]����y�/k�9Z��h��y*�>\��g��e�q.h�1w���Y0.��sA��%���>�Q����������<�~�����an_��?����*�g)��X4/�i��|$M�������
����-q�!�5���r�h��F)��K2��`���5�7 �s�6����M������)��a�/���������e������^�����Ob�/�<\((e���+�X�F-S�O�2�6�(�%S����z)z���j��e?�et�m��`�29=����������FM��+�/�+�Wz�6���kZC������7�GwY�����+��O�<���8�~�Qj�2�����T�|Sj4\b�
�e�����H��y���h{��8�XRW��(��q�6�^�y'CA/�'{�+�$;�<�dw]c�Dj�6�1�w�g���H�3�#�f���'N�1����Fi�W�m�����6��l"�x����
5�/��� �N������u3��/F���-�aW�C�Y�2o]�g�l������������5bM�F�����c'���U0 �i�S��x�fvf�FZP�(�i4��-2�x���M���d��z��X����)|�~����g�h��z��1
%6�2�-��yrVY������.��y������y����o�f!��������.��|h�}��+��I�=���X�Y�^+��l����F�^����������#�=����k��}O�^���������y�RU�����EAN�F�%�������<v7g�/��%s{����}������h�jA��i����B��`:�.�XZ6�^Zv9�^�U2�=o���'���e)�Y��e�3�S?����s�t@^7��X��|��oa>��'�|��C���/ix��|R��t+��s�M ��n(��=@V?��
���%���i�ww�~���7���~���.,o�{r��
������3�U��L7/Y��n��C��'7�y���y��$�3<��gMX~��i!�������y/Lf/�<�J'�S�_?����K!��w��?t/�a�����}(�b;o�I6��<!W�������#��yze9^;?Ow���y���y������]����C�o�{�/�x��������s�%���Y���s<~�e/�����5Cv�@�C���^5C��S�����<:�7u���Y�A������I�V�i�����7n�����x�����w�����g�O~_f��LT���:���T�t���7O��&<����������X���$>�c���A+�|���}�����O&GS?&�����<q��e��� 2Ia�o�rF���M{[0�Z0k��e�t��5�/���FA����DK�V�`���G9����}L��� 1��4jd�i�|K��mK��S�s���%p(��*��g>6VA�o�hE\����]F�����������K���������S�����'n/g��S1������O��P�����'���V��� 3����:�����6\��w2�������A6����7�5����!o��C��(���,�77m�U��0��=#�;*��������L������I'8A�H�~���*�W���\�r0����7�c���>.G?���b^��'���%6����ML�;�M���MJ����7��|o��\�����j�������b��z��?p�;`�ay�G���N�d�����4j��3�����������C�.�'������K��w�~����K�����|U�e��3�����SS�re�W8O7r�u�t���������������_7��<tpEh���M����������l�����m�<�25���2�_���N{��=0�<�=�7��`�N����L{%��[3���!���"o=|�{7��=�����5�nfn�G<�u�����I(���>��N���OdF�>}q���<d`��#;e����s�d��r�|���������}����2��6;as��yw$(o���{��=��9W~�&�����Q��a@nW������f;�>F�R�a��M���f�WwkDm������*�\@��\������L�����#��������� ����K�z���$��K����FfO�/�0+�'�R����}�b�����y 5h��d�-`� ��mD�Rp��d{ \P&�����f*��vK�+�����F�Ox#XR�k�x����f���*|�#�S����q"
��8�hU'�A���[s�r-�=x$�V&}��k/��J�H��U,<�i�sp�z~�'�.�����m��H:;pj^P!������A�D/���^C%I�|��A_Zza��l��r�`� @Q�z�(�-n�4��F������=[c��2���������!���y�� ���w{�! /90D�f^��gjV�!��D!��$:��A�����^�K��l� �ZA��]
]�3��;��H�?�!6���� ��D�aB�Q�
����O�A�-tn����~ m6�x^I�K5��`%%���/^?B�$�Bt��.���H���W���\�5U:�)����}����'6�o�;Y� ���.�����Pk�h� W�w+�7���m����Q�7_��N�5i�������i�U��I�
Xq����|��w�b��>�_�<�u��T?�G5W�\�J�#��c��������,��*���A�a�;����id��
�l�0 �l��l~3Dx��_`�L{��p�M�z>UyBp[���G��h���3C��y� 4$���Y����?;���������c�`D�
{�)���;�PVuQ�����c�s��#�N�����W�i/���/E���e2���������g���<���������F��3�G�v��+%��F}��~w����yd��������T����o[��
�|�W����vmoL����i���-
�"