FW: Possible optimisation: push down SORT and LIMIT nodes

Started by Christopher Wilsonover 7 years ago6 messages
#1Christopher Wilson
chris.wilson@cantabcapital.com
1 attachment(s)

Dear Postgres developers,

I sent this query to the performance list a couple of days ago, but nobody has come up with any suggestions. I was wondering if you'd like to consider it?

If this is interesting but nobody has time to implement it, then I would potentially be willing to implement and submit it myself, in my own time. I am experienced with C and C++, but I have not modified Postgres before, and I would need significant support (e.g. on IRC) to help me to find my way around the codebase and finish the task in an acceptable amount of time.

Thanks, Chris.

From: Christopher Wilson
Sent: 30 May 2018 16:47
To: 'pgsql-performance@postgresql.org'
Cc: Steven Winfield (steven.winfield@cantabcapital.com)
Subject: Possible optimisation: push down SORT and LIMIT nodes

Hi all,

We have a query which is rather slow (about 10 seconds), and it looks like this:

select inventory.date, asset.name, inventory.quantity
from temp.inventory
left outer join temp.asset on asset.id = id_asset
order by inventory.date, asset.name
limit 100

The inventory table has the quantity of each asset in the inventory on each date (complete SQL to create and populate the tables with dummy data is below). The query plan looks like this (the non-parallel version is similar):

[cid:image002.png@01D3F831.EEE55810]

Or in text form:

Limit (cost=217591.77..217603.60 rows=100 width=32) (actual time=9122.235..9122.535 rows=100 loops=1)
Buffers: shared hit=6645, temp read=6363 written=6364
-> Gather Merge (cost=217591.77..790859.62 rows=4844517 width=32) (actual time=9122.232..9122.518 rows=100 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=6645, temp read=6363 written=6364
-> Sort (cost=216591.73..220628.83 rows=1614839 width=32) (actual time=8879.909..8880.030 rows=727 loops=4)
Sort Key: inventory.date, asset.name
Sort Method: external merge Disk: 50904kB
Buffers: shared hit=27365, temp read=25943 written=25947
-> Hash Join (cost=26.52..50077.94 rows=1614839 width=32) (actual time=0.788..722.095 rows=1251500 loops=4)
Hash Cond: (inventory.id_asset = asset.id)
Buffers: shared hit=27236
-> Parallel Seq Scan on inventory (cost=0.00..29678.39 rows=1614839 width=12) (actual time=0.025..237.977 rows=1251500 loops=4)
Buffers: shared hit=27060
-> Hash (cost=14.01..14.01 rows=1001 width=28) (actual time=0.600..0.600 rows=1001 loops=4)
Buckets: 1024 Batches: 1 Memory Usage: 68kB
Buffers: shared hit=32
-> Seq Scan on asset (cost=0.00..14.01 rows=1001 width=28) (actual time=0.026..0.279 rows=1001 loops=4)
Buffers: shared hit=32
Planning time: 0.276 ms
Execution time: 9180.144 ms

I can see why it does this, but I can also imagine a potential optimisation, which would enable it to select far fewer rows from the inventory table.

As we are joining to the primary key of the asset table, we know that this join will not add extra rows to the output. Every output row comes from a distinct inventory row. Therefore only 100 rows of the inventory table are required. But which ones?

If we selected exactly 100 rows from inventory, ordered by date, then all of the dates that were complete (every row for that date returned) would be in the output. However, if there is a date which is incomplete (we haven't emitted all the inventory records for that date), then it's possible that we would need some records that we haven't emitted yet. This can only be known after joining to the asset table and sorting this last group by both date and asset name.

But we do know that there can only be 0 or 1 incomplete groups: either the last group (by date) is incomplete, if the LIMIT cut it off in mid-group, or its end coincided with the LIMIT boundary and it is complete. As long as we continue selecting rows from this table until we exhaust the prefix of the overall SORT which applies to it (in this case, just the date) then it will be complete, and we will have all the inventory rows that can appear in the output (because no possible values of columns that appear later in the sort order can cause any rows with different dates to appear in the output).

I'm imagining something like a sort-limit-finish node, which sorts its input and then returns at least the limit number of rows, but keeps returning rows until it exhausts the last sort prefix that it read.

This node could be created from an ordinary SORT and LIMIT pair:

SORT + LIMIT -> SORT-LIMIT-FINISH + SORT + LIMIT

And then pushed down through either a left join, or an inner join on a foreign key, when the right side is unique, and no columns from the right side appear in WHERE conditions, nor anywhere in the sort order except at the end. This sort column suffix would be removed by pushing the node down. The resulting query plan would then look something like:

Index Scan on inventory
SORT-LIMIT-FINISH(sort=[inventory.date], limit=100) (pushed down through the join to asset)
Seq Scan on asset
Hash Left Join (inventory.id_asset = asset.id)
Sort (inventory.date, asset.name)
Limit (100)

And would emit only about 100-1000 inventory rows from the index scan.

Does this sound correct, reasonable and potentially interesting to Postgres developers?

SQL to reproduce:

create schema temp;
create table temp.asset (
id serial primary key,
name text
);
insert into temp.asset (name) select 'Thing ' || random()::text as name from generate_series(0, 1000) as s;
create table temp.inventory (
date date,
id_asset int,
quantity int,
primary key (date, id_asset),
CONSTRAINT id_asset_fk FOREIGN KEY (id_asset) REFERENCES temp.asset (id) MATCH SIMPLE
);
insert into temp.inventory (date, id_asset, quantity)
select current_date - days, asset.id, random() from temp.asset, generate_series(0, 5000) as days;

Thanks, Chris.

This email is confidential. If you are not the intended recipient, please advise us immediately and delete this message.
The registered name of Cantab- part of GAM Systematic is Cantab Capital Partners LLP.
See - http://www.gam.com/en/Legal/Email+disclosures+EU for further information on confidentiality, the risks of non-secure electronic communication, and certain disclosures which we are required to make in accordance with applicable legislation and regulations.
If you cannot access this link, please notify us by reply message and we will send the contents to you.

GAM Holding AG and its subsidiaries (Cantab – GAM Systematic) will collect and use information about you in the course of your interactions with us.
Full details about the data types we collect and what we use this for and your related rights is set out in our online privacy policy at https://www.gam.com/en/legal/privacy-policy.
Please familiarise yourself with this policy and check it from time to time for updates as it supplements this notice.

Attachments:

image002.pngimage/png; name=image002.pngDownload
�PNG


IHDR#��}$ysRGB���<�IDATx^��	|T�������,@@@�QAQKY,5VE�U��O��b]������A[7��ZPY*j�����	!�&;d�33���?�L&73��;3w�����ps8�,���g~�?��n�8@@@@B@@�4�$����#���@@@BEJ+Td�.���@i��P��
Y�   *�=L��`��'[�6��DK'�L��i�g�Ai��qcQ�������1K�����=���.��#����x$���J�'������T��2���{�mZ��������S	B�������~���0�6"�>�b���=*��������-�`P���������9cN��'iTF�jP����;��3YN����PR6�;s��I=������lM95���l}�����x���7�_�T�n�����6�kM�6�@��J�Q��*{�V���K=���5���j��i���>�|��p<�����U�)I#$kTSr�{�(�h����T�*Y��aX�a�L���nG�c?����B$�LF��)EC�{�������t#_�����?%��kSNOB`}��z��#nlx���O��ZR�O�L/i�W4��i�6��[�v��n�V���f7�XH��^k�[u���T�Z�i������A��� J��������T����L��}�/�G_����=+E�u���&�%�6�ZM��[+M'[����iZu������{�
f[����Q��;-3599"1�����JS���c���b;�bm�X�	����M
�	�Z� ���6�_>
�z�{�yAa���'�\1���3�_�������sf�1���;0��AY?�����7_=6��C��d���z���E��5vX���93����zLN����7z�r$_���k���l����1�h��m>vh��j��C-�-��s>>��V&bH����_}��~��3_�9����)��u��l�����w����C
��cL�nr�aR�����a�:���������t}~�V~R�)�@a���O����e��'�c3
3W65�U@4whEc���]g�r������Q���f�7���z��o����v���W����8���w��w�������r�������4�mn�I��2A�C��>(9����6��V�|P_�N�Fe���B/^�%:i�*K������Z8��n��>��d���7����F����{~����<������:5U��S�V�u��b�~��_6��,��V���!����k����;�J�L�����������~����&'������#�V4��F(,���Il���������|Z$��5g�����n}1��7�k�`e;�}u���"�e�����-]�=��Hli�4�Y���}�mY�e�iVB��O`�G�;[���S���v�Y�)-�O�K�������������|�c.E�	RZE?����h4�V���������~9���]�Qi������TZ�/�p���o��p�Z]���3��kj���*�AC����
����E�*0���+0#����Z[��=)I�������oz����_5��EN��,�p�p����;�b��	�hE,���[���Ci]���5���_v�J+���L���Y��7���,MV��Ai%��:5Mv�{��sK��b��i0�zH���o��9���<R�fBq�5��?[��x�u��%�����m7�II���s�Bi2�g�}v���+�Hi-Y"���ST�������b��=%Ex�i�J������������Y�n�6�=]�y��(-%-���F+�o/E��5���n�����e�]�F�����+����rA���e��[t���Js������W}���������vQZ4�:�������@��45�B#�������\i�������4�2��>
�=<�4PZ�z���(�`��G�U\�3������}6���}i,.�2�<(-��(fA�G�hx�JE,m>-�����\6�,=���R���x�V���c����S�����g*�O�ik��W���h��d��u�Z����p%�){��������]v�>+K���.-��X��CC�����4mr�8p ��(�'��-��8Vk�?����ha�V��jZ��� ����/���hhEaA���w��V/Qx�J�0�����f�{��W�I����K_��g����%�����2}���~��O���;W^Q�����o^��c����V���,(/����XB��d.���{��a@�'�M�/*�i���Bz]�E��i27-��>UVA���:��!�N�V�N�Gm?r��S?�o�V��#���"t�y�o%�e]j\8Bp��7������c�bT�O��;s���\X)5N��[3�ldQ���2N��1��ya4
�=&��%��.����>��K���_����y��V���S��7�wp��ys.���������R[:���0������ ���$������������g!�>�V?<��?�Z��!5:c��w9�^S�1��Hcl����=1y�v�v���0�V=r���/U�,�{]cik���_>��=��������m�������-��e�hU�*
*�Dk#[~�=�7|�������S��,w���;K���H���Z��sb�4���m^�����������>������?��V_uJm��Q�)|��cD<,(���8hEF���-j~��w�����F�{����U�%o�WZ4l��(c)�����-�!Iy��{|��������P�]��W���N�%�����=�1��z74�L���T��7[�����Q�ai�!)js�I*�h8�������r��A�jm6�%�F�#^���0��,g��++�U��qt]c�����i�����#�V�o���Py�������%=Hf��C��O�&�Es�/�w���o����8�"�-z
�?��K�������?������x����/�#��O�k9��\��y��j���{�(���.V`��	�hD�-���nci���1��La/��qL��
qW�4���������s[�e���_sM����c��Ne�X�0���T�tnVg�h����~{�j�Z��n�Z��UU5�s��x�b�B�U3�Q�{�^��m��c���u�?��%h�!?�r$�H�UW[jk-R��P��*�-h=�������*�O���}��rZwh������Q�Ihn���a`����}�7	t�4M�~%�p����Z��>4XPk�K���eR!�Z}�F�A�E���Dlv�r������$W����3R�^�3��'���}��z�8�^��>��2�����NJ�k�����"���)wj&h� -�E��^�a�����B����K���~7��m��K�i0�s�"4�p[���a'Ym����]���~�.�f��������9F�V��*Y��Q����L"K���>���*��ZvA�ls�������O��%\�@,�Z��>4�ZP�g�a�C��w����f�Y�����.���Cv�����S��o��zK�<J�������������O��
)Y��z�Y��=��E��+9�rd�"������\<���S�f�����>>�IO�����Zm�:�|X5�,������,|U�m����ju����"2����/g���n����L�5V@������s�O��U~�C<���G�V4����!�E���!]S��w�2�5��i0]�������O4���sn�_
;���k��sv���i
�O�5b�=#f/}��2��'������w�9�g�S������l�@v�����T�����{�������@�Y�����C�M\������f��I��W����w�������Gi+
y��/�|�����[~���������!'?�6[���I���z�23uY3B~��'�|�2k�����c�'�C�'~+��oe�G�t���n��V�~�'��$v�=dO?�R��dv&�RT��?�S����
<�O�@�y����<�N{���,�q��FP�����4���\M!S�����������T�B��h�������h{���>����mMG-�g(��Zij<��p�N�]��Jb�a�&��;����RU����Zw����vn��\#>S��K��4����Ob�{rH�G#�I{t�-��c�m�������se�M��|�>������������GO2��is���=����8}�0����Y&lZ��x9 9��0�hhE��|d/������f�V��jW������N��+�F�r�>
��Cre��r�"L�+}�1re��b!6;
��Cre��r����w�����E���c�~{��tk�h��-0E��
����u��*t'�����4
�"�50U[����Y�]�����
=O����������w�~�f�|�i-�q����56[�j#���G2���oA������+���aN�[h�*Xs������������W]Gc>^y�u�������Ohh`q����P�33S�����|������"�}���7-�2�ZBj�ke�\B�
�>@+8����2[+�g4Qu���Z+��'�_�S�1���!���
������#�
��~�>ZJ�<��X��K1NK��j"������
A{��f�//�2j������D�2+V���>��e��$o����OrY��<��ok8�V����9[0�00��.9$�uZ�L(;!�=*�����qa�a�a������B9O�,5����N�I'����i��$��\8a�If���I��UU��s��kv��Zn+�����r,(��hEe�R6���"���Y<_�]H�,Z��d�V�f���4��G2��Y����f&�����LC�]�O�K�N>�P��M�,�X����$��C�9J����������g��o�W��f���i�/R4�~)�l���t�yY�k/	vl~O;�>�%o���Y�0]�_���uH��Ok�i��m���2H���u����C�NjJ����v���[�|Z��"i,��"������=o��S0�}����?��O���oQ��+^s�"g+��YWG��.������)S�}��6��A\>�wo��W|�8�������|V
����F���PZ
�>��r�z\Z����(��dM?��X�f�]z�<��s�>e�����NUVo1��Z5������7�s��=I���	�RtO������R��8��>�5
�R�E�sN�
��m������\>
J+��Y�Ii�n�_��E�XZ��z��\i����>B���Ox�?��i�7�SfQl���wEJK��T3g:�VF���PZ2m*��2�B+*���|ZpMn����}���(��������{1�N3{Z��g-���_��{4(-���6���V�z�G���v��G��R�������O�/�*"��7�J�����<=>��V�y���2%i�d�jJ�q�eMy����Q%�u2�>,qQZ�
}������W��������[Z��qZ�������8��^l�$����w*������Ue;y�X�9K�L��=*->��9���d�]w�.��ewj�gj*�Ytz=����x��	J�hr,�����TPh9�=BF�Z�����������
x��Q��JKN�TZ���/��+^_���0�0��Dr�|*�S�#s�vQ���oA��ZQ�������&��E��X1�vZ���!���h���coZ�����!�4��b���]������r���'g����U����,$%{�����=��#���{|����%��hHK�6���]�Vi�����*{�Qk�j�g������R���~5�_���
��'�vy����qHi�������rw����651�6����9<�����_��MX?R8!�	�<�:��P�F�A(*'�*�R��T�����'X�H~���<Ii��LC�WH{E�
s����k:��Kk�;Wp�{i�Z�a���m
?�Ui�+>~�`��u���������r��*��`2�3i�G��DmNM}������:O��N���>^�Bp(B��b�����n�|k��[�?����:
��V���6�P������
�������V{~�If�Vj�<�Ru���$��eY-H���:d��Q������rU��}��a6J��C����8^�`A9UG+*��"q�� �!C\+�����h'��x�YF�9H����Lwl>���_�JK�g &�m���Z�Mm��y�Jn��e�f����B�@�[����O��������l���#rqw|I�����]��}������;re����M���O�'0:�e�����Y�lb;+eS�~���0Hz)ZQE�3,�E��V�H��I(QWq�%�)
/
�����w95�(�<�<��.PZ>M�h�mi���_�8�3�yT�QfN�q[�pU,����/��U�k�<(n\(}�i�4ur�uz��FR{RZA|���Ut��i��U����C����1������C)�8�X��������\VR��"�����<��e���FeQ=������JJ�O��XHrvR�
L�8���+�j�{(TF���M^�.��j�����h�t�z��z���O�|�R5�1"J%��8�X�@8V4n;I��f�R���/����F6����@�Blx��p.2N�,�Y�&�����R>�!�iH��G.�i�|�� ��Z3\8evt8Pv��E�X}�������Hj	8N+`�|B���&�nE,�>+H�	m-�r�!\����J�'���@mDKV��&[�����<d4cB�1%�V�r�Z1������@�:����c��2=����$�Vj`p�.9{�{���I�Sp�a"1x�L��[��_��v�,�J�� �v��,�B�[QXP�]�m��S���	im��w��y��[f�1���y���W�����F^�?4��w6�7�l��#f���.;!�Wk�������}VI����FD'�K�����0�>�[2���-E�*��R�U&�Q(P���
�������Y;� �LmtF��z���bYi��r��W^Q[���*n���G���7�^P"+�����eBB+*��h�� k���P�+�����
���V�#o�����?|�o�LC����]x`���'�����NR���_�(Gia���1A@��n��Y#������-t����w�_cB���b���w�I�6�������[k�s}ZQ��Y�5�s���������Z�N�-Z�A2����555.����9�W6�EMqtP���A�[�\Y|��s�8�]����3=�1^�
uw��;n�`�u%y������B<�I�;��_b��TMsi:�Ml�<�:��R��?�T��c���x��eQ�P�iJ	(eAP���X��cA�2����Ci1�����(�:f�l����s�����H��� 6�7:��[vZ�A�Z�����2
����V����V
WQ~'���!��PR6��T�lHM�PX�fS��z�4��FeQ1h��!��GjL���9,$9��4L?�f���4�0)�?;3����46�:
�iH��C9�L�8a���!����Up��LCqM�z���
��o�QcQ�p�sY�(2�i��E�qZ��j�����t�w���i���q�8������V4R���W����f#�/��~�+�"S��/��]c��yRj}���$D�a:�n=r����<����,����r�z���Ym����l����M���Pt����#^���q#|�bg��i����|<��!� ��n8p�����J��#�VT9����V<[�{�Vo)��Ru����������^�����Zi��i��&��T�>��9��Z��3����������\d�"E�?����(B#���D��w>8���f�'G�
@��VUw#�~C�,<q  ��������,�
#eS�UD�PZQ`!:����b�J  �����@@Yd2HXPZ	kzT���:��@@ a	@i%��Q�n	(��h�D&����G�����  "PZ!�dc�Z�j9�@���VT��
#8��Y��@���J8���^����@@@YPZ������e]�l?�t��]x��Y��COK>��r8	�'�x�JK������e��p����(���6��r��\n�[(���j�qh��N��v��>��87��J*�0��&1���	��E
�9�F��i�JH�����~�����?"(-��K+�n��!s6-,XS��_aY�?��A��gn���m��hp	���Xuh�D6��+�N�3v����#r�d,Q
UY���A�W�m����T�,(}��s��#�=��Je�t ��T����
�D�0}��mL���F��2�#��������Sm�C�e���YA=��~/9�d�n'�%q�u_x��k�[�����E�t��(:"���RI:����C����_�m������bPj�(}��pdZ�?}����7H�zy�=A���]�����!���vy���Y����
�����ba�k�_���`�no���Se���+��vm�%���S$��e\XT��/�����|G������Z�P��Z��������oz����JY9qD� ����*\�����mq���)#�����-{��[Q�f���-Rs+gt�<����1C��qq`��.�,����Su���I&��^\(fT��t�K���G���K���X�r�v��n��9�Y�n\�����Eo��E�C��3���kQ��u��(X�M$�j��s��&z8}�w������M!�w����Fw�Z��s��>�(���}��f�Qtk%�����xi��O���`��+�9���=��{���=�h�}~GD1�*������b3�����O�wJ��'���y�y��7���B�>�f��������:s�7_���*z�\��+���9�N!�T�mc����v��5��l�_0�����S�{�,������.���q���j�[7���g����d������!�d�6�-g��T1�<<�����\���'^}����LA6FD�N��������x����K���xo���o�S�~�:�y�]��:R��B��(���-��(-�:��2����>&��a/�+���QXLO��$�,k�If;���[�u��.
�Y<���;����TZ�/eB�����|�(rhy�����i���K���@w�d�4�
D?T���<���@�+j�&
]	{��~5���'�8�VVJ+8�vx��oX��=q�	���|G��P+���~��(�p����8�T<�>�H�����K�d�����'L����[�!�|Bp��}~2�������L�a�Vw�t���e��p��&Q_�O�	�v��&��C����k���L)�^j����4<���1��^q�L�I~B{�����Z���7�D5m����;���h\P�c����B��%����2������[J�����7��r���z��:��8���vId�W����T:�f���w������R�p�����`��+bq��'�n��z7=����,�Zb��h
"8�5�i�A���Hpv}l���O�"�7^�h��G�O~��%��H���X������
��D;��#�����n�z�}�
����h���.A�O5;G��^�I�wPE��q���b��|1��B!��xd�c]7�H�V�:�BI�D7j����2�$�K��y ���"�s�a8��� ���,�k���� �
de�H�%�Vxy#�H�C+�@�  �X��J,{'xm��J������
?s�0B+*��B��@����w�~����  ~PZ�g�#O���%�� ��vN�Z�������@d@iE�;r� 8�"Y��@���J4�'b}��JD���  ����(E���.��@@����s����s�z  �����>(����R'�MJ�7#��]ph���Pr�PZ�aG��78�|3B�	@i)M�E
8���(�$.(���}B���27* �CJ+zl��(I-%i"-�@	@iJ��8�b�V()��(�x�(�C���c  %����(F���*�H@@@(-�%���S�BaA@ �	@i����zph%�����@�	@iE�(����R�&���V��@��C+Z-�r��@��J c�}U���{��  s��b�d(�,ph���H  !&�b�H>\��
i�  �(-?`!j��C+V,�r��@����{'D��J3��  ���b�h(�Wph��PZ�c�$@ph������V�#�0�C+������oPZ�!F4�C+��������v�|
On����'������T����/	b����kn�����������P�L�R��V0$q/��@(���&<���SG��Y�*V���-�K�\1����CFe�?�P<��1��u*���4���~��gc�4�4,��T�s��pW�9'Z��jv[���o��~��P�0�\���[���������.�����������h[.-(-�&FL������d��g�>;��]����d�7U������`��Y�jM;kL��Mb���u�]���:3]S�������
f�Om$�Y$��\8�p���./\\t������)�(�-�\�� ���F(!�$ ������l��9���'d��t��6\���ai:
�C;*�0��aB/��(��vd�~T�~x���)$/Y{v��t��iz�N@�!�2	���L;��%5����	Y��r�����6��1B+0n�@@ ���=���5����������
y��E��y��;y7Yz�TYF����5�����>-����C���D����gU��!���z���[vWS=/��������/�0����M{��Z���\�C�aZ,��7��c!{��*
�s��s�9B(��0��Uk��w���aw
�f������
��V��Gw�Ovq���?�8�J5��<�������	]d��
[�  � �n{If��:��r$������K.�i�����'�Y��o����3���������O+?���`�9)�Y~5���~7�������}����	G��0D��ZQo"@���qZ-����Rv��bG��3�_�gg~�Tt������k��4WW[jk---V/7�DE�m��<�n���c�����r���#����A@�F���������%�M����TWR�%�^N��6{�g�'*��?,c;i�"�\1}�Si]s�
�a��I?I�{�����;�BhJ�/f����+G���q~MoT��ph)I��(N����4��t>������
�3�kv&S�01d���$�O���w�B����B���y�F�BpD�MT�����v������ulm���������k�L��:�A�IFSK�6Z�6���9�1����=�  f�)���si��w��I��)d��kK�-g�;�4��Bh�{�;�?�']S����~�N9?i�;��X���5����v1$�5���22�YY�^�t����d��h��z��8�P���MW�O��]���%��d�G�4f��	�z��i�A=tC��}S"�NZ���v  ��{�py��s��>������U����8f�=|��B�i(�+��{���LC��C��E2�|>�P�E,�Y��]m�S�Z?���*�<=�����.�������.��~��_6�NJ;L!}u��y������1�?y�{��gn�I�*t����0�P�����h���u�c��(�Hf�I�l.�8��f�����]hms�4��>i��M�{h��CE�J2�e������6���n7n����t�C(�Z�|W�.�S}~ts��o����[4B�o��x]��_Tm�/P�����Z�!��@�	xSZ6��N����w�`(��"��9G������x�����/�gi�s�V��X�'�I�<�>5b�!���U����[O�ii�C�L
GLM����[*�*^`�1B+0n�@@ ��������R|�����OEDW$��WlSV�����G����>lV+���V��fs��8��+�^8���BL��[ii���
�,B����Y��y����
i$�4��s����!���5���R���?���p�-f#q�Rrl���jc���X�^6+�<������ph�/oJ�j��l����4��������s^!
~��]���w
;i�;�i��~�1���Uk���jv~�����|����C���N��N�&G����P��@�Zg��7������|���#$��>)����3d��I�`i�ph���  QB����_�H�|��C���|^!/=��HZ9w0d�
���
)�4�y��������Y$�.�Q2t����=�;'J�#-�,��O��������Pzo�yW���%�����%��[�������t����^k\�D��[~����p����ICS'�Io��_����S��<����N)���|(��@b����@PV�X���^3��r�D����:���4���!Y�!H~������C��I��k|�����>D�cM�v���%��I��Z]�Si9�Fb�JZk�RcMn��Y����-m6�~��m���:����1n�2>���U�.�i���W\\{��?r���l�>��5|�;u��Y%��3�g��[���1{�������Yx�A@b��7�u�ULi�����"%*$��Hi��fx{[���.BRZ]%��!*-Z�������Li���4I���PZ1�5E���"�5������o��=���J�z����u �����+�+����m/����1[�]�g��C��][��4N�������9(�P�i���4oJ��+�������]V.%]%*-:��E�������J��*Q����y��!���^]~!���O���b�r��wv����p�u������Q�L��	����Z��|�,���IJ�>��u����PS8�@E��[�75����
)�����L[�V�PZFRZ�����
���e/NE�G"   � ��i��2��h$�s�B��C=���&v�`Hc���l��8��f��"�E�j=�-�>��E��t��>	��Y<�4$w���?^����|w�z:W����6�K�������h��~����|�N~;�,��.�mV_������V���`�����n�����oX���"���@��������zE�����'O��'-=�+T*d�O�D�h+	� ����i����n!�d4����^�i���,_�t������.������O����J`���F^����4/�>%�l���O)��@@ ��)��g����������v0��WH�7��B�l����8���yW�-������Eh�����!y���������{K���~�k�a�IE���d�,f2%�J������D�   �����Li�����r�u~����J��QY����]H��/��mlD<[,�8�K_#�#B'F3]�_%��aS�U�W���_��X�W�;L���1�=��VI}�v+��z��Y2��'�-��b!���]��w��GY�d������O#f������A>�������
  :~�Ms�T(�A�s7�f���]�Q���.�"���	��_�����w!M$����V�p����#���Y`3[��|��l�9H��FB2����-Ux���'����t�S<I/<�9�d�S��\���6�"��,n���W����w:�f;t�jq!|\��C���8W~����
�����X�wa��BB�
i��x�]�����e��~��8�N�:0�0�O&SX$�h�hv2��B2�	,1D�a���1!�1�0�%G�   �����~z?*�[��:G�c�a�����qZ|T�X����iT-� �S����qZ��4*k���L
�NCC�8��x<PH�p�Oi��i�W����|�����B�d^!�`���j���N�q��B��/�Xw�$�|Z��i����JJ��$V8B��7�G�K�=��x���}�C�=T�vH@@b�����������:���O��}|�>/����������3�c�6x�8~���I�LC��.��/V�_|�pA�{H��s�{j��i)�-Rp%�a���;���}�,V:�������"O�������t6"���#���@����a(���$�Ai�����j��D�'J9�zp�/������1����r���j
  �I����^B�U�6��a��.x�!�f���'�WH#��������|R!���
i�!�1��bC�I��I��o���w����,�=�����b�L($��@"�c7�D�u�k�a|T���$��vA�@@@����k��C�Q(��`F&   	IJ+!��J������VX0#��$���fG�A@@�BJ+,��	���@B��JH���   a!���@@@ !	@i%��Qi����
fd   �������4���@X@i�s�e��E-�s�F�A@@ ��Qi��;"�w�+D�
�@@ Q	�Qi%*b�@@@ a	�UZ�
Q9G�Rg�������/��mq�J�^�hx�|�
�a���7EC�Q�=r���E[���X���M$��/.��(�p�S�������h�|�b���t���v��E�c�SL�X���`�6�.>J1�-.��
�`�#��O
  1r�V��j�J�t���C
��3T��1}���.��W��W<��}[YE�*��A���b�D�����hP�0q�����#��%������)-�V.XS���_��W�?��cb�����]���+�W6�h!����RR�@@b��<��<E3�N���������%X��J�m���0�p�ct=�������A���K�$  �L@�����B�X��e��/i��@���M�����+*��:�;��S"X�"F�G���i����%�8 ��.�8��   QH@�O+
�"����@����z��   1KJ+fM�����D=(��7
   ���!�_FVIEND�B`�
#2James Coleman
jtc331@gmail.com
In reply to: Christopher Wilson (#1)
1 attachment(s)
Re: FW: Possible optimisation: push down SORT and LIMIT nodes

The incremental sort patch seems to significantly improve performance for
your query: https://commitfest.postgresql.org/17/1124/

On Fri, Jun 1, 2018 at 7:46 AM, Christopher Wilson <
chris.wilson@cantabcapital.com> wrote:

Show quoted text

Dear Postgres developers,

I sent this query to the performance list a couple of days ago, but nobody
has come up with any suggestions. I was wondering if you’d like to consider
it?

If this is interesting but nobody has time to implement it, then I would
potentially be willing to implement and submit it myself, in my own time. I
am experienced with C and C++, but I have not modified Postgres before, and
I would need significant support (e.g. on IRC) to help me to find my way
around the codebase and finish the task in an acceptable amount of time.

Thanks, Chris.

*From:* Christopher Wilson
*Sent:* 30 May 2018 16:47
*To:* 'pgsql-performance@postgresql.org'
*Cc:* Steven Winfield (steven.winfield@cantabcapital.com)
*Subject:* Possible optimisation: push down SORT and LIMIT nodes

Hi all,

We have a query which is rather slow (about 10 seconds), and it looks like
this:

select inventory.date, asset.name, inventory.quantity

from temp.inventory

left outer join temp.asset on asset.id = id_asset

order by inventory.date, asset.name

limit 100

The inventory table has the quantity of each asset in the inventory on
each date (complete SQL to create and populate the tables with dummy data
is below). The query plan looks like this (the non-parallel version is
similar):

Or in text form:

Limit (cost=217591.77..217603.60 rows=100 width=32) (actual
time=9122.235..9122.535 rows=100 loops=1)

Buffers: shared hit=6645, temp read=6363 written=6364

-> Gather Merge (cost=217591.77..790859.62 rows=4844517 width=32)
(actual time=9122.232..9122.518 rows=100 loops=1)

Workers Planned: 3

Workers Launched: 3

Buffers: shared hit=6645, temp read=6363 written=6364

-> Sort (cost=216591.73..220628.83 rows=1614839 width=32)
(actual time=8879.909..8880.030 rows=727 loops=4)

Sort Key: inventory.date, asset.name

Sort Method: external merge Disk: 50904kB

Buffers: shared hit=27365, temp read=25943 written=25947

-> Hash Join (cost=26.52..50077.94 rows=1614839 width=32)
(actual time=0.788..722.095 rows=1251500 loops=4)

Hash Cond: (inventory.id_asset = asset.id)

Buffers: shared hit=27236

-> Parallel Seq Scan on inventory
(cost=0.00..29678.39 rows=1614839 width=12) (actual time=0.025..237.977
rows=1251500 loops=4)

Buffers: shared hit=27060

-> Hash (cost=14.01..14.01 rows=1001 width=28)
(actual time=0.600..0.600 rows=1001 loops=4)

Buckets: 1024 Batches: 1 Memory Usage: 68kB

Buffers: shared hit=32

-> Seq Scan on asset (cost=0.00..14.01
rows=1001 width=28) (actual time=0.026..0.279 rows=1001 loops=4)

Buffers: shared hit=32

Planning time: 0.276 ms

Execution time: 9180.144 ms

I can see why it does this, but I can also imagine a potential
optimisation, which would enable it to select far fewer rows from the
inventory table.

As we are joining to the primary key of the asset table, we know that this
join will not add extra rows to the output. Every output row comes from a
distinct inventory row. Therefore only 100 rows of the inventory table are
required. But which ones?

If we selected exactly 100 rows from inventory, ordered by date, then all
of the dates that were complete (every row for that date returned) would be
in the output. However, if there is a date which is incomplete (we haven’t
emitted all the inventory records for that date), then it’s possible that
we would need some records that we haven’t emitted yet. This can only be
known after joining to the asset table and sorting this last group by
*both* date and asset name.

But we do know that there can only be 0 or 1 incomplete groups: either the
last group (by date) is incomplete, if the LIMIT cut it off in mid-group,
or its end coincided with the LIMIT boundary and it is complete. As long as
we continue selecting rows from this table until we exhaust the prefix of
the overall SORT which applies to it (in this case, just the date) then it
will be complete, and we will have all the inventory rows that can appear
in the output (because no possible values of columns that appear later in
the sort order can cause any rows with different dates to appear in the
output).

I’m imagining something like a sort-limit-finish node, which sorts its
input and then returns at least the limit number of rows, but keeps
returning rows until it exhausts the last sort prefix that it read.

This node could be created from an ordinary SORT and LIMIT pair:

SORT + LIMIT -> SORT-LIMIT-FINISH + SORT + LIMIT

And then pushed down through either a left join, or an inner join on a
foreign key, when the right side is unique, and no columns from the right
side appear in WHERE conditions, nor anywhere in the sort order except at
the end. This sort column suffix would be removed by pushing the node down.
The resulting query plan would then look something like:

Index Scan on inventory

SORT-LIMIT-FINISH(sort=[inventory.date], limit=100) (pushed down through
the join to asset)

Seq Scan on asset

Hash Left Join (inventory.id_asset = asset.id)

Sort (inventory.date, asset.name)

Limit (100)

And would emit only about 100-1000 inventory rows from the index scan.

Does this sound correct, reasonable and potentially interesting to
Postgres developers?

SQL to reproduce:

create schema temp;

create table temp.asset (

id serial primary key,

name text

);

insert into temp.asset (name) select 'Thing ' || random()::text as name
from generate_series(0, 1000) as s;

create table temp.inventory (

date date,

id_asset int,

quantity int,

primary key (date, id_asset),

CONSTRAINT id_asset_fk FOREIGN KEY (id_asset) REFERENCES
temp.asset (id) MATCH SIMPLE

);

insert into temp.inventory (date, id_asset, quantity)

select current_date - days, asset.id, random() from temp.asset,
generate_series(0, 5000) as days;

Thanks, Chris.

------------------------------

*This email is confidential. If you are not the intended recipient, please
advise us immediately and delete this message. The registered name of
Cantab- part of GAM Systematic is Cantab Capital Partners LLP. See -
http://www.gam.com/en/Legal/Email+disclosures+EU
<http://www.gam.com/en/Legal/Email+disclosures+EU&gt; for further information
on confidentiality, the risks of non-secure electronic communication, and
certain disclosures which we are required to make in accordance with
applicable legislation and regulations. If you cannot access this link,
please notify us by reply message and we will send the contents to you.GAM
Holding AG and its subsidiaries (Cantab – GAM Systematic) will collect and
use information about you in the course of your interactions with us. Full
details about the data types we collect and what we use this for and your
related rights is set out in our online privacy policy at
https://www.gam.com/en/legal/privacy-policy
<https://www.gam.com/en/legal/privacy-policy&gt;. Please familiarise yourself
with this policy and check it from time to time for updates as it
supplements this notice------------------------------ *

Attachments:

image002.pngimage/png; name=image002.pngDownload
�PNG


IHDR#��}$ysRGB���<�IDATx^��	|T�������,@@@�QAQKY,5VE�U��O��b]������A[7��ZPY*j�����	!�&;d�33���?�L&73��;3w�����ps8�,���g~�?��n�8@@@@B@@�4�$����#���@@@BEJ+Td�.���@i��P��
Y�   *�=L��`��'[�6��DK'�L��i�g�Ai��qcQ�������1K�����=���.��#����x$���J�'������T��2���{�mZ��������S	B�������~���0�6"�>�b���=*��������-�`P���������9cN��'iTF�jP����;��3YN����PR6�;s��I=������lM95���l}�����x���7�_�T�n�����6�kM�6�@��J�Q��*{�V���K=���5���j��i���>�|��p<�����U�)I#$kTSr�{�(�h����T�*Y��aX�a�L���nG�c?����B$�LF��)EC�{�������t#_�����?%��kSNOB`}��z��#nlx���O��ZR�O�L/i�W4��i�6��[�v��n�V���f7�XH��^k�[u���T�Z�i������A��� J��������T����L��}�/�G_����=+E�u���&�%�6�ZM��[+M'[����iZu������{�
f[����Q��;-3599"1�����JS���c���b;�bm�X�	����M
�	�Z� ���6�_>
�z�{�yAa���'�\1���3�_�������sf�1���;0��AY?�����7_=6��C��d���z���E��5vX���93����zLN����7z�r$_���k���l����1�h��m>vh��j��C-�-��s>>��V&bH����_}��~��3_�9����)��u��l�����w����C
��cL�nr�aR�����a�:���������t}~�V~R�)�@a���O����e��'�c3
3W65�U@4whEc���]g�r������Q���f�7���z��o����v���W����8���w��w�������r�������4�mn�I��2A�C��>(9����6��V�|P_�N�Fe���B/^�%:i�*K������Z8��n��>��d���7����F����{~����<������:5U��S�V�u��b�~��_6��,��V���!����k����;�J�L�����������~����&'������#�V4��F(,���Il���������|Z$��5g�����n}1��7�k�`e;�}u���"�e�����-]�=��Hli�4�Y���}�mY�e�iVB��O`�G�;[���S���v�Y�)-�O�K�������������|�c.E�	RZE?����h4�V���������~9���]�Qi������TZ�/�p���o��p�Z]���3��kj���*�AC����
����E�*0���+0#����Z[��=)I�������oz����_5��EN��,�p�p����;�b��	�hE,���[���Ci]���5���_v�J+���L���Y��7���,MV��Ai%��:5Mv�{��sK��b��i0�zH���o��9���<R�fBq�5��?[��x�u��%�����m7�II���s�Bi2�g�}v���+�Hi-Y"���ST�������b��=%Ex�i�J������������Y�n�6�=]�y��(-%-���F+�o/E��5���n�����e�]�F�����+����rA���e��[t���Js������W}���������vQZ4�:�������@��45�B#�������\i�������4�2��>
�=<�4PZ�z���(�`��G�U\�3������}6���}i,.�2�<(-��(fA�G�hx�JE,m>-�����\6�,=���R���x�V���c����S�����g*�O�ik��W���h��d��u�Z����p%�){��������]v�>+K���.-��X��CC�����4mr�8p ��(�'��-��8Vk�?����ha�V��jZ��� ����/���hhEaA���w��V/Qx�J�0�����f�{��W�I����K_��g����%�����2}���~��O���;W^Q�����o^��c����V���,(/����XB��d.���{��a@�'�M�/*�i���Bz]�E��i27-��>UVA���:��!�N�V�N�Gm?r��S?�o�V��#���"t�y�o%�e]j\8Bp��7������c�bT�O��;s���\X)5N��[3�ldQ���2N��1��ya4
�=&��%��.����>��K���_����y��V���S��7�wp��ys.���������R[:���0������ ���$������������g!�>�V?<��?�Z��!5:c��w9�^S�1��Hcl����=1y�v�v���0�V=r���/U�,�{]cik���_>��=��������m�������-��e�hU�*
*�Dk#[~�=�7|�������S��,w���;K���H���Z��sb�4���m^�����������>������?��V_uJm��Q�)|��cD<,(���8hEF���-j~��w�����F�{����U�%o�WZ4l��(c)�����-�!Iy��{|��������P�]��W���N�%�����=�1��z74�L���T��7[�����Q�ai�!)js�I*�h8�������r��A�jm6�%�F�#^���0��,g��++�U��qt]c�����i�����#�V�o���Py�������%=Hf��C��O�&�Es�/�w���o����8�"�-z
�?��K�������?������x����/�#��O�k9��\��y��j���{�(���.V`��	�hD�-���nci���1��La/��qL��
qW�4���������s[�e���_sM����c��Ne�X�0���T�tnVg�h����~{�j�Z��n�Z��UU5�s��x�b�B�U3�Q�{�^��m��c���u�?��%h�!?�r$�H�UW[jk-R��P��*�-h=�������*�O���}��rZwh������Q�Ihn���a`����}�7	t�4M�~%�p����Z��>4XPk�K���eR!�Z}�F�A�E���Dlv�r������$W����3R�^�3��'���}��z�8�^��>��2�����NJ�k�����"���)wj&h� -�E��^�a�����B����K���~7��m��K�i0�s�"4�p[���a'Ym����]���~�.�f��������9F�V��*Y��Q����L"K���>���*��ZvA�ls�������O��%\�@,�Z��>4�ZP�g�a�C��w����f�Y�����.���Cv�����S��o��zK�<J�������������O��
)Y��z�Y��=��E��+9�rd�"������\<���S�f�����>>�IO�����Zm�:�|X5�,������,|U�m����ju����"2����/g���n����L�5V@������s�O��U~�C<���G�V4����!�E���!]S��w�2�5��i0]�������O4���sn�_
;���k��sv���i
�O�5b�=#f/}��2��'������w�9�g�S������l�@v�����T�����{�������@�Y�����C�M\������f��I��W����w�������Gi+
y��/�|�����[~���������!'?�6[���I���z�23uY3B~��'�|�2k�����c�'�C�'~+��oe�G�t���n��V�~�'��$v�=dO?�R��dv&�RT��?�S����
<�O�@�y����<�N{���,�q��FP�����4���\M!S�����������T�B��h�������h{���>����mMG-�g(��Zij<��p�N�]��Jb�a�&��;����RU����Zw����vn��\#>S��K��4����Ob�{rH�G#�I{t�-��c�m�������se�M��|�>������������GO2��is���=����8}�0����Y&lZ��x9 9��0�hhE��|d/������f�V��jW������N��+�F�r�>
��Cre��r�"L�+}�1re��b!6;
��Cre��r����w�����E���c�~{��tk�h��-0E��
����u��*t'�����4
�"�50U[����Y�]�����
=O����������w�~�f�|�i-�q����56[�j#���G2���oA������+���aN�[h�*Xs������������W]Gc>^y�u�������Ohh`q����P�33S�����|������"�}���7-�2�ZBj�ke�\B�
�>@+8����2[+�g4Qu���Z+��'�_�S�1���!���
������#�
��~�>ZJ�<��X��K1NK��j"������
A{��f�//�2j������D�2+V���>��e��$o����OrY��<��ok8�V����9[0�00��.9$�uZ�L(;!�=*�����qa�a�a������B9O�,5����N�I'����i��$��\8a�If���I��UU��s��kv��Zn+�����r,(��hEe�R6���"���Y<_�]H�,Z��d�V�f���4��G2��Y����f&�����LC�]�O�K�N>�P��M�,�X����$��C�9J����������g��o�W��f���i�/R4�~)�l���t�yY�k/	vl~O;�>�%o���Y�0]�_���uH��Ok�i��m���2H���u����C�NjJ����v���[�|Z��"i,��"������=o��S0�}����?��O���oQ��+^s�"g+��YWG��.������)S�}��6��A\>�wo��W|�8�������|V
����F���PZ
�>��r�z\Z����(��dM?��X�f�]z�<��s�>e�����NUVo1��Z5������7�s��=I���	�RtO������R��8��>�5
�R�E�sN�
��m������\>
J+��Y�Ii�n�_��E�XZ��z��\i����>B���Ox�?��i�7�SfQl���wEJK��T3g:�VF���PZ2m*��2�B+*���|ZpMn����}���(��������{1�N3{Z��g-���_��{4(-���6���V�z�G���v��G��R�������O�/�*"��7�J�����<=>��V�y���2%i�d�jJ�q�eMy����Q%�u2�>,qQZ�
}������W��������[Z��qZ�������8��^l�$����w*������Ue;y�X�9K�L��=*->��9���d�]w�.��ewj�gj*�Ytz=����x��	J�hr,�����TPh9�=BF�Z�����������
x��Q��JKN�TZ���/��+^_���0�0��Dr�|*�S�#s�vQ���oA��ZQ�������&��E��X1�vZ���!���h���coZ�����!�4��b���]������r���'g����U����,$%{�����=��#���{|����%��hHK�6���]�Vi�����*{�Qk�j�g������R���~5�_���
��'�vy����qHi�������rw����651�6����9<�����_��MX?R8!�	�<�:��P�F�A(*'�*�R��T�����'X�H~���<Ii��LC�WH{E�
s����k:��Kk�;Wp�{i�Z�a���m
?�Ui�+>~�`��u���������r��*��`2�3i�G��DmNM}������:O��N���>^�Bp(B��b�����n�|k��[�?����:
��V���6�P������
�������V{~�If�Vj�<�Ru���$��eY-H���:d��Q������rU��}��a6J��C����8^�`A9UG+*��"q�� �!C\+�����h'��x�YF�9H����Lwl>���_�JK�g &�m���Z�Mm��y�Jn��e�f����B�@�[����O��������l���#rqw|I�����]��}������;re����M���O�'0:�e�����Y�lb;+eS�~���0Hz)ZQE�3,�E��V�H��I(QWq�%�)
/
�����w95�(�<�<��.PZ>M�h�mi���_�8�3�yT�QfN�q[�pU,����/��U�k�<(n\(}�i�4ur�uz��FR{RZA|���Ut��i��U����C����1������C)�8�X��������\VR��"�����<��e���FeQ=������JJ�O��XHrvR�
L�8���+�j�{(TF���M^�.��j�����h�t�z��z���O�|�R5�1"J%��8�X�@8V4n;I��f�R���/����F6����@�Blx��p.2N�,�Y�&�����R>�!�iH��G.�i�|�� ��Z3\8evt8Pv��E�X}�������Hj	8N+`�|B���&�nE,�>+H�	m-�r�!\����J�'���@mDKV��&[�����<d4cB�1%�V�r�Z1������@�:����c��2=����$�Vj`p�.9{�{���I�Sp�a"1x�L��[��_��v�,�J�� �v��,�B�[QXP�]�m��S���	im��w��y��[f�1���y���W�����F^�?4��w6�7�l��#f���.;!�Wk�������}VI����FD'�K�����0�>�[2���-E�*��R�U&�Q(P���
�������Y;� �LmtF��z���bYi��r��W^Q[���*n���G���7�^P"+�����eBB+*��h�� k���P�+�����
���V�#o�����?|�o�LC����]x`���'�����NR���_�(Gia���1A@��n��Y#������-t����w�_cB���b���w�I�6�������[k�s}ZQ��Y�5�s���������Z�N�-Z�A2����555.����9�W6�EMqtP���A�[�\Y|��s�8�]����3=�1^�
uw��;n�`�u%y������B<�I�;��_b��TMsi:�Ml�<�:��R��?�T��c���x��eQ�P�iJ	(eAP���X��cA�2����Ci1�����(�:f�l����s�����H��� 6�7:��[vZ�A�Z�����2
����V����V
WQ~'���!��PR6��T�lHM�PX�fS��z�4��FeQ1h��!��GjL���9,$9��4L?�f���4�0)�?;3����46�:
�iH��C9�L�8a���!����Up��LCqM�z���
��o�QcQ�p�sY�(2�i��E�qZ��j�����t�w���i���q�8������V4R���W����f#�/��~�+�"S��/��]c��yRj}���$D�a:�n=r����<����,����r�z���Ym����l����M���Pt����#^���q#|�bg��i����|<��!� ��n8p�����J��#�VT9����V<[�{�Vo)��Ru����������^�����Zi��i��&��T�>��9��Z��3����������\d�"E�?����(B#���D��w>8���f�'G�
@��VUw#�~C�,<q  ��������,�
#eS�UD�PZQ`!:����b�J  �����@@Yd2HXPZ	kzT���:��@@ a	@i%��Q�n	(��h�D&����G�����  "PZ!�dc�Z�j9�@���VT��
#8��Y��@���J8���^����@@@YPZ������e]�l?�t��]x��Y��COK>��r8	�'�x�JK������e��p����(���6��r��\n�[(���j�qh��N��v��>��87��J*�0��&1���	��E
�9�F��i�JH�����~�����?"(-��K+�n��!s6-,XS��_aY�?��A��gn���m��hp	���Xuh�D6��+�N�3v����#r�d,Q
UY���A�W�m����T�,(}��s��#�=��Je�t ��T����
�D�0}��mL���F��2�#��������Sm�C�e���YA=��~/9�d�n'�%q�u_x��k�[�����E�t��(:"���RI:����C����_�m������bPj�(}��pdZ�?}����7H�zy�=A���]�����!���vy���Y����
�����ba�k�_���`�no���Se���+��vm�%���S$��e\XT��/�����|G������Z�P��Z��������oz����JY9qD� ����*\�����mq���)#�����-{��[Q�f���-Rs+gt�<����1C��qq`��.�,����Su���I&��^\(fT��t�K���G���K���X�r�v��n��9�Y�n\�����Eo��E�C��3���kQ��u��(X�M$�j��s��&z8}�w������M!�w����Fw�Z��s��>�(���}��f�Qtk%�����xi��O���`��+�9���=��{���=�h�}~GD1�*������b3�����O�wJ��'���y�y��7���B�>�f��������:s�7_���*z�\��+���9�N!�T�mc����v��5��l�_0�����S�{�,������.���q���j�[7���g����d������!�d�6�-g��T1�<<�����\���'^}����LA6FD�N��������x����K���xo���o�S�~�:�y�]��:R��B��(���-��(-�:��2����>&��a/�+���QXLO��$�,k�If;���[�u��.
�Y<���;����TZ�/eB�����|�(rhy�����i���K���@w�d�4�
D?T���<���@�+j�&
]	{��~5���'�8�VVJ+8�vx��oX��=q�	���|G��P+���~��(�p����8�T<�>�H�����K�d�����'L����[�!�|Bp��}~2�������L�a�Vw�t���e��p��&Q_�O�	�v��&��C����k���L)�^j����4<���1��^q�L�I~B{�����Z���7�D5m����;���h\P�c����B��%����2������[J�����7��r���z��:��8���vId�W����T:�f���w������R�p�����`��+bq��'�n��z7=����,�Zb��h
"8�5�i�A���Hpv}l���O�"�7^�h��G�O~��%��H���X������
��D;��#�����n�z�}�
����h���.A�O5;G��^�I�wPE��q���b��|1��B!��xd�c]7�H�V�:�BI�D7j����2�$�K��y ���"�s�a8��� ���,�k���� �
de�H�%�Vxy#�H�C+�@�  �X��J,{'xm��J������
?s�0B+*��B��@����w�~����  ~PZ�g�#O���%�� ��vN�Z�������@d@iE�;r� 8�"Y��@���J4�'b}��JD���  ����(E���.��@@����s����s�z  �����>(����R'�MJ�7#��]ph���Pr�PZ�aG��78�|3B�	@i)M�E
8���(�$.(���}B���27* �CJ+zl��(I-%i"-�@	@iJ��8�b�V()��(�x�(�C���c  %����(F���*�H@@@(-�%���S�BaA@ �	@i����zph%�����@�	@iE�(����R�&���V��@��C+Z-�r��@��J c�}U���{��  s��b�d(�,ph���H  !&�b�H>\��
i�  �(-?`!j��C+V,�r��@����{'D��J3��  ���b�h(�Wph��PZ�c�$@ph������V�#�0�C+������oPZ�!F4�C+��������v�|
On����'������T����/	b����kn�����������P�L�R��V0$q/��@(���&<���SG��Y�*V���-�K�\1����CFe�?�P<��1��u*���4���~��gc�4�4,��T�s��pW�9'Z��jv[���o��~��P�0�\���[���������.�����������h[.-(-�&FL������d��g�>;��]����d�7U������`��Y�jM;kL��Mb���u�]���:3]S�������
f�Om$�Y$��\8�p���./\\t������)�(�-�\�� ���F(!�$ ������l��9���'d��t��6\���ai:
�C;*�0��aB/��(��vd�~T�~x���)$/Y{v��t��iz�N@�!�2	���L;��%5����	Y��r�����6��1B+0n�@@ ���=���5����������
y��E��y��;y7Yz�TYF����5�����>-����C���D����gU��!���z���[vWS=/��������/�0����M{��Z���\�C�aZ,��7��c!{��*
�s��s�9B(��0��Uk��w���aw
�f������
��V��Gw�Ovq���?�8�J5��<�������	]d��
[�  � �n{If��:��r$������K.�i�����'�Y��o����3���������O+?���`�9)�Y~5���~7�������}����	G��0D��ZQo"@���qZ-����Rv��bG��3�_�gg~�Tt������k��4WW[jk---V/7�DE�m��<�n���c�����r���#����A@�F���������%�M����TWR�%�^N��6{�g�'*��?,c;i�"�\1}�Si]s�
�a��I?I�{�����;�BhJ�/f����+G���q~MoT��ph)I��(N����4��t>������
�3�kv&S�01d���$�O���w�B����B���y�F�BpD�MT�����v������ulm���������k�L��:�A�IFSK�6Z�6���9�1����=�  f�)���si��w��I��)d��kK�-g�;�4��Bh�{�;�?�']S����~�N9?i�;��X���5����v1$�5���22�YY�^�t����d��h��z��8�P���MW�O��]���%��d�G�4f��	�z��i�A=tC��}S"�NZ���v  ��{�py��s��>������U����8f�=|��B�i(�+��{���LC��C��E2�|>�P�E,�Y��]m�S�Z?���*�<=�����.�������.��~��_6�NJ;L!}u��y������1�?y�{��gn�I�*t����0�P�����h���u�c��(�Hf�I�l.�8��f�����]hms�4��>i��M�{h��CE�J2�e������6���n7n����t�C(�Z�|W�.�S}~ts��o����[4B�o��x]��_Tm�/P�����Z�!��@�	xSZ6��N����w�`(��"��9G������x�����/�gi�s�V��X�'�I�<�>5b�!���U����[O�ii�C�L
GLM����[*�*^`�1B+0n�@@ ��������R|�����OEDW$��WlSV�����G����>lV+���V��fs��8��+�^8���BL��[ii���
�,B����Y��y����
i$�4��s����!���5���R���?���p�-f#q�Rrl���jc���X�^6+�<������ph�/oJ�j��l����4��������s^!
~��]���w
;i�;�i��~�1���Uk���jv~�����|����C���N��N�&G����P��@�Zg��7������|���#$��>)����3d��I�`i�ph���  QB����_�H�|��C���|^!/=��HZ9w0d�
���
)�4�y��������Y$�.�Q2t����=�;'J�#-�,��O��������Pzo�yW���%�����%��[�������t����^k\�D��[~����p����ICS'�Io��_����S��<����N)���|(��@b����@PV�X���^3��r�D����:���4���!Y�!H~������C��I��k|�����>D�cM�v���%��I��Z]�Si9�Fb�JZk�RcMn��Y����-m6�~��m���:����1n�2>���U�.�i���W\\{��?r���l�>��5|�;u��Y%��3�g��[���1{�������Yx�A@b��7�u�ULi�����"%*$��Hi��fx{[���.BRZ]%��!*-Z�������Li���4I���PZ1�5E���"�5������o��=���J�z����u �����+�+����m/����1[�]�g��C��][��4N�������9(�P�i���4oJ��+�������]V.%]%*-:��E�������J��*Q����y��!���^]~!���O���b�r��wv����p�u������Q�L��	����Z��|�,���IJ�>��u����PS8�@E��[�75����
)�����L[�V�PZFRZ�����
���e/NE�G"   � ��i��2��h$�s�B��C=���&v�`Hc���l��8��f��"�E�j=�-�>��E��t��>	��Y<�4$w���?^����|w�z:W����6�K�������h��~����|�N~;�,��.�mV_������V���`�����n�����oX���"���@��������zE�����'O��'-=�+T*d�O�D�h+	� ����i����n!�d4����^�i���,_�t������.������O����J`���F^����4/�>%�l���O)��@@ ��)��g����������v0��WH�7��B�l����8���yW�-������Eh�����!y���������{K���~�k�a�IE���d�,f2%�J������D�   �����Li�����r�u~����J��QY����]H��/��mlD<[,�8�K_#�#B'F3]�_%��aS�U�W���_��X�W�;L���1�=��VI}�v+��z��Y2��'�-��b!���]��w��GY�d������O#f������A>�������
  :~�Ms�T(�A�s7�f���]�Q���.�"���	��_�����w!M$����V�p����#���Y`3[��|��l�9H��FB2����-Ux���'����t�S<I/<�9�d�S��\���6�"��,n���W����w:�f;t�jq!|\��C���8W~����
�����X�wa��BB�
i��x�]�����e��~��8�N�:0�0�O&SX$�h�hv2��B2�	,1D�a���1!�1�0�%G�   �����~z?*�[��:G�c�a�����qZ|T�X����iT-� �S����qZ��4*k���L
�NCC�8��x<PH�p�Oi��i�W����|�����B�d^!�`���j���N�q��B��/�Xw�$�|Z��i����JJ��$V8B��7�G�K�=��x���}�C�=T�vH@@b�����������:���O��}|�>/����������3�c�6x�8~���I�LC��.��/V�_|�pA�{H��s�{j��i)�-Rp%�a���;���}�,V:�������"O�������t6"���#���@����a(���$�Ai�����j��D�'J9�zp�/������1����r���j
  �I����^B�U�6��a��.x�!�f���'�WH#��������|R!���
i�!�1��bC�I��I��o���w����,�=�����b�L($��@"�c7�D�u�k�a|T���$��vA�@@@����k��C�Q(��`F&   	IJ+!��J������VX0#��$���fG�A@@�BJ+,��	���@B��JH���   a!���@@@ !	@i%��Qi����
fd   �������4���@X@i�s�e��E-�s�F�A@@ ��Qi��;"�w�+D�
�@@ Q	�Qi%*b�@@@ a	�UZ�
Q9G�Rg�������/��mq�J�^�hx�|�
�a���7EC�Q�=r���E[���X���M$��/.��(�p�S�������h�|�b���t���v��E�c�SL�X���`�6�.>J1�-.��
�`�#��O
  1r�V��j�J�t���C
��3T��1}���.��W��W<��}[YE�*��A���b�D�����hP�0q�����#��%������)-�V.XS���_��W�?��cb�����]���+�W6�h!����RR�@@b��<��<E3�N���������%X��J�m���0�p�ct=�������A���K�$  �L@�����B�X��e��/i��@���M�����+*��:�;��S"X�"F�G���i����%�8 ��.�8��   QH@�O+
�"����@����z��   1KJ+fM�����D=(��7
   ���!�_FVIEND�B`�
#3Steven Winfield
Steven.Winfield@cantabcapital.com
In reply to: James Coleman (#2)
1 attachment(s)
RE: FW: Possible optimisation: push down SORT and LIMIT nodes

It does indeed!
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=398.50..398.50 rows=100 width=32) (actual time=10.359..10.378 rows=100 loops=1)
Output: inventory.date, asset.name, inventory.quantity
-> Incremental Sort (cost=398.50..403.27 rows=5006001 width=32) (actual time=10.357..10.365 rows=100 loops=1)
Output: inventory.date, asset.name, inventory.quantity
Sort Key: inventory.date, asset.name
Presorted Key: inventory.date
Sort Method: quicksort Memory: 103kB
Sort Groups: 1
-> Nested Loop Left Join (cost=0.71..1702372.39 rows=5006001 width=32) (actual time=0.030..2.523 rows=1002 loops=1)
Output: inventory.date, asset.name, inventory.quantity
Inner Unique: true
-> Index Scan using inventory_pkey on temp.inventory (cost=0.43..238152.40 rows=5006001 width=12) (actual time=0.016..0.290 rows=1002 loops=1)
Output: inventory.date, inventory.id_asset, inventory.quantity
-> Index Scan using asset_pkey on temp.asset (cost=0.28..0.29 rows=1 width=28) (actual time=0.002..0.002 rows=1 loops=1002)
Output: asset.id, asset.name
Index Cond: (asset.id = inventory.id_asset)

I’m guessing the feature-freeze for v11 means we won’t see this in the that version, though, and the extra GUC it requires means it will be in v12 at the earliest?

From: James Coleman [mailto:jtc331@gmail.com]
Sent: 01 June 2018 13:50
To: Christopher Wilson
Cc: pgsql-hackers@lists.postgresql.org; Steven Winfield
Subject: Re: FW: Possible optimisation: push down SORT and LIMIT nodes

The incremental sort patch seems to significantly improve performance for your query: https://commitfest.postgresql.org/17/1124/&lt;https://commitfest.postgresql.org/17/1124/&gt;

On Fri, Jun 1, 2018 at 7:46 AM, Christopher Wilson <chris.wilson@cantabcapital.com<mailto:chris.wilson@cantabcapital.com>> wrote:
Dear Postgres developers,

I sent this query to the performance list a couple of days ago, but nobody has come up with any suggestions. I was wondering if you’d like to consider it?

If this is interesting but nobody has time to implement it, then I would potentially be willing to implement and submit it myself, in my own time. I am experienced with C and C++, but I have not modified Postgres before, and I would need significant support (e.g. on IRC) to help me to find my way around the codebase and finish the task in an acceptable amount of time.

Thanks, Chris.

From: Christopher Wilson
Sent: 30 May 2018 16:47
To: 'pgsql-performance@postgresql.org<mailto:pgsql-performance@postgresql.org>'
Cc: Steven Winfield (steven.winfield@cantabcapital.com<mailto:steven.winfield@cantabcapital.com>)
Subject: Possible optimisation: push down SORT and LIMIT nodes

Hi all,

We have a query which is rather slow (about 10 seconds), and it looks like this:

select inventory.date, asset.name<http://asset.name&gt;, inventory.quantity
from temp.inventory
left outer join temp.asset on asset.id<http://asset.id&gt; = id_asset
order by inventory.date, asset.name<http://asset.name&gt;
limit 100

The inventory table has the quantity of each asset in the inventory on each date (complete SQL to create and populate the tables with dummy data is below). The query plan looks like this (the non-parallel version is similar):

[cid:image001.png@01D3F9C1.92E23920]

Or in text form:

Limit (cost=217591.77..217603.60 rows=100 width=32) (actual time=9122.235..9122.535 rows=100 loops=1)
Buffers: shared hit=6645, temp read=6363 written=6364
-> Gather Merge (cost=217591.77..790859.62 rows=4844517 width=32) (actual time=9122.232..9122.518 rows=100 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=6645, temp read=6363 written=6364
-> Sort (cost=216591.73..220628.83 rows=1614839 width=32) (actual time=8879.909..8880.030 rows=727 loops=4)
Sort Key: inventory.date, asset.name<http://asset.name&gt;
Sort Method: external merge Disk: 50904kB
Buffers: shared hit=27365, temp read=25943 written=25947
-> Hash Join (cost=26.52..50077.94 rows=1614839 width=32) (actual time=0.788..722.095 rows=1251500 loops=4)
Hash Cond: (inventory.id_asset = asset.id<http://asset.id&gt;)
Buffers: shared hit=27236
-> Parallel Seq Scan on inventory (cost=0.00..29678.39 rows=1614839 width=12) (actual time=0.025..237.977 rows=1251500 loops=4)
Buffers: shared hit=27060
-> Hash (cost=14.01..14.01 rows=1001 width=28) (actual time=0.600..0.600 rows=1001 loops=4)
Buckets: 1024 Batches: 1 Memory Usage: 68kB
Buffers: shared hit=32
-> Seq Scan on asset (cost=0.00..14.01 rows=1001 width=28) (actual time=0.026..0.279 rows=1001 loops=4)
Buffers: shared hit=32
Planning time: 0.276 ms
Execution time: 9180.144 ms

I can see why it does this, but I can also imagine a potential optimisation, which would enable it to select far fewer rows from the inventory table.

As we are joining to the primary key of the asset table, we know that this join will not add extra rows to the output. Every output row comes from a distinct inventory row. Therefore only 100 rows of the inventory table are required. But which ones?

If we selected exactly 100 rows from inventory, ordered by date, then all of the dates that were complete (every row for that date returned) would be in the output. However, if there is a date which is incomplete (we haven’t emitted all the inventory records for that date), then it’s possible that we would need some records that we haven’t emitted yet. This can only be known after joining to the asset table and sorting this last group by both date and asset name.

But we do know that there can only be 0 or 1 incomplete groups: either the last group (by date) is incomplete, if the LIMIT cut it off in mid-group, or its end coincided with the LIMIT boundary and it is complete. As long as we continue selecting rows from this table until we exhaust the prefix of the overall SORT which applies to it (in this case, just the date) then it will be complete, and we will have all the inventory rows that can appear in the output (because no possible values of columns that appear later in the sort order can cause any rows with different dates to appear in the output).

I’m imagining something like a sort-limit-finish node, which sorts its input and then returns at least the limit number of rows, but keeps returning rows until it exhausts the last sort prefix that it read.

This node could be created from an ordinary SORT and LIMIT pair:

SORT + LIMIT -> SORT-LIMIT-FINISH + SORT + LIMIT

And then pushed down through either a left join, or an inner join on a foreign key, when the right side is unique, and no columns from the right side appear in WHERE conditions, nor anywhere in the sort order except at the end. This sort column suffix would be removed by pushing the node down. The resulting query plan would then look something like:

Index Scan on inventory
SORT-LIMIT-FINISH(sort=[inventory.date], limit=100) (pushed down through the join to asset)
Seq Scan on asset
Hash Left Join (inventory.id_asset = asset.id<http://asset.id&gt;)
Sort (inventory.date, asset.name<http://asset.name&gt;)
Limit (100)

And would emit only about 100-1000 inventory rows from the index scan.

Does this sound correct, reasonable and potentially interesting to Postgres developers?

SQL to reproduce:

create schema temp;
create table temp.asset (
id serial primary key,
name text
);
insert into temp.asset (name) select 'Thing ' || random()::text as name from generate_series(0, 1000) as s;
create table temp.inventory (
date date,
id_asset int,
quantity int,
primary key (date, id_asset),
CONSTRAINT id_asset_fk FOREIGN KEY (id_asset) REFERENCES temp.asset (id) MATCH SIMPLE
);
insert into temp.inventory (date, id_asset, quantity)
select current_date - days, asset.id<http://asset.id&gt;, random() from temp.asset, generate_series(0, 5000) as days;

Thanks, Chris.

________________________________
This email is confidential. If you are not the intended recipient, please advise us immediately and delete this message. The registered name of Cantab- part of GAM Systematic is Cantab Capital Partners LLP. See - http://www.gam.com/en/Legal/Email+disclosures+EU&lt;http://www.gam.com/en/Legal/Email+disclosures+EU&gt; for further information on confidentiality, the risks of non-secure electronic communication, and certain disclosures which we are required to make in accordance with applicable legislation and regulations. If you cannot access this link, please notify us by reply message and we will send the contents to you.

GAM Holding AG and its subsidiaries (Cantab – GAM Systematic) will collect and use information about you in the course of your interactions with us. Full details about the data types we collect and what we use this for and your related rights is set out in our online privacy policy at https://www.gam.com/en/legal/privacy-policy&lt;https://www.gam.com/en/legal/privacy-policy&gt;. Please familiarise yourself with this policy and check it from time to time for updates as it supplements this notice
________________________________

Attachments:

image001.pngimage/png; name=image001.pngDownload
�PNG


IHDR#��}$ysRGB���<�IDATx^��	|T�������,@@@�QAQKY,5VE�U��O��b]������A[7��ZPY*j�����	!�&;d�33���?�L&73��;3w�����ps8�,���g~�?��n�8@@@@B@@�4�$����#���@@@BEJ+Td�.���@i��P��
Y�   *�=L��`��'[�6��DK'�L��i�g�Ai��qcQ�������1K�����=���.��#����x$���J�'������T��2���{�mZ��������S	B�������~���0�6"�>�b���=*��������-�`P���������9cN��'iTF�jP����;��3YN����PR6�;s��I=������lM95���l}�����x���7�_�T�n�����6�kM�6�@��J�Q��*{�V���K=���5���j��i���>�|��p<�����U�)I#$kTSr�{�(�h����T�*Y��aX�a�L���nG�c?����B$�LF��)EC�{�������t#_�����?%��kSNOB`}��z��#nlx���O��ZR�O�L/i�W4��i�6��[�v��n�V���f7�XH��^k�[u���T�Z�i������A��� J��������T����L��}�/�G_����=+E�u���&�%�6�ZM��[+M'[����iZu������{�
f[����Q��;-3599"1�����JS���c���b;�bm�X�	����M
�	�Z� ���6�_>
�z�{�yAa���'�\1���3�_�������sf�1���;0��AY?�����7_=6��C��d���z���E��5vX���93����zLN����7z�r$_���k���l����1�h��m>vh��j��C-�-��s>>��V&bH����_}��~��3_�9����)��u��l�����w����C
��cL�nr�aR�����a�:���������t}~�V~R�)�@a���O����e��'�c3
3W65�U@4whEc���]g�r������Q���f�7���z��o����v���W����8���w��w�������r�������4�mn�I��2A�C��>(9����6��V�|P_�N�Fe���B/^�%:i�*K������Z8��n��>��d���7����F����{~����<������:5U��S�V�u��b�~��_6��,��V���!����k����;�J�L�����������~����&'������#�V4��F(,���Il���������|Z$��5g�����n}1��7�k�`e;�}u���"�e�����-]�=��Hli�4�Y���}�mY�e�iVB��O`�G�;[���S���v�Y�)-�O�K�������������|�c.E�	RZE?����h4�V���������~9���]�Qi������TZ�/�p���o��p�Z]���3��kj���*�AC����
����E�*0���+0#����Z[��=)I�������oz����_5��EN��,�p�p����;�b��	�hE,���[���Ci]���5���_v�J+���L���Y��7���,MV��Ai%��:5Mv�{��sK��b��i0�zH���o��9���<R�fBq�5��?[��x�u��%�����m7�II���s�Bi2�g�}v���+�Hi-Y"���ST�������b��=%Ex�i�J������������Y�n�6�=]�y��(-%-���F+�o/E��5���n�����e�]�F�����+����rA���e��[t���Js������W}���������vQZ4�:�������@��45�B#�������\i�������4�2��>
�=<�4PZ�z���(�`��G�U\�3������}6���}i,.�2�<(-��(fA�G�hx�JE,m>-�����\6�,=���R���x�V���c����S�����g*�O�ik��W���h��d��u�Z����p%�){��������]v�>+K���.-��X��CC�����4mr�8p ��(�'��-��8Vk�?����ha�V��jZ��� ����/���hhEaA���w��V/Qx�J�0�����f�{��W�I����K_��g����%�����2}���~��O���;W^Q�����o^��c����V���,(/����XB��d.���{��a@�'�M�/*�i���Bz]�E��i27-��>UVA���:��!�N�V�N�Gm?r��S?�o�V��#���"t�y�o%�e]j\8Bp��7������c�bT�O��;s���\X)5N��[3�ldQ���2N��1��ya4
�=&��%��.����>��K���_����y��V���S��7�wp��ys.���������R[:���0������ ���$������������g!�>�V?<��?�Z��!5:c��w9�^S�1��Hcl����=1y�v�v���0�V=r���/U�,�{]cik���_>��=��������m�������-��e�hU�*
*�Dk#[~�=�7|�������S��,w���;K���H���Z��sb�4���m^�����������>������?��V_uJm��Q�)|��cD<,(���8hEF���-j~��w�����F�{����U�%o�WZ4l��(c)�����-�!Iy��{|��������P�]��W���N�%�����=�1��z74�L���T��7[�����Q�ai�!)js�I*�h8�������r��A�jm6�%�F�#^���0��,g��++�U��qt]c�����i�����#�V�o���Py�������%=Hf��C��O�&�Es�/�w���o����8�"�-z
�?��K�������?������x����/�#��O�k9��\��y��j���{�(���.V`��	�hD�-���nci���1��La/��qL��
qW�4���������s[�e���_sM����c��Ne�X�0���T�tnVg�h����~{�j�Z��n�Z��UU5�s��x�b�B�U3�Q�{�^��m��c���u�?��%h�!?�r$�H�UW[jk-R��P��*�-h=�������*�O���}��rZwh������Q�Ihn���a`����}�7	t�4M�~%�p����Z��>4XPk�K���eR!�Z}�F�A�E���Dlv�r������$W����3R�^�3��'���}��z�8�^��>��2�����NJ�k�����"���)wj&h� -�E��^�a�����B����K���~7��m��K�i0�s�"4�p[���a'Ym����]���~�.�f��������9F�V��*Y��Q����L"K���>���*��ZvA�ls�������O��%\�@,�Z��>4�ZP�g�a�C��w����f�Y�����.���Cv�����S��o��zK�<J�������������O��
)Y��z�Y��=��E��+9�rd�"������\<���S�f�����>>�IO�����Zm�:�|X5�,������,|U�m����ju����"2����/g���n����L�5V@������s�O��U~�C<���G�V4����!�E���!]S��w�2�5��i0]�������O4���sn�_
;���k��sv���i
�O�5b�=#f/}��2��'������w�9�g�S������l�@v�����T�����{�������@�Y�����C�M\������f��I��W����w�������Gi+
y��/�|�����[~���������!'?�6[���I���z�23uY3B~��'�|�2k�����c�'�C�'~+��oe�G�t���n��V�~�'��$v�=dO?�R��dv&�RT��?�S����
<�O�@�y����<�N{���,�q��FP�����4���\M!S�����������T�B��h�������h{���>����mMG-�g(��Zij<��p�N�]��Jb�a�&��;����RU����Zw����vn��\#>S��K��4����Ob�{rH�G#�I{t�-��c�m�������se�M��|�>������������GO2��is���=����8}�0����Y&lZ��x9 9��0�hhE��|d/������f�V��jW������N��+�F�r�>
��Cre��r�"L�+}�1re��b!6;
��Cre��r����w�����E���c�~{��tk�h��-0E��
����u��*t'�����4
�"�50U[����Y�]�����
=O����������w�~�f�|�i-�q����56[�j#���G2���oA������+���aN�[h�*Xs������������W]Gc>^y�u�������Ohh`q����P�33S�����|������"�}���7-�2�ZBj�ke�\B�
�>@+8����2[+�g4Qu���Z+��'�_�S�1���!���
������#�
��~�>ZJ�<��X��K1NK��j"������
A{��f�//�2j������D�2+V���>��e��$o����OrY��<��ok8�V����9[0�00��.9$�uZ�L(;!�=*�����qa�a�a������B9O�,5����N�I'����i��$��\8a�If���I��UU��s��kv��Zn+�����r,(��hEe�R6���"���Y<_�]H�,Z��d�V�f���4��G2��Y����f&�����LC�]�O�K�N>�P��M�,�X����$��C�9J����������g��o�W��f���i�/R4�~)�l���t�yY�k/	vl~O;�>�%o���Y�0]�_���uH��Ok�i��m���2H���u����C�NjJ����v���[�|Z��"i,��"������=o��S0�}����?��O���oQ��+^s�"g+��YWG��.������)S�}��6��A\>�wo��W|�8�������|V
����F���PZ
�>��r�z\Z����(��dM?��X�f�]z�<��s�>e�����NUVo1��Z5������7�s��=I���	�RtO������R��8��>�5
�R�E�sN�
��m������\>
J+��Y�Ii�n�_��E�XZ��z��\i����>B���Ox�?��i�7�SfQl���wEJK��T3g:�VF���PZ2m*��2�B+*���|ZpMn����}���(��������{1�N3{Z��g-���_��{4(-���6���V�z�G���v��G��R�������O�/�*"��7�J�����<=>��V�y���2%i�d�jJ�q�eMy����Q%�u2�>,qQZ�
}������W��������[Z��qZ�������8��^l�$����w*������Ue;y�X�9K�L��=*->��9���d�]w�.��ewj�gj*�Ytz=����x��	J�hr,�����TPh9�=BF�Z�����������
x��Q��JKN�TZ���/��+^_���0�0��Dr�|*�S�#s�vQ���oA��ZQ�������&��E��X1�vZ���!���h���coZ�����!�4��b���]������r���'g����U����,$%{�����=��#���{|����%��hHK�6���]�Vi�����*{�Qk�j�g������R���~5�_���
��'�vy����qHi�������rw����651�6����9<�����_��MX?R8!�	�<�:��P�F�A(*'�*�R��T�����'X�H~���<Ii��LC�WH{E�
s����k:��Kk�;Wp�{i�Z�a���m
?�Ui�+>~�`��u���������r��*��`2�3i�G��DmNM}������:O��N���>^�Bp(B��b�����n�|k��[�?����:
��V���6�P������
�������V{~�If�Vj�<�Ru���$��eY-H���:d��Q������rU��}��a6J��C����8^�`A9UG+*��"q�� �!C\+�����h'��x�YF�9H����Lwl>���_�JK�g &�m���Z�Mm��y�Jn��e�f����B�@�[����O��������l���#rqw|I�����]��}������;re����M���O�'0:�e�����Y�lb;+eS�~���0Hz)ZQE�3,�E��V�H��I(QWq�%�)
/
�����w95�(�<�<��.PZ>M�h�mi���_�8�3�yT�QfN�q[�pU,����/��U�k�<(n\(}�i�4ur�uz��FR{RZA|���Ut��i��U����C����1������C)�8�X��������\VR��"�����<��e���FeQ=������JJ�O��XHrvR�
L�8���+�j�{(TF���M^�.��j�����h�t�z��z���O�|�R5�1"J%��8�X�@8V4n;I��f�R���/����F6����@�Blx��p.2N�,�Y�&�����R>�!�iH��G.�i�|�� ��Z3\8evt8Pv��E�X}�������Hj	8N+`�|B���&�nE,�>+H�	m-�r�!\����J�'���@mDKV��&[�����<d4cB�1%�V�r�Z1������@�:����c��2=����$�Vj`p�.9{�{���I�Sp�a"1x�L��[��_��v�,�J�� �v��,�B�[QXP�]�m��S���	im��w��y��[f�1���y���W�����F^�?4��w6�7�l��#f���.;!�Wk�������}VI����FD'�K�����0�>�[2���-E�*��R�U&�Q(P���
�������Y;� �LmtF��z���bYi��r��W^Q[���*n���G���7�^P"+�����eBB+*��h�� k���P�+�����
���V�#o�����?|�o�LC����]x`���'�����NR���_�(Gia���1A@��n��Y#������-t����w�_cB���b���w�I�6�������[k�s}ZQ��Y�5�s���������Z�N�-Z�A2����555.����9�W6�EMqtP���A�[�\Y|��s�8�]����3=�1^�
uw��;n�`�u%y������B<�I�;��_b��TMsi:�Ml�<�:��R��?�T��c���x��eQ�P�iJ	(eAP���X��cA�2����Ci1�����(�:f�l����s�����H��� 6�7:��[vZ�A�Z�����2
����V����V
WQ~'���!��PR6��T�lHM�PX�fS��z�4��FeQ1h��!��GjL���9,$9��4L?�f���4�0)�?;3����46�:
�iH��C9�L�8a���!����Up��LCqM�z���
��o�QcQ�p�sY�(2�i��E�qZ��j�����t�w���i���q�8������V4R���W����f#�/��~�+�"S��/��]c��yRj}���$D�a:�n=r����<����,����r�z���Ym����l����M���Pt����#^���q#|�bg��i����|<��!� ��n8p�����J��#�VT9����V<[�{�Vo)��Ru����������^�����Zi��i��&��T�>��9��Z��3����������\d�"E�?����(B#���D��w>8���f�'G�
@��VUw#�~C�,<q  ��������,�
#eS�UD�PZQ`!:����b�J  �����@@Yd2HXPZ	kzT���:��@@ a	@i%��Q�n	(��h�D&����G�����  "PZ!�dc�Z�j9�@���VT��
#8��Y��@���J8���^����@@@YPZ������e]�l?�t��]x��Y��COK>��r8	�'�x�JK������e��p����(���6��r��\n�[(���j�qh��N��v��>��87��J*�0��&1���	��E
�9�F��i�JH�����~�����?"(-��K+�n��!s6-,XS��_aY�?��A��gn���m��hp	���Xuh�D6��+�N�3v����#r�d,Q
UY���A�W�m����T�,(}��s��#�=��Je�t ��T����
�D�0}��mL���F��2�#��������Sm�C�e���YA=��~/9�d�n'�%q�u_x��k�[�����E�t��(:"���RI:����C����_�m������bPj�(}��pdZ�?}����7H�zy�=A���]�����!���vy���Y����
�����ba�k�_���`�no���Se���+��vm�%���S$��e\XT��/�����|G������Z�P��Z��������oz����JY9qD� ����*\�����mq���)#�����-{��[Q�f���-Rs+gt�<����1C��qq`��.�,����Su���I&��^\(fT��t�K���G���K���X�r�v��n��9�Y�n\�����Eo��E�C��3���kQ��u��(X�M$�j��s��&z8}�w������M!�w����Fw�Z��s��>�(���}��f�Qtk%�����xi��O���`��+�9���=��{���=�h�}~GD1�*������b3�����O�wJ��'���y�y��7���B�>�f��������:s�7_���*z�\��+���9�N!�T�mc����v��5��l�_0�����S�{�,������.���q���j�[7���g����d������!�d�6�-g��T1�<<�����\���'^}����LA6FD�N��������x����K���xo���o�S�~�:�y�]��:R��B��(���-��(-�:��2����>&��a/�+���QXLO��$�,k�If;���[�u��.
�Y<���;����TZ�/eB�����|�(rhy�����i���K���@w�d�4�
D?T���<���@�+j�&
]	{��~5���'�8�VVJ+8�vx��oX��=q�	���|G��P+���~��(�p����8�T<�>�H�����K�d�����'L����[�!�|Bp��}~2�������L�a�Vw�t���e��p��&Q_�O�	�v��&��C����k���L)�^j����4<���1��^q�L�I~B{�����Z���7�D5m����;���h\P�c����B��%����2������[J�����7��r���z��:��8���vId�W����T:�f���w������R�p�����`��+bq��'�n��z7=����,�Zb��h
"8�5�i�A���Hpv}l���O�"�7^�h��G�O~��%��H���X������
��D;��#�����n�z�}�
����h���.A�O5;G��^�I�wPE��q���b��|1��B!��xd�c]7�H�V�:�BI�D7j����2�$�K��y ���"�s�a8��� ���,�k���� �
de�H�%�Vxy#�H�C+�@�  �X��J,{'xm��J������
?s�0B+*��B��@����w�~����  ~PZ�g�#O���%�� ��vN�Z�������@d@iE�;r� 8�"Y��@���J4�'b}��JD���  ����(E���.��@@����s����s�z  �����>(����R'�MJ�7#��]ph���Pr�PZ�aG��78�|3B�	@i)M�E
8���(�$.(���}B���27* �CJ+zl��(I-%i"-�@	@iJ��8�b�V()��(�x�(�C���c  %����(F���*�H@@@(-�%���S�BaA@ �	@i����zph%�����@�	@iE�(����R�&���V��@��C+Z-�r��@��J c�}U���{��  s��b�d(�,ph���H  !&�b�H>\��
i�  �(-?`!j��C+V,�r��@����{'D��J3��  ���b�h(�Wph��PZ�c�$@ph������V�#�0�C+������oPZ�!F4�C+��������v�|
On����'������T����/	b����kn�����������P�L�R��V0$q/��@(���&<���SG��Y�*V���-�K�\1����CFe�?�P<��1��u*���4���~��gc�4�4,��T�s��pW�9'Z��jv[���o��~��P�0�\���[���������.�����������h[.-(-�&FL������d��g�>;��]����d�7U������`��Y�jM;kL��Mb���u�]���:3]S�������
f�Om$�Y$��\8�p���./\\t������)�(�-�\�� ���F(!�$ ������l��9���'d��t��6\���ai:
�C;*�0��aB/��(��vd�~T�~x���)$/Y{v��t��iz�N@�!�2	���L;��%5����	Y��r�����6��1B+0n�@@ ���=���5����������
y��E��y��;y7Yz�TYF����5�����>-����C���D����gU��!���z���[vWS=/��������/�0����M{��Z���\�C�aZ,��7��c!{��*
�s��s�9B(��0��Uk��w���aw
�f������
��V��Gw�Ovq���?�8�J5��<�������	]d��
[�  � �n{If��:��r$������K.�i�����'�Y��o����3���������O+?���`�9)�Y~5���~7�������}����	G��0D��ZQo"@���qZ-����Rv��bG��3�_�gg~�Tt������k��4WW[jk---V/7�DE�m��<�n���c�����r���#����A@�F���������%�M����TWR�%�^N��6{�g�'*��?,c;i�"�\1}�Si]s�
�a��I?I�{�����;�BhJ�/f����+G���q~MoT��ph)I��(N����4��t>������
�3�kv&S�01d���$�O���w�B����B���y�F�BpD�MT�����v������ulm���������k�L��:�A�IFSK�6Z�6���9�1����=�  f�)���si��w��I��)d��kK�-g�;�4��Bh�{�;�?�']S����~�N9?i�;��X���5����v1$�5���22�YY�^�t����d��h��z��8�P���MW�O��]���%��d�G�4f��	�z��i�A=tC��}S"�NZ���v  ��{�py��s��>������U����8f�=|��B�i(�+��{���LC��C��E2�|>�P�E,�Y��]m�S�Z?���*�<=�����.�������.��~��_6�NJ;L!}u��y������1�?y�{��gn�I�*t����0�P�����h���u�c��(�Hf�I�l.�8��f�����]hms�4��>i��M�{h��CE�J2�e������6���n7n����t�C(�Z�|W�.�S}~ts��o����[4B�o��x]��_Tm�/P�����Z�!��@�	xSZ6��N����w�`(��"��9G������x�����/�gi�s�V��X�'�I�<�>5b�!���U����[O�ii�C�L
GLM����[*�*^`�1B+0n�@@ ��������R|�����OEDW$��WlSV�����G����>lV+���V��fs��8��+�^8���BL��[ii���
�,B����Y��y����
i$�4��s����!���5���R���?���p�-f#q�Rrl���jc���X�^6+�<������ph�/oJ�j��l����4��������s^!
~��]���w
;i�;�i��~�1���Uk���jv~�����|����C���N��N�&G����P��@�Zg��7������|���#$��>)����3d��I�`i�ph���  QB����_�H�|��C���|^!/=��HZ9w0d�
���
)�4�y��������Y$�.�Q2t����=�;'J�#-�,��O��������Pzo�yW���%�����%��[�������t����^k\�D��[~����p����ICS'�Io��_����S��<����N)���|(��@b����@PV�X���^3��r�D����:���4���!Y�!H~������C��I��k|�����>D�cM�v���%��I��Z]�Si9�Fb�JZk�RcMn��Y����-m6�~��m���:����1n�2>���U�.�i���W\\{��?r���l�>��5|�;u��Y%��3�g��[���1{�������Yx�A@b��7�u�ULi�����"%*$��Hi��fx{[���.BRZ]%��!*-Z�������Li���4I���PZ1�5E���"�5������o��=���J�z����u �����+�+����m/����1[�]�g��C��][��4N�������9(�P�i���4oJ��+�������]V.%]%*-:��E�������J��*Q����y��!���^]~!���O���b�r��wv����p�u������Q�L��	����Z��|�,���IJ�>��u����PS8�@E��[�75����
)�����L[�V�PZFRZ�����
���e/NE�G"   � ��i��2��h$�s�B��C=���&v�`Hc���l��8��f��"�E�j=�-�>��E��t��>	��Y<�4$w���?^����|w�z:W����6�K�������h��~����|�N~;�,��.�mV_������V���`�����n�����oX���"���@��������zE�����'O��'-=�+T*d�O�D�h+	� ����i����n!�d4����^�i���,_�t������.������O����J`���F^����4/�>%�l���O)��@@ ��)��g����������v0��WH�7��B�l����8���yW�-������Eh�����!y���������{K���~�k�a�IE���d�,f2%�J������D�   �����Li�����r�u~����J��QY����]H��/��mlD<[,�8�K_#�#B'F3]�_%��aS�U�W���_��X�W�;L���1�=��VI}�v+��z��Y2��'�-��b!���]��w��GY�d������O#f������A>�������
  :~�Ms�T(�A�s7�f���]�Q���.�"���	��_�����w!M$����V�p����#���Y`3[��|��l�9H��FB2����-Ux���'����t�S<I/<�9�d�S��\���6�"��,n���W����w:�f;t�jq!|\��C���8W~����
�����X�wa��BB�
i��x�]�����e��~��8�N�:0�0�O&SX$�h�hv2��B2�	,1D�a���1!�1�0�%G�   �����~z?*�[��:G�c�a�����qZ|T�X����iT-� �S����qZ��4*k���L
�NCC�8��x<PH�p�Oi��i�W����|�����B�d^!�`���j���N�q��B��/�Xw�$�|Z��i����JJ��$V8B��7�G�K�=��x���}�C�=T�vH@@b�����������:���O��}|�>/����������3�c�6x�8~���I�LC��.��/V�_|�pA�{H��s�{j��i)�-Rp%�a���;���}�,V:�������"O�������t6"���#���@����a(���$�Ai�����j��D�'J9�zp�/������1����r���j
  �I����^B�U�6��a��.x�!�f���'�WH#��������|R!���
i�!�1��bC�I��I��o���w����,�=�����b�L($��@"�c7�D�u�k�a|T���$��vA�@@@����k��C�Q(��`F&   	IJ+!��J������VX0#��$���fG�A@@�BJ+,��	���@B��JH���   a!���@@@ !	@i%��Qi����
fd   �������4���@X@i�s�e��E-�s�F�A@@ ��Qi��;"�w�+D�
�@@ Q	�Qi%*b�@@@ a	�UZ�
Q9G�Rg�������/��mq�J�^�hx�|�
�a���7EC�Q�=r���E[���X���M$��/.��(�p�S�������h�|�b���t���v��E�c�SL�X���`�6�.>J1�-.��
�`�#��O
  1r�V��j�J�t���C
��3T��1}���.��W��W<��}[YE�*��A���b�D�����hP�0q�����#��%������)-�V.XS���_��W�?��cb�����]���+�W6�h!����RR�@@b��<��<E3�N���������%X��J�m���0�p�ct=�������A���K�$  �L@�����B�X��e��/i��@���M�����+*��:�;��S"X�"F�G���i����%�8 ��.�8��   QH@�O+
�"����@����z��   1KJ+fM�����D=(��7
   ���!�_FVIEND�B`�
#4Rui DeSousa
rui.desousa@icloud.com
In reply to: Steven Winfield (#3)
Re: Possible optimisation: push down SORT and LIMIT nodes

In the meantime you can force it with CTE.

with inv as (
select id_asset
, inventory.date
, quantity
from inventory
order by inventory.date
limit 100
)
select inv.date, asset.name, inv.quantity
from inv
join asset on id_asset = asset.id
order by inv.date, asset.name
;

Show quoted text

On Jun 1, 2018, at 11:12 AM, Steven Winfield <Steven.Winfield@cantabcapital.com> wrote:

It does indeed!
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=398.50..398.50 rows=100 width=32) (actual time=10.359..10.378 rows=100 loops=1)
Output: inventory.date, asset.name, inventory.quantity
-> Incremental Sort (cost=398.50..403.27 rows=5006001 width=32) (actual time=10.357..10.365 rows=100 loops=1)
Output: inventory.date, asset.name, inventory.quantity
Sort Key: inventory.date, asset.name
Presorted Key: inventory.date
Sort Method: quicksort Memory: 103kB
Sort Groups: 1
-> Nested Loop Left Join (cost=0.71..1702372.39 rows=5006001 width=32) (actual time=0.030..2.523 rows=1002 loops=1)
Output: inventory.date, asset.name, inventory.quantity
Inner Unique: true
-> Index Scan using inventory_pkey on temp.inventory (cost=0.43..238152.40 rows=5006001 width=12) (actual time=0.016..0.290 rows=1002 loops=1)
Output: inventory.date, inventory.id_asset, inventory.quantity
-> Index Scan using asset_pkey on temp.asset (cost=0.28..0.29 rows=1 width=28) (actual time=0.002..0.002 rows=1 loops=1002)
Output: asset.id <http://asset.id/&gt;, asset.name
Index Cond: (asset.id <http://asset.id/&gt; = inventory.id_asset)

I’m guessing the feature-freeze for v11 means we won’t see this in the that version, though, and the extra GUC it requires means it will be in v12 at the earliest?

From: James Coleman [mailto:jtc331@gmail.com <mailto:jtc331@gmail.com>]
Sent: 01 June 2018 13:50
To: Christopher Wilson
Cc: pgsql-hackers@lists.postgresql.org <mailto:pgsql-hackers@lists.postgresql.org>; Steven Winfield
Subject: Re: FW: Possible optimisation: push down SORT and LIMIT nodes

The incremental sort patch seems to significantly improve performance for your query: https://commitfest.postgresql.org/17/1124/ <https://commitfest.postgresql.org/17/1124/&gt;

On Fri, Jun 1, 2018 at 7:46 AM, Christopher Wilson <chris.wilson@cantabcapital.com <mailto:chris.wilson@cantabcapital.com>> wrote:
Dear Postgres developers,

I sent this query to the performance list a couple of days ago, but nobody has come up with any suggestions. I was wondering if you’d like to consider it?

If this is interesting but nobody has time to implement it, then I would potentially be willing to implement and submit it myself, in my own time. I am experienced with C and C++, but I have not modified Postgres before, and I would need significant support (e.g. on IRC) to help me to find my way around the codebase and finish the task in an acceptable amount of time.

Thanks, Chris.

From: Christopher Wilson
Sent: 30 May 2018 16:47
To: 'pgsql-performance@postgresql.org <mailto:pgsql-performance@postgresql.org>'
Cc: Steven Winfield (steven.winfield@cantabcapital.com <mailto:steven.winfield@cantabcapital.com>)
Subject: Possible optimisation: push down SORT and LIMIT nodes

Hi all,

We have a query which is rather slow (about 10 seconds), and it looks like this:

select inventory.date, asset.name <http://asset.name/&gt;, inventory.quantity
from temp.inventory
left outer join temp.asset on asset.id <http://asset.id/&gt; = id_asset
order by inventory.date, asset.name <http://asset.name/&gt;
limit 100

The inventory table has the quantity of each asset in the inventory on each date (complete SQL to create and populate the tables with dummy data is below). The query plan looks like this (the non-parallel version is similar):

<image001.png>

Or in text form:

Limit (cost=217591.77..217603.60 rows=100 width=32) (actual time=9122.235..9122.535 rows=100 loops=1)
Buffers: shared hit=6645, temp read=6363 written=6364
-> Gather Merge (cost=217591.77..790859.62 rows=4844517 width=32) (actual time=9122.232..9122.518 rows=100 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=6645, temp read=6363 written=6364
-> Sort (cost=216591.73..220628.83 rows=1614839 width=32) (actual time=8879.909..8880.030 rows=727 loops=4)
Sort Key: inventory.date, asset.name <http://asset.name/&gt;
Sort Method: external merge Disk: 50904kB
Buffers: shared hit=27365, temp read=25943 written=25947
-> Hash Join (cost=26.52..50077.94 rows=1614839 width=32) (actual time=0.788..722.095 rows=1251500 loops=4)
Hash Cond: (inventory.id_asset = asset.id <http://asset.id/&gt;)
Buffers: shared hit=27236
-> Parallel Seq Scan on inventory (cost=0.00..29678.39 rows=1614839 width=12) (actual time=0.025..237.977 rows=1251500 loops=4)
Buffers: shared hit=27060
-> Hash (cost=14.01..14.01 rows=1001 width=28) (actual time=0.600..0.600 rows=1001 loops=4)
Buckets: 1024 Batches: 1 Memory Usage: 68kB
Buffers: shared hit=32
-> Seq Scan on asset (cost=0.00..14.01 rows=1001 width=28) (actual time=0.026..0.279 rows=1001 loops=4)
Buffers: shared hit=32
Planning time: 0.276 ms
Execution time: 9180.144 ms

I can see why it does this, but I can also imagine a potential optimisation, which would enable it to select far fewer rows from the inventory table.

As we are joining to the primary key of the asset table, we know that this join will not add extra rows to the output. Every output row comes from a distinct inventory row. Therefore only 100 rows of the inventory table are required. But which ones?

If we selected exactly 100 rows from inventory, ordered by date, then all of the dates that were complete (every row for that date returned) would be in the output. However, if there is a date which is incomplete (we haven’t emitted all the inventory records for that date), then it’s possible that we would need some records that we haven’t emitted yet. This can only be known after joining to the asset table and sorting this last group by both date and asset name.

But we do know that there can only be 0 or 1 incomplete groups: either the last group (by date) is incomplete, if the LIMIT cut it off in mid-group, or its end coincided with the LIMIT boundary and it is complete. As long as we continue selecting rows from this table until we exhaust the prefix of the overall SORT which applies to it (in this case, just the date) then it will be complete, and we will have all the inventory rows that can appear in the output (because no possible values of columns that appear later in the sort order can cause any rows with different dates to appear in the output).

I’m imagining something like a sort-limit-finish node, which sorts its input and then returns at least the limit number of rows, but keeps returning rows until it exhausts the last sort prefix that it read.

This node could be created from an ordinary SORT and LIMIT pair:

SORT + LIMIT -> SORT-LIMIT-FINISH + SORT + LIMIT

And then pushed down through either a left join, or an inner join on a foreign key, when the right side is unique, and no columns from the right side appear in WHERE conditions, nor anywhere in the sort order except at the end. This sort column suffix would be removed by pushing the node down. The resulting query plan would then look something like:

Index Scan on inventory
SORT-LIMIT-FINISH(sort=[inventory.date], limit=100) (pushed down through the join to asset)
Seq Scan on asset
Hash Left Join (inventory.id_asset = asset.id <http://asset.id/&gt;)
Sort (inventory.date, asset.name <http://asset.name/&gt;)
Limit (100)

And would emit only about 100-1000 inventory rows from the index scan.

Does this sound correct, reasonable and potentially interesting to Postgres developers?

SQL to reproduce:

create schema temp;
create table temp.asset (
id serial primary key,
name text
);
insert into temp.asset (name) select 'Thing ' || random()::text as name from generate_series(0, 1000) as s;
create table temp.inventory (
date date,
id_asset int,
quantity int,
primary key (date, id_asset),
CONSTRAINT id_asset_fk FOREIGN KEY (id_asset) REFERENCES temp.asset (id) MATCH SIMPLE
);
insert into temp.inventory (date, id_asset, quantity)
select current_date - days, asset.id <http://asset.id/&gt;, random() from temp.asset, generate_series(0, 5000) as days;

Thanks, Chris.

This email is confidential. If you are not the intended recipient, please advise us immediately and delete this message. The registered name of Cantab- part of GAM Systematic is Cantab Capital Partners LLP. See - http://www.gam.com/en/Legal/Email+disclosures+EU <http://www.gam.com/en/Legal/Email+disclosures+EU&gt; for further information on confidentiality, the risks of non-secure electronic communication, and certain disclosures which we are required to make in accordance with applicable legislation and regulations. If you cannot access this link, please notify us by reply message and we will send the contents to you.

GAM Holding AG and its subsidiaries (Cantab – GAM Systematic) will collect and use information about you in the course of your interactions with us. Full details about the data types we collect and what we use this for and your related rights is set out in our online privacy policy at https://www.gam.com/en/legal/privacy-policy <https://www.gam.com/en/legal/privacy-policy&gt;. Please familiarise yourself with this policy and check it from time to time for updates as it supplements this notice

#5Chris Wilson
chris@simply-italian.co.uk
In reply to: Rui DeSousa (#4)
Re: Possible optimisation: push down SORT and LIMIT nodes

Hi Rui,

Unfortunately sorting and limiting the CTE doesn't work properly, because exactly which 100 rows are selected depends on values in the asset table, which are not known at the time that the cte is evaluated.

I can work around it for our case by querying for the unique dates that make it through the limit, and making the cte return all and only inventory records matching those dates, but of course having this done automatically is infinitely preferable.

I'm really happy that this patch is actively being worked on and pushed forward towards merging, and grateful to all involved in doing that. Thank you for making Postgres even more awesome!

Thanks, Chris.

Sent from my iPhone

Show quoted text

On 1 Jun 2018, at 16:44, Rui DeSousa <rui.desousa@icloud.com> wrote:

In the meantime you can force it with CTE.

with inv as (
select id_asset
, inventory.date
, quantity
from inventory
order by inventory.date
limit 100
)
select inv.date, asset.name, inv.quantity
from inv
join asset on id_asset = asset.id
order by inv.date, asset.name
;

On Jun 1, 2018, at 11:12 AM, Steven Winfield <Steven.Winfield@cantabcapital.com> wrote:

It does indeed!
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=398.50..398.50 rows=100 width=32) (actual time=10.359..10.378 rows=100 loops=1)
Output: inventory.date, asset.name, inventory.quantity
-> Incremental Sort (cost=398.50..403.27 rows=5006001 width=32) (actual time=10.357..10.365 rows=100 loops=1)
Output: inventory.date, asset.name, inventory.quantity
Sort Key: inventory.date, asset.name
Presorted Key: inventory.date
Sort Method: quicksort Memory: 103kB
Sort Groups: 1
-> Nested Loop Left Join (cost=0.71..1702372.39 rows=5006001 width=32) (actual time=0.030..2.523 rows=1002 loops=1)
Output: inventory.date, asset.name, inventory.quantity
Inner Unique: true
-> Index Scan using inventory_pkey on temp.inventory (cost=0.43..238152.40 rows=5006001 width=12) (actual time=0.016..0.290 rows=1002 loops=1)
Output: inventory.date, inventory.id_asset, inventory.quantity
-> Index Scan using asset_pkey on temp.asset (cost=0.28..0.29 rows=1 width=28) (actual time=0.002..0.002 rows=1 loops=1002)
Output: asset.id, asset.name
Index Cond: (asset.id = inventory.id_asset)

I’m guessing the feature-freeze for v11 means we won’t see this in the that version, though, and the extra GUC it requires means it will be in v12 at the earliest?

From: James Coleman [mailto:jtc331@gmail.com]
Sent: 01 June 2018 13:50
To: Christopher Wilson
Cc: pgsql-hackers@lists.postgresql.org; Steven Winfield
Subject: Re: FW: Possible optimisation: push down SORT and LIMIT nodes

The incremental sort patch seems to significantly improve performance for your query: https://commitfest.postgresql.org/17/1124/

On Fri, Jun 1, 2018 at 7:46 AM, Christopher Wilson <chris.wilson@cantabcapital.com> wrote:
Dear Postgres developers,

I sent this query to the performance list a couple of days ago, but nobody has come up with any suggestions. I was wondering if you’d like to consider it?

If this is interesting but nobody has time to implement it, then I would potentially be willing to implement and submit it myself, in my own time. I am experienced with C and C++, but I have not modified Postgres before, and I would need significant support (e.g. on IRC) to help me to find my way around the codebase and finish the task in an acceptable amount of time.

Thanks, Chris.

From: Christopher Wilson
Sent: 30 May 2018 16:47
To: 'pgsql-performance@postgresql.org'
Cc: Steven Winfield (steven.winfield@cantabcapital.com)
Subject: Possible optimisation: push down SORT and LIMIT nodes

Hi all,

We have a query which is rather slow (about 10 seconds), and it looks like this:

select inventory.date, asset.name, inventory.quantity
from temp.inventory
left outer join temp.asset on asset.id = id_asset
order by inventory.date, asset.name
limit 100

The inventory table has the quantity of each asset in the inventory on each date (complete SQL to create and populate the tables with dummy data is below). The query plan looks like this (the non-parallel version is similar):

<image001.png>

Or in text form:

Limit (cost=217591.77..217603.60 rows=100 width=32) (actual time=9122.235..9122.535 rows=100 loops=1)
Buffers: shared hit=6645, temp read=6363 written=6364
-> Gather Merge (cost=217591.77..790859.62 rows=4844517 width=32) (actual time=9122.232..9122.518 rows=100 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=6645, temp read=6363 written=6364
-> Sort (cost=216591.73..220628.83 rows=1614839 width=32) (actual time=8879.909..8880.030 rows=727 loops=4)
Sort Key: inventory.date, asset.name
Sort Method: external merge Disk: 50904kB
Buffers: shared hit=27365, temp read=25943 written=25947
-> Hash Join (cost=26.52..50077.94 rows=1614839 width=32) (actual time=0.788..722.095 rows=1251500 loops=4)
Hash Cond: (inventory.id_asset = asset.id)
Buffers: shared hit=27236
-> Parallel Seq Scan on inventory (cost=0.00..29678.39 rows=1614839 width=12) (actual time=0.025..237.977 rows=1251500 loops=4)
Buffers: shared hit=27060
-> Hash (cost=14.01..14.01 rows=1001 width=28) (actual time=0.600..0.600 rows=1001 loops=4)
Buckets: 1024 Batches: 1 Memory Usage: 68kB
Buffers: shared hit=32
-> Seq Scan on asset (cost=0.00..14.01 rows=1001 width=28) (actual time=0.026..0.279 rows=1001 loops=4)
Buffers: shared hit=32
Planning time: 0.276 ms
Execution time: 9180.144 ms

I can see why it does this, but I can also imagine a potential optimisation, which would enable it to select far fewer rows from the inventory table.

As we are joining to the primary key of the asset table, we know that this join will not add extra rows to the output. Every output row comes from a distinct inventory row. Therefore only 100 rows of the inventory table are required. But which ones?

If we selected exactly 100 rows from inventory, ordered by date, then all of the dates that were complete (every row for that date returned) would be in the output. However, if there is a date which is incomplete (we haven’t emitted all the inventory records for that date), then it’s possible that we would need some records that we haven’t emitted yet. This can only be known after joining to the asset table and sorting this last group by both date and asset name.

But we do know that there can only be 0 or 1 incomplete groups: either the last group (by date) is incomplete, if the LIMIT cut it off in mid-group, or its end coincided with the LIMIT boundary and it is complete. As long as we continue selecting rows from this table until we exhaust the prefix of the overall SORT which applies to it (in this case, just the date) then it will be complete, and we will have all the inventory rows that can appear in the output (because no possible values of columns that appear later in the sort order can cause any rows with different dates to appear in the output).

I’m imagining something like a sort-limit-finish node, which sorts its input and then returns at least the limit number of rows, but keeps returning rows until it exhausts the last sort prefix that it read.

This node could be created from an ordinary SORT and LIMIT pair:

SORT + LIMIT -> SORT-LIMIT-FINISH + SORT + LIMIT

And then pushed down through either a left join, or an inner join on a foreign key, when the right side is unique, and no columns from the right side appear in WHERE conditions, nor anywhere in the sort order except at the end. This sort column suffix would be removed by pushing the node down. The resulting query plan would then look something like:

Index Scan on inventory
SORT-LIMIT-FINISH(sort=[inventory.date], limit=100) (pushed down through the join to asset)
Seq Scan on asset
Hash Left Join (inventory.id_asset = asset.id)
Sort (inventory.date, asset.name)
Limit (100)

And would emit only about 100-1000 inventory rows from the index scan.

Does this sound correct, reasonable and potentially interesting to Postgres developers?

SQL to reproduce:

create schema temp;
create table temp.asset (
id serial primary key,
name text
);
insert into temp.asset (name) select 'Thing ' || random()::text as name from generate_series(0, 1000) as s;
create table temp.inventory (
date date,
id_asset int,
quantity int,
primary key (date, id_asset),
CONSTRAINT id_asset_fk FOREIGN KEY (id_asset) REFERENCES temp.asset (id) MATCH SIMPLE
);
insert into temp.inventory (date, id_asset, quantity)
select current_date - days, asset.id, random() from temp.asset, generate_series(0, 5000) as days;

Thanks, Chris.

This email is confidential. If you are not the intended recipient, please advise us immediately and delete this message. The registered name of Cantab- part of GAM Systematic is Cantab Capital Partners LLP. See - http://www.gam.com/en/Legal/Email+disclosures+EU for further information on confidentiality, the risks of non-secure electronic communication, and certain disclosures which we are required to make in accordance with applicable legislation and regulations. If you cannot access this link, please notify us by reply message and we will send the contents to you.

GAM Holding AG and its subsidiaries (Cantab – GAM Systematic) will collect and use information about you in the course of your interactions with us. Full details about the data types we collect and what we use this for and your related rights is set out in our online privacy policy at https://www.gam.com/en/legal/privacy-policy. Please familiarise yourself with this policy and check it from time to time for updates as it supplements this notice

#6Rui DeSousa
rui.desousa@icloud.com
In reply to: Chris Wilson (#5)
Re: Possible optimisation: push down SORT and LIMIT nodes

True… but in that case it needs to be more expressive.

i.e.

with d as (
select date
from inventory
order by date
limit 10
), df as (
select distinct date
from d
)
select i.date, a.name, i.quantity
from inventory i
join asset a on a.id = i.id_asset
where i.date in (select date from df)
order by i.date, a.name
limit 10
;

prod=> with d as (
prod(> select date
prod(> from inventory
prod(> order by date
prod(> limit 10
prod(> ), df as (
prod(> select distinct date
prod(> from d
prod(> )
prod-> select i.date, a.name, i.quantity
prod-> from inventory i
prod-> join asset a on a.id = i.id_asset
prod-> where i.date in (select date from df)
prod-> order by i.date, a.name
prod-> limit 10
prod-> ;
date | name | quantity
------------+---------------------------+----------
2004-09-22 | Thing 0.00122669106349349 | 0
2004-09-22 | Thing 0.00140673760324717 | 0
2004-09-22 | Thing 0.00180063676089048 | 0
2004-09-22 | Thing 0.00463481899350882 | 1
2004-09-22 | Thing 0.00622459733858705 | 1
2004-09-22 | Thing 0.00649207830429077 | 0
2004-09-22 | Thing 0.00823836214840412 | 1
2004-09-22 | Thing 0.0109024560078979 | 1
2004-09-22 | Thing 0.0109436474740505 | 0
2004-09-22 | Thing 0.0111544523388147 | 0
(10 rows)

Time: 3.040 ms
prod=> select inventory.date, asset.name, inventory.quantity
prod-> from asset
prod-> join inventory on id_asset = asset.id
prod-> order by inventory.date, asset.name
prod-> limit 10;
date | name | quantity
------------+---------------------------+----------
2004-09-22 | Thing 0.00122669106349349 | 0
2004-09-22 | Thing 0.00140673760324717 | 0
2004-09-22 | Thing 0.00180063676089048 | 0
2004-09-22 | Thing 0.00463481899350882 | 1
2004-09-22 | Thing 0.00622459733858705 | 1
2004-09-22 | Thing 0.00649207830429077 | 0
2004-09-22 | Thing 0.00823836214840412 | 1
2004-09-22 | Thing 0.0109024560078979 | 1
2004-09-22 | Thing 0.0109436474740505 | 0
2004-09-22 | Thing 0.0111544523388147 | 0
(10 rows)

Time: 6733.775 ms (00:06.734)

Show quoted text

On Jun 1, 2018, at 12:46 PM, Chris Wilson <chris@simply-italian.co.uk> wrote:

Hi Rui,

Unfortunately sorting and limiting the CTE doesn't work properly, because exactly which 100 rows are selected depends on values in the asset table, which are not known at the time that the cte is evaluated.

I can work around it for our case by querying for the unique dates that make it through the limit, and making the cte return all and only inventory records matching those dates, but of course having this done automatically is infinitely preferable.

I'm really happy that this patch is actively being worked on and pushed forward towards merging, and grateful to all involved in doing that. Thank you for making Postgres even more awesome!

Thanks, Chris.

Sent from my iPhone

On 1 Jun 2018, at 16:44, Rui DeSousa <rui.desousa@icloud.com <mailto:rui.desousa@icloud.com>> wrote:

In the meantime you can force it with CTE.

with inv as (
select id_asset
, inventory.date
, quantity
from inventory
order by inventory.date
limit 100
)
select inv.date, asset.name, inv.quantity
from inv
join asset on id_asset = asset.id <http://asset.id/&gt;
order by inv.date, asset.name
;

On Jun 1, 2018, at 11:12 AM, Steven Winfield <Steven.Winfield@cantabcapital.com <mailto:Steven.Winfield@cantabcapital.com>> wrote:

It does indeed!
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=398.50..398.50 rows=100 width=32) (actual time=10.359..10.378 rows=100 loops=1)
Output: inventory.date, asset.name, inventory.quantity
-> Incremental Sort (cost=398.50..403.27 rows=5006001 width=32) (actual time=10.357..10.365 rows=100 loops=1)
Output: inventory.date, asset.name, inventory.quantity
Sort Key: inventory.date, asset.name
Presorted Key: inventory.date
Sort Method: quicksort Memory: 103kB
Sort Groups: 1
-> Nested Loop Left Join (cost=0.71..1702372.39 rows=5006001 width=32) (actual time=0.030..2.523 rows=1002 loops=1)
Output: inventory.date, asset.name, inventory.quantity
Inner Unique: true
-> Index Scan using inventory_pkey on temp.inventory (cost=0.43..238152.40 rows=5006001 width=12) (actual time=0.016..0.290 rows=1002 loops=1)
Output: inventory.date, inventory.id_asset, inventory.quantity
-> Index Scan using asset_pkey on temp.asset (cost=0.28..0.29 rows=1 width=28) (actual time=0.002..0.002 rows=1 loops=1002)
Output: asset.id <http://asset.id/&gt;, asset.name
Index Cond: (asset.id <http://asset.id/&gt; = inventory.id_asset)

I’m guessing the feature-freeze for v11 means we won’t see this in the that version, though, and the extra GUC it requires means it will be in v12 at the earliest?

From: James Coleman [mailto:jtc331@gmail.com <mailto:jtc331@gmail.com>]
Sent: 01 June 2018 13:50
To: Christopher Wilson
Cc: pgsql-hackers@lists.postgresql.org <mailto:pgsql-hackers@lists.postgresql.org>; Steven Winfield
Subject: Re: FW: Possible optimisation: push down SORT and LIMIT nodes

The incremental sort patch seems to significantly improve performance for your query: https://commitfest.postgresql.org/17/1124/ <https://commitfest.postgresql.org/17/1124/&gt;

On Fri, Jun 1, 2018 at 7:46 AM, Christopher Wilson <chris.wilson@cantabcapital.com <mailto:chris.wilson@cantabcapital.com>> wrote:
Dear Postgres developers,

I sent this query to the performance list a couple of days ago, but nobody has come up with any suggestions. I was wondering if you’d like to consider it?

If this is interesting but nobody has time to implement it, then I would potentially be willing to implement and submit it myself, in my own time. I am experienced with C and C++, but I have not modified Postgres before, and I would need significant support (e.g. on IRC) to help me to find my way around the codebase and finish the task in an acceptable amount of time.

Thanks, Chris.

From: Christopher Wilson
Sent: 30 May 2018 16:47
To: 'pgsql-performance@postgresql.org <mailto:pgsql-performance@postgresql.org>'
Cc: Steven Winfield (steven.winfield@cantabcapital.com <mailto:steven.winfield@cantabcapital.com>)
Subject: Possible optimisation: push down SORT and LIMIT nodes

Hi all,

We have a query which is rather slow (about 10 seconds), and it looks like this:

select inventory.date, asset.name <http://asset.name/&gt;, inventory.quantity
from temp.inventory
left outer join temp.asset on asset.id <http://asset.id/&gt; = id_asset
order by inventory.date, asset.name <http://asset.name/&gt;
limit 100

The inventory table has the quantity of each asset in the inventory on each date (complete SQL to create and populate the tables with dummy data is below). The query plan looks like this (the non-parallel version is similar):

<image001.png>

Or in text form:

Limit (cost=217591.77..217603.60 rows=100 width=32) (actual time=9122.235..9122.535 rows=100 loops=1)
Buffers: shared hit=6645, temp read=6363 written=6364
-> Gather Merge (cost=217591.77..790859.62 rows=4844517 width=32) (actual time=9122.232..9122.518 rows=100 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=6645, temp read=6363 written=6364
-> Sort (cost=216591.73..220628.83 rows=1614839 width=32) (actual time=8879.909..8880.030 rows=727 loops=4)
Sort Key: inventory.date, asset.name <http://asset.name/&gt;
Sort Method: external merge Disk: 50904kB
Buffers: shared hit=27365, temp read=25943 written=25947
-> Hash Join (cost=26.52..50077.94 rows=1614839 width=32) (actual time=0.788..722.095 rows=1251500 loops=4)
Hash Cond: (inventory.id_asset = asset.id <http://asset.id/&gt;)
Buffers: shared hit=27236
-> Parallel Seq Scan on inventory (cost=0.00..29678.39 rows=1614839 width=12) (actual time=0.025..237.977 rows=1251500 loops=4)
Buffers: shared hit=27060
-> Hash (cost=14.01..14.01 rows=1001 width=28) (actual time=0.600..0.600 rows=1001 loops=4)
Buckets: 1024 Batches: 1 Memory Usage: 68kB
Buffers: shared hit=32
-> Seq Scan on asset (cost=0.00..14.01 rows=1001 width=28) (actual time=0.026..0.279 rows=1001 loops=4)
Buffers: shared hit=32
Planning time: 0.276 ms
Execution time: 9180.144 ms

I can see why it does this, but I can also imagine a potential optimisation, which would enable it to select far fewer rows from the inventory table.

As we are joining to the primary key of the asset table, we know that this join will not add extra rows to the output. Every output row comes from a distinct inventory row. Therefore only 100 rows of the inventory table are required. But which ones?

If we selected exactly 100 rows from inventory, ordered by date, then all of the dates that were complete (every row for that date returned) would be in the output. However, if there is a date which is incomplete (we haven’t emitted all the inventory records for that date), then it’s possible that we would need some records that we haven’t emitted yet. This can only be known after joining to the asset table and sorting this last group by both date and asset name.

But we do know that there can only be 0 or 1 incomplete groups: either the last group (by date) is incomplete, if the LIMIT cut it off in mid-group, or its end coincided with the LIMIT boundary and it is complete. As long as we continue selecting rows from this table until we exhaust the prefix of the overall SORT which applies to it (in this case, just the date) then it will be complete, and we will have all the inventory rows that can appear in the output (because no possible values of columns that appear later in the sort order can cause any rows with different dates to appear in the output).

I’m imagining something like a sort-limit-finish node, which sorts its input and then returns at least the limit number of rows, but keeps returning rows until it exhausts the last sort prefix that it read.

This node could be created from an ordinary SORT and LIMIT pair:

SORT + LIMIT -> SORT-LIMIT-FINISH + SORT + LIMIT

And then pushed down through either a left join, or an inner join on a foreign key, when the right side is unique, and no columns from the right side appear in WHERE conditions, nor anywhere in the sort order except at the end. This sort column suffix would be removed by pushing the node down. The resulting query plan would then look something like:

Index Scan on inventory
SORT-LIMIT-FINISH(sort=[inventory.date], limit=100) (pushed down through the join to asset)
Seq Scan on asset
Hash Left Join (inventory.id_asset = asset.id <http://asset.id/&gt;)
Sort (inventory.date, asset.name <http://asset.name/&gt;)
Limit (100)

And would emit only about 100-1000 inventory rows from the index scan.

Does this sound correct, reasonable and potentially interesting to Postgres developers?

SQL to reproduce:

create schema temp;
create table temp.asset (
id serial primary key,
name text
);
insert into temp.asset (name) select 'Thing ' || random()::text as name from generate_series(0, 1000) as s;
create table temp.inventory (
date date,
id_asset int,
quantity int,
primary key (date, id_asset),
CONSTRAINT id_asset_fk FOREIGN KEY (id_asset) REFERENCES temp.asset (id) MATCH SIMPLE
);
insert into temp.inventory (date, id_asset, quantity)
select current_date - days, asset.id <http://asset.id/&gt;, random() from temp.asset, generate_series(0, 5000) as days;

Thanks, Chris.

This email is confidential. If you are not the intended recipient, please advise us immediately and delete this message. The registered name of Cantab- part of GAM Systematic is Cantab Capital Partners LLP. See - http://www.gam.com/en/Legal/Email+disclosures+EU <http://www.gam.com/en/Legal/Email+disclosures+EU&gt; for further information on confidentiality, the risks of non-secure electronic communication, and certain disclosures which we are required to make in accordance with applicable legislation and regulations. If you cannot access this link, please notify us by reply message and we will send the contents to you.

GAM Holding AG and its subsidiaries (Cantab – GAM Systematic) will collect and use information about you in the course of your interactions with us. Full details about the data types we collect and what we use this for and your related rights is set out in our online privacy policy at https://www.gam.com/en/legal/privacy-policy <https://www.gam.com/en/legal/privacy-policy&gt;. Please familiarise yourself with this policy and check it from time to time for updates as it supplements this notice