Proposal: scan key push down to heap [WIP]

Started by Dilip Kumarover 9 years ago36 messages
#1Dilip Kumar
dilipbalaut@gmail.com
2 attachment(s)

Hi Hackers,

I would like to propose a patch for pushing down the scan key to heap.

Currently only in case of system table scan keys are pushed down. I
have implemented the POC patch to do the same for normal table scan.

This patch will extract the expression from qual and prepare the scan
keys. Currently in POC version I have only supported "var OP const"
type of qual, because these type of quals can be pushed down using
existing framework.

Purpose of this work is to first implement the basic functionality and
analyze the results. If results are good then we can extend it for
other type of expressions.

However in future when we try to expand the support for complex
expressions, then we need to be very careful while selecting
pushable expression. It should not happen that we push something very
complex, which may cause contention with other write operation (as
HeapKeyTest is done under page lock).

Performance Test: (test done in local machine, with all default setting).

Setup:
----------

create table test(a int, b varchar, c varchar, d int);
insert into test values(generate_series(1,10000000), repeat('x', 30),
repeat('y', 30), generate_series(1,10000000));
analyze test;

Test query:
--------------
select count(*) from test where a < $1;

Results: (execution time in ms)
------------
Selectivity Head(ms) Patch(ms) gain
0.01 612 307 49%
0.1 623 353 43%
0.2 645 398 38%
0.5 780 535 31%
0.8 852 590 30%
1 913 730 20%

Instructions: (Cpu instructions measured with callgrind tool):

Quary : select count(*) from test where a < 100000;

Head: 10,815,730,925
Patch: 4,780,047,331

Summary:
--------------
1. ~50% reduction in both instructions as well as execution time.
2. Here we can see ~ 20% execution time reduction even at selectivity
1 (when all tuples are selected). And, reasoning for the same can be
that HeapKeyTest is much simplified compared to ExecQual. It's
possible that in future when we try to support more variety of keys,
gain at high selectivity may come down.

WIP patch attached..

Thoughts ?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

heap_scankey_pushdown_v1.patchapplication/octet-stream; name=heap_scankey_pushdown_v1.patchDownload
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 22b3f5f..3f08b79 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1649,7 +1649,8 @@ heap_parallelscan_initialize(ParallelHeapScanDesc target, Relation relation,
  * ----------------
  */
 HeapScanDesc
-heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan)
+heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan,
+						int nkeys, ScanKey key)
 {
 	Snapshot	snapshot;
 
@@ -1657,7 +1658,7 @@ heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan)
 	snapshot = RestoreSnapshot(parallel_scan->phs_snapshot_data);
 	RegisterSnapshot(snapshot);
 
-	return heap_beginscan_internal(relation, snapshot, 0, NULL, parallel_scan,
+	return heap_beginscan_internal(relation, snapshot, nkeys, key, parallel_scan,
 								   true, true, true, false, false, true);
 }
 
diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index fb0013d..3a12ec4 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -205,7 +205,7 @@ ExecScan(ScanState *node,
 		 * when the qual is nil ... saves only a few cycles, but they add up
 		 * ...
 		 */
-		if (!qual || ExecQual(qual, econtext, false))
+		if (!node->ps.qual || ExecQual(node->ps.qual, econtext, false))
 		{
 			/*
 			 * Found a satisfactory scan tuple.
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 00bf3a5..af716e2 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -33,6 +33,9 @@
 
 static void InitScanRelation(SeqScanState *node, EState *estate, int eflags);
 static TupleTableSlot *SeqNext(SeqScanState *node);
+static void get_scankey_from_qual(List *qual, PlanState *ps, int *nkeys,
+					ScanKey *rs_keys);
+static bool extract_var_and_const(List *args, Var **var, Const **cons);
 
 /* ----------------------------------------------------------------
  *						Scan Support
@@ -64,13 +67,23 @@ SeqNext(SeqScanState *node)
 
 	if (scandesc == NULL)
 	{
+		int 		 nkeys = 0;
+		ScanKey		 keys = NULL;
+
+		/*
+		 * Pull out scan key from qual list
+		 */
+		get_scankey_from_qual(node->ss.ps.plan->qual, &node->ss.ps,
+							&nkeys, &keys);
+
 		/*
 		 * We reach here if the scan is not parallel, or if we're executing a
 		 * scan that was intended to be parallel serially.
 		 */
 		scandesc = heap_beginscan(node->ss.ss_currentRelation,
 								  estate->es_snapshot,
-								  0, NULL);
+								  nkeys, keys);
+
 		node->ss.ss_currentScanDesc = scandesc;
 	}
 
@@ -317,14 +330,24 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelHeapScanDesc pscan;
+	int					 nkeys = 0;
+	ScanKey				 keys = NULL;
+
+	/*
+	 * Pull out scan key from qual list
+	 */
+	get_scankey_from_qual(node->ss.ps.plan->qual, &node->ss.ps,
+						&nkeys, &keys);
+
 
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	heap_parallelscan_initialize(pscan,
 								 node->ss.ss_currentRelation,
 								 estate->es_snapshot);
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pscan);
-	node->ss.ss_currentScanDesc =
-		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+	node->ss.ss_currentScanDesc = heap_beginscan_parallel(
+										node->ss.ss_currentRelation, pscan,
+										nkeys, keys);
 }
 
 /* ----------------------------------------------------------------
@@ -337,8 +360,124 @@ void
 ExecSeqScanInitializeWorker(SeqScanState *node, shm_toc *toc)
 {
 	ParallelHeapScanDesc pscan;
+	int					 nkeys = 0;
+	ScanKey				 keys = NULL;
 
 	pscan = shm_toc_lookup(toc, node->ss.ps.plan->plan_node_id);
-	node->ss.ss_currentScanDesc =
-		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+
+	/*
+	 * Pull out scan key from qual list
+	 */
+	get_scankey_from_qual(node->ss.ps.plan->qual, &node->ss.ps,
+						&nkeys, &keys);
+
+	node->ss.ss_currentScanDesc = heap_beginscan_parallel(
+										node->ss.ss_currentRelation, pscan,
+										nkeys, keys);
+}
+
+/*
+ * extract_var_and_const
+ *
+ * Get var and const expression from arg list.
+ */
+static bool
+extract_var_and_const(List *args, Var **var, Const **cons)
+{
+	Expr		*larg;
+	Expr		*rarg;
+
+	/* for POC only supports var op const so only 2 arguments */
+	if (list_length(args) != 2)
+		return false;
+
+	larg = linitial(args);
+	rarg = lsecond(args);
+
+	if (IsA(larg, Var) || IsA(larg, RelabelType))
+	{
+		if (IsA(larg, Var))
+			*var = (Var*)larg;
+		else if (IsA(((RelabelType*)larg)->arg, Var))
+			*var = (Var*)((RelabelType*)larg)->arg;
+		else
+			return false;
+
+		if (!IsA(rarg, Const))
+			return false;
+
+		*cons = (Const*)rarg;
+	}
+	else
+		return false;
+
+	return true;
+}
+
+/*
+ * get_scankey_from_qual
+ *
+ * Currently in POC we only supports the qual of type (var op const)
+ * so that we can use exising heap scan key pushdown framework.
+ */
+static void
+get_scankey_from_qual(List *qual, PlanState *ps, int *nkeys,
+					ScanKey *rs_keys)
+{
+	ListCell   		*l, *l1;
+	Expr 	   		*expr;
+	ListCell   		*prev = NULL;
+	ScanKey			key;
+	Var 			*var;
+	Const 			*con;
+
+	/*
+	 * Validate each qual whether this qual can be push down to heap node
+	 * or not, If this can be pushed down then create a ScanKey entry
+	 * and delete it from qual list of PlanState
+	 */
+	forboth(l, qual, l1, ps->qual)
+	{
+		expr = (Expr *) lfirst(l);
+
+		if (IsA(expr, OpExpr))
+		{
+			OpExpr		*opexpr = (OpExpr *) expr;
+
+			/* Extract the var and const expression from arg list. */
+			if (!extract_var_and_const(opexpr->args, &var, &con))
+			{
+				prev = l1;
+				continue;
+			}
+
+			/* We don't yet have memory for key */
+			if (*rs_keys == NULL)
+			{
+				*rs_keys = (ScanKey) palloc(sizeof(ScanKeyData) *
+										list_length(qual));
+				key = *rs_keys;
+			}
+
+			/* Create scan key entry */
+			ScanKeyInit(key,
+					var->varattno,
+					BTEqualStrategyNumber,
+					opexpr->opfuncid,
+					con->constvalue);
+
+			key->sk_collation = opexpr->inputcollid;
+
+			key++;
+			(*nkeys)++;
+
+			/* Only one qual in list so set it to NULL */
+			if (list_length(ps->qual) == 1)
+				ps->qual = NIL;
+			else
+				list_delete_cell(ps->qual, l1, prev);
+		}
+
+		prev = l1;
+	}
 }
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index b3a595c..7288555 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -130,7 +130,9 @@ extern HeapTuple heap_getnext(HeapScanDesc scan, ScanDirection direction);
 extern Size heap_parallelscan_estimate(Snapshot snapshot);
 extern void heap_parallelscan_initialize(ParallelHeapScanDesc target,
 							 Relation relation, Snapshot snapshot);
-extern HeapScanDesc heap_beginscan_parallel(Relation, ParallelHeapScanDesc);
+extern HeapScanDesc heap_beginscan_parallel(Relation relation,
+						ParallelHeapScanDesc parallel_scan, int nkeys,
+						ScanKey key);
 
 extern bool heap_fetch(Relation relation, Snapshot snapshot,
 		   HeapTuple tuple, Buffer *userbuf, bool keep_buf,
heap_scankey_pushdown.pngimage/png; name=heap_scankey_pushdown.pngDownload
�PNG


IHDRS�6sBIT|d�tEXtSoftwaregnome-screenshot��> IDATx���y\T������0�;(�(��@��Bj�dr5�L��f�-�L+��[Y?���L�i�7T�j��K������3l0,����f`��_/^p�s�s�s���<�9�0�L 11��������c�zt[��a��1$���T�!�B�S�B!��B�Y%&&�t��B����!�Bxt+�B!<
!�����B!<
!�����B!<
!�����B!<
!�����B!<
!�����B!<
!�����B!<
!���������	///�B8::b����}{��mT������h)�@���)�e�:�:��	iS			�����t��1��_?�����J�bw��e[�ne��������;/-��1c�U>�f��eJ�<�:��	ik�3upB:����c��}		i4/99+W������W^��<u����n�14C/S2�y��OH[�[	�h


�����������3��u+���`cc��� l��Uk��������kW���a���HKK����b��}�o���"}��!,--q��
��n��+++����5���g�������k���������5d2Y�}�g{��_��O>	GGG���c���HMM�zl4�-�J��������X�^�z����HJJBxx8lll����6�Vs?�����vmll0`������H�2������������@�Rd������t<���puu�P(���^{�5(�V#BZ��
�3Y�p!���?;p�S�T����_���7;x� +..f			����=z�_����k�={��A��7n0�R������!C��m��W�vz�����O?U������'�xB�z�1�V�X�6��Ml����]���6m��JJJ�B�`_|�<x���4�_�4�����K�.���b����������G���/�i�=���>���rH$6r�H�����6���y�Ow���BBB��s�XQQ;r�����������u�
<�}���L.�3�J�n���f������V#BZ�b4���l��U����I$��.]�N�8�����#���;���m��&O��O��`t-?r�Hv��1�e�T����l���j�F��~����1���?���6��	��_~�X]����c���Z��)�,))�����KNNVKspp�����t������FS�=��)cDD;|����={�h������:����`...�tK�!�A�1���jv��y�q�F6c�������r9c�1����������/?]����]]]Y^^��2i�������B!+((`�1VXX�����B���c����3����cL�R���@VQQ����7e�6{�l���_���t���m��`%%%�tuu���Z�|N���k��:��)���C�@+//O�2��}}��������li+tv����`K�,a/��c�1KKK���H$�����Q��B�Pke�0���`����1����4�\45��������3�KLLd�/�Z]�+))a~�!d���?�n���3O}�������+O}������>e���`���j�TVV��Mm_��WVV�������0�����B!-�l1:�����}��?�Z���L"�0�svv��\^����===[�b�c�V�b�=�c��^x�����z��c�����e���l������+Z��k{���u��;w�
8Pg�m��9���%�����>e���c2�L--;;[k��}]���	��e���7X~~>+//g��������L�8�v�jr^jj*���'��W�����8s�L�
[��I��s�N���c���7n\��&�J�z�jTVV6��[K����\�rE��Rs?'C��<t��O�>8{��Z��c��}]����l�2������8p������L���#>>�999��������J����v����u��c����o,  �%$$0�R������-[�����U������Xhh(���`J��m������_�l]�ta��'+..n�?c�������{���g��7\�a>�q-��m��I�c�i{#F�`���g
������/���Qg��4�UG+[���t���v���<���111�o��,55�����C����pfaaa���:��������R�d���l���l�������������tv�:q��0a���+
������=�<xPm���X��O~���'���o���c���?������T*e�G�f��]S[�����������5���~�4���p���0��-���ZM����?���C���
spp`O?�4�����_[��>���oi���y�O�����A��H$b���c�O�f��`��v��?�
2����0WWW�������\�r�Jfcc��cDHK���������+W����7M]�'s)����>������?��� ��X����+�J���7���3���M]B���;;;���b���������C!z�>�B�QB!�������H�B!�F�!���[	�B�Q`@!��B�Q`@!��B�Q`@!��B�Q`@!��B�Q`@!��B�Q`@!��B�Q`@H+����u�}kn�;���F�1���L��9����J�8p ���k�bBi���D���R999:t(z������w���K/��i��a����.!�=YDFF�����Z�z5�z�)���{pss���^~�e����8|�0��J��o�ooo���a��YP*����
8;;#66���pss���;���?���j-�����������X���'����_����5j�������'�|iii|Y<==������@�D"�T*,X�^^^��������R�4�������
�H�^�z��?������������������#�o��&�/�������ZJ�...��d8r�z��
�D�>}�h���lqq1&M��������z�OEE���q��q���B"� 44�O�nT�f���w�yG�q���pqqAqq������pww�\.���9W�j��M�u\���>KM��{���������B�D�B���-,,p���f_Btb�@PP;{����V�X�����d2���c,`�/���B6�|�������{����^~�eVPP����{��wo��,,,��	��;w�L&c���{��������c����R�d
��}���l���|[[[6}�tv��
�R���~�>��&��YQQ�������/��#
��O?��^�������U�X���cLg^���t�R��������������A�e�8q"��m�Z��m���	c����������r�}�v��_?���k�w�}���?�����j�uk�H�������WYqq1[�r%�~�E,`s�������S��o�����O�>����:W��wX����h;t}������,66�1����{�X,f{��e�1��O?���P��/!�A�1�H�


t.��wov��e~���+�{���4v��m�c��������E"�Z~��s�����4��G��/--U����?���={��w���yyy�g�����\��O��77��e���1���=�d��c���;v�ZZTT_�8;;����k�n}��
d�n��+���ujj*?�P(�X,��1������&N��*++�������!C��m��'�d			����\��:.��]����Dm �0M�4�-X��1��+����y��2h:��4� D"��������+//������P(�����*���`eee��R�d666��L&c�g�f>>>L,3�<�omm���H$����|���+/me���eJ��_�6HjJEE���{��!c������KVQQ�cl��5L,��������7*o}����j�_}���~�c��_��I$�F����X]]��_��c����,$$����:W������@�g��;�������c�����=z������NLLT����KHs�S	� �w�����K�r�����������:�P���1��i��A("!!yyy(--m��H$���H$(..��f0�������t���l���j�ZYYi�����O��;vbcc1m�4~������K�������*��?_c^�����@uuu���s�---5n����z���E����G�@�9s�`��5����s����
m�JCUUU���CMy��,�}'����l�?J��F��R��������:Tm=m����( !f��w�aS�Nm��o�>��3�������������+W������L�o��y��Z~"�������III��b�Wk����Pm�����������w��w���WW�\a���,++�u��M�I�>�R��R�����l��=���W��?�i?V��������'�����et����3+++v��1fcc��ry���:W�b1+**��/]��d��:.��M�g��;1u�T6i�$���/2�{�����������.!�A-� �~�m�9s���*<x���"���3g�Dtt4���i����B���������i�Z�]�@��>����Cnn.�l��g�y����[7<x%%%8r�>��cx{{���;M�7{�l|��W�w�����v�Z<���-*�����-**
k��Enn.�����>�H��8!!!�����/�ggg�����7x�`���O�����������1]�N�8�V�B^^~��'���G���W��]�g����[�z��^���������Y�0a����4���s%88�6mBii)n����K���I�9�
��,u}'�z�)����F��9{��ATT��e �YL������f���\\\�T*e�
b�w�V[�������K����yzz�7�|��T*~~�SR����������D"6y�d�����c���L,�����k����������WQQ��x�
����lmmYdd$KII���M}�j�t���l���l��I����������w3�D����M��{��<x������D"�N�8�1]�*�J6a�&�Y�>}������]��������woVRR��g��~��N�����\INNf}��a"��������T*����m�t}������L ���,�c���L �����]B�C�cF�F1�@:�����/��.\���+��Gs�&Bi�***����%K� &&���!�,P`@��lmm�����+Wb����.!f�n%B!�GO%B!�G�!�BxB!�G�!�BxV��+i��B!����B!���V!�BxB!�G�!�BxB!�G�!�BxB!�gv�All,,,�VAA���������(jM�5�B1Gf�<y���hjh��+W"""YYY�����������G!��#�
����X���y�������aoo��'��_���k!�b��r�C�@���@*�B.�C"�@�T���EEE���C!���NXZZ�������
�������m��RRRB����m����8::�W�^�.1 �����D"AYY���R�Tk��y��B!����~d2�!���[�n��;3ce�K@@������L���Ck��y�BBB�����W�^���l�"��v����Mv�&W�i1���F\\
���0~�x����B!����@�@��ox�������={K�,���k!�b�			|Pdd�)�b6�����_?��@1+��*\�����������G���!��k�uw���Y]wv*VB������<&��9�M}���{�p���t�B��[���LD_<�/1BiH[���B!<
!���4�B�MUY�����!3�w�E�+/BV����5u��P`@!�DeU5�#3G�,�we5�~��Y2���+��(0 ��f()�@��NxR���6F*����a^1������r���SW�g��a^	���':P`@!���8�xG�r��=����d����a^	��O^S���pWV�;9���/AeU�^�fme	/[������]]��������-6�r?J5�>���Bi��������W`��0�Y2���7�g����~n**[W�[YZ�������q�GW7��������N��7`l#RU�}�������^.w�'[�3_�����JwFZXZ��l?wtu�C;���U����t��B[���a�}�a�y������� ������;�R�z�j;��(0 �K����2s������u���@Og)�������w�CW7;��;��Ij�!���
���=�T�|������tm����C�!�hPU�pO^��H����+/��^aq�^����-|�������������l�U�r�a�/�������H$&*14
!�ZaI9W��T����w�[����,E�*����3�O�W;X[Ypoi�N������_��+W��wo|���`���8u�����X888hL'�t,����)j*��?P����J
��=�0kt,�����&��u��`���X�`�M�����c��98w�`���������;�q�F�^�+V���Nir��*~��?K�h�c|NRx9"��^�����������Z��X[�k�I�%`��G����A~~>D"T*�Tr����{��A`` �_���S�"%%Ec�>�����_?xzz��nb2�>����Y�����}@�7������X�z{�U��]�V�����(����oH"�B7���������F$l�N�������|#�����B���~-*WGr��Q:������D~"22��Ei;���0s�L���a�������t��u����Azz��tB����(-����]�u\�5P���J���^�_[����h�>��v�S��������l��<���+2�2FA��,�<��f�MH{�)Zn���a��A.����������'����������P(DUU����222���Z�o�F@@���[���PTZ�u�^>NX��r����)+�Y2e���S��z��������q���������f_7[�ok���������Z��X[��p�n���|�2�N-f�S�1��?��������3�y��!!! �HPVV�D���RH�R��
egg��H��%�������j���}v~	V����J��+���e����Q�.Ntq�����v�j�U�J �)i�6u��U����������������H��N�<y����X,��3�`�~^@@������L���CkzC�
j�����c@���e�����p��\���l�j�����swhW��uG�5u��u�� ((111�1cv�����`~^tt4����`����a���Z�	������|9�/f�D�m���V�sw�X�;���������}��;��s����k��-[���B���i�p��)DDD`��������z*���\E)�.e!15���HM�iV���vX2e0����������CGv�t�?a�d#D����4�@:E``l��(�@�����z�����!Ck�;��a�kci�|su�W�����6���@�@S��@���@i�a^	/e�-e���.�x,�}}�x�Q7�v1#��j�(SoO��
�j���V����N�~n1R����KY����u���v�������A`g~�{�'��Q
G�����'�������O#�o=
�;��2M��AP`@����)�����K�HH�D��|���{8 "�#�|��^������(���;l%�G�����������v�^:��X��1]�������x�0(0 �Ldd"1�S3u�����	�5��������%a�F���=[[d��>.��~���D�����\�������*�~>�.e"15	)w�����|PWgD���ca~��EWW;#��t(��@J2p1	�p����z]�n@����'����My8t IDATQ`@Hq�n.�R��R3qO^�u�^�.���Z�^�	�Nq
SN�W���oO����T�����J�����8e'm�B���w�j�<�+���@�����}}����0���+Sr���\����Z�(�^��#�@�����1�o>vo�J`���`�*�o����Q`@He�CY�}X`OG)$"�_3��+wd|g��KY���<���@�>���X�o�S�pu����&���'	.&��k��J-���y��D���x�}���{6�Z���X����\���/v�EUu������1���(o�@�������R���p�������d����8���6[�nEXXX��G
p�9�X��r���Fi����m��t)��R�yXZ���P_<��}�������
�����W�r��&N�@���``8�����4�1���[�>
��q�@�����s���e��Xo���:3v`PUU���0��� <<����;w�g����aa����P�!-$h�[_U5CJz6��tyE��~���#�����'@�R]�����=9p�w�@;'.<�s'sK(��fc������_���b����{�����/����8p v��
GGG;v���:�������/��O<����k�:u*<x������+..AAAGyy9<==�a�,Z����X�n����l�2TWWc��ux����v����K����=z`��
������?v���I�&�x������gD:���2��fr
4����X��h�w�0b������C���((�|�fei�G����{tpD_�I��Y�t@�qSNpSO%Z�H��B�s��D�������K�ez
�����*��������D�np���.#������/�������?�{���
6`������O0z�hl�����Gzz:��w���I��h�"|���|^?��3�N�
���Fqq1RRRp��M8p�����/����48po��6���+���Gxx8v���E����s�2e
���Z���V��b��+(-���|��5x{{����l/���������?��}y}��e�"��Py�]#�����������/Z�.������a�����A(�z]g��d��BMD����~=,��q����:t($����Z�]V��6P�����K�����hi9��*]]]q��exzz"''�
������U�T�D�:����r�
���p��}t���1���!))	~~~��e2\]]�<�r9\\\�R� �HPU3�T�>}0~�x��?�_n���Q�F���;->.t+���{�������������}l~#�h�j/���q�7���+��,������*(�*��W�LU�����R�e(��BAITU(.U�����*�si%e(.��
����[B$�����x�fT�!!] S�m����Bb��B�5/+��n��B��������B����?-q-			X�d	�]��R����NNN���VNN����f�����>C~���0������+0p�@x{{c��u������'d2Y�w��NTWWc����u+<<<��7�`����l���8u�����X888hLoHQR�s�CB����^8S�
K�Q��Diy%����D�RU�TUu��Rey�+*QP���U�EePU�V�*�*��6����5�app��N ���~�S�������z���#�{�J��E������O�����i+�������Fwp�����<<<p��U������>}:��]���h���������������t��
[�l�����m�6��9�����S����y3�R)���p��a,[��OsMV+W�DDDv����7b���X�b�������lu5W��R(���zj�����E�*~���2�~%K�TPUT��]Y/o��������Pg�n?,c���m�!��9��ame[�5�b!��,q�����zF���^�t\c������	/?��%p!���M����������C�~ff����{��i9Z{k�;m|���r���X�v-�y������o���G��={B�R��/���������G0[�l��������-������Q�����c��E�8q"�[>��������������a�899a���|g��?���{{{L�8S�N��+4�����B��b���z�+�u�waI9�k"eeY%�+�������*����J��L����_��m���F$�HhYSy[�V"�*o�%�j��b!l%��Z�Q*�Hh	�vkX[Y�A*���
��m�4{����)O�lqb&���y�;�/�>�o���(�f��\���k�i�'�ErO���4�����u������nz|��T�\5k�b�
������|}}�����v�Z�3��e���PRR�U�V��g��g�}����b��18g��iQ`�|�r,\��f����?6o�8}�4���}4Z����<|�����g�Q�[�n�8q"�|�M��K[pvv��M��p�Bx{{c��H�R��rH$(�Jxxx���HczC��>�In��wt7���m]E-������"k+���`'�Zh�b�%$"!WQ-ao#���
bk+8HE�Z������AS�3p����u��A�p��T@j2�p!�u`Z��=�n
��[w�gL��|��x�o�lhp�.�04�hei���X����v�2X�&L���31y���a��c��a����5k~��Gt��������w��GU��oh
����HKK���k�p�B:tPVV��{>\,��F4�7�R��c`(vkXZpMPWA��n��:�9HE�c���� ���;�	8H���-��%��Hh��[��B�V�:HE����vvl<�����[[��u3[��Mc��R��������"k��~�q���s�t\�NB\?(�����L�J�����wEU�pT�CU�00/�:�9c��>�5���lS�L�G}���T����:�.���[x��gZ��QZF����Xxx4�,{vv6�}�Y?~�M�������AAA(,,���/
��������d�H$())���
����^\�#�&k�(����Rk���O��5OK�-`U3J��R����B �T�~_Fr�^1�xE�2]������9�?�����R��i�|]$����;�uf��� RBT���7�s6^{@*;g���AA������RG��C�R!22��-������;w.N�8��!��J%��������(�� >>^�I�������6�����1+���@VV�������=zhMo�����z������m�j'��?� ������9H�_\�Q�F�d�����?�r��>����d��~�jPG�AO�����o<�� �G�6��6�W�~�p���V�ccc�����DF
j�����c������?��3� 11_}��{��6�6'M��m���G����/�nYDGG#..,@\\���5�����7	�V�����c��f��N:��,���}<?_Q�f�����u����;�!P��g{�98�@����
-j�2���!!!�������u��a���4in�����-))����q�������m���/����i�p��)DDD`��������P��LD��
�|]����b��b�e���\,��(����6�OBL"�:0o�*�����o�@��������d@�=nZ���
���������j�����'�������0�������������E��
hH��|H��Q�H������{��)S`gg����90�������^��'O"88��
B|]i�b~f��t���w}��GuW��>����wu�W�;�k�7��+�
�:a'w��K�oo�����]��T�������~�c���>ZN���c�������_�������o���F�n-{ov{���6�}S�Pz�21oy��C�������3���A�\e�W����~�J��*?�A$�*���������=Wo.p���e�l8�����s��"�S�<50X�l�
��s����_|���_o�"B�0��`@Y�x^i	�����Pi��I�"���7����v���k��]��f|crt�:
���`M��&�]�
����_�~�b@���4��������J8��k����_�*��Uw^�$�V��1j�ABB�L��\�s�&��cw���RNp��j{�OXY��+y�zW���5�W�S�}�R�����@��}�����x����v�Zt������������M���0g�$''�j\���6[�nEXXX���JLL�'"#��7q�y��!&&c������G�!&t7
H=�Op�%����T������I�Q��
�������W6���f�A&����?��Y�t�w���Vo���
�f�BLLL������o���Y�p���f�������ik���1b��'����������`�������?5
@�P`�������K�z�[Yc��W�5�(( m�R��@�
X
��j�M���c���j;v}���D"App0:���#33����?r�������
���q��Y>������^^^��c ..AAAGyy9����}�vxzz�����m���~���a����]�v!00b�}��A����������w7k��z+a��������h��VGC����^s[�$�[v_���{ t�r��a�o��=Q�0|<���Y<�	 ���8{D�r�'����%�\�0| ���������h ����g�a���|�����'�`�����};�������F�/^���P���������������HOO��C��x�bL�>?��3��_kkk#%%7o���0�|����HKK�����o�����+�����x���c��]X�h��)S� ..�&M����������������1����]�����\L��p��m*����7��x�����3X��o>v|n�2���S�(�5�1����O<�5k�������*�
"���F����.]���z�r�...����H$BUU���������Y @&������V�z*�
�UUU���������7z���[�0j�(��sG��b���;wRB���������p��P�������1�CH��B5!!K�,��k���95??����K��YYY���-�999�	WW�����Zm=kkk~�}��a��8p ����n�:DDD���L�u�2j`���B}1��ZN9Z^�%�!�##���\�@KB-�y��c�-+�_�s�]?����os�><��]����c�������@ ��]�/�suu���u>�`��u��-[�y�fl��
3g�Dfff��3j`�f��[����:�1 ��fr}.��@���.������=���{0�*�:�����������'T*���K���!--
]�tAZZ|}}amm���z
�|�	�/_�C�a����|Y��������it;@���c��E�8q"PUU7���������������(**��e������!U?H=�MkR�B�a����Q���8��
���~�G�v�Z�3��e���p��)DEE���999X�z5f��OOO��o������q���f��/���1k�,���c�������Oc��!���(���� �QYY��qP�2��:�V����-�rBw ��uA�1:��|XU3��e�I]ll,v���]�v,�	&`����<y����H�7����3d2�}�Y=z��!�m�>��8���a�����
B���k&$���A@Pk��)�����������V�w����u�<�L��3�]�t)���gc��	�����g����jMm%&&�f�R�uQPP�������S:t(bcc����1��&�w%�u��WG�^�j^�6��0� C���d>��E�B����%~��G��='N�hU_<�R�9s���lv���c ������b��}�}�6�'ett4�z�-��������q��q����w����#^}�Ul��
�+V����������
��O���%Rn8����e���ko<����_0��0��bp�������N�<�O>�D-0���7������@\�~S�NEJJ��t}P`��������i�`� �7��6~�\6��qj< 
!���|n���z�j�����O>QK�����3���Azz��t��1��r���}���4����%��t>�A\>�BH�c��������?���g�����G�����1l��6�nBB�����
��B,�#YiJo���CP�T�����OC�C�R���Z���*]��
X����"���5�5����V* @��������r}B��%���!������;E��YZZ�������=��������{�9��7O����j�*���M��H$(++�D"Aii)�R�����x��Fi��V����M�qo���P�[�������>P�q�6�r@���p/^���Vw�\k��T���l4��dB:z���50�|�2���OO�6
�����5����8x� ?-��� ++AAA���D�=@c:������n�MV���G*+j*�\ _��*�9�/S��su��R"	7b��������3�r�
+�����uB�80���+!�����6@ddd�n,,,(
�Y�l�m�#+6|Vtt4����`����a���Z�I����k�)/���=�������RUs�[S���*t�j^^w�_��veI�������wp�\�*~{����'i����}`��@Y)�����SP@����T��7��k���?��@ �����~�z������B�i�����S�����;`kk�1]��VBept'�f��������W��(��,b�BwrS�����*yGW������os�{������o��Z
1s�T���4�+S�
X5w�^ ��G_ ���k���iPT�ve�H�*��J�����k+yW���w6~���{���
u�$��9
��Q�$$$`��)���j�f��9���{������.
W���y.W����/����k��������W���4���U���+��Xw��f����.!���Q�y��!&&c��i��B�I���aSU�W��.�#���������/lp%�_��}���#�fc�U�vN��k@�R=���l����o|��'�BH�g�� 77#F�0����y`�����`�����>��h�`Y^�U�y��:���<Y�t~NMg�\���m�_�3��+�\�T{���}�=��W�r�����.�t�8{]���vB�`���`��������E�Z�r�v'�:��p�G�jY�����:��]��p�5=pR�YYs����Q�d�����;�&n����
����A`����7t��3G}��Q���$DGG���P-�C�1�����]��%��{��{���^��t���'�o��4Evx60�{L�5cB:
��Qo%,_�;w�4�>vN-
j+v�zz����������nn]�����#��n50HII1�>"�����UY�yb�� ��\���������	!��3F
��Y�u�����_7�>O��J����}�����=��?��9{B!DF�c������"T����������3 f
*�*����0X�K�;�BH;E}��Q[d2����I���()�����������BH�b�Z����Z/��d+7
	wS<@!���Qj��	4tc6�j(��Qm�d�bB!�f��j+�K�.�1���B!��Q������tNN�|��o����#akk���0��������DEE���QQQ��K��	!�sf�������
���P�����5k���v`������IDAT�93frrr��k�a��y���+W"""YYY���������B!����+>���8r���6�����0c��D"���������{c��=����1u�T���hL�G||<���OO���-B19z\���1S����]����322��kW��������B!���-�v��5�;{��EXX����������P(DUU�������PRR��VUU�B��!���R�DDD���N0����K?~<6m�� �HPVV�D���RH�R��
������\-����������K��!��g��5u��u������x����q�FDEE��@VV�������=zhMo����[SRR`cc��!��#���!���P�%K�������q�����F\\
���0~�x���B�9�},,,
�t��M���
���M��S��;v���Vc�>��BHgAO%��Nq+���Z�<{{{���oz�B!��S�J �B�~(0 �B�B!��(0 �B�B!��(0 �B�B!��(0 �B�*11�����4aQ!�bjVB!��J �B�B!��(0 �B�-


DEE�����E"�B�Z�\����Bdd$V�^m�"B!m�-������'���'N����j�"B!m�-222��kW������M\"B!�mY���YYY�b1@,C�T6Z����P(ji�1�;w���F)'!��c�.�ITVV�1f�b��@�D���2H$���B*�6Z����
/%%%�J�
��*j��P(`eeS������������bt���(++������bt�1�����E1���|���w������@���dee!((������G�e�J��>������i�b�+�����;���c���������J6��E1���\�v�S�{YY�9�)���2C��@���h���A�P ..���7u�!��6E��o��6��������g�b��%�.!����V��������L]B!�h(0h���MvT�<<<���l�b����;���L]����������aVVVpqq1u1L���"���� $`��	!�BjX%&&����&,
!�BL�Z!����!�����B!<
!����@O���������PXX���bcc�[����X��������o����#akk���0����F.������?�8���k����������`��A�����A�p��E#������M8���i������@VV"##�z��f/w��I���v�7��{,�e��w�����1c� ''������g�����������40���W_}e�����>{�l��?999�;w.���c���9~�I
F��n���c���k,44���}��g�1�:�a��X��������������1��B�`R��hel+��{@@���3f�����.�H������I$�������M8�����R)�r9$	�J%<<<PTT���A����=�:�����}���wc��
8r���J�6��w<���8}�4�����[�����%6}�}����8q"f�����8���������M8�����������P(lr�[}���_$}�E�����5w��]���c�b���3bI
O�}�9s&����G}���������{����������q��
�\.���+�����gO��x���M8��@O�eee���R��B�w���3��&���K�.a�����iS�
��=&��������0���s�,-��tf6ME����0��(R��QQ-�M$����v1�hQH-
KR�EAQA�5��!j���^��I��]u��������3�s`���33W�D�p�B577+��U�+��<xP�N�R:�Vkk��������U�V������7o�h�����n:�
3���������b�x��v��9�-N��?z�H�LF���df��x�����jhhPQQ���������l��jkk�H$�J��H$TSS������0�X��~��1544���z�;�<^g��o�.]��t:���O���r�;�x^g/++���W5::�x<�`08�� _W=N7�>}2UUU�����b1322�����r�:I9������)�f�:��89��x�"-O��?|��l����h4j��������:{__�)//7���7������/�N�����\|,�J�Xs�?J���
�7�1o%�`,���"�`,���"�`L�q~{o$��8�����w�����������������w����f���7O�PH�?>600�h4���"E"%���������Z�����y�������8����=#000 ��g�@&����v�����{���\�����m��������#3��������QSS��9b�>�
6������PSSS���j�=��7jhhH---*--���L��E�l�����+���+I������+�z{{��UVV�{��I�zzz���&�����uE��h4��[�j���
Y+--��'O���444���
�z�J��������$�u���c-[��>f�����������ZZZt��I�����=������g����Y�����C�{�L{�1<H$��m�T^^���T*%��/�q���r��W���G-Y��Suuu����$utt���n����w�����[�ni��=�����Ggg�~��ZII�._����A?~\���v�u]�R)cd���/_r��WSZZ����`0�t:��O�*�Ji��uc�._�\������Raa��~����l�G8���_��}�������7�~���UUU���s�����\������������Kg��Q2�T[[�B���.]:fX���Qss�jkks������^����-����n��m��]k���k������7�������bf��f������;�c�_^c�c������c�)((0�P�tww�x<nJJJr�c�����8���+g={�1��|��8�c��{7�G�L����5::�.�����~�z��0M��w&�����e�]�v-���F8c,>�,���"�`,���"�`,�������i���yl���R�.�
&VIEND�B`�
#2Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#1)
Re: Proposal: scan key push down to heap [WIP]

On Tue, Oct 11, 2016 at 4:57 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

This patch will extract the expression from qual and prepare the scan
keys. Currently in POC version I have only supported "var OP const"
type of qual, because these type of quals can be pushed down using
existing framework.

Purpose of this work is to first implement the basic functionality and
analyze the results. If results are good then we can extend it for
other type of expressions.

However in future when we try to expand the support for complex
expressions, then we need to be very careful while selecting
pushable expression. It should not happen that we push something very
complex, which may cause contention with other write operation (as
HeapKeyTest is done under page lock).

I seriously doubt that this should EVER be supported for anything
other than "var op const", and even then only for very simple
operators. For example, texteq() is probably too complicated, because
it might de-TOAST, and that might involve doing a LOT of work while
holding the buffer content lock. But int4eq() would be OK.

Part of the trick if we want to make this work is going to be figuring
out how we'll identify which operators are safe.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Dilip Kumar
dilipbalaut@gmail.com
In reply to: Robert Haas (#2)
Re: Proposal: scan key push down to heap [WIP]

On Thu, Oct 13, 2016 at 6:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I seriously doubt that this should EVER be supported for anything
other than "var op const", and even then only for very simple
operators.

Yes, with existing key push down infrastructure only "var op const",
but If we extend that I think we can cover many other simple
expressions, i.e

Unary Operator : ISNULL, NOTNULL
Other Simple expression : "Var op Const", "Var op Var", "Var op
SimpleExpr(Var, Const)" ..

We can not push down function expression because we can not guarantee
that they are safe, but can we pushdown inbuilt functions ? (I think
we can identify their safety based on their data type, but I am not
too sure about this point).

For example, texteq() is probably too complicated, because

it might de-TOAST, and that might involve doing a LOT of work while
holding the buffer content lock. But int4eq() would be OK.

Part of the trick if we want to make this work is going to be figuring
out how we'll identify which operators are safe.

Yes, I agree that's the difficult part. Can't we process full qual
list and decide decide the operator are safe or not based on their
datatype ?

What I mean to say is instead of checking safety of each operator like
texteq(), text_le()...
we can directly discard any operator involving such kind of data types.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dilip Kumar (#3)
Re: Proposal: scan key push down to heap [WIP]

Dilip Kumar <dilipbalaut@gmail.com> writes:

On Thu, Oct 13, 2016 at 6:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I seriously doubt that this should EVER be supported for anything
other than "var op const", and even then only for very simple
operators.

Yes, with existing key push down infrastructure only "var op const",
but If we extend that I think we can cover many other simple
expressions, i.e

I think it is a mistake to let this idea drive any additional
complication of the ScanKey data structure. That will have negative
impacts on indexscan performance, not to mention require touching
quite a lot of unrelated code. And as Robert points out, we do not
want to execute anything expensive while holding the buffer LWLock.

Part of the trick if we want to make this work is going to be figuring
out how we'll identify which operators are safe.

Yes, I agree that's the difficult part. Can't we process full qual
list and decide decide the operator are safe or not based on their
datatype ?

Possibly restricting it to builtin, immutable functions on non-toastable
data types would do. Or for more safety, only allow pass-by-value data
types. But I have a feeling that there might still be counterexamples.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Andres Freund
andres@anarazel.de
In reply to: Dilip Kumar (#1)
Re: Proposal: scan key push down to heap [WIP]

Hi,

On 2016-10-11 17:27:56 +0530, Dilip Kumar wrote:

I would like to propose a patch for pushing down the scan key to heap.

I think that makes sense. Both because scankey eval is faster than
generic expression eval, and because it saves a lot of function calls in
heavily filtered cases.

However in future when we try to expand the support for complex
expressions, then we need to be very careful while selecting
pushable expression. It should not happen that we push something very
complex, which may cause contention with other write operation (as
HeapKeyTest is done under page lock).

I don't think it's a good idea to do this under the content lock in any
case. But luckily I don't think we have to do so at all.

Due to pagemode - which is used by pretty much everything iterating over
heaps, and definitely seqscans - the key test already happens without
the content lock held, in heapgettup_pagemode():
/*
* ----------------
* heapgettup_pagemode - fetch next heap tuple in page-at-a-time mode
*
* Same API as heapgettup, but used in page-at-a-time mode
*
* The internal logic is much the same as heapgettup's too, but there are some
* differences: *****we do not take the buffer content lock**** (that only needs to
* happen inside heapgetpage), and we iterate through just the tuples listed
* in rs_vistuples[] rather than all tuples on the page. Notice that
* lineindex is 0-based, where the corresponding loop variable lineoff in
* heapgettup is 1-based.
* ----------------
*/
static void
heapgettup_pagemode(HeapScanDesc scan,
ScanDirection dir,
int nkeys,
ScanKey key)
...
/*
* advance the scan until we find a qualifying tuple or run out of stuff
* to scan
*/
for (;;)
{
while (linesleft > 0)
{
lineoff = scan->rs_vistuples[lineindex];
lpp = PageGetItemId(dp, lineoff);
Assert(ItemIdIsNormal(lpp));
...
/*
* if current tuple qualifies, return it.
*/
if (key != NULL)
{
bool valid;

HeapKeyTest(tuple, RelationGetDescr(scan->rs_rd),
nkeys, key, valid);
if (valid)
{
scan->rs_cindex = lineindex;
return;
}
}

Instructions: (Cpu instructions measured with callgrind tool):

Note that callgrind's numbers aren't very meaningful in these days. CPU
pipelining and speculative execution/reordering makes them very
inaccurate.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Dilip Kumar
dilipbalaut@gmail.com
In reply to: Andres Freund (#5)
Re: Proposal: scan key push down to heap [WIP]

On Fri, Oct 14, 2016 at 11:54 AM, Andres Freund <andres@anarazel.de> wrote:

I don't think it's a good idea to do this under the content lock in any
case. But luckily I don't think we have to do so at all.

Due to pagemode - which is used by pretty much everything iterating over
heaps, and definitely seqscans - the key test already happens without
the content lock held, in heapgettup_pagemode():

Yes, you are right. Now after this clarification, it's clear that
even we push down the key we are not evaluating it under buffer
content lock.

As of now, I have done my further analysis by keeping in mind only
pushing 'var op const'. Below are my observations.

#1. If we process the qual in executor, all temp memory allocation are
under "per_tuple_context" which get reset after each tuple process.
But, on other hand if we do that in heap, then that will be under
"per_query_context". This restrict us to pushdown any qual which need
to do memory allocations like "toastable" field.

Is there any other way to handle this ?

#2. Currently quals are ordered based on cost (refer
order_qual_clauses), But once we pushdown some of the quals, then
those quals will always be executed first. Can this create problem ?

consider below example..

create or replace function f1(anyelement) returns bool as
$$
begin
raise notice '%', $1;
return true;
end;
$$ LANGUAGE plpgsql cost 0.01;

select * from t3 where a>1 and f1(b);

In this case "f1(b)" will always be executed as first qual (cost is
set very low by user) hence it will raise notice for all the tuple.
But if we pushdown "a>1" qual to heap then only qualified tuple will
be passed to "f1(b)".

Is it behaviour change ?

I know that, it can also impact the performance, because when user has
given some function with very low cost then that should be executed
first as it may discard most of the tuple.

One solution to this can be..
1. Only pushdown if either all quals are of same cost.
2. Pushdown all quals until we find first non pushable qual (this way
we can maintain the same qual execution order what is there in
existing system).

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Dilip Kumar (#6)
Re: Proposal: scan key push down to heap [WIP]

Dilip Kumar wrote:

#2. Currently quals are ordered based on cost (refer
order_qual_clauses), But once we pushdown some of the quals, then
those quals will always be executed first. Can this create problem ?

We don't promise order of execution (which is why we can afford to sort
on cost), but I think it makes sense to keep a rough ordering based on
cost, and not let this push-down affect those ordering decisions too
much.

I think it's fine to push-down quals that are within the same order of
magnitude of the cost of a pushable condition, while keeping any other
much-costlier qual (which could otherwise be pushed down) together with
non-pushable conditions; this would sort-of guarantee within-order-of-
magnitude order of execution of quals.

Hopefully an example clarifies what I mean. Let's suppose you have
three quals, where qual2 is non-pushable but 1 and 3 are. cost(1)=10,
cost(2)=11, cost(3)=12. Currently, they are executed in that order.

If you were to compare costs in the straightforward way, you would push
only 1 (because 3 is costlier than 2 which is not push-down-able). With
fuzzy comparisons, you'd push both 1 and 3, because cost of 3 is close
enough to that of qual 2.

But if cost(3)=100 then only push qual 1, and let qual 3 be evaluated
together with 2 outside the scan node.

BTW, should we cost push-down-able quals differently, say discount some
fraction of the cost, to reflect the fact that they are cheaper to run?
However, since the decision of which ones to push down depends on the
cost, and the cost would depend on which ones we push down, it looks
rather messy.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#7)
Re: Proposal: scan key push down to heap [WIP]

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

BTW, should we cost push-down-able quals differently, say discount some
fraction of the cost, to reflect the fact that they are cheaper to run?
However, since the decision of which ones to push down depends on the
cost, and the cost would depend on which ones we push down, it looks
rather messy.

I don't think our cost model is anywhere near refined enough to make it
worth trying to distinguish that. (Frankly, I'm pretty skeptical of this
entire patch being worth the trouble...)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Dilip Kumar
dilipbalaut@gmail.com
In reply to: Alvaro Herrera (#7)
Re: Proposal: scan key push down to heap [WIP]

On Tue, Oct 25, 2016 at 10:35 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

We don't promise order of execution (which is why we can afford to sort
on cost), but I think it makes sense to keep a rough ordering based on
cost, and not let this push-down affect those ordering decisions too
much.

Ok, Thanks for clarification.

I think it's fine to push-down quals that are within the same order of
magnitude of the cost of a pushable condition, while keeping any other
much-costlier qual (which could otherwise be pushed down) together with
non-pushable conditions; this would sort-of guarantee within-order-of-
magnitude order of execution of quals.

Hopefully an example clarifies what I mean. Let's suppose you have
three quals, where qual2 is non-pushable but 1 and 3 are. cost(1)=10,
cost(2)=11, cost(3)=12. Currently, they are executed in that order.

If you were to compare costs in the straightforward way, you would push
only 1 (because 3 is costlier than 2 which is not push-down-able). With
fuzzy comparisons, you'd push both 1 and 3, because cost of 3 is close
enough to that of qual 2.

But if cost(3)=100 then only push qual 1, and let qual 3 be evaluated
together with 2 outside the scan node.

After putting more thought on this, IMHO it need not to be so
complicated. Currently we are talking about pushing only "var op
const", and cost of all such functions are very low and fixed "1".

Do we really need to take care of any user defined function which is
declared with very low cost ?
Because while building index conditions also we don't take care of
such things. Index conditions will always we evaluated first then only
filter will be applied.

Am I missing something ?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#8)
Re: Proposal: scan key push down to heap [WIP]

On 2016-10-25 13:18:47 -0400, Tom Lane wrote:

(Frankly, I'm pretty skeptical of this entire patch being worth the
trouble...)

The gains are quite noticeable in some cases. So if we can make it work
without noticeable downsides...

What I'm worried about though is that this, afaics, will quite
noticeably *increase* total cost in cases with a noticeable number of
columns and a not that selective qual. The reason for that being that
HeapKeyTest() uses heap_getattr(), whereas upper layers use
slot_getattr(). The latter "caches" repeated deforms, the former
doesn't... That'll lead to deforming being essentially done twice, and
it's quite often already a major cost of query processing.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#10)
Re: Proposal: scan key push down to heap [WIP]

On Wed, Oct 26, 2016 at 12:01 PM, Andres Freund <andres@anarazel.de> wrote:

On 2016-10-25 13:18:47 -0400, Tom Lane wrote:

(Frankly, I'm pretty skeptical of this entire patch being worth the
trouble...)

The gains are quite noticeable in some cases. So if we can make it work
without noticeable downsides...

+1.

What I'm worried about though is that this, afaics, will quite
noticeably *increase* total cost in cases with a noticeable number of
columns and a not that selective qual. The reason for that being that
HeapKeyTest() uses heap_getattr(), whereas upper layers use
slot_getattr(). The latter "caches" repeated deforms, the former
doesn't... That'll lead to deforming being essentially done twice, and
it's quite often already a major cost of query processing.

heap_getattr() also has some caching mechanism to cache the tuple
offset , however it might not be as good as slot_getattr(). I think
if we decide to form the scan key from a qual only when qual refers to
fixed length column and that column is before any varlen column, the
increased cost will be alleviated. Do you have any other idea to
alleviate such cost?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#11)
Re: Proposal: scan key push down to heap [WIP]

On 2016-10-28 11:23:22 +0530, Amit Kapila wrote:

On Wed, Oct 26, 2016 at 12:01 PM, Andres Freund <andres@anarazel.de> wrote:

What I'm worried about though is that this, afaics, will quite
noticeably *increase* total cost in cases with a noticeable number of
columns and a not that selective qual. The reason for that being that
HeapKeyTest() uses heap_getattr(), whereas upper layers use
slot_getattr(). The latter "caches" repeated deforms, the former
doesn't... That'll lead to deforming being essentially done twice, and
it's quite often already a major cost of query processing.

heap_getattr() also has some caching mechanism to cache the tuple
offset , however it might not be as good as slot_getattr().

It's most definitely not as good. In fact, my measurements show it to be
a net negative in a number of cases.

I think if we decide to form the scan key from a qual only when qual
refers to fixed length column and that column is before any varlen
column, the increased cost will be alleviated. Do you have any other
idea to alleviate such cost?

Well, that'll also make the feature not particularly useful :(. My
suspicion is that the way to suceed here isn't to rely more on testing
as part of the scan, but create a more general fastpath for qual
evaluation, which atm is a *LOT* more heavyweight than what
HeapKeyTest() does. But maybe I'm biased since I'm working on the
latter...

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#12)
Re: Proposal: scan key push down to heap [WIP]

On Fri, Oct 28, 2016 at 12:16 PM, Andres Freund <andres@anarazel.de> wrote:

On 2016-10-28 11:23:22 +0530, Amit Kapila wrote:

I think if we decide to form the scan key from a qual only when qual
refers to fixed length column and that column is before any varlen
column, the increased cost will be alleviated. Do you have any other
idea to alleviate such cost?

Well, that'll also make the feature not particularly useful :(.

Yeah, the number of cases it can benefit will certainly reduce, but
still it can be a win wherever it can be used which is not bad.
Another thing that can be considered here is to evaluate if we can use
selectivity as a measure to decide if we can push the quals down. If
the quals are selective, then I think the non-effective caching impact
can be negated and we can in turn see benefits. For example, we can
consider a table with 20 to 40 columns having large number of rows and
try to use different columns in quals and then test it for different
selectivity to see how the performance varies with this patch.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Dilip Kumar
dilipbalaut@gmail.com
In reply to: Andres Freund (#10)
1 attachment(s)
Re: Proposal: scan key push down to heap [WIP]

On Wed, Oct 26, 2016 at 12:01 PM, Andres Freund <andres@anarazel.de> wrote:

The gains are quite noticeable in some cases. So if we can make it work
without noticeable downsides...

What I'm worried about though is that this, afaics, will quite
noticeably *increase* total cost in cases with a noticeable number of
columns and a not that selective qual. The reason for that being that
HeapKeyTest() uses heap_getattr(), whereas upper layers use
slot_getattr(). The latter "caches" repeated deforms, the former
doesn't... That'll lead to deforming being essentially done twice, and
it's quite often already a major cost of query processing.

What about putting slot reference inside HeapScanDesc ?. I know it
will make ,heap layer use executor structure but just a thought.

I have quickly hacked this way where we use slot reference in
HeapScanDesc and directly use
slot_getattr inside HeapKeyTest (only if we have valid slot otherwise
use _heap_getattr) and measure the worst case performance (what you
have mentioned above.)

My Test: (21 column table with varchar in beginning + qual is on last
few column + varying selectivity )

postgres=# \d test
Table "public.test"
Column | Type | Modifiers
--------+-------------------+-----------
f1 | integer |
f2 | character varying |
f3 | integer |
f4 | integer |
f5 | integer |
f6 | integer |
f7 | integer |
f8 | integer |
f9 | integer |
f10 | integer |
f11 | integer |
f12 | integer |
f13 | integer |
f14 | integer |
f15 | integer |
f16 | integer |
f17 | integer |
f18 | integer |
f19 | integer |
f20 | integer |
f21 | integer |

tuple count : 10000000 (10 Million)
explain analyze select * from test where f21< $1 and f20 < $1 and f19
< $1 and f15 < $1 and f10 < $1; ($1 vary from 1Million to 1Million).

Target code base:
-----------------------
1. Head
2. Heap_scankey_pushdown_v1
3. My hack for keeping slot reference in HeapScanDesc
(v1+use_slot_in_HeapKeyTest)

Result:
Selectivity Head scan_key_pushdown_v1 v1+use_slot_in_HeapKeyTest
0.1 3880 2980 2747
0.2 4041 3187 2914
0.5 5051 4921 3626
0.8 5378 7296 3879
1.0 6161 8525 4575

Performance graph is attached in the mail..

Observation:
----------------
1. Heap_scankey_pushdown_v1, start degrading after very high
selectivity (this behaviour is only visible if table have 20 or more
columns, I tested with 10 columns but with that I did not see any
regression in v1).

2. (v1+use_slot_in_HeapKeyTest) is always winner, even at very high selectivity.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

heap_scan_key_pushdown.pngimage/png; name=heap_scan_key_pushdown.pngDownload
�PNG


IHDR7�k��sBIT|d�tEXtSoftwaregnome-screenshot��> IDATx���y\�u���p��7� *����H�e�[Y�i�n[{��vZ���nV�����3[M�����<�<�V�AD��y�����s���z>���|��g���|?��G%""""";���
 """jI7DDDdWn�����0��]a�!"""��pCDDDv�������
�
��"""�+N��"""�������L�����������"""�+7DDDdWn�����0��]a�!"""��pCDDDv�������
�
��"""�+7DDDdWn�����0��]a�!""�J�jVyK����8�w���z��
�Z���6�Is���F-"�=��!"�@2sK��w�EnqY��?;;������QTT��m����Z���X�����}RSS��_?b��
��6�9%''7z1��������o���z�*�
�y��������zEEE�;w.���?�y��0y�d���������t�S			�^*��n������xq��6=��w���D��S��"  'O�DHHrrr0l�0\�p�h���J����	

BJJ
q��e�����z�5�����D���!�F���P�T*#;;�c�2^^^P��F�������Z�O7D��X�.�P�R]��zg'�x������n.��j#���S���2*�6m�~�mL�4	*������������W�}N��b�!"���?L������ i�)��]���n�d��
l���y��x���1w�\�\�+V����[��={���|�q��y���`��!����0g������V�pCD����8����^�&�Jt�z�����-����O>����o���o���7>��3�T*���Z�o��&|�A����X�d	��� "9�j8����9u1q�}f�VeU
�}�(�L����
�>-EDDDv�������
�
��"""�+7DDDdWn�����0��]a�!"""��������JNNnT�����������"""�+7DDDdWn�����0��]a�!"""��pCDDDv�����#}{��E�T��n�u����������`?��~>	���p20>x�~@_����j���8~�8�iN_�"����>��{��J��`�c�
�rZ�v-�w�777����hZ���<�}��pww�����?�m���}�B�V#66?��#���X�l����+V4�=����1c�x�
@vv6��$&&���������t:���eee���Aaaa��������V�^
ooot��
'O���g��{s������/G@@BCC�r�J@jj*������@l�����YYYHLL���F���/����pwwGII	���>>>(--������*��Y�7++(((���_�5,X`v�G����U��h
�{�)2�_��N���BE���Hq�����2jB@@�8p@*++e��2p�@C���3��^������k��������^t:�,Y�D���DDD������<�����hd�����kW�����RAA��=Z>��C��)S��������;w���O�'�|R>��S���������z���d��y��h��W^������
����5rrr���{N���e��5��K���-����h4y��W��7}�t�;w�h4��_�*>�����=ZV�^-""III���$�V��U�V������������M�DDd��-��������-X���+m{��7}�����'N���t���h���`�8qAAA&WYY	WWW��
�J���\���j�55�o��T*���a���x��g0k�,C]@@N�<������`��a�p�.^�����'N��J��V�E�=�o�>t�����������\���yyyP�TF�h�7w�T*���������j������AAAHIIA`` ._����p�����p��#''�����������wc���x��������+1}�t�z��������;�|������^�[o��V��^{�����x[����}����0t�P���c�������B���5�g��2d����V��\\\��[7��W��233iT��h
�J���`dgg�t��~����o�|��G;v��`S'00����Fcq{s����NNN��[TT___���N~~�����yyy���'c��M(--������k����V���?����'���;a��-���&L��OG��""[��+`�b�����r����<�]xx[w�g>b�6*|��Wx������@�a8p�"""��


��o��I�&A�R�������T�)*�
Daa!~�al��
�z�t���N����W���9�;��)))���/6o��>}�X<��+W���\������ptt�N����3�^�����F�n�������8u�
��n���>|AAA������q��%���c1d�>|�6m��1c0t�P<x�7o���+"����?��	&���C�1E��sc?7���n&�~^d|�2���)"����:��a�d��URYY)�w����0C���>*�g����BY�j������26������<�?���s�D���Sn�a�����\�zUDD~�������E�����.��v���c����#G��X�R���W_5���4i��������u�D���3�<#NNNV]#S�w�������KII���o�Je�f���2o�<)--���^y�C���~*�����'�����~(����x�b���d��Y2n�8�5k�u����'!"�������`�_w��t����i����S\\\�G��a�C]^^��q����*}����{���������W�v�*�7o����o���."7nDD��]+#G����2����	&�������[v��i������������R�d������#111���*""K�.���	�e��������z�����=v�����K�����o��Zm8^VV�$$$�����3F�����������\�r����������=�%����n�:�~��:�;v4z��QG%���Li������W��w���}�����n���""������w�^<��X�r%�����~9�w����bn���	

��o�ilZ�<����b�
��sCDDDv�������
�
��"""�+7DDDdWn�����0��]a�!"""��pCDDDv�������
�
��"""�+\����:����Fe\�����
oK�]a�!�J������}�v$$$ 44�����������+V�w���N�.]B�>}��������������x��wPSS������yzz�O�>x��'q���k�]�)�7E_7��Hk�S&%%����������������k�������c���#..}��u���������17�al���>�,���`������@NN�n����z
�����o~���l$55w�q���?��^h��]{����W�^������s���w�m�6�S�>�F�AJJ
V�^���c����2eJ����U�d�n�N�HwP9���v������7�wU��;����[�\�����Vmg�8�=�i��������q���
��J����b��o6!K�l��Jtt����4Y�k�.������E�]{����'��;����*))oo�69����{�n����3g��H����Z�T�����_sD.�����t���T*������"qpp�K�.����+"��v��7|�f����WWW����;v���������owww=z�Z�~S�kx�K�.�������Sn��VIOO�a��	 ���3{����t:�������?$�����W_Yl�5���5��u����G����g���i�&���OOOY�j�t��I������M{���2`�9u����Htt�����S��J�n�������������9�����7������K�J�~�D�VK���e���F����/2u�T���'''����'�xB���ED���B<==e���2p�@qqq���p���O�����Orss��<_|����������q�r��9��XQQ!!!!�k�.��<x�;v��y^�������)|�A�����999���,�O�6�������,999V��Z.\���P��][���*���ooo������G]'''�����o�����"�����'O�<��������������/Z�H�u�&���#�V�2����Ipp��}ff��.�"##���3���lS��"�{e��8��*rz�H�f����N/�6m�-[�H||�Q�Z2������RV�X!�)S���O?-%%%2w�\����d��3u����>}���;W4�����|��g0���z�8;;���>+�FV�^-]�v�zs���A\\�|������d��%%""2o�<�h4��+���Ww���$ILLl�3����|��F?�����SO=�d��������5�Q7$"���o��V���d��MRZZ*;v����;���[
��r�-��H^^�TVV��s�d��Y��#��qpp�A��������T>,�z����W�l��9sd��A�q�F���4�����e��ar��Y)++�O>�D�n���6��jILL����Kii�����2z�h����uZ�|������E�����u�4i�,X��h��?�X���^���PQQ����KF�i����p3t�P����E���F����-��b�wqq�1c��������X���'����n_C)))-"����X9p��h�Z9v�����O��_o��s���@�z�jqpp��+W����3g$22RD�����r5���/����������������o�)/���Q�~�~��_��n�������������^x��{R���_���ED����Vs�kx���P�r����aaa&�jJ���BEE�888X���Wk�AEE�a?�}rrr������.��={Jlll�?�L�/==]���D����Hii����IFFF���\�M}}���nHD��5��KLL4��YD��������������}��]��h����^F�i�������oJXX���j5j������k�.��e��mV}nSm ;w�4����YuK����{���*�{��K�.5�����n�:2d�Q�=��#�~����_�����[o�U6l�`�]S�6���e�m3u��~���>//O���,�����rqss��������7n�(�����g��!������<��S2t�Py��'ED���1c������lS�n���/��s���L�<Y���o�����IKK����w$���dqvv6�����j��3u���tqq���jQ~����X��>���g�����]�����������S��M�g���ggg�c���������W^�����V2w�i����_-""o���������qL}}��n����D���ooo�}�:������M���i��!//�h���|�����z9|��,Z�H�O�.���2t�P����y����D�����^����{�n9z��>���bC��cEE����Kff���TVVJ����������aM�y��d��Y����JZZZ��_{O�9�7���h���SDD|||��(,,4
%�/�I�&���������a�DD������>��p���z���oE.Lo�k�����[R�ER�6s��"�����z����_���$>>��{���C��+W��N����e��s��""!���._S�kx���p�z���(=���?���"���!!!�|�r�h4RRRb�����LNN��<���e����?\������$���b4����[������T�Z��}wS{������c��
S�N�����K������O���E�R5��P��F���J����UUU%�=����9SD�� ,�Km���SHVV������e������5���'�����""���,�>��U���5�O������*��w2v�X9{�l��������K�>}DD�F555F�������hx������V+j�Z._�,j�Z�����������z�������9;ND���7k�,7n���5�Q���u�w4Y.����F#�<��899��
&�V����J��{����~�;�?�h�Z�������n��������3g��y��Iii�<���F����r��9���0{��7������������<�?���s�D�R����jsS�M�s��%��� ������1cd���&��;�������X������%X�K������
���K/�$g�����B���0�?\w�k�*���5xm���:����V�ED���R���6�H��Vrr��9���Zs8 ���R^^.#F�0��Xs���vZ��������:tH�j�\7n���?�3�<#""����z���������,$$D.\hE}����>���������b�����[���u��	Y�n����n��K�J@@���e�������i�&�������H�=�Lsrrd��	���)�{�6��h���5lkVV�$$$�����3�����G�JLL��=O[�s��������+]�v���7����7qww�J%k��������T��\�v��u�]���a��o��Q�p2�vs�6�����7�|wSK����w��h������������o�i������o�MBB���0@�����������OD��nn��-nD��F�2t�ZsEDz��%����Q����_�S�NRTTdx�k�.����������s�H���������0����n�o���h��������)S�������G�G}T�"S�N���v��D���w+�����U�Vn�v$v��M7��_J����$::Zv��!eee���%/����}���m���/�-���2����E���w�)]�v������r?~��a�s�N�j�r��a��������d~������������-���r��UY�j�DEE�m��Q���'���RVV&�}��4��X�����FD������n�M�Z�U�QDyjE�V��l����QD��h�"�j�r��q;v�a o�[o�U����F����rY�p�Q�h�pSVV&��9s�����,Y��P�n�:�������KYY�<xPbcc
D�����,^�XDD>��S`���]���������*��s�_�����$"������$}��ggg��������hn����������]��G���|y��7���]DDT*����G�"���._|���v���K������0���1��9�,]�TBCC���C��c�MkM[#����X�B��'������rK����X�C����/nnn��o_����$00�h����e�����.���2a��G9�7�4|999IDD�L�6M����h��?�X��������N�<)�������h������[c��E���j��6<gKsqq����~.�"�
N��F���Y,Y�o����;��M!"�0:�JhD7����c�����"��g����J,^������e���9DD6+99�QoKQ��m)�\]]��K����oo��u(7DDDdW8������
�
��"""�+7DDDdWn�����0��]a�!"""��pCDDDv�������
�
��"""�+7DDDdWn�����0��]qj�]����Fe*�vhQ��m)"""�+7DDDdWn�����0��]a�!"""��pCDDDv�f��������y���h4����"�?���?~<�����[�#"""�e3����3F����/c����?�a�{��70j�(dff"!!o����rKuDDDd�lf?\�x���8r�f�����@\\��_�������3�2e
�;f���>DDDd�l��T*���
�������tDDD:w����4��������~���R#F��{�����~�~�)���
u:�nnn��SVVf��R]C������FeNN6sY���:���8::���l�����1e��!�� IDAT����x������b�S����tP��(//�����rKu
?~�Q�������MujuHz�]�vE�>}���6nz����G�<�E���������=z ##111f�-�5t�w���t��e�������DDD7���,�����9m�{b���������",[��'O6�M�4	k���F���5k0q�D��������~�L�y��7��SO!<<���F�n?���������sg����x�����[�#"""�e3������R&Lh��uxu�����[�������lf�
Qs%$$4*���RDDDD-������Z��������6='oKQ�xw:}�%:�����69-{n�����}�W��/��/�;�����oA7DDD����	��������`��V?=�
������ohb�]��`����"""�qYi���,}M�:Q�����jS8��������8���\������Jz��Vm�
YV]	��_fvg��oz[G'@��������	�������T�"""j��%���IlJTo`P��;��W��=�+��
�:�������6y������W� S�*�jz[�`%�JT^�!��C�/}y�7p���yx��Vm~�""��UE��#sh��Cs!����j��H`pm���P���WQ0{!�/�A����h��a�!""�Y�^+sh�2v��~e,MST@�~�=3�qC�f�R7�6�F�A����Yn����������m��M%���
���A����������n�����VI�7��fz[�N��[���D��];[����c�����GJJ
�������b�����"L�6
{����#���ooo����!""�+5�����fR?�����kp}����u`�����T"bf��3p�@��=S�N��+������C��s����O<�-Z�F��,7��%:�[�l��	Z��]�������m���@�������?�?
p�j�V����������6;���www���������AYY ..���G���q��L�2��3YnnKn�������j:�
��2��O0`T�SM�Am9����76�5j�(,]�3f���5k���h�KOOGD������3�����[�#""�i�:��>�`�@�_N��
��
�3�v p"�Wy��&f3=7g�����#���������O���'������P�T���pvvFMM��rs�\���chx	�z=._��~����'""�K�Y������}p={�J]���T���]�������@�u<��V���PUU�A���9m���O�^�uL�>III����;v�j5t:�j5������a��R]C�������p]�)**j�OJDD8������:{�g����#�U��(�9%=���`T{��W������j�pqqi�s�L���{76l�777L�>�g�6�EGG#33=z�@FFbbb��[�kh��QF���������ZT�8��v��v ���m���L�O59w��/�6kl��s��l&������/�����r�J����&M��5k�`���X�f
&N�h��RQ������0sx���vuU��:8=�?�7pj��{b3cn:��{�������g�}f�?��h0u�T����F����+���i���>��i)""�nYiJ�9�
8�(-6�mh���
�o�:b��e7����������VR���C��[MW.����G�gf`m�LXT����������J ��2������c�gvrz����3^��D����������J�9�8��+3�m���0��W��`js7DDD
\Uf>X��d~��m}��e
�VM@X[���`�!"��[E9p|O������S�Q]�@��������m�K1���E�����a&����AST�rua��0e��i7DDd��f*A����Sq��m�"� 38QY��'���I-�������V��
��v��y���{�>�=Z�������I������:��j����0�zH)k���s�d'��(e�!%''7*�W���l���<<�0���g��_���N�������� 3��V�G�k6�����Fe7DD���m^���w�T��q��6Ey�����hor.�>^'?``�f&!���~�)7DD����^~H	6������%��V;x;p����SS�]�'���6��_y��nJ7DD�~������r��)��S����������AC�""j/9��9���*����_�`�h�/�U�G�
�� ��L����7q2��3@T��iux7DD��2�K(���|��n�Oo���\M��������N�%�?o?�4=�O �k�z^������f���T�F�:EEE?~<���1~�x�-�TGDDm��*����������W50�/��������0�N�_
8��k��c��p#"����)S�G����5j233������z�l��:""j#%E�������T��\�{�x`���������W��'�n*S������4�w�FXX ..���G���q��L�2��3YnnKt:�l��	&���$"�k�Z`�G����=��N����#/4=�����|�^m�VjUYYY���A|||���������{�9s�!�@zz:"""�;wFZZ��rKuDD�J*��
_+�Qf��R�~
�f����!]Z��d�l��F���o��8|�0|||��������J��^����3jjjL����Z;w��^_?���@��"2�SuYK�������y)��r��4����]��<��m���Z-\\\0h��6;�M��l����z�Q��Z
�N�Z���rxxx�-�T����7�;�^�V��
DD���c����c8]�0���9E����{<\��O�TUY9Qc��p��������kT���L������1[n���������t�|�2{n�����|���)����G��uP"8�09::"''�M�i3OK��#G��o�F��&M��5k��h�f�L�8�l��:""�����8�D�^]|�MY"�������N�:�������4*�h4�:u*����Q�Fa������4YnnK���	)���{
8��qyX0�E`�\��i��������T��7DD�8�����M���a����;gN���6�y|���lG�9`��@�z�������,'����Mb�!""cW3�/���2^�����	��?���@%�7DD�(��L�����$J��������~�#��
����HZ���TQ^_��L������C��q0�������Ok?jz���/����}D������fSQ|�D�U�__�R	�(�?u�R	�q1��,��������r/��r'0��@L��T�:�""{'z`��	������
xt>7�}�F�
n�����{���@�i���A�o_�2	d�n�������R	g�G�Vn?��KcCd�n�����}J�9���<<x���D7�""{p�8��?L��4?Cy�����>w%Ng�AU�3XSS}��.��{jBo���h89�N���t"��,������o���	�_������G7�-G�M��������s�`Duj�60�uDW2�����nb��?+k@q�'jN�*T���m��=[�
�zt""jY�W���6,5^���]Y�i�_/�vk�-`�!"�4@���z��D�M�w������Fe6n�z=f���/��������������PTT�i��a��=1b������m���>DDJ��Ok>���rG'`�4�	����k���jq�B.R.��dz.N��!%=5��M7			��T"m�����'� --
/��"6o����{{����;>>>x��'�h�"h4,X��d��},��t��e&L���������r������]���C��O/q�'jE�:�\��
0�HI������[���0O�}f�
(��p3j�(|��G�����&qqqX�~=�w��3g�`��)8v���rs�X�pCD��������@^�?��e�[��i����*�����98u1�/����<d�j,�\���eUoM�v����R'O���S�0v�X���a���������#"B�v���3�����[�#"�9���V���N7����*����TU�q�R>R.������+�2i�E�[�������H�u	@�� �u	@����
���]oZ|Z���L��h48~�8��?���~s����?�@�MqsS�ipssCYY��rKu
m��	z��Q��]�Z���"���}��}9�s.U���@��GP=(��s��A/��B�r���[�_�j��W���r����
�U#:�Q����@T�;"��pr�[�������W�u�����j�89D/ptt��{����_i��k�f����f��	///<������4���j�t:��j�������l�������gn���q��	�����>)����N���'�7*���E����r;"D6�;���-Ng��TFNg*����WV[��V�@/�F��w�zu�C�H?�����c��2��
l?�	7%bTVV���^^��I>0*A>��:ns�L����JJJ"'���EGG#33=z�@FFbbb��[�k(44���N���'�d���{��_N�1.��|���w��DM������?����R���k���O�@��������5������E���wc��gee!''���-r|k�L��<y2���+���`������`��IX�f
f���5k�`���f�-���s������h\<<�s:�"@Iy�5&'.��j���cx{��w���
1u��5{Ll��<-��j1m�4l���
�W_}e�5��h0u�T����F����+���i���>��i)"jqg�/^v}{��O��COs���XEU
N]������s���O���^��'*�v�o zG�K�m��h���	7�����ZL��O[V���rOo`���?r���Du��/�s!�d����C-Mqvr@�p��[J�A&:��*�h'7�m)""����_���1^�I����4�I��d�D��9�
LR.��tF*�j,�JD��40�������'��<�"��R�������&�VY��7���G-�J���'��2&���<��WZ��V��g�����5������[n�n��nTYi��Oe%���N�������OXa��0�a���X�����Z�%����7*��k�n���WE9��g@�����5�hz
�=eU8U��R��<OS���Wby�Z�j���7
0}�"���-�rn�������re���l�����������mdQU����8�Q`N���B3�������uQ�P������
��Dl^�����?
LP��=�=Z�!�r�/��huU&���R��?�A�w��f�� -��~���m��Y����	%G����oT �����=�N��h�V1�Y"��X�:p1����`����pC��l�I���jv/�R3�p��Yf�������<o0W���<�WX��@�`o��u�^���e�6%''7*�W������K��=j\������}�u����G�O�\@y��8��s���/?�p%��'�:�������a�pCD�s�?��% �>e�0����2��c���8�����S��R���[L��������G������E3���O7�X[�|M9�5�(()������_-\�Rdr��zG����������N�C����?MFY����-@SV���QP)(����������/G���V�qvr@�p?e%�����E!*�������pCD7�+���6�R�����
L���\����TR^iB�K�
���Jm8�{o�ZH-!��!~����	������Un�����
,��������%�tZ]�Q���aE�Ce�uk�(wWg�y�����^j�ur����/����Kf���B���F�I[�>0�Q�PT�C����6>����/�V��_l�����r�i�3���vPVQe|+G���k�_T�]��F�]����V�������_y��������g��pCt=n��C���%��)����eB��&z\�J����~�x��;�+��[h����j�^��A$��~����J��d����&����QP��qQ����'G898�����6�*j�mw���
u�����K�%����\��Y�~����S��f���cN�����W\fT_P;n�9��'G�urSn�4��c6�x�mf��Gn�����	sC���^�;��fm"�`3�f�����u���O<�?�PTT�i��a��=1b������m���>Dt����������+FUWb�o�,�w
C��L��C���p��.��[�%9;9�?��o�\T<��{<:���N�_l�f�R�X^%��������_�.���p�����������
7d��AX�z5�uk����s����'�x�-�F���L�����N�-[�`��	7�����O�9E������������dz�������1��w�.0:�VU�S%`�*�U����`$��I1�U�
*^j�uR����"s����������6;����m�������?�e����������_c����������xC
)**���_�u6l������S'�{���2e
,X`���>D����W=��7�	����r���D@b�1�vhz!�h������U����9�T�=$��m�G��K
o��Cd,��$&&"))	���M�_�z>� �o�~C
��������w�^4_~�%"##����Z�FYY���QRRb���>�����}�g.�c�e�I��}�����k��/����l��/�~RE e$���������8�?�(���iT|,=qEDVk�������j���o���>������Crr2>��C<��C�mn��3����b����?>�����7�Q]]
�J�^ggg����,7���6l�pC�&�S^����2�^)����H�.C������cX�����P�)'o�Eq# �&���������s�{��+��___����X�`&O��s���x�


��"e-OOO���B�VC��"44��d��},a�
Q�R3���t�����SY8u1z?���=q[�!���k�Q���
W��������?��sS��_~�}���y�����B�^�����b
9r������z��������Fff&z���������-�TGDm�X[���dc��K�w�2��^FAI��}��00&�{�ch�0���@dP'l?�l���N��� g�D�0�����	�3g�`��
x����n�:DEE�XC�q��9���o��o`���y
&M��5k�`���X�f
&N�h��R���^���� ���^��/��+�{�cx�p�	���6?�����V�5y�3n��"��_��<�[�c��KNNnTf�m�O>�/��{�1��_��-�����;vl�4���#�5k��?��#Gb���

h4L�:{����Q��r�Jxzz�,7��%�-Ed�bm��������{:�N_FQ���>.N���	������W8��
GdP'������?���mj�O�W��rDN�\��9$����("j69��f�pC��^g2���%�IQ���2#z�cXl������pu6����`�W���@Qn������������b �'����f��>)�4�sCD7�bm��RB���,�=��b�V�:;b`��
�����W:v2��`s��Ks�h�z��p/p�L������?O�.�*������OID�"�~��8��D�0�:p��vG���h���7;v��<��<�����C���Ep�b����'��/5��V�����1�+����#;�^���3^�T*��e1�Q�(�ZQq�;�gw=��CD���@�W�/Q��P����P�"�l�f������X�|9n��v888X���lR]����Y8p����2���n/�������������@����A����xX�{�ED�%������41^���)�_o�&Xn���q�����y IDAT�2�u u�2uO/�=}	����������R|L�u�2��(v~����4:��+0b<0�a`�m�J�D��9�#���O1�k4Pq�j����7����}�]<���pw���D����R����2N����V��(O0)�d7�W�Z�(�h��WWs���2c�n|�]"�R��n��p���D�2)7��2�'��-)���O�>�4i^~�e�r��!ju�2{Nea���������Q�����	���u��\+?��J	5M�I�����������Q���
T���.(a�"
����2G�	�.�n�����s�j�*��!j'%��Y~�����������Q�:!�[�a�������i�nN����������������""������Ry����u��6�p�jJ���U�1e������csC�Fj�J����Y��r	�\��Wfd\�����^�l��N��=k��4�O�C[��D�2�.��������0S��[J�p\��h�5F��+t������L|h��q-������/����x��'9������c�)e��}��m���J���]�0�{���M���Y��[I!�u��9in��s���4D���e��������5�X��ktm��\���(��ZaoiS� �������e>�V�P��������w;����PL��F/8����^JQ�a:�e�W�K��2�L�p�
k�^���I��[�������������l`*kCLUv3�����(�����[sj�y@���P+��s���������\895�y�6D�X���Ng�_:�j]������A��{�#��
�CK�ICD-C_�`0�/��vl������W}hq�Vfj��Zn|\��'(��@l��ps��w#))	���M���������[[�qDM]����K�%�^*��_T���1�[z�c@������VE9���J/��9i&��GqN��"5@����r�~`oEP���c����=0���|4m��!���?/,��_|��rf���{��QQ�_iiiiX�~=�.]����zC�lQnq���=)Y8x6��^�q���2���4D���X	.�kLe�2gLs8z+��-�8��D��8[b1��;����;���|.\DEEa��I8p����[��D���Wfwm��>+{e�C}0����������c;�zX��&;�s��(�*3?�Tq�������p���}�����n��_�����z@����;��G3p��_Y�nH�P��oL���f���r	?���VWe��n��q��Ky��]{e���mT���y��9i�|X���t���Gt�j
����0�M��d��omx�����nJ�QY�������BNN���[������y����|�r<���FOaa��i��gF����$x{{�,7����R��m����{�Uy�����f&3"`�"E+���D������h�`�c�-��bO�V[��E����T|��>��Q���(`�����E-1 BH���I2�Y���f&��e������<y2y�5k�����~�]�����	%�V�P�m����}|q���y1+�k���������{\	7���aT|�1�V�
����"�-���,������|�r������3V�Z�+V��^H���9D�$8]���[���Si���rmQ�
%��UW���O=�Ic�S��j�&
Qw�ob
yC32���o/��nH�)��]32�����;P��i�?���o�K/��m������|��c��������{�o�����������Vo:�/}t��3|n]�=y������^�f����
�?��P�IC�E�E��}i�q ���>�5��6���7��G���T"i��C���?�)������NL�6
�?�<����^�w��5k�����[��X,hmm�(�p�\(,,D{{{��T��������?�\v+���F�u�R@��?=�1�9U�)@B���/���/�/���/��������$����$�ptz�d8;��%t�}�������
��D���>.ExV&���C�y:+�k�P�;
�S�[t2��&P0��^#p��&:��t��D����0�by���E�I���C=���~����?���x�b<x�W�e�4(r�y��EM&\.W�����m?p��K8Z�/�q����������������/D����$���CP�����$�p��J��~�Atz����@��������	  Ih=����$�p�|�}�+�N�Q�_�{n-+���A�ge�.��f�,y	��>�����aO��4�������z�����&MkkFK�� �Y	0	������z���d/�����
��6��|�� g��XLjG"���n��k���Q������<�/dx���hb���n<����������c�=�ky������'�E��(��v�b��l��/������&������������Rk� �2��H��No���O�/�7 ��������p�J2:<�rx@��I:�$n_���Oy�/&���Q��A��h�B��@��3���X�
�a�iP>��/��x���9R��%�~�����z��c��cq~�L�N���������<J���w�P�I����I0k����2_����p&�)���5gaN������Q�W���j���pKE��E�J�	�s:�7G����wp_I;�L�0���N�+W�����^!�'�|�O>��.***�e�������eeehjjBii)$m��/��nN��}��5����L�J�Rd��b��,�-�)�_9
�������������5�[�����-F���
�k���z�:��:�F=L-,&:-
�����}XM�h�}h5lf#�:
��F�fc������~��{)�h2�0��2eJ_}�'�5i,����5�#��
�6+W"E��R<	��b���e��qLT!oI�6��a�kh�^��T6�n���?a��%���5j�����W6]��A��\[[���:,]�uuu���I��S_��}��}_��_?�{����V���^�I�A���������tZ�F���4`�U�F�-&��
D#:e{�A	�A�A��C�E�h�����$�W��?�?�����#`�L�IC�'u�����<�R{��R�+�;C?�+�E��9����O����puL�	-n��go��[^]-n�N'����;v������.�Vk��T��u����y���<��l�����*��i`3+3v�Mh�"f�C	�Z�F=�F�z��zm������h����������#�ns�����/����YY�&
e[�Pzt�BIGT	���m�N���En�8�{1�qt����z_^_-�e��s�=h�)H��l��t�M� ����-�
&Z-lf�Z
��!��B��1�� �jY!U���8z:u�wo��|��k���c+���Roc�L��xr-����BFt(	����+�t'^����go�W�(����,t{��7o����WK-^�k����3���S�_������L��P(M��q�{��\#9Y>�l|��q�5i��s���a���?���GhD���4y��O���tLxV$��9X�!���}�3�O��AY�E[�|���w�Ui�X�6�Ey�-46�_c
mz��4p�@0O����r���J;�|��7��w�3��
Q���4���i����k��Pf��`t&�)�����wm�m�$���&JBAC�PH�v��5J��K8���p�z�=�>�_��}���v�y�����+���'����O7KC� +����$	�W^�u����M������
������RKs`G��40���5����������c�x�s �O_�v#��t&�+���D�P(	���LI���0���LIt)��D��D�kh�@��(�3��t`����tY�����u+jkk����p����\�������dP��p��	�3&��$���r��e��n`�j`�{J��Uv�RG3�� j��g��{��z������
�����<d��
�[����S5�b�
1��<
%��{�)��B^O�>��-S}�
�-E�%�5i0��������Q��9�}�|������H(�����S5���&����5��(��.(��okn��{���Tk�L����5ib�w/�;1�N!�gct%��c�����Z�n�$bX�M��nV�\�W_}�=���jn�.YxM��xp&����Q���x��
1_��3�e���5N<���P���(�nW
C	��Q_e���
���v���W��knb����5i�:�SI��Jp�W��I	8i�8[KC���V
�u��M�6�K{��:���~@D�ky}Z�������
@�����Tk�T�H��I�k���)#_L-��k�S&4b(��CL���S�����3C��3��������Vlh��k�H��%\�G��������f_����U��j\z(��N.��.qD�V=&��}�!�?Bj;-E�@[�&�M��xTA��PA����#`Z�f4`�J����Z��^�|�H{:�!�����	��������M����
o+�&��4��U�����������]�S��{3�����S�`��0��J�!"�������~��@�|�&`���B�S��4����%rURc��B��wtTA������
�R������J�O'��e�������������h�~ O�����Pp�)�
��lWS(���^�k(���D���n���<]Ey�L#��]�/��&�K���Y��>`��,��/`bja����]2��Uc��CI�����%�V�a��>#K@�S	>7���;�pu.'��(��m�c��?v����m��|=�����7�k�.t_�7|J�w��j��C�UI�e�;"�$oNK9r.����k��o��n����������;v���o���ka�����zNO4�n��O�3��|���r(P�B!"6��+���@�Sy��T��/NbC���fL6������mU�c�����A����7�tR�bf������N'FB��B#"��W(_
��***P[[����'X�z5^{�5�����_�W\q~���a��Up:�x�����zNJ��O�(u��xy�3��[�]�B�G	�N��+H����]	�!�3���/�7+�V�J-p��rK�h�U���px���
�Rg�Z�q@�g�#��������/Q��]����5k0o�<��ztvvb���p�������|��������{���}����zNR�?����_B&����Vn|�����'>�����PfL�fH��i��'�XlJ�0�]�$��M�6&���l���%�E�-z_�����o)�c�tG��]�}��f>v���=���f��
-�"^���{yi"R�n���x�
|���X�~=�b�����(��r������I�S='�������1u��y���J�����H�r�&NbC�/4+����{1���+�[��6`�bh��b.P������m)��'(K�y8�&��6����uL�������m�gb����&""c��r������-[PZZ
�j��$A��#&mO��X��G��,�&��H2Y L�&H����`B0���l�d0A�!���&�z$��h��7B-��fe_����"��0�sJ�r��V��8a\
���^j�An��]�sJ�L�dh��'��_3~������-�GD���N'dY��I����ySP�����?�.���[�(���@E��nX,���=�E��4��F2�!���MfMf%\����X�
�ASd�	���l
��&s��_W�)&a��!�`W�>xt��Z83��>�^�'��|��j���N2�n���~�DD����C���,u����+W��G��h��>�'�|2�WRR���f������)2������h%�&�
<�$���n`�*+�D�&�r�Rbk��Y��dV��(�d"`�B���E��-������SN�� � )+��~e�����.�-� ��.@�)��;��7j�H~�=��{��(Tu�$�h��`��=���($|Z*��&�|��G�$	K�.����1~��H_mm-����t�R������&e{O}�u�Q�I3��{X�:`*����7��S�9:`H��!u(�j������0	�$���N%`�&jm��5b4�g���(�������M�������Ca��};v,�|���:7N������;P^^�w�}V�5i{��$�p+H��y�,@����/�v��P �w(H�+3�6(3��!��Ft���=Q�	���F����w����N`�r�|#�/fA/QcAq�8�
��e|��v(�5W���E�N	�`��
�� {)t��S$r��Q�����?�m�r7iAh�����"�1����&�M�r�i������H0Z+���nd���N�A�a�jl��U@������4�����/'�fC�����tb�l��{���Z��/��fD�V(��������-���	@�nT�i��(+rn���O�i��K�s������`���M���kVB0������@�-`Dm�1v��'���@��k�
��p6�A���vP��,)wN�"�Jp���B�H���C$��C[%���������1����(��(Kn�0�a���" ��3��Z
������7�vDDF�������m�k���l3s=""""��M4�	�<6�����HDDD��M����8���I�#""���pKc��,.�FDD�O1���0���0���0���0���0�����DDD�o�����1�Q�UQQ��7���?��S��j�b��	��kW���p���
v�UUUhkkK��S�W���E�a��hii��%K�x��H����Q^^���fTTT`��)�{�#"""���p3�|�����l�}����G�F���_��s��f�a��9��aC�������H����f������6m��)S"?766���0b�;v,e{O}DDD�^�,�r�������X�n&L��j��$A��#&mO��X�7o�$I��<�����������(��������y3s@MM
^��H�Q��x �"�n7,K�����
>�[�	�8q���G����h��p��^oV_3o�����Q]]�U�V����[_II	���QVV���&����l��/��q�����xp��	���I�����������e��a���={v\_mm-����t:QWW������=��z�M��F�A�P���+�����tb��y��c��������j�&m��/���M�6����O�'�@r��)���`���Y{��	7���������"���i)"""���pCDDD��pCDDD��pCDDD��pCDDD��7��e���>�������������6��""""Ua�!"""Ua�!"""Ua�!"""Ua�!"""Ua�!"""Ua�!"""Ua�!"""Ua�!"""U��p�v�Zh4�Cr8�����nGUU���R���GDDD��7�f�����?dY��[�|9���������
�X�"e{O}DDD�^ynv���^x!a����1w�\�l6��36lH��S�W�������I�Q\\1b�;����>"""R/ANt(�A�;5��j $I�^�G0L���9�>��cH���M�e����{�DDDD ��W^��o�9k����+]Q��x �"�n7,K������z���~�����w/�L��wo���h�8w�GV_�_����477���MMM(--M��S_���w����gn���z��`���������M�M*���������D]]jjjR���GDDD�PQQ��7�F���z�)l�(�i	IDAT��
#F���={�l����=��z�]Aq�y<l��	����
Q�w��)���`���Y{����!"""�
7DDD�*7DDD�*7DDD�*7DDD�*7DDD�*7DDD�*7DDD�*7DDD�*7DDD�*7DDD�*7DDD�*7DDD�*�\�����R�����1�Q�UQQ����R�UUU������B[[[��DDDDY��p�|�r������X�bE��DDDDY��p�~�z��;6�
s����
r=$"""������FF��c���xDDDD�
�-(�x<0�L�����p��v��������2��� IRV_S��FEx<����
���p������e9�}���}=D""��n�g��TnJJJ������2455���4�v�g��������M�P]]��aR�l��W]u����\�zIcc#PYY���P/��q#&M��a���z(�K<�������jknjkkQWW�����:����zHDDD��
7O=��m��#F`��=X�lY��DDDDY���R6�
�q��ADDDY���"""�9��B�����l��0��|>h4�t���p$I����,�@���x`0�����j IC�^������T������T������T�DDD�o�����������Te@��r8�����nGUU����n�v�ZV�������c����Z��0av�����R��=���O� ��%K�dy���t���}�0y�dX�VL�<���?�<R�\��: k/_����hnnFEEV�X�p����c���	o�I�'���h�"��1---X�d	/^���R��=�/^���G!�2dY����,��2��q}�����#����?�0-Z��������Py?~�|��Y�e���A���n��+���,��c�w�=�o�����xdY�e��)[,����2��1-))�/\�����eH��������zeQ�6F�|��: kn,Z[[!�"\.
����t{A8{�dz\�����k���M�6ei���t�����q��7c����������oc���91�#��ZYY�9s�`������C]]6l���������n�Z-�A�$I����I�g��2=�


�����u�0a��,�����1�?>-Z��o��>�,�;�u���`���t���#Gp�w���C���m�p����`�t9n��j�����E���(**���L�=�M���q=p�jjj�������*�#�te�w.\����8�,��2��q�>}:���^<��X�v-V�^�-[�d�tYr�;t@���������������zC����������k���`���=���9$I��s�h4fm���t������`�X,<����{w6�I���7���������D]]jjjr=$���e��a���={v�GH�J�����?��U��������c��iY)e"��ZVV�5k���v��w���q��<R���^��������J�f�����r{{{���� ���W��U�����W_�b���t��g�}&O�0A�X,���3���O�b���t����{��'��(�'N��������%����YsCDDD�5 OK�z1���0���0���0����r="""�KU__��K����HUxZ����T������T������T������T������T������T������T������T������T�������Oq�]w�f��f�a������/z|� ���S�NM{?����#"b�!��x�<��8{�,N�<�Y�fa��}��G��<���0����#"b�!����=���B��f�l6,[�;w����9s3g���b���S�p8���j���V��=f�7�x#F����OC8�����3g0l��H��$	������s�F�������B������
���g��s"���pCD	=����9s&.\���{.��!���������6��;���'������G�m�|�I�p�
hii�s�=�!C�Pfd�����vEEE���k�g���={0z��H�_����������2����{�n�7��
��O����
NDI566���>�����g�l��7�p`��!8x� �
���L�<��� �Y�d�@aa!8����*�z���~���/��������o~��F����N���~�;�:u
+W��3�<���B,Y�$;��"J���{���/c��������7��x<��F�m�}.�:�.��,�444����]�p�����������n�����3g�����r>��C���BDyH��&D4����&M�^��=6l�����C(((H��T�2g��Eqqq�c7n:::���_��t�[��V�mG�	���m���l63��\}}}\kn�(�'�|�=�.^����_��W����#����x����r���[oa��iq�H���Y���K/��p�/����z<8i�oMM
�y�������>o���x��G����r>"�***�� %���(��5K��l��A�����rSSS����E�����V�<~�xy����,�2��YI��,�rkk�|��w�F�Q������;w�����l����#���s�NYy��mq����eY>z��,�|���^�D���`�
���������]�v������p�(XsCD�2h� L�4	���^��BD9��"""R��0���0���0����"�$Z��2����6�C�������3�|�/?�����P�����(������T������T������T������T������T���?����IEND�B`�
#15Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#12)
Re: Proposal: scan key push down to heap [WIP]

On Fri, Oct 28, 2016 at 2:46 AM, Andres Freund <andres@anarazel.de> wrote:

Well, that'll also make the feature not particularly useful :(. My
suspicion is that the way to suceed here isn't to rely more on testing
as part of the scan, but create a more general fastpath for qual
evaluation, which atm is a *LOT* more heavyweight than what
HeapKeyTest() does. But maybe I'm biased since I'm working on the
latter...

I think you might be right, but I'm not very clear on what the
timeline for your work is. It would be easier to say, sure, let's put
this on hold if we knew that in a month or two we could come back and
retest after you've made some progress. But I don't know whether
we're talking about months or years.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#15)
Re: Proposal: scan key push down to heap [WIP]

On 2016-10-31 09:28:00 -0400, Robert Haas wrote:

On Fri, Oct 28, 2016 at 2:46 AM, Andres Freund <andres@anarazel.de> wrote:

Well, that'll also make the feature not particularly useful :(. My
suspicion is that the way to suceed here isn't to rely more on testing
as part of the scan, but create a more general fastpath for qual
evaluation, which atm is a *LOT* more heavyweight than what
HeapKeyTest() does. But maybe I'm biased since I'm working on the
latter...

I think you might be right, but I'm not very clear on what the
timeline for your work is.

Me neither. But I think if we can stomach Dilip's approach of using a
slot in heapam, then I think my concerns are addressed, and this is
probably going go to be a win regardless of faster expression evaluation
and/or batching.

It would be easier to say, sure, let's put
this on hold if we knew that in a month or two we could come back and
retest after you've made some progress. But I don't know whether
we're talking about months or years.

I sure hope it's months.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Kouhei Kaigai
kaigai@ak.jp.nec.com
In reply to: Dilip Kumar (#14)
Re: Proposal: scan key push down to heap [WIP]

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Dilip Kumar
Sent: Saturday, October 29, 2016 3:48 PM
To: Andres Freund
Cc: Tom Lane; Alvaro Herrera; pgsql-hackers
Subject: Re: [HACKERS] Proposal: scan key push down to heap [WIP]

On Wed, Oct 26, 2016 at 12:01 PM, Andres Freund <andres@anarazel.de> wrote:

The gains are quite noticeable in some cases. So if we can make it work
without noticeable downsides...

What I'm worried about though is that this, afaics, will quite
noticeably *increase* total cost in cases with a noticeable number of
columns and a not that selective qual. The reason for that being that
HeapKeyTest() uses heap_getattr(), whereas upper layers use
slot_getattr(). The latter "caches" repeated deforms, the former
doesn't... That'll lead to deforming being essentially done twice, and
it's quite often already a major cost of query processing.

What about putting slot reference inside HeapScanDesc ?. I know it
will make ,heap layer use executor structure but just a thought.

I have quickly hacked this way where we use slot reference in
HeapScanDesc and directly use
slot_getattr inside HeapKeyTest (only if we have valid slot otherwise
use _heap_getattr) and measure the worst case performance (what you
have mentioned above.)

My Test: (21 column table with varchar in beginning + qual is on last
few column + varying selectivity )

postgres=# \d test
Table "public.test"
Column | Type | Modifiers
--------+-------------------+-----------
f1 | integer |
f2 | character varying |
f3 | integer |
f4 | integer |
f5 | integer |
f6 | integer |
f7 | integer |
f8 | integer |
f9 | integer |
f10 | integer |
f11 | integer |
f12 | integer |
f13 | integer |
f14 | integer |
f15 | integer |
f16 | integer |
f17 | integer |
f18 | integer |
f19 | integer |
f20 | integer |
f21 | integer |

tuple count : 10000000 (10 Million)
explain analyze select * from test where f21< $1 and f20 < $1 and f19
< $1 and f15 < $1 and f10 < $1; ($1 vary from 1Million to 1Million).

Target code base:
-----------------------
1. Head
2. Heap_scankey_pushdown_v1
3. My hack for keeping slot reference in HeapScanDesc
(v1+use_slot_in_HeapKeyTest)

Result:
Selectivity Head scan_key_pushdown_v1 v1+use_slot_in_HeapKeyTest
0.1 3880 2980 2747
0.2 4041 3187 2914
0.5 5051 4921 3626
0.8 5378 7296 3879
1.0 6161 8525 4575

Performance graph is attached in the mail..

Observation:
----------------
1. Heap_scankey_pushdown_v1, start degrading after very high
selectivity (this behaviour is only visible if table have 20 or more
columns, I tested with 10 columns but with that I did not see any
regression in v1).

2. (v1+use_slot_in_HeapKeyTest) is always winner, even at very high selectivity.

Prior to this interface change, it may be a starting point to restrict scan key
pushdown only when OpExpr references the column with static attcacheoff.
This type of column does not need walks on tuples from the head, thus, tuple
deforming cost will not be a downside.

By the way, I'm a bit skeptical whether this enhancement is really beneficial
than works for this enhancement, because we can now easily increase the number
of processor cores to run seq-scan with qualifier, especially, when it has high
selectivity.
How about your thought?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#14)
1 attachment(s)
Re: Proposal: scan key push down to heap [WIP]

On Sat, Oct 29, 2016 at 12:17 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

What about putting slot reference inside HeapScanDesc ?. I know it
will make ,heap layer use executor structure but just a thought.

I have quickly hacked this way where we use slot reference in
HeapScanDesc and directly use
slot_getattr inside HeapKeyTest (only if we have valid slot otherwise
use _heap_getattr) and measure the worst case performance (what you
have mentioned above.)

My Test: (21 column table with varchar in beginning + qual is on last
few column + varying selectivity )

postgres=# \d test
Table "public.test"
Column | Type | Modifiers
--------+-------------------+-----------
f1 | integer |
f2 | character varying |
f3 | integer |
f4 | integer |
f5 | integer |
f6 | integer |
f7 | integer |
f8 | integer |
f9 | integer |
f10 | integer |
f11 | integer |
f12 | integer |
f13 | integer |
f14 | integer |
f15 | integer |
f16 | integer |
f17 | integer |
f18 | integer |
f19 | integer |
f20 | integer |
f21 | integer |

tuple count : 10000000 (10 Million)
explain analyze select * from test where f21< $1 and f20 < $1 and f19
< $1 and f15 < $1 and f10 < $1; ($1 vary from 1Million to 1Million).

Target code base:
-----------------------
1. Head
2. Heap_scankey_pushdown_v1
3. My hack for keeping slot reference in HeapScanDesc
(v1+use_slot_in_HeapKeyTest)

Result:
Selectivity Head scan_key_pushdown_v1 v1+use_slot_in_HeapKeyTest
0.1 3880 2980 2747
0.2 4041 3187 2914
0.5 5051 4921 3626
0.8 5378 7296 3879
1.0 6161 8525 4575

Performance graph is attached in the mail..

Observation:
----------------
1. Heap_scankey_pushdown_v1, start degrading after very high
selectivity (this behaviour is only visible if table have 20 or more
columns, I tested with 10 columns but with that I did not see any
regression in v1).

2. (v1+use_slot_in_HeapKeyTest) is always winner, even at very high selectivity.

The patch is attached for this (storing slot reference in HeapScanDesc)..

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

heap_scankey_pushdown_v2.patchapplication/octet-stream; name=heap_scankey_pushdown_v2.patchDownload
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index b019bc1..3c6aad1 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -299,6 +299,7 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
 	ItemPointerSetInvalid(&scan->rs_ctup.t_self);
 	scan->rs_cbuf = InvalidBuffer;
 	scan->rs_cblock = InvalidBlockNumber;
+	scan->rs_scanslot = NULL;
 
 	/* page-at-a-time fields are always invalid when not rs_inited */
 
@@ -660,7 +661,8 @@ heapgettup(HeapScanDesc scan,
 
 				if (valid && key != NULL)
 					HeapKeyTest(tuple, RelationGetDescr(scan->rs_rd),
-								nkeys, key, valid);
+							NULL, 0,
+							nkeys, key, valid);
 
 				if (valid)
 				{
@@ -955,6 +957,7 @@ heapgettup_pagemode(HeapScanDesc scan,
 				bool		valid;
 
 				HeapKeyTest(tuple, RelationGetDescr(scan->rs_rd),
+							scan->rs_scanslot, scan->rs_cbuf,
 							nkeys, key, valid);
 				if (valid)
 				{
@@ -1649,7 +1652,8 @@ heap_parallelscan_initialize(ParallelHeapScanDesc target, Relation relation,
  * ----------------
  */
 HeapScanDesc
-heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan)
+heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan,
+						int nkeys, ScanKey key)
 {
 	Snapshot	snapshot;
 
@@ -1657,7 +1661,7 @@ heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan)
 	snapshot = RestoreSnapshot(parallel_scan->phs_snapshot_data);
 	RegisterSnapshot(snapshot);
 
-	return heap_beginscan_internal(relation, snapshot, 0, NULL, parallel_scan,
+	return heap_beginscan_internal(relation, snapshot, nkeys, key, parallel_scan,
 								   true, true, true, false, false, true);
 }
 
diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index fb0013d..3a12ec4 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -205,7 +205,7 @@ ExecScan(ScanState *node,
 		 * when the qual is nil ... saves only a few cycles, but they add up
 		 * ...
 		 */
-		if (!qual || ExecQual(qual, econtext, false))
+		if (!node->ps.qual || ExecQual(node->ps.qual, econtext, false))
 		{
 			/*
 			 * Found a satisfactory scan tuple.
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 00bf3a5..b54d5fd 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -30,9 +30,12 @@
 #include "executor/execdebug.h"
 #include "executor/nodeSeqscan.h"
 #include "utils/rel.h"
+#include "optimizer/clauses.h"
 
 static void InitScanRelation(SeqScanState *node, EState *estate, int eflags);
 static TupleTableSlot *SeqNext(SeqScanState *node);
+static void get_scankey_from_qual(TupleDesc tupDesc, List *qual, PlanState *ps,
+							int *nkeys, ScanKey *rs_keys);
 
 /* ----------------------------------------------------------------
  *						Scan Support
@@ -64,17 +67,35 @@ SeqNext(SeqScanState *node)
 
 	if (scandesc == NULL)
 	{
+		int 		 nkeys = 0;
+		ScanKey		 keys = NULL;
+
+		/*
+		 * Pull out scan key from qual list
+		 */
+		get_scankey_from_qual(RelationGetDescr(node->ss.ss_currentRelation),
+							 node->ss.ps.plan->qual, &node->ss.ps,
+							 &nkeys, &keys);
+
 		/*
 		 * We reach here if the scan is not parallel, or if we're executing a
 		 * scan that was intended to be parallel serially.
 		 */
 		scandesc = heap_beginscan(node->ss.ss_currentRelation,
 								  estate->es_snapshot,
-								  0, NULL);
+								  nkeys, keys);
+
 		node->ss.ss_currentScanDesc = scandesc;
 	}
 
 	/*
+	 *  If we have pushed down the scan keys to heap and scan slot is not
+	 *  yet stored in heap scan descriptor then do it.
+	 */
+	if (scandesc->rs_nkeys && !scandesc->rs_scanslot)
+		scandesc->rs_scanslot = slot;
+
+	/*
 	 * get the next tuple from the table
 	 */
 	tuple = heap_getnext(scandesc, direction);
@@ -86,16 +107,18 @@ SeqNext(SeqScanState *node)
 	 * not created with palloc() and so should not be pfree()'d.  Note also
 	 * that ExecStoreTuple will increment the refcount of the buffer; the
 	 * refcount will not be dropped until the tuple table slot is cleared.
+	 *
+	 * If scandesc have valid scanslot means we would have already stored
+	 * tuple in slot, so no need to do it again.
 	 */
-	if (tuple)
+	if (!tuple)
+		ExecClearTuple(slot);
+	else if (!scandesc->rs_scanslot)
 		ExecStoreTuple(tuple,	/* tuple to store */
 					   slot,	/* slot to store in */
 					   scandesc->rs_cbuf,		/* buffer associated with this
 												 * tuple */
 					   false);	/* don't pfree this pointer */
-	else
-		ExecClearTuple(slot);
-
 	return slot;
 }
 
@@ -317,14 +340,24 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelHeapScanDesc pscan;
+	int					 nkeys = 0;
+	ScanKey				 keys = NULL;
+
+	/*
+	 * Pull out scan key from qual list
+	 */
+	get_scankey_from_qual(RelationGetDescr(node->ss.ss_currentRelation),
+						 node->ss.ps.plan->qual, &node->ss.ps,
+						 &nkeys, &keys);
 
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	heap_parallelscan_initialize(pscan,
 								 node->ss.ss_currentRelation,
 								 estate->es_snapshot);
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pscan);
-	node->ss.ss_currentScanDesc =
-		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+	node->ss.ss_currentScanDesc = heap_beginscan_parallel(
+										node->ss.ss_currentRelation, pscan,
+										nkeys, keys);
 }
 
 /* ----------------------------------------------------------------
@@ -337,8 +370,130 @@ void
 ExecSeqScanInitializeWorker(SeqScanState *node, shm_toc *toc)
 {
 	ParallelHeapScanDesc pscan;
+	int					 nkeys = 0;
+	ScanKey				 keys = NULL;
 
 	pscan = shm_toc_lookup(toc, node->ss.ps.plan->plan_node_id);
-	node->ss.ss_currentScanDesc =
-		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+
+	/*
+	 * Pull out scan key from qual list
+	 */
+	get_scankey_from_qual(RelationGetDescr(node->ss.ss_currentRelation),
+						 node->ss.ps.plan->qual, &node->ss.ps,
+						 &nkeys, &keys);
+
+	node->ss.ss_currentScanDesc = heap_beginscan_parallel(
+										node->ss.ss_currentRelation, pscan,
+										nkeys, keys);
+}
+
+/*
+ * get_scankey_from_qual
+ *
+ * process complete QUAL list and create scankey entry for pushable quals.
+ *
+ * quals which can be pushed down:
+ * 1. qual should be of form VAR op CONST.
+ * 2. VAR should be for datatype with fixed length.
+ */
+static void
+get_scankey_from_qual(TupleDesc tupDesc, List *qual, PlanState *ps,
+					int *nkeys, ScanKey *rs_keys)
+{
+	ListCell   		*l, *l1;
+	Expr 	   		*expr;
+	ListCell   		*prev = NULL;
+	ScanKey			 key;
+	Expr			*leftop;
+	Expr			*rightop;
+	AttrNumber	 	 varattno;
+	Datum		 	 scanvalue;
+
+	/*
+	 * Validate each qual whether this qual can be push down to heap node
+	 * or not, If this can be pushed down then create a ScanKey entry
+	 * and delete it from qual list of PlanState
+	 */
+	forboth(l, qual, l1, ps->qual)
+	{
+		expr = (Expr *) lfirst(l);
+
+		if (!IsA(expr, OpExpr))
+		{
+			prev = l1;
+			continue;
+		}
+
+
+		/* leftop should be the Var, possibly relabeled. */
+		leftop = (Expr *) get_leftop(expr);
+		if (leftop && IsA(leftop, RelabelType))
+			leftop = ((RelabelType *) leftop)->arg;
+
+		Assert(leftop != NULL);
+
+		/* If leftop is not var then continue and check next qual. */
+		if (!(IsA(leftop, Var)))
+		{
+			prev = l1;
+			continue;
+		}
+
+		varattno = ((Var *) leftop)->varattno;
+
+		/*
+		 * We don't want to push down the qual which can have variable
+		 * length data.
+		 */
+		if (varattno <= 0 ||
+			(tupDesc->attrs[varattno - 1])->attlen == -1)
+		{
+			prev = l1;
+			continue;
+		}
+
+		rightop = (Expr *) get_rightop(expr);
+		if (rightop && IsA(rightop, RelabelType))
+			rightop = ((RelabelType *) rightop)->arg;
+
+		Assert(rightop != NULL);
+
+		/*
+		 * If right op is not constant then we can not push down this key
+		 * so move to next qual.
+		 */
+		if (!IsA(rightop, Const))
+		{
+			prev = l1;
+			continue;
+		}
+
+		scanvalue = ((Const *) rightop)->constvalue;
+
+		/* If we havn't yet allocated memory for scan keys, then do so. */
+		if (*rs_keys == NULL)
+		{
+			*rs_keys = (ScanKey) palloc(sizeof(ScanKeyData) *
+									list_length(qual));
+			key = *rs_keys;
+		}
+
+		/* Create scan key entry */
+		ScanKeyInit(key,
+				varattno,
+				InvalidStrategy,
+				((OpExpr *) expr)->opfuncid,
+				scanvalue);
+
+		key->sk_collation = ((OpExpr *) expr)->inputcollid;
+
+		key++;
+		(*nkeys)++;
+
+		/*
+		 * We have created scankey for this qual entry so remove it from
+		 * qual list.
+		 */
+		ps->qual = list_delete_cell(ps->qual, l1, prev);
+	}
 }
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 6016d19..9e12ebe 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1168,6 +1168,7 @@ SearchCatCache(CatCache *cache,
 		 */
 		HeapKeyTest(&ct->tuple,
 					cache->cc_tupdesc,
+					NULL, 0,
 					cache->cc_nkeys,
 					cur_skey,
 					res);
@@ -1459,6 +1460,7 @@ SearchCatCacheList(CatCache *cache,
 			continue;
 		HeapKeyTest(&cl->tuple,
 					cache->cc_tupdesc,
+					NULL, 0,
 					nkeys,
 					cur_skey,
 					res);
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 0d12bbb..d02f44a 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -130,7 +130,9 @@ extern HeapTuple heap_getnext(HeapScanDesc scan, ScanDirection direction);
 extern Size heap_parallelscan_estimate(Snapshot snapshot);
 extern void heap_parallelscan_initialize(ParallelHeapScanDesc target,
 							 Relation relation, Snapshot snapshot);
-extern HeapScanDesc heap_beginscan_parallel(Relation, ParallelHeapScanDesc);
+extern HeapScanDesc heap_beginscan_parallel(Relation relation,
+						ParallelHeapScanDesc parallel_scan, int nkeys,
+						ScanKey key);
 
 extern bool heap_fetch(Relation relation, Snapshot snapshot,
 		   HeapTuple tuple, Buffer *userbuf, bool keep_buf,
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index de98dd6..82f2910 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -20,7 +20,7 @@
 #include "access/itup.h"
 #include "access/tupdesc.h"
 #include "storage/spin.h"
-
+#include "executor/tuptable.h"
 /*
  * Shared state for parallel heap scan.
  *
@@ -70,6 +70,7 @@ typedef struct HeapScanDescData
 	Buffer		rs_cbuf;		/* current buffer in scan, if any */
 	/* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
 	ParallelHeapScanDesc rs_parallel;	/* parallel scan information */
+	TupleTableSlot	*rs_scanslot;
 
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	int			rs_cindex;		/* current tuple's index in vistuples */
diff --git a/src/include/access/valid.h b/src/include/access/valid.h
index 30af9df..5c13224 100644
--- a/src/include/access/valid.h
+++ b/src/include/access/valid.h
@@ -21,14 +21,23 @@
  */
 #define HeapKeyTest(tuple, \
 					tupdesc, \
+					slot, \
+					buf, \
 					nkeys, \
 					keys, \
 					result) \
 do \
 { \
 	/* Use underscores to protect the variables passed in as parameters */ \
-	int			__cur_nkeys = (nkeys); \
-	ScanKey		__cur_keys = (keys); \
+	int				__cur_nkeys = (nkeys); \
+	ScanKey			__cur_keys = (keys); \
+	\
+	if ((slot))	\
+			ExecStoreTuple(tuple,	/* tuple to store */	\
+					   (slot),	/* slot to store in */      \
+					   (buf),		/* buffer associated with this \
+												 * tuple */ \
+					   false);	/* don't pfree this pointer */ \
  \
 	(result) = true; /* may change */ \
 	for (; __cur_nkeys--; __cur_keys++) \
@@ -43,10 +52,13 @@ do \
 			break; \
 		} \
  \
-		__atp = heap_getattr((tuple), \
+ 	 	if (!(slot))	\
+			__atp = heap_getattr((tuple), \
 							 __cur_keys->sk_attno, \
 							 (tupdesc), \
 							 &__isnull); \
+		else \
+		    __atp = slot_getattr((slot), __cur_keys->sk_attno, &__isnull); \
  \
 		if (__isnull) \
 		{ \
#19Robert Haas
robertmhaas@gmail.com
In reply to: Kouhei Kaigai (#17)
Re: Proposal: scan key push down to heap [WIP]

On Tue, Nov 1, 2016 at 8:31 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

By the way, I'm a bit skeptical whether this enhancement is really beneficial
than works for this enhancement, because we can now easily increase the number
of processor cores to run seq-scan with qualifier, especially, when it has high
selectivity.
How about your thought?

Are you saying we don't need to both making sequential scans faster
because we could just use parallel sequential scan instead? That
doesn't sound right to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Dilip Kumar
dilipbalaut@gmail.com
In reply to: Robert Haas (#19)
2 attachment(s)
Re: Proposal: scan key push down to heap [WIP]

I have done performance analysis for TPCH queries, I saw visible gain
in 5 queries (10-25%).

Performance Data:

Benchmark : TPCH (S.F. 10)
shared_buffer : 20GB
work_mem : 50MB
Machine : POWER

Results are median of three run (explain analyze results for both
head/patch are attached in TPCH_out.tar).

Query Execution Time in (ms)
Head Patch Improvement
Q3 18475 16558 10%
Q4 7526 5856 22%
Q7 19386 17425 10%
Q10 16994 15019 11%
Q12 13852 10117 26%

Currently we had two major problems about this patch..

Problem1: As Andres has mentioned, HeapKeyTest uses heap_getattr,
whereas ExecQual use slot_getattr().So we can have worst case
performance problem when very less tuple are getting filter out and we
have table with many columns with qual on most of the columns.

Problem2. In HeapKeyTest we are under per_query_ctx, whereas in
ExecQual we are under per_tuple_ctx , so in former we can not afford
to have any palloc.

In this patch I have address both the concern by exposing executor
information to heap (I exposed per_tuple_ctx and slot to HeapDesc),
which is not a very good design.

I have other ideas in mind for solving these concerns, please provide
your thoughts..

For problem1 :
I think it's better to give task of key push down to optimizer, there
we can actually take the decision mostly based on two parameters.
1. Selectivity.
2. Column number on which qual is given.

For problem2 :
I think for solving this we need to limit the number of datatype we
pushdown to heap (I mean we can push all datatype which don't need
palloc in qual test).
1. Don't push down datatype with variable length.
2. Some other datatype with fixed length like 'name' can also
palloc (i.e nameregexeq). so we need to block them as well.

*Note: For exactly understanding which key is pushed down in below
attached exact analyze output, refer this example..

without patch:
-> Parallel Seq Scan on orders (cost=0.00..369972.00 rows=225038
width=20) (actual time=0.025..3216.157 rows=187204 loops=3)
Filter: ((o_orderdate >= '1995-01-01'::date) AND
(o_orderdate < '1995-04-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 4812796

with patch:
-> Parallel Seq Scan on orders (cost=0.00..369972.00 rows=225038
width=20) (actual time=0.015..1884.993 rows=187204 loops=3)
Filter: ((o_orderdate >= '1995-01-01'::date) AND
(o_orderdate < '1995-04-01 00:00:00'::timestamp without time zone))

1. So basically on head it shows how many rows are discarded by filter
(Rows Removed by Filter: 4812796), Now if we pushdown all the keys
then it will not show this value.

2. We can also check how much actual time reduced in the SeqScan node.
(i.e in above example on head it was 3216.157 whereas with patch it
was 1884.993).

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

On Thu, Nov 3, 2016 at 7:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Nov 1, 2016 at 8:31 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

By the way, I'm a bit skeptical whether this enhancement is really beneficial
than works for this enhancement, because we can now easily increase the number
of processor cores to run seq-scan with qualifier, especially, when it has high
selectivity.
How about your thought?

Are you saying we don't need to both making sequential scans faster
because we could just use parallel sequential scan instead? That
doesn't sound right to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

heap_scankey_pushdown_expr_ctx.patchapplication/octet-stream; name=heap_scankey_pushdown_expr_ctx.patchDownload
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index b019bc1..ab95fcc 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -299,6 +299,7 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
 	ItemPointerSetInvalid(&scan->rs_ctup.t_self);
 	scan->rs_cbuf = InvalidBuffer;
 	scan->rs_cblock = InvalidBlockNumber;
+	scan->rs_scanslot = NULL;
 
 	/* page-at-a-time fields are always invalid when not rs_inited */
 
@@ -660,7 +661,8 @@ heapgettup(HeapScanDesc scan,
 
 				if (valid && key != NULL)
 					HeapKeyTest(tuple, RelationGetDescr(scan->rs_rd),
-								nkeys, key, valid);
+							NULL, 0, NULL,
+							nkeys, key, valid);
 
 				if (valid)
 				{
@@ -955,7 +957,9 @@ heapgettup_pagemode(HeapScanDesc scan,
 				bool		valid;
 
 				HeapKeyTest(tuple, RelationGetDescr(scan->rs_rd),
-							nkeys, key, valid);
+						scan->rs_scanslot, scan->rs_cbuf,
+						scan->per_tuple_context,
+						nkeys, key, valid);
 				if (valid)
 				{
 					scan->rs_cindex = lineindex;
@@ -1649,7 +1653,8 @@ heap_parallelscan_initialize(ParallelHeapScanDesc target, Relation relation,
  * ----------------
  */
 HeapScanDesc
-heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan)
+heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan,
+						int nkeys, ScanKey key)
 {
 	Snapshot	snapshot;
 
@@ -1657,7 +1662,7 @@ heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan)
 	snapshot = RestoreSnapshot(parallel_scan->phs_snapshot_data);
 	RegisterSnapshot(snapshot);
 
-	return heap_beginscan_internal(relation, snapshot, 0, NULL, parallel_scan,
+	return heap_beginscan_internal(relation, snapshot, nkeys, key, parallel_scan,
 								   true, true, true, false, false, true);
 }
 
diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index fb0013d..3a12ec4 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -205,7 +205,7 @@ ExecScan(ScanState *node,
 		 * when the qual is nil ... saves only a few cycles, but they add up
 		 * ...
 		 */
-		if (!qual || ExecQual(qual, econtext, false))
+		if (!node->ps.qual || ExecQual(node->ps.qual, econtext, false))
 		{
 			/*
 			 * Found a satisfactory scan tuple.
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 00bf3a5..0f28443 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -30,9 +30,12 @@
 #include "executor/execdebug.h"
 #include "executor/nodeSeqscan.h"
 #include "utils/rel.h"
+#include "optimizer/clauses.h"
 
 static void InitScanRelation(SeqScanState *node, EState *estate, int eflags);
 static TupleTableSlot *SeqNext(SeqScanState *node);
+static void get_scankey_from_qual(TupleDesc tupDesc, List *qual, PlanState *ps,
+							int *nkeys, ScanKey *rs_keys);
 
 /* ----------------------------------------------------------------
  *						Scan Support
@@ -64,17 +67,39 @@ SeqNext(SeqScanState *node)
 
 	if (scandesc == NULL)
 	{
+		int 		 nkeys = 0;
+		ScanKey		 keys = NULL;
+
+		/*
+		 * Pull out scan key from qual list
+		 */
+		get_scankey_from_qual(RelationGetDescr(node->ss.ss_currentRelation),
+							 node->ss.ps.plan->qual, &node->ss.ps,
+							 &nkeys, &keys);
+
 		/*
 		 * We reach here if the scan is not parallel, or if we're executing a
 		 * scan that was intended to be parallel serially.
 		 */
 		scandesc = heap_beginscan(node->ss.ss_currentRelation,
 								  estate->es_snapshot,
-								  0, NULL);
+								  nkeys, keys);
+
 		node->ss.ss_currentScanDesc = scandesc;
 	}
 
 	/*
+	 *  If we have pushed down the scan keys to heap and scan slot is not
+	 *  yet stored in heap scan descriptor then do it.
+	 */
+	if (scandesc->rs_nkeys && !scandesc->rs_scanslot)
+	{
+		scandesc->rs_scanslot = slot;
+		scandesc->per_tuple_context =
+				node->ss.ps.ps_ExprContext->ecxt_per_tuple_memory;
+	}
+
+	/*
 	 * get the next tuple from the table
 	 */
 	tuple = heap_getnext(scandesc, direction);
@@ -86,16 +111,18 @@ SeqNext(SeqScanState *node)
 	 * not created with palloc() and so should not be pfree()'d.  Note also
 	 * that ExecStoreTuple will increment the refcount of the buffer; the
 	 * refcount will not be dropped until the tuple table slot is cleared.
+	 *
+	 * If scandesc have valid scanslot means we would have already stored
+	 * tuple in slot, so no need to do it again.
 	 */
-	if (tuple)
+	if (!tuple)
+		ExecClearTuple(slot);
+	else if (!scandesc->rs_scanslot)
 		ExecStoreTuple(tuple,	/* tuple to store */
 					   slot,	/* slot to store in */
 					   scandesc->rs_cbuf,		/* buffer associated with this
 												 * tuple */
 					   false);	/* don't pfree this pointer */
-	else
-		ExecClearTuple(slot);
-
 	return slot;
 }
 
@@ -317,14 +344,24 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelHeapScanDesc pscan;
+	int					 nkeys = 0;
+	ScanKey				 keys = NULL;
+
+	/*
+	 * Pull out scan key from qual list
+	 */
+	get_scankey_from_qual(RelationGetDescr(node->ss.ss_currentRelation),
+						 node->ss.ps.plan->qual, &node->ss.ps,
+						 &nkeys, &keys);
 
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	heap_parallelscan_initialize(pscan,
 								 node->ss.ss_currentRelation,
 								 estate->es_snapshot);
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pscan);
-	node->ss.ss_currentScanDesc =
-		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+	node->ss.ss_currentScanDesc = heap_beginscan_parallel(
+										node->ss.ss_currentRelation, pscan,
+										nkeys, keys);
 }
 
 /* ----------------------------------------------------------------
@@ -337,8 +374,129 @@ void
 ExecSeqScanInitializeWorker(SeqScanState *node, shm_toc *toc)
 {
 	ParallelHeapScanDesc pscan;
+	int					 nkeys = 0;
+	ScanKey				 keys = NULL;
 
 	pscan = shm_toc_lookup(toc, node->ss.ps.plan->plan_node_id);
-	node->ss.ss_currentScanDesc =
-		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+
+	/*
+	 * Pull out scan key from qual list
+	 */
+	get_scankey_from_qual(RelationGetDescr(node->ss.ss_currentRelation),
+						 node->ss.ps.plan->qual, &node->ss.ps,
+						 &nkeys, &keys);
+
+	node->ss.ss_currentScanDesc = heap_beginscan_parallel(
+										node->ss.ss_currentRelation, pscan,
+										nkeys, keys);
+}
+
+/*
+ * get_scankey_from_qual
+ *
+ * process complete QUAL list and create scankey entry for pushable quals.
+ *
+ * quals which can be pushed down:
+ * 1. qual should be of form VAR op CONST.
+ * 2. VAR should be for datatype with fixed length.
+ */
+static void
+get_scankey_from_qual(TupleDesc tupDesc, List *qual, PlanState *ps,
+					int *nkeys, ScanKey *rs_keys)
+{
+	ListCell   		*l, *l1;
+	Expr 	   		*expr;
+	ListCell   		*prev = NULL;
+	ScanKey			 key;
+	Expr			*leftop;
+	Expr			*rightop;
+	AttrNumber	 	 varattno;
+	Datum		 	 scanvalue;
+
+	/*
+	 * Validate each qual whether this qual can be push down to heap node
+	 * or not, If this can be pushed down then create a ScanKey entry
+	 * and delete it from qual list of PlanState
+	 */
+	forboth(l, qual, l1, ps->qual)
+	{
+		expr = (Expr *) lfirst(l);
+
+		if (!IsA(expr, OpExpr))
+		{
+			prev = l1;
+			continue;
+		}
+
+
+		/* leftop should be the Var, possibly relabeled. */
+		leftop = (Expr *) get_leftop(expr);
+		if (leftop && IsA(leftop, RelabelType))
+			leftop = ((RelabelType *) leftop)->arg;
+
+		Assert(leftop != NULL);
+
+		/* If leftop is not var then continue and check next qual. */
+		if (!(IsA(leftop, Var)))
+		{
+			prev = l1;
+			continue;
+		}
+
+		varattno = ((Var *) leftop)->varattno;
+
+		/*
+		 * We don't want to push down the qual which can have variable
+		 * length data.
+		 */
+		if (varattno <= 0)
+		{
+			prev = l1;
+			continue;
+		}
+
+		rightop = (Expr *) get_rightop(expr);
+		if (rightop && IsA(rightop, RelabelType))
+			rightop = ((RelabelType *) rightop)->arg;
+
+		Assert(rightop != NULL);
+
+		/*
+		 * If right op is not constant then we can not push down this key
+		 * so move to next qual.
+		 */
+		if (!IsA(rightop, Const))
+		{
+			prev = l1;
+			continue;
+		}
+
+		scanvalue = ((Const *) rightop)->constvalue;
+
+		/* If we havn't yet allocated memory for scan keys, then do so. */
+		if (*rs_keys == NULL)
+		{
+			*rs_keys = (ScanKey) palloc(sizeof(ScanKeyData) *
+									list_length(qual));
+			key = *rs_keys;
+		}
+
+		/* Create scan key entry */
+		ScanKeyInit(key,
+				varattno,
+				InvalidStrategy,
+				((OpExpr *) expr)->opfuncid,
+				scanvalue);
+
+		key->sk_collation = ((OpExpr *) expr)->inputcollid;
+
+		key++;
+		(*nkeys)++;
+
+		/*
+		 * We have created scankey for this qual entry so remove it from
+		 * qual list.
+		 */
+		ps->qual = list_delete_cell(ps->qual, l1, prev);
+	}
 }
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 6016d19..48e50ce 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1168,6 +1168,7 @@ SearchCatCache(CatCache *cache,
 		 */
 		HeapKeyTest(&ct->tuple,
 					cache->cc_tupdesc,
+					NULL, 0, NULL,
 					cache->cc_nkeys,
 					cur_skey,
 					res);
@@ -1459,6 +1460,7 @@ SearchCatCacheList(CatCache *cache,
 			continue;
 		HeapKeyTest(&cl->tuple,
 					cache->cc_tupdesc,
+					NULL, 0, NULL,
 					nkeys,
 					cur_skey,
 					res);
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 0d12bbb..d02f44a 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -130,7 +130,9 @@ extern HeapTuple heap_getnext(HeapScanDesc scan, ScanDirection direction);
 extern Size heap_parallelscan_estimate(Snapshot snapshot);
 extern void heap_parallelscan_initialize(ParallelHeapScanDesc target,
 							 Relation relation, Snapshot snapshot);
-extern HeapScanDesc heap_beginscan_parallel(Relation, ParallelHeapScanDesc);
+extern HeapScanDesc heap_beginscan_parallel(Relation relation,
+						ParallelHeapScanDesc parallel_scan, int nkeys,
+						ScanKey key);
 
 extern bool heap_fetch(Relation relation, Snapshot snapshot,
 		   HeapTuple tuple, Buffer *userbuf, bool keep_buf,
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index de98dd6..f237ecd 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -20,7 +20,7 @@
 #include "access/itup.h"
 #include "access/tupdesc.h"
 #include "storage/spin.h"
-
+#include "executor/tuptable.h"
 /*
  * Shared state for parallel heap scan.
  *
@@ -70,6 +70,8 @@ typedef struct HeapScanDescData
 	Buffer		rs_cbuf;		/* current buffer in scan, if any */
 	/* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
 	ParallelHeapScanDesc rs_parallel;	/* parallel scan information */
+	TupleTableSlot	*rs_scanslot;
+	MemoryContext	per_tuple_context;
 
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	int			rs_cindex;		/* current tuple's index in vistuples */
diff --git a/src/include/access/valid.h b/src/include/access/valid.h
index 30af9df..b1d9bbb 100644
--- a/src/include/access/valid.h
+++ b/src/include/access/valid.h
@@ -21,14 +21,25 @@
  */
 #define HeapKeyTest(tuple, \
 					tupdesc, \
+					slot, \
+					buf, \
+					context, \
 					nkeys, \
 					keys, \
 					result) \
 do \
 { \
 	/* Use underscores to protect the variables passed in as parameters */ \
-	int			__cur_nkeys = (nkeys); \
-	ScanKey		__cur_keys = (keys); \
+	int				__cur_nkeys = (nkeys); \
+	ScanKey			__cur_keys = (keys); \
+	MemoryContext	__oldcontext;	\
+	\
+	if ((slot))	\
+			ExecStoreTuple(tuple,	/* tuple to store */	\
+					   (slot),	/* slot to store in */      \
+					   (buf),		/* buffer associated with this \
+												 * tuple */ \
+					   false);	/* don't pfree this pointer */ \
  \
 	(result) = true; /* may change */ \
 	for (; __cur_nkeys--; __cur_keys++) \
@@ -43,10 +54,13 @@ do \
 			break; \
 		} \
  \
-		__atp = heap_getattr((tuple), \
+ 	 	if (!(slot))	\
+			__atp = heap_getattr((tuple), \
 							 __cur_keys->sk_attno, \
 							 (tupdesc), \
 							 &__isnull); \
+		else \
+		    __atp = slot_getattr((slot), __cur_keys->sk_attno, &__isnull); \
  \
 		if (__isnull) \
 		{ \
@@ -54,9 +68,11 @@ do \
 			break; \
 		} \
  \
+ 	 	__oldcontext = MemoryContextSwitchTo(context);	\
 		__test = FunctionCall2Coll(&__cur_keys->sk_func, \
 								   __cur_keys->sk_collation, \
 								   __atp, __cur_keys->sk_argument); \
+		MemoryContextSwitchTo(__oldcontext);	\
  \
 		if (!DatumGetBool(__test)) \
 		{ \
TPCH_out.tarapplication/x-tar; name=TPCH_out.tarDownload
#21Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#20)
Re: Proposal: scan key push down to heap [WIP]

On Sun, Nov 13, 2016 at 12:16 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

Problem1: As Andres has mentioned, HeapKeyTest uses heap_getattr,
whereas ExecQual use slot_getattr().So we can have worst case
performance problem when very less tuple are getting filter out and we
have table with many columns with qual on most of the columns.

Problem2. In HeapKeyTest we are under per_query_ctx, whereas in
ExecQual we are under per_tuple_ctx , so in former we can not afford
to have any palloc.

Couldn't we just change the current memory context before calling
heap_getnext()? And then change back?

Also, what if we abandoned the idea of pushing qual evaluation all the
way down into the heap and just tried to do HeapKeyTest in SeqNext
itself? Would that be almost as fast, or would it give up most of the
benefits?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Dilip Kumar
dilipbalaut@gmail.com
In reply to: Robert Haas (#21)
Re: Proposal: scan key push down to heap [WIP]

On Mon, Nov 14, 2016 at 9:30 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Couldn't we just change the current memory context before calling
heap_getnext()? And then change back?

Right, seems like it will not have any problem..

Also, what if we abandoned the idea of pushing qual evaluation all the
way down into the heap and just tried to do HeapKeyTest in SeqNext
itself? Would that be almost as fast, or would it give up most of the
benefits?

This we can definitely test. I will test and post the data.

Thanks for the suggestion.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#22)
Re: Proposal: scan key push down to heap [WIP]

On Mon, Nov 14, 2016 at 9:44 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

Also, what if we abandoned the idea of pushing qual evaluation all the
way down into the heap and just tried to do HeapKeyTest in SeqNext
itself? Would that be almost as fast, or would it give up most of the
benefits?

This we can definitely test. I will test and post the data.

Thanks for the suggestion.

Here are some results with this approach...

create table test(a int, b varchar, c varchar, d int);
insert into test values(generate_series(1,10000000), repeat('x', 30),
repeat('y', 30), generate_series(1,10000000));
analyze test;

query: explain analyze select * from test where a < $1;

selectivity head patch1 patch2 patch3
10.00% 840 460 582 689
20.00% 885 563 665 752
50.00% 1076 786 871 910
80.00% 1247 988 1055 1099
100.00% 1386 1123 1193 1237

patch1: Original patch (heap_scankey_pushdown_v1.patch), only
supported for fixed length datatype and use heap_getattr.

patch2: Switches memory context in HeapKeyTest + Store tuple in slot
and use slot_getattr instead of heap_getattr.

patch3: call HeapKeyTest in SeqNext after storing slot, and also
switch memory context before calling HeapKeyTest.

Summary: (performance data)
----------------------------------------
1. At 10% selectivity patch1 shows > 40% gain which reduced to 30% in
patch2 and finally it drops to 17% in patch3.
2. I think patch1 wins over patch2, because patch1 avoid call to ExecStoreTuple.
3. Patch2 wins over patch3 because patch2 can reject tuple in
heapgettup_pagemode whereas patch3 can do it in SeqNext, so patch2 can
avoid some function calls instructions.

Summary (various patches and problems)
-----------------------------------------------------
Patch1:
problem1: This approach has performance problem in some cases (quals
on many columns of the table + high selectivity (>50%)).
problem2: HeapKeyTest uses ExecutorContext so we need to block any
datatype which needs memory allocation during eval functions.

Patch2: Patch2 solves both the problems of patch1, but exposes
executor items to heap and it's not a good design.

Patch3: This solves all the problems exists in patch1+patch2, but
performance is significantly less.

I haven't yet tested patch3 with TPCH, I will do that once machine is available.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#23)
1 attachment(s)
Re: Proposal: scan key push down to heap [WIP]

On Sat, Nov 19, 2016 at 6:48 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

patch1: Original patch (heap_scankey_pushdown_v1.patch), only
supported for fixed length datatype and use heap_getattr.

patch2: Switches memory context in HeapKeyTest + Store tuple in slot
and use slot_getattr instead of heap_getattr.

patch3: call HeapKeyTest in SeqNext after storing slot, and also
switch memory context before calling HeapKeyTest.

I haven't yet tested patch3 with TPCH, I will do that once machine is available.

As promised, I have taken the performance with TPCH benchmark and
still result are quite good. However this are less compared to older
version (which was exposing expr ctx and slot to heap).

Query Head [1]heap_scankey_pushdown_POC_V3.patch : attached with the mail. Patch3 Improvement
Q3 36122.425 32285.608 10%
Q4 6797 5763.871 15%
Q10 17996.104 15878.505 11%
Q12 12399.651 9969.489 19%

[1]: heap_scankey_pushdown_POC_V3.patch : attached with the mail.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

heap_scankey_pushdown_POC_V3.patchapplication/octet-stream; name=heap_scankey_pushdown_POC_V3.patchDownload
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index b019bc1..ab95fcc 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -299,6 +299,7 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
 	ItemPointerSetInvalid(&scan->rs_ctup.t_self);
 	scan->rs_cbuf = InvalidBuffer;
 	scan->rs_cblock = InvalidBlockNumber;
+	scan->rs_scanslot = NULL;
 
 	/* page-at-a-time fields are always invalid when not rs_inited */
 
@@ -660,7 +661,8 @@ heapgettup(HeapScanDesc scan,
 
 				if (valid && key != NULL)
 					HeapKeyTest(tuple, RelationGetDescr(scan->rs_rd),
-								nkeys, key, valid);
+							NULL, 0, NULL,
+							nkeys, key, valid);
 
 				if (valid)
 				{
@@ -955,7 +957,9 @@ heapgettup_pagemode(HeapScanDesc scan,
 				bool		valid;
 
 				HeapKeyTest(tuple, RelationGetDescr(scan->rs_rd),
-							nkeys, key, valid);
+						scan->rs_scanslot, scan->rs_cbuf,
+						scan->per_tuple_context,
+						nkeys, key, valid);
 				if (valid)
 				{
 					scan->rs_cindex = lineindex;
@@ -1649,7 +1653,8 @@ heap_parallelscan_initialize(ParallelHeapScanDesc target, Relation relation,
  * ----------------
  */
 HeapScanDesc
-heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan)
+heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan,
+						int nkeys, ScanKey key)
 {
 	Snapshot	snapshot;
 
@@ -1657,7 +1662,7 @@ heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan)
 	snapshot = RestoreSnapshot(parallel_scan->phs_snapshot_data);
 	RegisterSnapshot(snapshot);
 
-	return heap_beginscan_internal(relation, snapshot, 0, NULL, parallel_scan,
+	return heap_beginscan_internal(relation, snapshot, nkeys, key, parallel_scan,
 								   true, true, true, false, false, true);
 }
 
diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index fb0013d..3a12ec4 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -205,7 +205,7 @@ ExecScan(ScanState *node,
 		 * when the qual is nil ... saves only a few cycles, but they add up
 		 * ...
 		 */
-		if (!qual || ExecQual(qual, econtext, false))
+		if (!node->ps.qual || ExecQual(node->ps.qual, econtext, false))
 		{
 			/*
 			 * Found a satisfactory scan tuple.
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 00bf3a5..695958f 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -30,9 +30,14 @@
 #include "executor/execdebug.h"
 #include "executor/nodeSeqscan.h"
 #include "utils/rel.h"
+#include "optimizer/clauses.h"
+#include "executor/executor.h"
+#include "access/valid.h"
 
 static void InitScanRelation(SeqScanState *node, EState *estate, int eflags);
 static TupleTableSlot *SeqNext(SeqScanState *node);
+static void get_scankey_from_qual(TupleDesc tupDesc, List *qual, PlanState *ps,
+							int *nkeys, ScanKey *rs_keys);
 
 /* ----------------------------------------------------------------
  *						Scan Support
@@ -53,6 +58,9 @@ SeqNext(SeqScanState *node)
 	EState	   *estate;
 	ScanDirection direction;
 	TupleTableSlot *slot;
+	int 		 nkeys = 0;
+	ScanKey		 keys = NULL;
+	bool		valid = false;
 
 	/*
 	 * get information from the estate and scan state
@@ -65,36 +73,76 @@ SeqNext(SeqScanState *node)
 	if (scandesc == NULL)
 	{
 		/*
+		 * Pull out scan key from qual list
+		 */
+		get_scankey_from_qual(RelationGetDescr(node->ss.ss_currentRelation),
+							 node->ss.ps.plan->qual, &node->ss.ps,
+							 &nkeys, &keys);
+
+		/*
 		 * We reach here if the scan is not parallel, or if we're executing a
 		 * scan that was intended to be parallel serially.
 		 */
 		scandesc = heap_beginscan(node->ss.ss_currentRelation,
 								  estate->es_snapshot,
 								  0, NULL);
+
+		node->ss.nkey = nkeys;
+		node->ss.keys = keys;
+
+
 		node->ss.ss_currentScanDesc = scandesc;
 	}
 
-	/*
-	 * get the next tuple from the table
-	 */
-	tuple = heap_getnext(scandesc, direction);
+	for (;;)
+	{
+		/*
+		 * get the next tuple from the table
+		 */
+		tuple = heap_getnext(scandesc, direction);
 
-	/*
-	 * save the tuple and the buffer returned to us by the access methods in
-	 * our scan tuple slot and return the slot.  Note: we pass 'false' because
-	 * tuples returned by heap_getnext() are pointers onto disk pages and were
-	 * not created with palloc() and so should not be pfree()'d.  Note also
-	 * that ExecStoreTuple will increment the refcount of the buffer; the
-	 * refcount will not be dropped until the tuple table slot is cleared.
-	 */
-	if (tuple)
-		ExecStoreTuple(tuple,	/* tuple to store */
-					   slot,	/* slot to store in */
-					   scandesc->rs_cbuf,		/* buffer associated with this
-												 * tuple */
-					   false);	/* don't pfree this pointer */
-	else
-		ExecClearTuple(slot);
+		/*
+		 * save the tuple and the buffer returned to us by the access methods in
+		 * our scan tuple slot and return the slot.  Note: we pass 'false' because
+		 * tuples returned by heap_getnext() are pointers onto disk pages and were
+		 * not created with palloc() and so should not be pfree()'d.  Note also
+		 * that ExecStoreTuple will increment the refcount of the buffer; the
+		 * refcount will not be dropped until the tuple table slot is cleared.
+		 *
+		 * If scandesc have valid scanslot means we would have already stored
+		 * tuple in slot, so no need to do it again.
+		 */
+		if (tuple)
+			ExecStoreTuple(tuple,	/* tuple to store */
+						   slot,	/* slot to store in */
+						   scandesc->rs_cbuf,		/* buffer associated with this
+													 * tuple */
+						   false);	/* don't pfree this pointer */
+		else
+			ExecClearTuple(slot);
+
+		if (tuple)
+		{
+			MemoryContext oldctx;
+
+			oldctx = MemoryContextSwitchTo(node->ss.ps.ps_ExprContext->ecxt_per_tuple_memory);
+			HeapKeyTest(tuple, RelationGetDescr(node->ss.ss_currentRelation),
+					slot, scandesc->rs_cbuf, NULL,
+					node->ss.nkey, node->ss.keys, valid);
+			if (valid != true)
+			{
+				MemoryContextSwitchTo(oldctx);
+				MemoryContextReset(node->ss.ps.ps_ExprContext->ecxt_per_tuple_memory);
+				continue;
+			}
+
+			MemoryContextSwitchTo(oldctx);
+
+			return slot;
+		}
+		else
+			break;
+	}
 
 	return slot;
 }
@@ -317,14 +365,24 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelHeapScanDesc pscan;
+	int					 nkeys = 0;
+	ScanKey				 keys = NULL;
+
+	/*
+	 * Pull out scan key from qual list
+	 */
+	get_scankey_from_qual(RelationGetDescr(node->ss.ss_currentRelation),
+						 node->ss.ps.plan->qual, &node->ss.ps,
+						 &nkeys, &keys);
 
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	heap_parallelscan_initialize(pscan,
 								 node->ss.ss_currentRelation,
 								 estate->es_snapshot);
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pscan);
-	node->ss.ss_currentScanDesc =
-		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+	node->ss.ss_currentScanDesc = heap_beginscan_parallel(
+										node->ss.ss_currentRelation, pscan,
+										nkeys, keys);
 }
 
 /* ----------------------------------------------------------------
@@ -337,8 +395,129 @@ void
 ExecSeqScanInitializeWorker(SeqScanState *node, shm_toc *toc)
 {
 	ParallelHeapScanDesc pscan;
+	int					 nkeys = 0;
+	ScanKey				 keys = NULL;
 
 	pscan = shm_toc_lookup(toc, node->ss.ps.plan->plan_node_id);
-	node->ss.ss_currentScanDesc =
-		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+
+	/*
+	 * Pull out scan key from qual list
+	 */
+	get_scankey_from_qual(RelationGetDescr(node->ss.ss_currentRelation),
+						 node->ss.ps.plan->qual, &node->ss.ps,
+						 &nkeys, &keys);
+
+	node->ss.ss_currentScanDesc = heap_beginscan_parallel(
+										node->ss.ss_currentRelation, pscan,
+										nkeys, keys);
+}
+
+/*
+ * get_scankey_from_qual
+ *
+ * process complete QUAL list and create scankey entry for pushable quals.
+ *
+ * quals which can be pushed down:
+ * 1. qual should be of form VAR op CONST.
+ * 2. VAR should be for datatype with fixed length.
+ */
+static void
+get_scankey_from_qual(TupleDesc tupDesc, List *qual, PlanState *ps,
+					int *nkeys, ScanKey *rs_keys)
+{
+	ListCell   		*l, *l1;
+	Expr 	   		*expr;
+	ListCell   		*prev = NULL;
+	ScanKey			 key;
+	Expr			*leftop;
+	Expr			*rightop;
+	AttrNumber	 	 varattno;
+	Datum		 	 scanvalue;
+
+	/*
+	 * Validate each qual whether this qual can be push down to heap node
+	 * or not, If this can be pushed down then create a ScanKey entry
+	 * and delete it from qual list of PlanState
+	 */
+	forboth(l, qual, l1, ps->qual)
+	{
+		expr = (Expr *) lfirst(l);
+
+		if (!IsA(expr, OpExpr))
+		{
+			prev = l1;
+			continue;
+		}
+
+
+		/* leftop should be the Var, possibly relabeled. */
+		leftop = (Expr *) get_leftop(expr);
+		if (leftop && IsA(leftop, RelabelType))
+			leftop = ((RelabelType *) leftop)->arg;
+
+		Assert(leftop != NULL);
+
+		/* If leftop is not var then continue and check next qual. */
+		if (!(IsA(leftop, Var)))
+		{
+			prev = l1;
+			continue;
+		}
+
+		varattno = ((Var *) leftop)->varattno;
+
+		/*
+		 * We don't want to push down the qual which can have variable
+		 * length data.
+		 */
+		if (varattno <= 0)
+		{
+			prev = l1;
+			continue;
+		}
+
+		rightop = (Expr *) get_rightop(expr);
+		if (rightop && IsA(rightop, RelabelType))
+			rightop = ((RelabelType *) rightop)->arg;
+
+		Assert(rightop != NULL);
+
+		/*
+		 * If right op is not constant then we can not push down this key
+		 * so move to next qual.
+		 */
+		if (!IsA(rightop, Const))
+		{
+			prev = l1;
+			continue;
+		}
+
+		scanvalue = ((Const *) rightop)->constvalue;
+
+		/* If we havn't yet allocated memory for scan keys, then do so. */
+		if (*rs_keys == NULL)
+		{
+			*rs_keys = (ScanKey) palloc(sizeof(ScanKeyData) *
+									list_length(qual));
+			key = *rs_keys;
+		}
+
+		/* Create scan key entry */
+		ScanKeyInit(key,
+				varattno,
+				InvalidStrategy,
+				((OpExpr *) expr)->opfuncid,
+				scanvalue);
+
+		key->sk_collation = ((OpExpr *) expr)->inputcollid;
+
+		key++;
+		(*nkeys)++;
+
+		/*
+		 * We have created scankey for this qual entry so remove it from
+		 * qual list.
+		 */
+		ps->qual = list_delete_cell(ps->qual, l1, prev);
+	}
 }
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 6016d19..48e50ce 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1168,6 +1168,7 @@ SearchCatCache(CatCache *cache,
 		 */
 		HeapKeyTest(&ct->tuple,
 					cache->cc_tupdesc,
+					NULL, 0, NULL,
 					cache->cc_nkeys,
 					cur_skey,
 					res);
@@ -1459,6 +1460,7 @@ SearchCatCacheList(CatCache *cache,
 			continue;
 		HeapKeyTest(&cl->tuple,
 					cache->cc_tupdesc,
+					NULL, 0, NULL,
 					nkeys,
 					cur_skey,
 					res);
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 0d12bbb..d02f44a 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -130,7 +130,9 @@ extern HeapTuple heap_getnext(HeapScanDesc scan, ScanDirection direction);
 extern Size heap_parallelscan_estimate(Snapshot snapshot);
 extern void heap_parallelscan_initialize(ParallelHeapScanDesc target,
 							 Relation relation, Snapshot snapshot);
-extern HeapScanDesc heap_beginscan_parallel(Relation, ParallelHeapScanDesc);
+extern HeapScanDesc heap_beginscan_parallel(Relation relation,
+						ParallelHeapScanDesc parallel_scan, int nkeys,
+						ScanKey key);
 
 extern bool heap_fetch(Relation relation, Snapshot snapshot,
 		   HeapTuple tuple, Buffer *userbuf, bool keep_buf,
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index de98dd6..f237ecd 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -20,7 +20,7 @@
 #include "access/itup.h"
 #include "access/tupdesc.h"
 #include "storage/spin.h"
-
+#include "executor/tuptable.h"
 /*
  * Shared state for parallel heap scan.
  *
@@ -70,6 +70,8 @@ typedef struct HeapScanDescData
 	Buffer		rs_cbuf;		/* current buffer in scan, if any */
 	/* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
 	ParallelHeapScanDesc rs_parallel;	/* parallel scan information */
+	TupleTableSlot	*rs_scanslot;
+	MemoryContext	per_tuple_context;
 
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	int			rs_cindex;		/* current tuple's index in vistuples */
diff --git a/src/include/access/valid.h b/src/include/access/valid.h
index 30af9df..bdcb390 100644
--- a/src/include/access/valid.h
+++ b/src/include/access/valid.h
@@ -21,15 +21,18 @@
  */
 #define HeapKeyTest(tuple, \
 					tupdesc, \
+					slot, \
+					buf, \
+					context, \
 					nkeys, \
 					keys, \
 					result) \
 do \
 { \
 	/* Use underscores to protect the variables passed in as parameters */ \
-	int			__cur_nkeys = (nkeys); \
-	ScanKey		__cur_keys = (keys); \
- \
+	int				__cur_nkeys = (nkeys); \
+	ScanKey			__cur_keys = (keys); \
+	\
 	(result) = true; /* may change */ \
 	for (; __cur_nkeys--; __cur_keys++) \
 	{ \
@@ -43,10 +46,13 @@ do \
 			break; \
 		} \
  \
-		__atp = heap_getattr((tuple), \
+ 	 	if (!(slot))	\
+			__atp = heap_getattr((tuple), \
 							 __cur_keys->sk_attno, \
 							 (tupdesc), \
 							 &__isnull); \
+		else \
+		    __atp = slot_getattr((slot), __cur_keys->sk_attno, &__isnull); \
  \
 		if (__isnull) \
 		{ \
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index f6f73f3..faad0a1 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1256,6 +1256,8 @@ typedef struct ScanState
 	Relation	ss_currentRelation;
 	HeapScanDesc ss_currentScanDesc;
 	TupleTableSlot *ss_ScanTupleSlot;
+	int			nkey;
+	ScanKey		keys;
 } ScanState;
 
 /* ----------------
#25Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#24)
Re: Proposal: scan key push down to heap [WIP]

On Mon, Nov 28, 2016 at 3:00 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

As promised, I have taken the performance with TPCH benchmark and
still result are quite good. However this are less compared to older
version (which was exposing expr ctx and slot to heap).

Query Head [1] Patch3 Improvement
Q3 36122.425 32285.608 10%
Q4 6797 5763.871 15%
Q10 17996.104 15878.505 11%
Q12 12399.651 9969.489 19%

[1] heap_scankey_pushdown_POC_V3.patch : attached with the mail.

I forgot to mention the configuration parameter in last mail..

TPCH-scale factor 10.
work mem 20MB
Power, 4 socket machine
Shared Buffer 1GB

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#24)
Re: Proposal: scan key push down to heap [WIP]

On Mon, Nov 28, 2016 at 4:30 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Sat, Nov 19, 2016 at 6:48 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

patch1: Original patch (heap_scankey_pushdown_v1.patch), only
supported for fixed length datatype and use heap_getattr.

patch2: Switches memory context in HeapKeyTest + Store tuple in slot
and use slot_getattr instead of heap_getattr.

patch3: call HeapKeyTest in SeqNext after storing slot, and also
switch memory context before calling HeapKeyTest.

I haven't yet tested patch3 with TPCH, I will do that once machine is available.

As promised, I have taken the performance with TPCH benchmark and
still result are quite good. However this are less compared to older
version (which was exposing expr ctx and slot to heap).

Query Head [1] Patch3 Improvement
Q3 36122.425 32285.608 10%
Q4 6797 5763.871 15%
Q10 17996.104 15878.505 11%
Q12 12399.651 9969.489 19%

[1] heap_scankey_pushdown_POC_V3.patch : attached with the mail.

I think we should go with this approach. I don't think it's a good
idea to have the heapam layer know about executor slots. Even though
it's a little sad to pass up an opportunity for a larger performance
improvement, this improvement is still quite good. However, there's a
fair amount of this patch that doesn't look right:

- The changes to heapam.c shouldn't be needed any more. Ditto
valid.h, relscan.h, catcache.c and maybe some other stuff.

- get_scankey_from_qual() should be done at plan time, not execution
time. Just as index scans already divide up quals between "Index
Cond" and "Filter" (see ExplainNode), I think Seq Scans are now going
to need to something similar. Obviously, "Index Cond" isn't an
appropriate name for things that we test via HeapKeyTest, but maybe
"Heap Cond" would be suitable. That's going to be a fair amount of
refactoring, since the "typedef Scan SeqScan" in plannodes.h is going
to need to be replaced by an actual new structure definition.

- get_scankey_from_qual()'s prohibition on variable-width columns is
presumably no longer necessary with this approach, right?

- Anything tested in SeqNext() will also need to be retested in
SeqRecheck(); otherwise, the test will be erroneously skipped during
EPQ rechecks.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27Dilip Kumar
dilipbalaut@gmail.com
In reply to: Robert Haas (#26)
Re: Proposal: scan key push down to heap [WIP]

On Mon, Nov 28, 2016 at 8:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think we should go with this approach. I don't think it's a good
idea to have the heapam layer know about executor slots. Even though
it's a little sad to pass up an opportunity for a larger performance
improvement, this improvement is still quite good.

I agree.

However, there's a

fair amount of this patch that doesn't look right:

- The changes to heapam.c shouldn't be needed any more. Ditto
valid.h, relscan.h, catcache.c and maybe some other stuff.

Actually we want to call slot_getattr instead heap_getattr, because of
problem mentioned by Andres upthread and we also saw in test results.

Should we make a copy of HeapKeyTest lets say ExecKeyTest and keep it
under executor ?

ExecKeyTest will be same as HeapKeyTest but it will call slot_getattr
instead of heap_getattr.

- get_scankey_from_qual() should be done at plan time, not execution
time. Just as index scans already divide up quals between "Index
Cond" and "Filter" (see ExplainNode), I think Seq Scans are now going
to need to something similar. Obviously, "Index Cond" isn't an
appropriate name for things that we test via HeapKeyTest, but maybe
"Heap Cond" would be suitable. That's going to be a fair amount of
refactoring, since the "typedef Scan SeqScan" in plannodes.h is going
to need to be replaced by an actual new structure definition.

Okay.

- get_scankey_from_qual()'s prohibition on variable-width columns is
presumably no longer necessary with this approach, right?

Correct.

- Anything tested in SeqNext() will also need to be retested in
SeqRecheck(); otherwise, the test will be erroneously skipped during
EPQ rechecks.

Okay..

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#26)
Re: Proposal: scan key push down to heap [WIP]

On 2016-11-28 09:55:00 -0500, Robert Haas wrote:

I think we should go with this approach. I don't think it's a good
idea to have the heapam layer know about executor slots.

I agree that that's not pretty.

Even though
it's a little sad to pass up an opportunity for a larger performance
improvement, this improvement is still quite good.

It's imo not primarily about a larger performance improvement, but about
avoid possible regressions. Doubling deforming of wide tuples will have
noticeable impact on some workloads. I don't think we can disregard
that.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#28)
Re: Proposal: scan key push down to heap [WIP]

On Mon, Nov 28, 2016 at 11:25 PM, Andres Freund <andres@anarazel.de> wrote:

On 2016-11-28 09:55:00 -0500, Robert Haas wrote:

I think we should go with this approach. I don't think it's a good
idea to have the heapam layer know about executor slots.

I agree that that's not pretty.

Even though
it's a little sad to pass up an opportunity for a larger performance
improvement, this improvement is still quite good.

It's imo not primarily about a larger performance improvement, but about
avoid possible regressions. Doubling deforming of wide tuples will have
noticeable impact on some workloads. I don't think we can disregard
that.

I wasn't proposing to disregard that. I'm saying hoist the tests up
into nodeSeqScan.c where they can use the slot without breaking the
abstraction. It's not quite as fast but it's a lot cleaner.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#27)
Re: Proposal: scan key push down to heap [WIP]

On Mon, Nov 28, 2016 at 11:17 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

Actually we want to call slot_getattr instead heap_getattr, because of
problem mentioned by Andres upthread and we also saw in test results.

Ah, right.

Should we make a copy of HeapKeyTest lets say ExecKeyTest and keep it
under executor ?

Sure.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31Dilip Kumar
dilipbalaut@gmail.com
In reply to: Robert Haas (#30)
1 attachment(s)
Re: Proposal: scan key push down to heap [WIP]

On Tue, Nov 29, 2016 at 11:21 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Nov 28, 2016 at 11:17 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

Actually we want to call slot_getattr instead heap_getattr, because of
problem mentioned by Andres upthread and we also saw in test results.

Ah, right.

Should we make a copy of HeapKeyTest lets say ExecKeyTest and keep it
under executor ?

Sure.

I have worked on the idea you suggested upthread. POC patch is attached.

Todo:
1. Comments.
2. Test.
3. Some regress output will change as we are adding some extra
information to analyze output.

I need some suggestion..

1. As we decided to separate scankey and qual during planning time. So
I am doing it at create_seqscan_path. My question is currently we
don't have path node for seqscan, so where should we store scankey ?
In Path node, or create new SeqScanPath node ?. In attached patch I
have stored in Path node.

2. This is regarding instrumentation information for scan key, after
my changes currently explain analyze result will look like this.

postgres=# explain (analyze, costs off) select * from lineitem
where l_shipmode in ('MAIL', 'AIR')
and l_receiptdate >= date '1994-01-01';
QUERY PLAN
--------------------------------------------------------------------------
Seq Scan on lineitem (actual time=0.022..12179.946 rows=6238212 loops=1)
Scan Key: (l_receiptdate >= '1994-01-01'::date)
Filter: (l_shipmode = ANY ('{MAIL,AIR}'::bpchar[]))
Rows Removed by Scan Key: 8162088
Rows Removed by Filter: 15599495
Planning time: 0.182 ms
Execution time: 12410.529 ms

My question is, how should we show pushdown filters ?
In above plan I am showing as "Scan Key: ", does this look fine or we
should name it something else ?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

heap_scankey_pushdown_POC_v4.patchapplication/octet-stream; name=heap_scankey_pushdown_POC_v4.patchDownload
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 0a669d9..1fc5068 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1338,7 +1338,6 @@ ExplainNode(PlanState *planstate, List *ancestors,
 			show_tablesample(((SampleScan *) plan)->tablesample,
 							 planstate, ancestors, es);
 			/* FALL THRU to print additional fields the same as SeqScan */
-		case T_SeqScan:
 		case T_ValuesScan:
 		case T_CteScan:
 		case T_WorkTableScan:
@@ -1348,6 +1347,20 @@ ExplainNode(PlanState *planstate, List *ancestors,
 				show_instrumentation_count("Rows Removed by Filter", 1,
 										   planstate, es);
 			break;
+		case T_SeqScan:
+			{
+				SeqScan		*scan = (SeqScan*)plan;
+				show_scan_qual(scan->simplequal, "Scan Key", planstate,
+						ancestors, es);
+				show_scan_qual(plan->qual, "Filter", planstate, ancestors, es);
+				if (scan->simplequal)
+					show_instrumentation_count("Rows Removed by Scan Key", 2,
+											planstate, es);
+				if (plan->qual)
+					show_instrumentation_count("Rows Removed by Filter", 1,
+											planstate, es);
+			}
+			break;
 		case T_Gather:
 			{
 				Gather	   *gather = (Gather *) plan;
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 00bf3a5..8da2884 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -30,9 +30,12 @@
 #include "executor/execdebug.h"
 #include "executor/nodeSeqscan.h"
 #include "utils/rel.h"
+#include "optimizer/clauses.h"
+#include "utils/memutils.h"
 
 static void InitScanRelation(SeqScanState *node, EState *estate, int eflags);
 static TupleTableSlot *SeqNext(SeqScanState *node);
+static void prepare_scankey(ScanState *node, List *qual);
 
 /* ----------------------------------------------------------------
  *						Scan Support
@@ -53,6 +56,8 @@ SeqNext(SeqScanState *node)
 	EState	   *estate;
 	ScanDirection direction;
 	TupleTableSlot *slot;
+	bool			valid = false;
+	MemoryContext 	oldctx;
 
 	/*
 	 * get information from the estate and scan state
@@ -74,27 +79,56 @@ SeqNext(SeqScanState *node)
 		node->ss.ss_currentScanDesc = scandesc;
 	}
 
-	/*
-	 * get the next tuple from the table
-	 */
-	tuple = heap_getnext(scandesc, direction);
+	for (;;)
+	{
+		/*
+		 * get the next tuple from the table
+		 */
+		tuple = heap_getnext(scandesc, direction);
 
-	/*
-	 * save the tuple and the buffer returned to us by the access methods in
-	 * our scan tuple slot and return the slot.  Note: we pass 'false' because
-	 * tuples returned by heap_getnext() are pointers onto disk pages and were
-	 * not created with palloc() and so should not be pfree()'d.  Note also
-	 * that ExecStoreTuple will increment the refcount of the buffer; the
-	 * refcount will not be dropped until the tuple table slot is cleared.
-	 */
-	if (tuple)
-		ExecStoreTuple(tuple,	/* tuple to store */
-					   slot,	/* slot to store in */
-					   scandesc->rs_cbuf,		/* buffer associated with this
-												 * tuple */
-					   false);	/* don't pfree this pointer */
-	else
-		ExecClearTuple(slot);
+		/*
+		 * save the tuple and the buffer returned to us by the access methods
+		 * in our scan tuple slot and return the slot.  Note: we pass 'false'
+		 * because tuples returned by heap_getnext() are pointers onto disk
+		 * pages and were not created with palloc() and so should not be
+		 * pfree()'d.  Note also that ExecStoreTuple will increment the
+		 * refcount of the buffer; the refcount will not be dropped until the
+		 * tuple table slot is cleared.
+		 */
+		if (tuple)
+			ExecStoreTuple(tuple,	/* tuple to store */
+						   slot,	/* slot to store in */
+						   scandesc->rs_cbuf,		/* buffer associated with this
+													 * tuple */
+						   false);	/* don't pfree this pointer */
+		else
+		{
+			ExecClearTuple(slot);
+			break;
+		}
+
+		oldctx = MemoryContextSwitchTo(
+						node->ss.ps.ps_ExprContext->ecxt_per_tuple_memory);
+
+		ExecKeyTest(tuple, RelationGetDescr(node->ss.ss_currentRelation),
+				slot, scandesc->rs_cbuf, NULL, node->ss.nkey,
+				node->ss.keys, valid);
+		if (valid != true)
+		{
+			/*
+			 * tuple fail qual so reset the expression context and move
+			 * to next tuple.
+			 */
+			MemoryContextSwitchTo(oldctx);
+			ResetExprContext(node->ss.ps.ps_ExprContext);
+			InstrCountFiltered2(node, 1);
+			continue;
+		}
+
+		MemoryContextSwitchTo(oldctx);
+
+		return slot;
+	}
 
 	return slot;
 }
@@ -195,6 +229,8 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
 		ExecInitExpr((Expr *) node->plan.qual,
 					 (PlanState *) scanstate);
 
+	prepare_scankey(&scanstate->ss, node->simplequal);
+
 	/*
 	 * tuple table initialization
 	 */
@@ -342,3 +378,63 @@ ExecSeqScanInitializeWorker(SeqScanState *node, shm_toc *toc)
 	node->ss.ss_currentScanDesc =
 		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
 }
+
+/*
+ * prepare_scankey
+ *
+ * process complete simplequal list and create scankey entry.
+ */
+static void
+prepare_scankey(ScanState *node, List *qual)
+{
+	ListCell   		*l;
+	Expr 	   		*expr;
+	ScanKey			 keys;
+	Expr			*leftop;
+	Expr			*rightop;
+	AttrNumber	 	 varattno;
+	Datum		 	 scanvalue;
+
+	if (qual == NULL)
+		return;
+
+	node->nkey = list_length(qual);
+	keys = node->keys = (ScanKey) palloc(sizeof(ScanKeyData) * node->nkey);
+
+	foreach(l, qual)
+	{
+		expr = (Expr *) lfirst(l);
+
+		Assert(IsA(expr, OpExpr));
+
+		leftop = (Expr *) get_leftop(expr);
+		rightop = (Expr *) get_rightop(expr);
+
+		Assert(leftop && rightop);
+
+		/* Right op must be of RelabelType or Var type */
+		Assert(IsA(leftop, RelabelType) || IsA(leftop, Var));
+
+		/* Lest op must be of RelabelType or Const type*/
+		Assert(IsA(leftop, RelabelType) || IsA(leftop, Const));
+
+		if (IsA(leftop, RelabelType))
+			leftop = ((RelabelType *) leftop)->arg;
+
+		if (IsA(rightop, RelabelType))
+			rightop = ((RelabelType *) rightop)->arg;
+
+		varattno = ((Var *) leftop)->varattno;
+		scanvalue = ((Const *) rightop)->constvalue;
+
+		/* Create scan key entry */
+		ScanKeyInit(keys,
+				varattno,
+				InvalidStrategy,
+				((OpExpr *) expr)->opfuncid,
+				scanvalue);
+
+		keys->sk_collation = ((OpExpr *) expr)->inputcollid;
+		keys++;
+	}
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 04e49b7..90bdfd8 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -354,6 +354,7 @@ CopyScanFields(const Scan *from, Scan *newnode)
 	CopyPlanFields((const Plan *) from, (Plan *) newnode);
 
 	COPY_SCALAR_FIELD(scanrelid);
+	COPY_NODE_FIELD(simplequal);
 }
 
 /*
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 748b687..32c1824 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -291,6 +291,7 @@ _outScanInfo(StringInfo str, const Scan *node)
 	_outPlanInfo(str, (const Plan *) node);
 
 	WRITE_UINT_FIELD(scanrelid);
+	WRITE_NODE_FIELD(simplequal);
 }
 
 /*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 917e6c8..b8f32aa 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1604,6 +1604,7 @@ ReadCommonScan(Scan *local_node)
 	ReadCommonPlan(&local_node->plan);
 
 	READ_UINT_FIELD(scanrelid);
+	READ_NODE_FIELD(simplequal);
 }
 
 /*
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ad49674..c795a7a 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -158,7 +158,8 @@ static void copy_generic_path_info(Plan *dest, Path *src);
 static void copy_plan_costsize(Plan *dest, Plan *src);
 static void label_sort_with_costsize(PlannerInfo *root, Sort *plan,
 						 double limit_tuples);
-static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
+static SeqScan *make_seqscan(List *qptlist, List *qpqual, List *simplequal,
+						Index scanrelid);
 static SampleScan *make_samplescan(List *qptlist, List *qpqual, Index scanrelid,
 				TableSampleClause *tsc);
 static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
@@ -2265,6 +2266,9 @@ create_seqscan_plan(PlannerInfo *root, Path *best_path,
 {
 	SeqScan    *scan_plan;
 	Index		scan_relid = best_path->parent->relid;
+	List	   *qpqual;
+	List	   *simplequal;
+	ListCell   *l;
 
 	/* it should be a base rel... */
 	Assert(scan_relid > 0);
@@ -2273,19 +2277,32 @@ create_seqscan_plan(PlannerInfo *root, Path *best_path,
 	/* Sort clauses into best execution order */
 	scan_clauses = order_qual_clauses(root, scan_clauses);
 
+	qpqual = NIL;
+	foreach(l, scan_clauses)
+	{
+		RestrictInfo *rinfo = (RestrictInfo *) lfirst(l);
+
+		Assert(IsA(rinfo, RestrictInfo));
+		if (list_member_ptr(best_path->simplequal, rinfo))
+			continue;			/* simple duplicate */
+		qpqual = lappend(qpqual, rinfo);
+	}
+
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
-	scan_clauses = extract_actual_clauses(scan_clauses, false);
+	qpqual = extract_actual_clauses(qpqual, false);
+	simplequal = extract_actual_clauses(best_path->simplequal, false);
+
 
 	/* Replace any outer-relation variables with nestloop params */
 	if (best_path->param_info)
 	{
-		scan_clauses = (List *)
-			replace_nestloop_params(root, (Node *) scan_clauses);
+		qpqual = (List *) replace_nestloop_params(root, (Node *) qpqual);
 	}
 
 	scan_plan = make_seqscan(tlist,
-							 scan_clauses,
-							 scan_relid);
+							qpqual,
+							simplequal,
+							scan_relid);
 
 	copy_generic_path_info(&scan_plan->plan, best_path);
 
@@ -4656,6 +4673,7 @@ label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)
 static SeqScan *
 make_seqscan(List *qptlist,
 			 List *qpqual,
+			 List *simplequal,
 			 Index scanrelid)
 {
 	SeqScan    *node = makeNode(SeqScan);
@@ -4666,7 +4684,7 @@ make_seqscan(List *qptlist,
 	plan->lefttree = NULL;
 	plan->righttree = NULL;
 	node->scanrelid = scanrelid;
-
+	node->simplequal = simplequal;
 	return node;
 }
 
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index d91bc3b..2dea1de 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -453,6 +453,8 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 				SeqScan    *splan = (SeqScan *) plan;
 
 				splan->scanrelid += rtoffset;
+				splan->simplequal =
+					fix_scan_list(root, splan->simplequal, rtoffset);
 				splan->plan.targetlist =
 					fix_scan_list(root, splan->plan.targetlist, rtoffset);
 				splan->plan.qual =
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 6d3ccfd..0df72a0 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -46,6 +46,7 @@ typedef enum
 #define STD_FUZZ_FACTOR 1.01
 
 static List *translate_sub_tlist(List *tlist, int relid);
+static void create_simple_qual(RelOptInfo *rel, Path *path);
 
 
 /*****************************************************************************
@@ -957,6 +958,9 @@ create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
 
 	cost_seqscan(pathnode, root, rel, pathnode->param_info);
 
+	/* create simple qual from baserestrictinfo */
+	create_simple_qual(rel, pathnode);
+
 	return pathnode;
 }
 
@@ -3209,3 +3213,61 @@ reparameterize_path(PlannerInfo *root, Path *path,
 	}
 	return NULL;
 }
+
+/*
+ * create_simple_qual
+ *
+ *		Find the simple qual from baserestriction info.
+ *	Process baserestriction and find out all qual which is of form
+ *	var op const.
+ */
+static void
+create_simple_qual(RelOptInfo *rel, Path *path)
+{
+	ListCell	*lc;
+	List		*clauses = rel->baserestrictinfo;
+
+	foreach(lc, clauses)
+	{
+		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+		Expr	     *clause = rinfo->clause;
+		Node	     *leftop,
+				     *rightop;
+
+		if (is_opclause(clause))
+		{
+			leftop = get_leftop(clause);
+			rightop = get_rightop(clause);
+
+			if (!leftop || !rightop)
+				continue;
+
+			if (leftop && IsA(leftop, RelabelType))
+				leftop = (Node*)(((RelabelType *) leftop)->arg);
+
+			Assert(leftop != NULL);
+
+			/* If leftop is not var then continue and check next qual. */
+			if (!(IsA(leftop, Var)))
+			{
+				continue;
+			}
+
+			if (rightop && IsA(rightop, RelabelType))
+				rightop = (Node*)(((RelabelType *) rightop)->arg);
+
+			Assert(rightop != NULL);
+
+			/*
+			 * If right op is not constant then this can not be executed as
+			 * simple qual.
+			 */
+			if (!IsA(rightop, Const))
+			{
+				continue;
+			}
+
+			path->simplequal = lappend(path->simplequal, rinfo);
+		}
+	}
+}
diff --git a/src/include/executor/nodeSeqscan.h b/src/include/executor/nodeSeqscan.h
index f2e61ff..44c00d9 100644
--- a/src/include/executor/nodeSeqscan.h
+++ b/src/include/executor/nodeSeqscan.h
@@ -27,4 +27,50 @@ extern void ExecSeqScanEstimate(SeqScanState *node, ParallelContext *pcxt);
 extern void ExecSeqScanInitializeDSM(SeqScanState *node, ParallelContext *pcxt);
 extern void ExecSeqScanInitializeWorker(SeqScanState *node, shm_toc *toc);
 
+/*
+ *		ExecKeyTest
+ *
+ *		Test a heap tuple to see if it satisfies a scan key.
+ */
+#define ExecKeyTest(tuple, \
+					tupdesc, \
+					slot, \
+					buf, \
+					context, \
+					nkeys, \
+					keys, \
+					result) \
+do \
+{ \
+	/* Use underscores to protect the variables passed in as parameters */ \
+	int				__cur_nkeys = (nkeys); \
+	ScanKey			__cur_keys = (keys); \
+	\
+	(result) = true; /* may change */ \
+	for (; __cur_nkeys--; __cur_keys++) \
+	{ \
+		Datum	__atp; \
+		bool	__isnull; \
+		Datum	__test; \
+\
+		__atp = slot_getattr((slot), __cur_keys->sk_attno, &__isnull); \
+ \
+		if (__isnull) \
+		{ \
+			(result) = false; \
+			break; \
+		} \
+ \
+		__test = FunctionCall2Coll(&__cur_keys->sk_func, \
+								   __cur_keys->sk_collation, \
+								   __atp, __cur_keys->sk_argument); \
+ \
+		if (!DatumGetBool(__test)) \
+		{ \
+			(result) = false; \
+			break; \
+		} \
+	} \
+} while (0)
+
 #endif   /* NODESEQSCAN_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index f6f73f3..faad0a1 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1256,6 +1256,8 @@ typedef struct ScanState
 	Relation	ss_currentRelation;
 	HeapScanDesc ss_currentScanDesc;
 	TupleTableSlot *ss_ScanTupleSlot;
+	int			nkey;
+	ScanKey		keys;
 } ScanState;
 
 /* ----------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index e2fbc7d..fd736c5 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -282,8 +282,9 @@ typedef struct BitmapOr
  */
 typedef struct Scan
 {
-	Plan		plan;
-	Index		scanrelid;		/* relid is index into the range table */
+	Plan		 plan;
+	Index		 scanrelid;		/* relid is index into the range table */
+	List		*simplequal;	/* simple var op const qual of seq scan */
 } Scan;
 
 /* ----------------
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 3a1255a..62e96fa 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -895,7 +895,7 @@ typedef struct Path
 	double		rows;			/* estimated number of result tuples */
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
-
+	List	   *simplequal;		/* quals which can be converted to scankey */
 	List	   *pathkeys;		/* sort ordering of path's output */
 	/* pathkeys is a List of PathKey nodes; see above */
 } Path;
#32Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#31)
Re: Proposal: scan key push down to heap [WIP]

On Wed, Nov 30, 2016 at 5:41 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

1. As we decided to separate scankey and qual during planning time. So
I am doing it at create_seqscan_path. My question is currently we
don't have path node for seqscan, so where should we store scankey ?
In Path node, or create new SeqScanPath node ?. In attached patch I
have stored in Path node.

Well, Path should obviously only contain those things that are common
across all kinds of paths, so I would think you'd need to invent
SeqScanPath.

2. This is regarding instrumentation information for scan key, after
my changes currently explain analyze result will look like this.

postgres=# explain (analyze, costs off) select * from lineitem
where l_shipmode in ('MAIL', 'AIR')
and l_receiptdate >= date '1994-01-01';
QUERY PLAN
--------------------------------------------------------------------------
Seq Scan on lineitem (actual time=0.022..12179.946 rows=6238212 loops=1)
Scan Key: (l_receiptdate >= '1994-01-01'::date)
Filter: (l_shipmode = ANY ('{MAIL,AIR}'::bpchar[]))
Rows Removed by Scan Key: 8162088
Rows Removed by Filter: 15599495
Planning time: 0.182 ms
Execution time: 12410.529 ms

My question is, how should we show pushdown filters ?
In above plan I am showing as "Scan Key: ", does this look fine or we
should name it something else ?

Seems OK to me, but I'm open to better ideas if anybody has any.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33Andres Freund
andres@anarazel.de
In reply to: Dilip Kumar (#31)
Re: Proposal: scan key push down to heap [WIP]

On 2016-11-30 16:11:23 +0530, Dilip Kumar wrote:

On Tue, Nov 29, 2016 at 11:21 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Nov 28, 2016 at 11:17 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

Actually we want to call slot_getattr instead heap_getattr, because of
problem mentioned by Andres upthread and we also saw in test results.

Ah, right.

Should we make a copy of HeapKeyTest lets say ExecKeyTest and keep it
under executor ?

Sure.

I have worked on the idea you suggested upthread. POC patch is
attached.

Hm. I'm more than a bit doubful about this approach. Shouldn't we just
*always* do this as part of expression evaluation, instead of
special-casing for seqscans?

I.e. during planning recognize that an OpExpr can be evaluated as a
scankey and then emit different qual evaluation instructions? Because
then the benefit can be everywhere, instead of just seqscans.

I'll post my new expression evaluation stuff - which doesn't do this
atm, but makes ExecQual faster in other ways - later this week. If we
could get the planner (or parse-analysis?) to set an OpExpr flag that
signals that the expression can be evaluated as a scankey, that'd be
easy.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34Haribabu Kommi
kommi.haribabu@gmail.com
In reply to: Andres Freund (#33)
Re: Proposal: scan key push down to heap [WIP]

On Fri, Dec 2, 2016 at 8:41 AM, Andres Freund <andres@anarazel.de> wrote:

On 2016-11-30 16:11:23 +0530, Dilip Kumar wrote:

On Tue, Nov 29, 2016 at 11:21 PM, Robert Haas <robertmhaas@gmail.com>

wrote:

On Mon, Nov 28, 2016 at 11:17 PM, Dilip Kumar <dilipbalaut@gmail.com>

wrote:

Actually we want to call slot_getattr instead heap_getattr, because of
problem mentioned by Andres upthread and we also saw in test results.

Ah, right.

Should we make a copy of HeapKeyTest lets say ExecKeyTest and keep it
under executor ?

Sure.

I have worked on the idea you suggested upthread. POC patch is
attached.

Hm. I'm more than a bit doubful about this approach. Shouldn't we just
*always* do this as part of expression evaluation, instead of
special-casing for seqscans?

I.e. during planning recognize that an OpExpr can be evaluated as a
scankey and then emit different qual evaluation instructions? Because
then the benefit can be everywhere, instead of just seqscans.

I'll post my new expression evaluation stuff - which doesn't do this
atm, but makes ExecQual faster in other ways - later this week. If we
could get the planner (or parse-analysis?) to set an OpExpr flag that
signals that the expression can be evaluated as a scankey, that'd be
easy.

Moved to next CF with "waiting on author" status.

Regards,
Hari Babu
Fujitsu Australia

#35Dilip Kumar
dilipbalaut@gmail.com
In reply to: Andres Freund (#33)
Re: Proposal: scan key push down to heap [WIP]

On Fri, Dec 2, 2016 at 3:11 AM, Andres Freund <andres@anarazel.de> wrote:

Hm. I'm more than a bit doubful about this approach. Shouldn't we just
*always* do this as part of expression evaluation, instead of
special-casing for seqscans?

That make sense, we can actually do this as part of expression
evaluation and we can cover more cases.

I.e. during planning recognize that an OpExpr can be evaluated as a
scankey and then emit different qual evaluation instructions? Because
then the benefit can be everywhere, instead of just seqscans.

I will experiment with this..

I'll post my new expression evaluation stuff - which doesn't do this
atm, but makes ExecQual faster in other ways - later this week. If we
could get the planner (or parse-analysis?) to set an OpExpr flag that
signals that the expression can be evaluated as a scankey, that'd be
easy.

Isn't it better to directly make two separate lists during planning
itself, one for regular qual and other which can be converted to
scankey. Instead of keeping the flag in OpExpr ?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#35)
Re: Proposal: scan key push down to heap [WIP]

On Sat, Dec 3, 2016 at 10:30 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

I'll post my new expression evaluation stuff - which doesn't do this
atm, but makes ExecQual faster in other ways - later this week. If we
could get the planner (or parse-analysis?) to set an OpExpr flag that
signals that the expression can be evaluated as a scankey, that'd be
easy.

Isn't it better to directly make two separate lists during planning
itself, one for regular qual and other which can be converted to
scankey. Instead of keeping the flag in OpExpr ?

Well, I certainly think that in the end we need them in two separate
lists. But it's possible that a flag in the OpExpr could somehow be
useful in constructing those lists cheaply. I'm not sure I see what
Andres has in mind, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers