Sort support for macaddr8

Started by Melanie Plagemanover 6 years ago23 messages

melanieplageman@gmail.com

over 6 years ago

2 attachment(s)

Peter and I implemented this small (attached) patch to extend
abbreviated key compare sort to macaddr8 datatype (currently supported
for macaddr).

I tried checking to see if there is a performance difference using the
attached DDL based on src/test/regress/sql/macaddr8.sql. I found
that the sort function is only exercised when creating an index (not,
for example, when doing some type of aggregation).

With the patch applied to current master and using the DDL attached,
the timing for creating the index hovered around 20 ms for master and
15 ms for the patched version.

Machine and version specs: PostgreSQL 12beta1 on x86_64-pc-linux-gnu
compiled by gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, 64-bit

I think that that seems like an improvement. I was thinking of
registering this patch for the next commitfest. Is that okay?

I was just wondering what the accepted way to test and share
performance numbers is for a small patch like this. Is sharing DDL
enough? Do I need to use pg_bench?

--
Melanie Plageman

Attachments:

v1-0001-Implement-SortSupport-for-macaddr8-data-type.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Implement-SortSupport-for-macaddr8-data-type.patchDownload

From 23f66c342924201fd4e67d249c835f8aa149ed87 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 31 May 2019 15:48:55 -0400
Subject: [PATCH v1 1/1] Implement SortSupport for macaddr8 data type

Introduces a scheme to produce abbreviated keys for the macaddr8 type.

Co-authored-by: Peter Geoghegan <pg@bowt.ie>
---
 src/backend/utils/adt/mac8.c      | 203 ++++++++++++++++++++++++++++++
 src/include/catalog/pg_amproc.dat |   3 +
 src/include/catalog/pg_proc.dat   |   3 +
 3 files changed, 209 insertions(+)

diff --git a/src/backend/utils/adt/mac8.c b/src/backend/utils/adt/mac8.c
index 0b1fe4978e..25568eb3eb 100644
--- a/src/backend/utils/adt/mac8.c
+++ b/src/backend/utils/adt/mac8.c
@@ -21,10 +21,14 @@
 
 #include "postgres.h"
 
+#include "lib/hyperloglog.h"
 #include "libpq/pqformat.h"
+#include "port/pg_bswap.h"
 #include "utils/builtins.h"
+#include "utils/guc.h"
 #include "utils/hashutils.h"
 #include "utils/inet.h"
+#include "utils/sortsupport.h"
 
 /*
  *	Utility macros used for sorting and comparing:
@@ -48,6 +52,20 @@ static const signed char hexlookup[128] = {
 	-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
 };
 
+/* sortsupport for macaddr8 */
+typedef struct
+{
+	int64		input_count;	/* number of non-null values seen */
+	bool		estimating;		/* true if estimating cardinality */
+
+	hyperLogLogState abbr_card; /* cardinality estimator */
+} macaddr8_sortsupport_state;
+
+static int32 macaddr8_cmp_internal(macaddr8 *a1, macaddr8 *a2);
+static int	macaddr8_fast_cmp(Datum x, Datum y, SortSupport ssup);
+static int	macaddr8_cmp_abbrev(Datum x, Datum y, SortSupport ssup);
+static bool macaddr8_abbrev_abort(int memtupcount, SortSupport ssup);
+static Datum macaddr8_abbrev_convert(Datum original, SortSupport ssup);
 /*
  * hex2_to_uchar - convert 2 hex digits to a byte (unsigned char)
  *
@@ -575,3 +593,188 @@ macaddr8tomacaddr(PG_FUNCTION_ARGS)
 
 	PG_RETURN_MACADDR_P(result);
 }
+
+/*
+ * SortSupport strategy function. Populates a SortSupport struct with the
+ * information necessary to use comparison by abbreviated keys.
+ */
+Datum
+macaddr8_sortsupport(PG_FUNCTION_ARGS)
+{
+	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
+
+	ssup->comparator = macaddr8_fast_cmp;
+	ssup->ssup_extra = NULL;
+
+	if (ssup->abbreviate)
+	{
+		macaddr8_sortsupport_state *uss;
+		MemoryContext oldcontext;
+
+		oldcontext = MemoryContextSwitchTo(ssup->ssup_cxt);
+
+		uss = palloc(sizeof(macaddr8_sortsupport_state));
+		uss->input_count = 0;
+		uss->estimating = true;
+		initHyperLogLog(&uss->abbr_card, 10);
+
+		ssup->ssup_extra = uss;
+
+		ssup->comparator = macaddr8_cmp_abbrev;
+		ssup->abbrev_converter = macaddr8_abbrev_convert;
+		ssup->abbrev_abort = macaddr8_abbrev_abort;
+		ssup->abbrev_full_comparator = macaddr8_fast_cmp;
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * SortSupport "traditional" comparison function. Pulls two 8-byte MAC
+ * addresses from the heap and runs a standard comparison on them.
+ */
+static int
+macaddr8_fast_cmp(Datum x, Datum y, SortSupport ssup)
+{
+	macaddr8    *arg1 = DatumGetMacaddr8P(x);
+	macaddr8    *arg2 = DatumGetMacaddr8P(y);
+
+	return macaddr8_cmp_internal(arg1, arg2);
+}
+
+/*
+ * SortSupport abbreviated key comparison function. Compares two 8-byte MAC
+ * addresses quickly by treating them like integers, and without having to go
+ * to the heap.
+ */
+static int
+macaddr8_cmp_abbrev(Datum x, Datum y, SortSupport ssup)
+{
+	if (x > y)
+		return 1;
+	else if (x == y)
+		return 0;
+	else
+		return -1;
+}
+
+/*
+ * Callback for estimating effectiveness of abbreviated key optimization.
+ *
+ * We pay no attention to the cardinality of the non-abbreviated data, because
+ * there is no equality fast-path within authoritative macaddr8 comparator.
+ */
+static bool
+macaddr8_abbrev_abort(int memtupcount, SortSupport ssup)
+{
+	macaddr8_sortsupport_state *uss = ssup->ssup_extra;
+	double		abbr_card;
+
+	if (memtupcount < 10000 || uss->input_count < 10000 || !uss->estimating)
+		return false;
+
+	abbr_card = estimateHyperLogLog(&uss->abbr_card);
+
+	/*
+	 * If we have >100k distinct values, then even if we were sorting many
+	 * billion rows we'd likely still break even, and the penalty of undoing
+	 * that many rows of abbrevs would probably not be worth it. At this point
+	 * we stop counting because we know that we're now fully committed.
+	 */
+	if (abbr_card > 100000.0)
+	{
+#ifdef TRACE_SORT
+		if (trace_sort)
+			elog(LOG,
+				 "macaddr8_abbrev: estimation ends at cardinality %f"
+				 " after " INT64_FORMAT " values (%d rows)",
+				 abbr_card, uss->input_count, memtupcount);
+#endif
+		uss->estimating = false;
+		return false;
+	}
+
+	/*
+	 * Target minimum cardinality is 1 per ~2k of non-null inputs. 0.5 row
+	 * fudge factor allows us to abort earlier on genuinely pathological data
+	 * where we've had exactly one abbreviated value in the first 2k
+	 * (non-null) rows.
+	 */
+	if (abbr_card < uss->input_count / 2000.0 + 0.5)
+	{
+#ifdef TRACE_SORT
+		if (trace_sort)
+			elog(LOG,
+				 "macaddr8_abbrev: aborting abbreviation at cardinality %f"
+				 " below threshold %f after " INT64_FORMAT " values (%d rows)",
+				 abbr_card, uss->input_count / 2000.0 + 0.5, uss->input_count,
+				 memtupcount);
+#endif
+		return true;
+	}
+
+#ifdef TRACE_SORT
+	if (trace_sort)
+		elog(LOG,
+			 "macaddr8_abbrev: cardinality %f after " INT64_FORMAT
+			 " values (%d rows)", abbr_card, uss->input_count, memtupcount);
+#endif
+
+	return false;
+}
+
+/*
+ * SortSupport conversion routine. Converts original macaddr8 representation
+ * to abbreviated key representation.
+ *
+ * Packs the bytes of a 8-byte MAC address into a Datum and treats it as an
+ * unsigned integer for purposes of comparison. The integer is converted to
+ * native endianness to facilitate easy comparison.
+ */
+static Datum
+macaddr8_abbrev_convert(Datum original, SortSupport ssup)
+{
+	macaddr8_sortsupport_state *uss = ssup->ssup_extra;
+	macaddr8    *authoritative = DatumGetMacaddr8P(original);
+	Datum		res;
+
+#if SIZEOF_DATUM == 8
+	memset(&res, 0, SIZEOF_DATUM);
+	memcpy(&res, authoritative, sizeof(macaddr8));
+#else							/* SIZEOF_DATUM != 8 */
+	memcpy(&res, authoritative, SIZEOF_DATUM);
+#endif
+	uss->input_count += 1;
+
+	/*
+	 * Cardinality estimation. The estimate uses uint32, so on a 64-bit
+	 * architecture, XOR the two 32-bit halves together to produce slightly
+	 * more entropy.
+	 */
+	if (uss->estimating)
+	{
+		uint32		tmp;
+
+#if SIZEOF_DATUM == 8
+		tmp = (uint32) res ^ (uint32) ((uint64) res >> 32);
+#else							/* SIZEOF_DATUM != 8 */
+		tmp = (uint32) res;
+#endif
+
+		addHyperLogLog(&uss->abbr_card, DatumGetUInt32(hash_uint32(tmp)));
+	}
+
+	/*
+	 * Byteswap on little-endian machines.
+	 *
+	 * This is needed so that macaddr8_cmp_abbrev() (an unsigned integer 3-way
+	 * comparator) works correctly on all platforms. Without this, the
+	 * comparator would have to call memcmp() with a pair of pointers to the
+	 * first byte of each abbreviated key, which is slower.
+	 */
+	res = DatumBigEndianToNative(res);
+
+	return res;
+}
diff --git a/src/include/catalog/pg_amproc.dat b/src/include/catalog/pg_amproc.dat
index 020b7413cc..d77d98cbd1 100644
--- a/src/include/catalog/pg_amproc.dat
+++ b/src/include/catalog/pg_amproc.dat
@@ -214,6 +214,9 @@
   amprocrighttype => 'pg_lsn', amprocnum => '1', amproc => 'pg_lsn_cmp' },
 { amprocfamily => 'btree/macaddr8_ops', amproclefttype => 'macaddr8',
   amprocrighttype => 'macaddr8', amprocnum => '1', amproc => 'macaddr8_cmp' },
+{ amprocfamily => 'btree/macaddr8_ops', amproclefttype => 'macaddr8',
+  amprocrighttype => 'macaddr8', amprocnum => '2',
+  amproc => 'macaddr8_sortsupport' },
 { amprocfamily => 'btree/enum_ops', amproclefttype => 'anyenum',
   amprocrighttype => 'anyenum', amprocnum => '1', amproc => 'enum_cmp' },
 { amprocfamily => 'btree/tsvector_ops', amproclefttype => 'tsvector',
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 87335248a0..2fe573e418 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3865,6 +3865,9 @@
 { oid => '4125', descr => 'set 7th bit in macaddr8',
   proname => 'macaddr8_set7bit', prorettype => 'macaddr8',
   proargtypes => 'macaddr8', prosrc => 'macaddr8_set7bit' },
+{ oid => '4142', descr => 'sort support',
+  proname => 'macaddr8_sortsupport', prorettype => 'void',
+  proargtypes => 'internal', prosrc => 'macaddr8_sortsupport' },
 
 # for inet type support
 { oid => '910', descr => 'I/O',
-- 
2.21.0

macaddr8_ddl.sqlapplication/sql; name=macaddr8_ddl.sqlDownload

Chapman Flack

chap@anastigmatix.net

over 6 years ago

In reply to: Melanie Plageman (#1)

Re: Sort support for macaddr8

On 6/3/19 3:23 PM, Melanie Plageman wrote:

Peter and I implemented this small (attached) patch to extend
abbreviated key compare sort to macaddr8 datatype (currently supported
for macaddr).

Am I going cross-eyed, or would the memset be serving more of a purpose
if it were in the SIZEOF_DATUM != 8 branch?

Regards,
-Chap

Peter Geoghegan

pg@bowt.ie

over 6 years ago

In reply to: Melanie Plageman (#1)

Re: Sort support for macaddr8

On Mon, Jun 3, 2019 at 1:17 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

I tried checking to see if there is a performance difference using the
attached DDL based on src/test/regress/sql/macaddr8.sql. I found
that the sort function is only exercised when creating an index (not,
for example, when doing some type of aggregation).

As you know, it's a bit weird that we're proposing adding sort support
with abbreviated keys for a type that is 8 bytes, since you'd expect
it to also be pass-by-value on most platforms, which largely defeats
the purpose of having abbreviated keys (though sort support could
still make sense, for the same reason it makes sense to have it for
int8). However, macaddr8 isn't actually pass-by-value, and it seems
too late to do anything about that now, so abbreviated keys actually
make sense.

With the patch applied to current master and using the DDL attached,
the timing for creating the index hovered around 20 ms for master and
15 ms for the patched version.

I would expect a sufficiently large sort to execute in about half the
time with abbreviation, based on previous experience. However, I think
that this patch can be justified in a relatively straightforward way.
It extends sort support for macaddr to macaddr8, since these two types
are almost identical in every other way. We should still validate the
performance out of an abundance of caution, but I would be very
surprised if there was much difference between the macaddr and
macaddr8 cases.

In short, users should not be surprised by the big gap in performance
between macaddr and macaddr8. It's worth being consistent there.

I think that that seems like an improvement. I was thinking of
registering this patch for the next commitfest. Is that okay?

Definitely, yes.

I was just wondering what the accepted way to test and share
performance numbers is for a small patch like this. Is sharing DDL
enough? Do I need to use pg_bench?

I've always used custom microbenchmarks for stuff like this.
Repeatedly executing a particular query and taking the median
execution time as representative seems to be the best approach.

--
Peter Geoghegan

Peter Geoghegan

pg@bowt.ie

over 6 years ago

In reply to: Chapman Flack (#2)

Re: Sort support for macaddr8

On Mon, Jun 3, 2019 at 2:03 PM Chapman Flack <chap@anastigmatix.net> wrote:

Am I going cross-eyed, or would the memset be serving more of a purpose
if it were in the SIZEOF_DATUM != 8 branch?

No, it wouldn't -- that's the correct place for it with the macaddr
type. However, it isn't actually necessary to memset() at the
equivalent point for macaddr8, since we cannot "run out of bytes from
the authoritative representation" that go in the Datum/abbreviated
key. I suppose that the memset() should simply be removed, since it is
superfluous here.

--
Peter Geoghegan

Chapman Flack

chap@anastigmatix.net

over 6 years ago

In reply to: Chapman Flack (#2)

Re: Sort support for macaddr8

On 6/3/19 5:03 PM, Chapman Flack wrote:

On 6/3/19 3:23 PM, Melanie Plageman wrote:

Peter and I implemented this small (attached) patch to extend
abbreviated key compare sort to macaddr8 datatype (currently supported
for macaddr).

Am I going cross-eyed, or would the memset be serving more of a purpose
if it were in the SIZEOF_DATUM != 8 branch?

It looks like a copy-pasto coming from mac.c, where the size of
the thing to be compared isn't itself 8 bytes.

With sizeof(macaddr) being 6, that original code may have had
these cases in mind:

- SIZEOF_DATUM is something smaller than 6 (likely 4). The whole key
doesn't fit, but that's ok, because abbreviated "equality" just means
to recheck with the authoritative routine.
- SIZEOF_DATUM is exactly 6. Probably not a thing.
- SIZEOF_DATUM is anything larger than 6 (likely 8). Needs the memset.
Also, in this case, abbreviated "equality" could be taken as true
equality, never needing the authoritative fallback.

For macaddr8, the cases morph into these:

- SIZEOF_DATUM is something smaller than 8 (likely 4). Ok; it's
just an abbreviation.
- SIZEOF_DATUM is exactly 8. Now an actual thing, even likely.
- SIZEOF_DATUM is larger than 8. Our flying cars run postgres, and
we need the memset to make sure they don't crash.

This leaves me with a couple of questions:

1. (This one seems like a bug.) In the little-endian case, if
SIZEOF_DATUM is smaller than the type, I'm not convinced by doing
the DatumBigEndianToNative() after the memcpy(). Seems like that's
too late to make sure the most-significant bytes got copied.

2. (This one seems like an API opportunity.) If it becomes common to
add abbreviation support for smallish types such that (as here,
when SIZEOF_DATUM >= 8), an abbreviated "equality" result is in fact
authoritative, would it be worthwhile to have some way for the sort
support routine to announce that fact to the caller? That could
spare the caller the effort of re-checking with the authoritative
routine. It could also (by making the equality case less costly)
end up changing the weight assigned to the cardinality estimate in
deciding whether to abbrev..

Regards,
-Chap

Chapman Flack

chap@anastigmatix.net

over 6 years ago

In reply to: Chapman Flack (#5)

Re: Sort support for macaddr8

On 6/3/19 5:59 PM, Chapman Flack wrote:

On 6/3/19 5:03 PM, Chapman Flack wrote:
1. (This one seems like a bug.) In the little-endian case, if
SIZEOF_DATUM is smaller than the type, I'm not convinced by doing
the DatumBigEndianToNative() after the memcpy(). Seems like that's
too late to make sure the most-significant bytes got copied.

Wait, I definitely was cross-eyed for that one. It's the abbreviated
copy whose endianness varies. Never mind.

Regards,
-Chap

Peter Geoghegan

pg@bowt.ie

over 6 years ago

In reply to: Chapman Flack (#5)

Re: Sort support for macaddr8

On Mon, Jun 3, 2019 at 2:59 PM Chapman Flack <chap@anastigmatix.net> wrote:

1. (This one seems like a bug.) In the little-endian case, if
SIZEOF_DATUM is smaller than the type, I'm not convinced by doing
the DatumBigEndianToNative() after the memcpy(). Seems like that's
too late to make sure the most-significant bytes got copied.

Uh, when else would you do it? Before the memcpy()?

2. (This one seems like an API opportunity.) If it becomes common to
add abbreviation support for smallish types such that (as here,
when SIZEOF_DATUM >= 8), an abbreviated "equality" result is in fact
authoritative, would it be worthwhile to have some way for the sort
support routine to announce that fact to the caller? That could
spare the caller the effort of re-checking with the authoritative
routine.

It's possible that that would make sense, but I don't think that this
patch needs to do that. There is at least one pre-existing cases that
does this -- macaddr.

--
Peter Geoghegan

Stephen Frost

sfrost@snowman.net

over 6 years ago

In reply to: Melanie Plageman (#1)

Re: Sort support for macaddr8

Greetings,

* Melanie Plageman (melanieplageman@gmail.com) wrote:

Peter and I implemented this small (attached) patch to extend
abbreviated key compare sort to macaddr8 datatype (currently supported
for macaddr).

Nice.

I think that that seems like an improvement. I was thinking of
registering this patch for the next commitfest. Is that okay?

Sure.

I was just wondering what the accepted way to test and share
performance numbers is for a small patch like this. Is sharing DDL
enough? Do I need to use pg_bench?

Detailed (enough... doesn't need to include timing of every indivudal
query, but something like the average timing across 5 runs or similar
would be good) results posted to this list, with enough information
about how to reproduce the tests, would be the way to go.

The idea is to let others also test and make sure that they come up with
similar results to yours, and if they don't, ideally have enough
information to narrow down what's happening / what's different.

Thanks!

Stephen

Melanie Plageman

melanieplageman@gmail.com

over 6 years ago

In reply to: Peter Geoghegan (#3)

1 attachment(s)

Re: Sort support for macaddr8

On Mon, Jun 3, 2019 at 2:39 PM Peter Geoghegan <pg@bowt.ie> wrote:

As you know, it's a bit weird that we're proposing adding sort support
with abbreviated keys for a type that is 8 bytes, since you'd expect
it to also be pass-by-value on most platforms, which largely defeats
the purpose of having abbreviated keys (though sort support could
still make sense, for the same reason it makes sense to have it for
int8). However, macaddr8 isn't actually pass-by-value, and it seems
too late to do anything about that now, so abbreviated keys actually
make sense.

so, if what you say is true and it is either not worth it or
potentially a breaking change to make macaddr8 pass-by-value and
adding abbreviated sort support has the potential to avoid "pointer
chasing" and guarantee equivalent relative performance for macaddr8
and macaddr, then that seems worth it.

With regard to macaddr8_abbrev_convert() and memset(), I attached a patch
with the memset() removed, since it is superfluous here.

macaddr8_cmp_internal() already existed before this patch and I noticed
that it explicitly returns int32 whereas the return type of
macaddr_cmp_internal() is just specified as an int. I was wondering why.

I also noticed that the prototype for macaddr8_cmp_internal() was not
at the top of the file with the other static function prototypes. I
added it there, but I wasn't sure if there was some reason that it was
like that to begin with.

--
Melanie Plageman

Attachments:

v2-0001-Implement-SortSupport-for-macaddr8-data-type.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Implement-SortSupport-for-macaddr8-data-type.patchDownload

From 5ed80391a70bb4048d3cb4c66c8a3343318144e0 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 31 May 2019 15:48:55 -0400
Subject: [PATCH v2 1/1] Implement SortSupport for macaddr8 data type

Introduces a scheme to produce abbreviated keys for the macaddr8 type.

Co-authored-by: Peter Geoghegan <pg@bowt.ie>
---
 src/backend/utils/adt/mac8.c      | 202 ++++++++++++++++++++++++++++++
 src/include/catalog/pg_amproc.dat |   3 +
 src/include/catalog/pg_proc.dat   |   3 +
 3 files changed, 208 insertions(+)

diff --git a/src/backend/utils/adt/mac8.c b/src/backend/utils/adt/mac8.c
index 0b1fe4978e..0a869fe323 100644
--- a/src/backend/utils/adt/mac8.c
+++ b/src/backend/utils/adt/mac8.c
@@ -21,10 +21,14 @@
 
 #include "postgres.h"
 
+#include "lib/hyperloglog.h"
 #include "libpq/pqformat.h"
+#include "port/pg_bswap.h"
 #include "utils/builtins.h"
+#include "utils/guc.h"
 #include "utils/hashutils.h"
 #include "utils/inet.h"
+#include "utils/sortsupport.h"
 
 /*
  *	Utility macros used for sorting and comparing:
@@ -48,6 +52,20 @@ static const signed char hexlookup[128] = {
 	-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
 };
 
+/* sortsupport for macaddr8 */
+typedef struct
+{
+	int64		input_count;	/* number of non-null values seen */
+	bool		estimating;		/* true if estimating cardinality */
+
+	hyperLogLogState abbr_card; /* cardinality estimator */
+} macaddr8_sortsupport_state;
+
+static int32 macaddr8_cmp_internal(macaddr8 *a1, macaddr8 *a2);
+static int	macaddr8_fast_cmp(Datum x, Datum y, SortSupport ssup);
+static int	macaddr8_cmp_abbrev(Datum x, Datum y, SortSupport ssup);
+static bool macaddr8_abbrev_abort(int memtupcount, SortSupport ssup);
+static Datum macaddr8_abbrev_convert(Datum original, SortSupport ssup);
 /*
  * hex2_to_uchar - convert 2 hex digits to a byte (unsigned char)
  *
@@ -575,3 +593,187 @@ macaddr8tomacaddr(PG_FUNCTION_ARGS)
 
 	PG_RETURN_MACADDR_P(result);
 }
+
+/*
+ * SortSupport strategy function. Populates a SortSupport struct with the
+ * information necessary to use comparison by abbreviated keys.
+ */
+Datum
+macaddr8_sortsupport(PG_FUNCTION_ARGS)
+{
+	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
+
+	ssup->comparator = macaddr8_fast_cmp;
+	ssup->ssup_extra = NULL;
+
+	if (ssup->abbreviate)
+	{
+		macaddr8_sortsupport_state *uss;
+		MemoryContext oldcontext;
+
+		oldcontext = MemoryContextSwitchTo(ssup->ssup_cxt);
+
+		uss = palloc(sizeof(macaddr8_sortsupport_state));
+		uss->input_count = 0;
+		uss->estimating = true;
+		initHyperLogLog(&uss->abbr_card, 10);
+
+		ssup->ssup_extra = uss;
+
+		ssup->comparator = macaddr8_cmp_abbrev;
+		ssup->abbrev_converter = macaddr8_abbrev_convert;
+		ssup->abbrev_abort = macaddr8_abbrev_abort;
+		ssup->abbrev_full_comparator = macaddr8_fast_cmp;
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * SortSupport "traditional" comparison function. Pulls two 8-byte MAC
+ * addresses from the heap and runs a standard comparison on them.
+ */
+static int
+macaddr8_fast_cmp(Datum x, Datum y, SortSupport ssup)
+{
+	macaddr8    *arg1 = DatumGetMacaddr8P(x);
+	macaddr8    *arg2 = DatumGetMacaddr8P(y);
+
+	return macaddr8_cmp_internal(arg1, arg2);
+}
+
+/*
+ * SortSupport abbreviated key comparison function. Compares two 8-byte MAC
+ * addresses quickly by treating them like integers, and without having to go
+ * to the heap.
+ */
+static int
+macaddr8_cmp_abbrev(Datum x, Datum y, SortSupport ssup)
+{
+	if (x > y)
+		return 1;
+	else if (x == y)
+		return 0;
+	else
+		return -1;
+}
+
+/*
+ * Callback for estimating effectiveness of abbreviated key optimization.
+ *
+ * We pay no attention to the cardinality of the non-abbreviated data, because
+ * there is no equality fast-path within authoritative macaddr8 comparator.
+ */
+static bool
+macaddr8_abbrev_abort(int memtupcount, SortSupport ssup)
+{
+	macaddr8_sortsupport_state *uss = ssup->ssup_extra;
+	double		abbr_card;
+
+	if (memtupcount < 10000 || uss->input_count < 10000 || !uss->estimating)
+		return false;
+
+	abbr_card = estimateHyperLogLog(&uss->abbr_card);
+
+	/*
+	 * If we have >100k distinct values, then even if we were sorting many
+	 * billion rows we'd likely still break even, and the penalty of undoing
+	 * that many rows of abbrevs would probably not be worth it. At this point
+	 * we stop counting because we know that we're now fully committed.
+	 */
+	if (abbr_card > 100000.0)
+	{
+#ifdef TRACE_SORT
+		if (trace_sort)
+			elog(LOG,
+				 "macaddr8_abbrev: estimation ends at cardinality %f"
+				 " after " INT64_FORMAT " values (%d rows)",
+				 abbr_card, uss->input_count, memtupcount);
+#endif
+		uss->estimating = false;
+		return false;
+	}
+
+	/*
+	 * Target minimum cardinality is 1 per ~2k of non-null inputs. 0.5 row
+	 * fudge factor allows us to abort earlier on genuinely pathological data
+	 * where we've had exactly one abbreviated value in the first 2k
+	 * (non-null) rows.
+	 */
+	if (abbr_card < uss->input_count / 2000.0 + 0.5)
+	{
+#ifdef TRACE_SORT
+		if (trace_sort)
+			elog(LOG,
+				 "macaddr8_abbrev: aborting abbreviation at cardinality %f"
+				 " below threshold %f after " INT64_FORMAT " values (%d rows)",
+				 abbr_card, uss->input_count / 2000.0 + 0.5, uss->input_count,
+				 memtupcount);
+#endif
+		return true;
+	}
+
+#ifdef TRACE_SORT
+	if (trace_sort)
+		elog(LOG,
+			 "macaddr8_abbrev: cardinality %f after " INT64_FORMAT
+			 " values (%d rows)", abbr_card, uss->input_count, memtupcount);
+#endif
+
+	return false;
+}
+
+/*
+ * SortSupport conversion routine. Converts original macaddr8 representation
+ * to abbreviated key representation.
+ *
+ * Packs the bytes of a 8-byte MAC address into a Datum and treats it as an
+ * unsigned integer for purposes of comparison. The integer is converted to
+ * native endianness to facilitate easy comparison.
+ */
+static Datum
+macaddr8_abbrev_convert(Datum original, SortSupport ssup)
+{
+	macaddr8_sortsupport_state *uss = ssup->ssup_extra;
+	macaddr8    *authoritative = DatumGetMacaddr8P(original);
+	Datum		res;
+
+#if SIZEOF_DATUM == 8
+	memcpy(&res, authoritative, sizeof(macaddr8));
+#else							/* SIZEOF_DATUM != 8 */
+	memcpy(&res, authoritative, SIZEOF_DATUM);
+#endif
+	uss->input_count += 1;
+
+	/*
+	 * Cardinality estimation. The estimate uses uint32, so on a 64-bit
+	 * architecture, XOR the two 32-bit halves together to produce slightly
+	 * more entropy.
+	 */
+	if (uss->estimating)
+	{
+		uint32		tmp;
+
+#if SIZEOF_DATUM == 8
+		tmp = (uint32) res ^ (uint32) ((uint64) res >> 32);
+#else							/* SIZEOF_DATUM != 8 */
+		tmp = (uint32) res;
+#endif
+
+		addHyperLogLog(&uss->abbr_card, DatumGetUInt32(hash_uint32(tmp)));
+	}
+
+	/*
+	 * Byteswap on little-endian machines.
+	 *
+	 * This is needed so that macaddr8_cmp_abbrev() (an unsigned integer 3-way
+	 * comparator) works correctly on all platforms. Without this, the
+	 * comparator would have to call memcmp() with a pair of pointers to the
+	 * first byte of each abbreviated key, which is slower.
+	 */
+	res = DatumBigEndianToNative(res);
+
+	return res;
+}
diff --git a/src/include/catalog/pg_amproc.dat b/src/include/catalog/pg_amproc.dat
index 020b7413cc..d77d98cbd1 100644
--- a/src/include/catalog/pg_amproc.dat
+++ b/src/include/catalog/pg_amproc.dat
@@ -214,6 +214,9 @@
   amprocrighttype => 'pg_lsn', amprocnum => '1', amproc => 'pg_lsn_cmp' },
 { amprocfamily => 'btree/macaddr8_ops', amproclefttype => 'macaddr8',
   amprocrighttype => 'macaddr8', amprocnum => '1', amproc => 'macaddr8_cmp' },
+{ amprocfamily => 'btree/macaddr8_ops', amproclefttype => 'macaddr8',
+  amprocrighttype => 'macaddr8', amprocnum => '2',
+  amproc => 'macaddr8_sortsupport' },
 { amprocfamily => 'btree/enum_ops', amproclefttype => 'anyenum',
   amprocrighttype => 'anyenum', amprocnum => '1', amproc => 'enum_cmp' },
 { amprocfamily => 'btree/tsvector_ops', amproclefttype => 'tsvector',
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 87335248a0..2fe573e418 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3865,6 +3865,9 @@
 { oid => '4125', descr => 'set 7th bit in macaddr8',
   proname => 'macaddr8_set7bit', prorettype => 'macaddr8',
   proargtypes => 'macaddr8', prosrc => 'macaddr8_set7bit' },
+{ oid => '4142', descr => 'sort support',
+  proname => 'macaddr8_sortsupport', prorettype => 'void',
+  proargtypes => 'internal', prosrc => 'macaddr8_sortsupport' },
 
 # for inet type support
 { oid => '910', descr => 'I/O',
-- 
2.21.0

#10

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 6 years ago

In reply to: Stephen Frost (#8)

Re: Sort support for macaddr8

On Tue, Jun 04, 2019 at 01:49:23PM -0400, Stephen Frost wrote:

Greetings,

* Melanie Plageman (melanieplageman@gmail.com) wrote:

Peter and I implemented this small (attached) patch to extend
abbreviated key compare sort to macaddr8 datatype (currently supported
for macaddr).

Nice.

I think that that seems like an improvement. I was thinking of
registering this patch for the next commitfest. Is that okay?

Sure.

I was just wondering what the accepted way to test and share
performance numbers is for a small patch like this. Is sharing DDL
enough? Do I need to use pg_bench?

Detailed (enough... doesn't need to include timing of every indivudal
query, but something like the average timing across 5 runs or similar
would be good) results posted to this list, with enough information
about how to reproduce the tests, would be the way to go.

The idea is to let others also test and make sure that they come up with
similar results to yours, and if they don't, ideally have enough
information to narrow down what's happening / what's different.

Yeah, there's no "approved way" to do performance tests the contributors
would have to follow. That applies both to tooling and how detailed the
data need/should be. Ultimately, the goal is to convince others (and
yourself) that the change is an improvement. Does a simple pgbench
script achieve that? Cool, use that. Do you need something more complex?
Sure, do a shell script or something like that.

As long as others can reasonably reproduce your tests, it's fine.

For me, the most critical part of benchmarking a change is deciding what
to test - which queries, data sets, what amounts of data, config, etc.

For example, the data set you used has ~12k rows. Does the behavior
change with 10x or 100x that? It probably does not make sense to go
above available RAM (the I/O costs are likely to make everything else
mostly irrelevant), but CPU caches may matter a lot. Different work_mem
(and maintenance_work_mem) values may be useful too.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Peter Geoghegan (#3)

Re: Sort support for macaddr8

On 2019-Jun-03, Peter Geoghegan wrote:

As you know, it's a bit weird that we're proposing adding sort support
with abbreviated keys for a type that is 8 bytes, since you'd expect
it to also be pass-by-value on most platforms, which largely defeats
the purpose of having abbreviated keys (though sort support could
still make sense, for the same reason it makes sense to have it for
int8). However, macaddr8 isn't actually pass-by-value, and it seems
too late to do anything about that now, so abbreviated keys actually
make sense.

I'm not sure I understand why you say it's too late to change now.
Surely the on-disk representation doesn't actually change, so it is not
impossible to change? And you make it sound like doing that change is
worthwhile, performance-wise.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#12

Andres Freund

andres@anarazel.de

over 6 years ago

In reply to: Alvaro Herrera (#11)

Re: Sort support for macaddr8

Hi,

On 2019-06-04 17:37:35 -0400, Alvaro Herrera wrote:

On 2019-Jun-03, Peter Geoghegan wrote:

As you know, it's a bit weird that we're proposing adding sort support
with abbreviated keys for a type that is 8 bytes, since you'd expect
it to also be pass-by-value on most platforms, which largely defeats
the purpose of having abbreviated keys (though sort support could
still make sense, for the same reason it makes sense to have it for
int8). However, macaddr8 isn't actually pass-by-value, and it seems
too late to do anything about that now, so abbreviated keys actually
make sense.

I'm not sure I understand why you say it's too late to change now.
Surely the on-disk representation doesn't actually change, so it is not
impossible to change? And you make it sound like doing that change is
worthwhile, performance-wise.

Yea, I don't immediately see a problem with doing that on a major
version boundary. Obviously that'd only be possible for sizeof(Datum) ==
8 == sizeof(macaddr8) platforms, but that's the vast majority these
days. And generally, I think it's just about *always* worth to go for a
pass-by-value for the cases where that doesn't imply space wastage.

I think it might be worthwhile to optimize things so that all typlen > 0
&& typlen <= sizeof(Datum) are allowed for byval datums.

SELECT typname, typlen FROM pg_type WHERE typlen > 0 AND typlen <= 8 AND NOT typbyval;
┌──────────┬────────┐
│ typname │ typlen │
├──────────┼────────┤
│ tid │ 6 │
│ macaddr │ 6 │
│ macaddr8 │ 8 │
└──────────┴────────┘
(3 rows)

Seems like adding byval support for sizes outside of 1/2/4/8 bytes would
be worthwhile for tid alone. Not sure whether there's extensions with
signifcant use that have fixed-width types <= 8 bytes.

Greetings,

Andres Freund

#13

Andres Freund

andres@anarazel.de

over 6 years ago

In reply to: Andres Freund (#12)

Re: Sort support for macaddr8

Hi,

On 2019-06-04 14:55:16 -0700, Andres Freund wrote:

On 2019-06-04 17:37:35 -0400, Alvaro Herrera wrote:
I think it might be worthwhile to optimize things so that all typlen > 0
&& typlen <= sizeof(Datum) are allowed for byval datums.

Maybe even just deprecate specifying byval at CREATE TYPE time, and
instead automatically infer it from the type length. We've had a number
of blunders around this, and I can't really see any reason for
specifying byval = false when we internally could treat it as a byval.

Greetings,

Andres Freund

#14

Peter Geoghegan

pg@bowt.ie

over 6 years ago

In reply to: Andres Freund (#12)

Re: Sort support for macaddr8

On Tue, Jun 4, 2019 at 2:55 PM Andres Freund <andres@anarazel.de> wrote:

Yea, I don't immediately see a problem with doing that on a major
version boundary. Obviously that'd only be possible for sizeof(Datum) ==
8 == sizeof(macaddr8) platforms, but that's the vast majority these
days. And generally, I think it's just about *always* worth to go for a
pass-by-value for the cases where that doesn't imply space wastage.

It would be faster to do it that way, I think. You would need a more
complicated comparator than a classic abbreviated comparator (i.e. a
3-way unsigned int comparator) that way, but it would very probably be
faster on balance.

I'm glad to hear that it isn't *obviously* a problem from a
compatibility perspective -- I really wasn't sure about that, since
retrofitting a type to be pass-by-value like this is something that
may never have been attempted before now (at least not since we
started to care about pg_upgrade).

I think it might be worthwhile to optimize things so that all typlen > 0
&& typlen <= sizeof(Datum) are allowed for byval datums.

SELECT typname, typlen FROM pg_type WHERE typlen > 0 AND typlen <= 8 AND NOT typbyval;
┌──────────┬────────┐
│ typname │ typlen │
├──────────┼────────┤
│ tid │ 6 │
│ macaddr │ 6 │
│ macaddr8 │ 8 │
└──────────┴────────┘
(3 rows)

This is half the reason why I ended up implementing itemptr_encode()
to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some
years back -- TID is 6 bytes wide, which doesn't have the necessary
macro support within postgres.h. There is no reason why that couldn't
be added for the benefit of both TID and macaddr types, though it
probably wouldn't be worth it. And, as long as we're not going to
those lengths, there may be some value in keeping the macaddr8 code in
line with macaddr code -- the two types are currently almost the same
(the glaring difference is the lack of macaddr8 sort support).

We'll need to draw the line somewhere, and that is likely to be a bit
arbitrary. This was what I meant by "weird".

--
Peter Geoghegan

#15

Andres Freund

andres@anarazel.de

over 6 years ago

In reply to: Peter Geoghegan (#14)

Re: Sort support for macaddr8

Hi,

On 2019-06-04 15:10:07 -0700, Peter Geoghegan wrote:

On Tue, Jun 4, 2019 at 2:55 PM Andres Freund <andres@anarazel.de> wrote:

Yea, I don't immediately see a problem with doing that on a major
version boundary. Obviously that'd only be possible for sizeof(Datum) ==
8 == sizeof(macaddr8) platforms, but that's the vast majority these
days. And generally, I think it's just about *always* worth to go for a
pass-by-value for the cases where that doesn't imply space wastage.

It would be faster to do it that way, I think. You would need a more
complicated comparator than a classic abbreviated comparator (i.e. a
3-way unsigned int comparator) that way, but it would very probably be
faster on balance.

I'd be surprised if it weren't.

I'm glad to hear that it isn't *obviously* a problem from a
compatibility perspective -- I really wasn't sure about that, since
retrofitting a type to be pass-by-value like this is something that
may never have been attempted before now (at least not since we
started to care about pg_upgrade).

Obviously we have to test it, but I don't really see any compat
problems. Both have the same size on disk, after all. We couldn't make
such a change in a minor version, as DatumGetMacaddr*,
DatumGetItemPointer obviously need to change, but it ought to otherwise
be transparent. It would, I think, be different if we still supported
v0 calling conventions, but we don't...

This is half the reason why I ended up implementing itemptr_encode()
to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some
years back -- TID is 6 bytes wide, which doesn't have the necessary
macro support within postgres.h. There is no reason why that couldn't
be added for the benefit of both TID and macaddr types, though it
probably wouldn't be worth it.

I think we should definitely do that. It seems not unlikely that other
people want to write new fixed width types, and we shouldn't have them
deal with all this complexity unnecessarily.

Greetings,

Andres Freund

#16

Peter Geoghegan

pg@bowt.ie

over 6 years ago

In reply to: Andres Freund (#15)

Re: Sort support for macaddr8

On Tue, Jun 4, 2019 at 3:23 PM Andres Freund <andres@anarazel.de> wrote:

This is half the reason why I ended up implementing itemptr_encode()
to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some
years back -- TID is 6 bytes wide, which doesn't have the necessary
macro support within postgres.h. There is no reason why that couldn't
be added for the benefit of both TID and macaddr types, though it
probably wouldn't be worth it.

I think we should definitely do that. It seems not unlikely that other
people want to write new fixed width types, and we shouldn't have them
deal with all this complexity unnecessarily.

On second thought, maybe there is something to be said for being
exhaustive here.

It seems like there is a preference for making macaddr8 pass-by-value
instead of adding abbreviated keys support to macaddr8, and possibly
doing the same with the original macaddr type.

Do you think that you'll be able to work on the project with this
expanded scope, Melanie?

--
Peter Geoghegan

#17

Melanie Plageman

melanieplageman@gmail.com

over 6 years ago

In reply to: Peter Geoghegan (#16)

Re: Sort support for macaddr8

On Tue, Jun 4, 2019 at 3:50 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Jun 4, 2019 at 3:23 PM Andres Freund <andres@anarazel.de> wrote:

This is half the reason why I ended up implementing itemptr_encode()
to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some
years back -- TID is 6 bytes wide, which doesn't have the necessary
macro support within postgres.h. There is no reason why that couldn't
be added for the benefit of both TID and macaddr types, though it
probably wouldn't be worth it.

I think we should definitely do that. It seems not unlikely that other
people want to write new fixed width types, and we shouldn't have them
deal with all this complexity unnecessarily.

On second thought, maybe there is something to be said for being
exhaustive here.

It seems like there is a preference for making macaddr8 pass-by-value
instead of adding abbreviated keys support to macaddr8, and possibly
doing the same with the original macaddr type.

Do you think that you'll be able to work on the project with this
expanded scope, Melanie?

I can take on making macaddr8 pass-by-value
I tinkered a bit last night and got in/out mostly working (I think).
I'm not sure about macaddr, TID, and user-defined types.

--
Melanie Plageman

#18

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Melanie Plageman (#17)

Re: Sort support for macaddr8

On 2019-Jun-05, Melanie Plageman wrote:

I can take on making macaddr8 pass-by-value
I tinkered a bit last night and got in/out mostly working (I think).
I'm not sure about macaddr, TID, and user-defined types.

Yeah, let's see what macaddr8 looks like, and we can move from there --
I suppose adapting for macaddr would not be terribly different, but we
don't have to do both in a single commit. I don't expect that TID would
necessarily be similar since we have lots of bespoke code for that in
lots of places; it might not affect anything (it should not!) but then
it might. No reason not to move forward incrementally.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#19

Andres Freund

andres@anarazel.de

over 6 years ago

In reply to: Melanie Plageman (#17)

Re: Sort support for macaddr8

Hi,

On 2019-06-05 09:18:34 -0700, Melanie Plageman wrote:

On Tue, Jun 4, 2019 at 3:50 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Jun 4, 2019 at 3:23 PM Andres Freund <andres@anarazel.de> wrote:

This is half the reason why I ended up implementing itemptr_encode()
to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some
years back -- TID is 6 bytes wide, which doesn't have the necessary
macro support within postgres.h. There is no reason why that couldn't
be added for the benefit of both TID and macaddr types, though it
probably wouldn't be worth it.

I think we should definitely do that. It seems not unlikely that other
people want to write new fixed width types, and we shouldn't have them
deal with all this complexity unnecessarily.

On second thought, maybe there is something to be said for being
exhaustive here.

It seems like there is a preference for making macaddr8 pass-by-value
instead of adding abbreviated keys support to macaddr8, and possibly
doing the same with the original macaddr type.

Do you think that you'll be able to work on the project with this
expanded scope, Melanie?

I can take on making macaddr8 pass-by-value
I tinkered a bit last night and got in/out mostly working (I think).
I'm not sure about macaddr, TID, and user-defined types.

I'd much rather see this tackled in a general way than fiddling with
individual datatypes. I think we should:

1) make fetch_att(), store_att_byval() etc support datums of any length
between 1 and <= sizeof(Datum). Probably also convert them to inline
functions. There's a few more functions to be adjusted, but not many,
I think.

2) Remove ability to pass PASSEDBYVALUE to CREATE TYPE, but instead
compute whether attyval is possible, solely based on INTERNALLENGTH
(when INTERNALLENGTH > 0 obviously).

3) Fix the fallout, by fixing a few of the Datum<->type conversion
functions affected by this change. That'll require a bit of work, but
not too much. We should write those conversion routines in a way
that'll keep them working for the scenarios where the type is
actually passable by value, and not (required for > 4 byte datums).

Greetings,

Andres Freund

#20

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Andres Freund (#19)

Re: Sort support for macaddr8

On 2019-Jun-05, Andres Freund wrote:

I'd much rather see this tackled in a general way than fiddling with
individual datatypes. I think we should:

1) make fetch_att(), store_att_byval() etc support datums of any length
between 1 and <= sizeof(Datum). Probably also convert them to inline
functions. There's a few more functions to be adjusted, but not many,
I think.

Does this mean that datatypes that are >4 and <=8 bytes need to handle
both cases? Is it possible for them to detect the current environment?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#21

Andres Freund

andres@anarazel.de

over 6 years ago

In reply to: Alvaro Herrera (#20)

Re: Sort support for macaddr8

Hi,

On June 5, 2019 12:14:42 PM PDT, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

On 2019-Jun-05, Andres Freund wrote:

I'd much rather see this tackled in a general way than fiddling with
individual datatypes. I think we should:

1) make fetch_att(), store_att_byval() etc support datums of any

length

between 1 and <= sizeof(Datum). Probably also convert them to

inline

functions. There's a few more functions to be adjusted, but not

many,

I think.

Does this mean that datatypes that are >4 and <=8 bytes need to handle
both cases? Is it possible for them to detect the current environment?

Well, the conversion macros need to know. You can look at float8 for an example of the difference - it's pretty centralized. We should provide a few helper macros to abstract that away.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#22

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Andres Freund (#21)

Re: Sort support for macaddr8

Andres Freund <andres@anarazel.de> writes:

On June 5, 2019 12:14:42 PM PDT, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Does this mean that datatypes that are >4 and <=8 bytes need to handle
both cases? Is it possible for them to detect the current environment?

Well, the conversion macros need to know. You can look at float8 for an example of the difference - it's pretty centralized. We should provide a few helper macros to abstract that away.

FWIW, I disagree with Andres on this being a reasonable way to proceed.

The fact that we support both pass-by-value and pass-by-ref for float8
and int8 is because those are important data types that are worth taking
extra pains to optimize. It's a very long way from there to insisting
that every datatype between 5 and 8 bytes long must get the same
treatment, and even further to Andres' apparent position that we should
force third-party types to do it whether they care about
micro-optimization or not.

And I *entirely* fail to get the point of adding such support for
datatypes of 5 or 7 bytes. No such types exist, or are on the horizon
AFAIK.

Lastly, I don't think adding additional allowed widths of pass-by-value
types would be cost-free, because it would add cycles to the inner loops
of the tuple forming and deforming functions. (No, I don't believe that
JIT makes that an ignorable concern.)

I'm not really sure that either macaddr or macaddr8 are used widely
enough to justify expending optimization effort on them. But if they
are, let's just do that, not move the goal posts into the next stadium.

regards, tom lane

#23

Andres Freund

andres@anarazel.de

over 6 years ago

In reply to: Tom Lane (#22)

Re: Sort support for macaddr8

Hi,

On 2019-06-06 13:39:50 -0400, Tom Lane wrote:

Lastly, I don't think adding additional allowed widths of pass-by-value
types would be cost-free, because it would add cycles to the inner loops
of the tuple forming and deforming functions.

I'm not sure I quite buy that.

I think that we have branches over a fixed number of lengths is largely
unnecessary. att_addlength_pointer() doesn't care - it just uses the
length. And I think we should just consider doing the same for
fetch_att(). E.g. by using memcpy().

That'd also have the advantage that we'd not be *forced* to rely
alignment of byval types. The only reason we actually need that is the
heaptuple-to-struct mapping for catalogs. Outside of that we don't have
pointers to individual byval tuples, and waste a fair bit of padding due
to that.

Additionally we'd get rid of needing separate versions for SIZEOF_DATUM
!= 8/not.

(No, I don't believe that JIT makes that an ignorable concern.)

Obviously not.

Greetings,

Andres Freund