Questions about support function and abbreviate

Started by Han Wangover 4 years ago4 messages
#1Han Wang
hanwgeek@gmail.com

Hi all,

I am trying to implement a sort support function for geometry data types in
PostGIS with the new feature `SortSupport`. However, I have a question
about this.

I think it is hardly to apply a sort support function to a complex data
type without the `abbrev_converter` to simply the data structure into a
single `Datum`. However, I do not know how the system determines when to
apply the converter.

I appreciate any answers or suggestions. I am looking forward to hearing
from you.

Best regards,
Han

In reply to: Han Wang (#1)
Re: Questions about support function and abbreviate

Hello,

the abbrev_converter is applied whenever it is defined. The values are
sorted using the abbreviated comparator first using the shortened version,
and if there is a tie the system asks the real full comparator to resolve
it.

This article seems to be rather comprehensive:
https://brandur.org/sortsupport

On Sat, Jun 12, 2021 at 9:51 AM Han Wang <hanwgeek@gmail.com> wrote:

Hi all,

I am trying to implement a sort support function for geometry data types
in PostGIS with the new feature `SortSupport`. However, I have a question
about this.

I think it is hardly to apply a sort support function to a complex data
type without the `abbrev_converter` to simply the data structure into a
single `Datum`. However, I do not know how the system determines when to
apply the converter.

I appreciate any answers or suggestions. I am looking forward to hearing
from you.

Best regards,
Han

--
Darafei "Komяpa" Praliaskouski
OSM BY Team - http://openstreetmap.by/

#3Han Wang
hanwgeek@gmail.com
In reply to: Darafei "Komяpa" Praliaskouski (#2)
Re: Questions about support function and abbreviate

Hi Darafei,

Thanks for your reply.

However, I still don't get the full picture of this. Let me make my
question more clear.

First of all, in the *`gistproc.c
<https://github.com/postgres/postgres/blob/master/src/backend/access/gist/gistproc.c#L1761&gt;`*
of Postgres, it shows that the `abbreviate` attributes should be set before
the `abbrev_converter` defined. So I would like to know where to define a
`SortSupport` structure with `abbreviate` is `true`.

Secondly, in the support functions of internal data type `Point`, the
`abbrev_full_copmarator` just z-order hash the point first like the
`abbrev_converter` doing and then compare the hash value. So I don't know
the difference between `full_comparator` and `comparator` after
`abbrev_converter`.

Best regards,
Han

On Sat, Jun 12, 2021 at 2:55 PM Darafei "Komяpa" Praliaskouski <
me@komzpa.net> wrote:

Show quoted text

Hello,

the abbrev_converter is applied whenever it is defined. The values are
sorted using the abbreviated comparator first using the shortened version,
and if there is a tie the system asks the real full comparator to resolve
it.

This article seems to be rather comprehensive:
https://brandur.org/sortsupport

On Sat, Jun 12, 2021 at 9:51 AM Han Wang <hanwgeek@gmail.com> wrote:

Hi all,

I am trying to implement a sort support function for geometry data types
in PostGIS with the new feature `SortSupport`. However, I have a question
about this.

I think it is hardly to apply a sort support function to a complex data
type without the `abbrev_converter` to simply the data structure into a
single `Datum`. However, I do not know how the system determines when to
apply the converter.

I appreciate any answers or suggestions. I am looking forward to hearing
from you.

Best regards,
Han

--
Darafei "Komяpa" Praliaskouski
OSM BY Team - http://openstreetmap.by/

#4Giuseppe Broccolo
g.broccolo.7@gmail.com
In reply to: Han Wang (#3)
Re: Questions about support function and abbreviate

Hi Han,

Darafei already provided a good answer to your question, I will add just a
few things with the hope of making things more clear for your use case.

SortSupport implementation in PostgreSQL allows to make comparisons at
binary level in a dedicated region of memory where data can be quickly
accessed through
references to actual data in the heap called "sort tuples". Those
references have a space to include the data of a length of a native pointer
of a system, which is 8 bytes
for 64 bit systems. Although that represents enough space for standard data
types like integers or floats, it's not enough for longer data types, or
varlena data like
geometries.

In this last case, we need to pass to sort tuples an abbreviated version of
the key which should include the most representative part. This is the
scope of the abbreviated
attributes which need to be provided to create the abbreviated keys.

To answer more specifically to your question, the four abbreviated
attributes represent

* comparator --> the access method which should
be used of comparison of abbreviated keys
* abbrev_converter --> the method which creates the abbreviations
(NOTE in src/backend/access/gist/gistproc.c it just consider the first 32
bits of the hash of a geometry)
* abbrev_abort --> the method which should check if the
abbreviation has to be done or not even in cases the length is greater than
the size of the native pointer (NOTE,
it is not
implemented in src/backend/access/gist/gistproc.c, which means that
abbreviation is always worth)
* abbrev_full_comparator --> the method which should be used for
comparisons in case of fall back into not abbreviated keys (NOTE, this
attribute coincides to the comparator one
in case the
abbreviate flag is set to false)

Hope it helps,
Giuseppe.

Il giorno sab 12 giu 2021 alle ore 08:43 Han Wang <hanwgeek@gmail.com> ha
scritto:

Show quoted text

Hi Darafei,

Thanks for your reply.

However, I still don't get the full picture of this. Let me make my
question more clear.

First of all, in the *`gistproc.c
<https://github.com/postgres/postgres/blob/master/src/backend/access/gist/gistproc.c#L1761&gt;`*
of Postgres, it shows that the `abbreviate` attributes should be set before
the `abbrev_converter` defined. So I would like to know where to define a
`SortSupport` structure with `abbreviate` is `true`.

Secondly, in the support functions of internal data type `Point`, the
`abbrev_full_copmarator` just z-order hash the point first like the
`abbrev_converter` doing and then compare the hash value. So I don't know
the difference between `full_comparator` and `comparator` after
`abbrev_converter`.

Best regards,
Han

On Sat, Jun 12, 2021 at 2:55 PM Darafei "Komяpa" Praliaskouski <
me@komzpa.net> wrote:

Hello,

the abbrev_converter is applied whenever it is defined. The values are
sorted using the abbreviated comparator first using the shortened version,
and if there is a tie the system asks the real full comparator to resolve
it.

This article seems to be rather comprehensive:
https://brandur.org/sortsupport

On Sat, Jun 12, 2021 at 9:51 AM Han Wang <hanwgeek@gmail.com> wrote:

Hi all,

I am trying to implement a sort support function for geometry data types
in PostGIS with the new feature `SortSupport`. However, I have a question
about this.

I think it is hardly to apply a sort support function to a complex data
type without the `abbrev_converter` to simply the data structure into a
single `Datum`. However, I do not know how the system determines when to
apply the converter.

I appreciate any answers or suggestions. I am looking forward to hearing
from you.

Best regards,
Han

--
Darafei "Komяpa" Praliaskouski
OSM BY Team - http://openstreetmap.by/