C based plugins, clocks, locks, and configuration variables

Started by Clifford Hammerschmidtabout 9 years ago10 messages
#1Clifford Hammerschmidt
tanglebones@gmail.com

Hi all,

Apologies in advance if this isn't the right place to be posting this.

I've started work on a plugin in C (https://github.com/tanglebones/pg_tuid)
for generating generally monotonically ascending UUIDs (aka TUIDs), and
after googling around I couldn't find any guidence on a few things. (It's
hard to google for anything in the postgres C api as most results coming
back are for using postgres itself, not developing plugins for postgres.)

I'm looking for the idiomatic (and portable) way of:

1) getting microseconds (or nanoseconds) from UTC epoch in a plugin
2) getting an exclusive lock for a user plugin to serialize access to its
shared state (I'm assuming that plugins must be reentrant)
3) creating a configuration variable for a plugin and accessing its values
in the plugin. (e.g. `set plugin.configuration_variable=1` or somesuch)

Thanks,

--
Clifford Hammerschmidt, P.Eng.

#2Craig Ringer
craig.ringer@2ndquadrant.com
In reply to: Clifford Hammerschmidt (#1)
Re: C based plugins, clocks, locks, and configuration variables

On 4 Nov. 2016 06:05, "Clifford Hammerschmidt" <tanglebones@gmail.com>
wrote:

Hi all,

Apologies in advance if this isn't the right place to be posting this.

I've started work on a plugin in C (https://github.com/tanglebones/pg_tuid)

for generating generally monotonically ascending UUIDs (aka TUIDs), and
after googling around I couldn't find any guidence on a few things. (It's
hard to google for anything in the postgres C api as most results coming
back are for using postgres itself, not developing plugins for postgres.)

I'm looking for the idiomatic (and portable) way of:

1) getting microseconds (or nanoseconds) from UTC epoch in a plugin

GetCurrentIntegerTimestamp()

2) getting an exclusive lock for a user plugin to serialize access to its

shared state (I'm assuming that plugins must be reentrant)

Allocate an LWLock in your shared memory segment and use it to arbitrate
access. Multiple examples in contrib. Lwlock allocation infonin developer
docs.

3) creating a configuration variable for a plugin and accessing its

values in the plugin. (e.g. `set plugin.configuration_variable=1` or
somesuch)

DefineCustomIntegerVariable etc (I think, name not exactly right? On
phone). See guc.h .

#3Craig Ringer
craig.ringer@2ndquadrant.com
In reply to: Clifford Hammerschmidt (#1)
Re: C based plugins, clocks, locks, and configuration variables

On 8 November 2016 at 07:41, Clifford Hammerschmidt
<tanglebones@gmail.com> wrote:

Hi Craig,

Thanks for the pointers; I made a stab at it in:
https://github.com/tanglebones/pg_tuid

I've no idea if the shmem and lwlock code is correct, or how to test it. It
seems to work (requires loading via the shared_preload_libraries) on osx in
that the tuid_ calls work and produce the expected results on my lightly
loaded development box (not really a good test of shmem or locks in that I
doubt either are being exercised).

Since that's a public github I took the liberty of replying to the
list. Please reply to the list, not just to me.

Good on you for giving it a go.

For concurrency testing, the isolation tester tool in
src/test/isolation is quite handy. Custom pgbench scripts can also be
useful, though they're really only useful if you can detect an
anomalous situation and Assert to crash the backend in an
--enable-cassert build when there's a problem.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Craig Ringer (#2)
Re: C based plugins, clocks, locks, and configuration variables

On 11/3/16 7:14 PM, Craig Ringer wrote:

1) getting microseconds (or nanoseconds) from UTC epoch in a plugin

GetCurrentIntegerTimestamp()

Since you're serializing generation anyway you might want to just forgo
the timestamp completely. It's not like the values your generating are
globally unique anymore, or hard to guess.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Clifford Hammerschmidt
tanglebones@gmail.com
In reply to: Jim Nasby (#4)
Re: C based plugins, clocks, locks, and configuration variables

Hi Jim,

The values are still globally unique. The odds of a collision are very very
low. Two instances with the same node_id generating on the same millisecond
(in their local view of time) have a 1:2^34 chance of collision. node_id
only repeats every 256 machines in a cluster (assuming you're configured
correctly), and the probability of the same millisecond being used on both
machines is also low (depends on generation rate and machine speed). The
only real concern is with clock replays (i.e. something sets the clock
backwards, like an admin or a badly implemented time sync system), which
does happen in rare instances and is why seq is there to extend that space
out and reduce the chance of a collision in that millisecond. (time replays
are a real problem with id systems like snowflake.)

Also, the point of the timestamp isn't uniqueness, it's the generally
monotonically ascending aspect I want. This causes inserts to append to the
index (much faster than random inserts in large indexes because of cache
coherency), and causes data generated around the same time to occupy near
nodes in the index (again, cache benefits, as related data tends to be
generated bunched up in time).

Thanks,
-Cliff.

--
Clifford Hammerschmidt, P.Eng.

On Tue, Nov 8, 2016 at 6:27 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

Show quoted text

On 11/3/16 7:14 PM, Craig Ringer wrote:

1) getting microseconds (or nanoseconds) from UTC epoch in a plugin

GetCurrentIntegerTimestamp()

Since you're serializing generation anyway you might want to just forgo
the timestamp completely. It's not like the values your generating are
globally unique anymore, or hard to guess.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

#6Clifford Hammerschmidt
tanglebones@gmail.com
In reply to: Clifford Hammerschmidt (#5)
Re: C based plugins, clocks, locks, and configuration variables

Looking closer at the bit math, I screwed it up.... it should be 64 bits
time, 6 bit uuid version, 8 node, 8 seq, and the rest random ... which is
42 bits of random. I'll find the code in a bit.

--
Clifford Hammerschmidt, P.Eng.

On Tue, Nov 8, 2016 at 9:42 AM, Clifford Hammerschmidt <
tanglebones@gmail.com> wrote:

Show quoted text

Hi Jim,

The values are still globally unique. The odds of a collision are very
very low. Two instances with the same node_id generating on the same
millisecond (in their local view of time) have a 1:2^34 chance of
collision. node_id only repeats every 256 machines in a cluster (assuming
you're configured correctly), and the probability of the same millisecond
being used on both machines is also low (depends on generation rate and
machine speed). The only real concern is with clock replays (i.e. something
sets the clock backwards, like an admin or a badly implemented time sync
system), which does happen in rare instances and is why seq is there to
extend that space out and reduce the chance of a collision in that
millisecond. (time replays are a real problem with id systems like
snowflake.)

Also, the point of the timestamp isn't uniqueness, it's the generally
monotonically ascending aspect I want. This causes inserts to append to the
index (much faster than random inserts in large indexes because of cache
coherency), and causes data generated around the same time to occupy near
nodes in the index (again, cache benefits, as related data tends to be
generated bunched up in time).

Thanks,
-Cliff.

--
Clifford Hammerschmidt, P.Eng.

On Tue, Nov 8, 2016 at 6:27 AM, Jim Nasby <Jim.Nasby@bluetreble.com>
wrote:

On 11/3/16 7:14 PM, Craig Ringer wrote:

1) getting microseconds (or nanoseconds) from UTC epoch in a plugin

GetCurrentIntegerTimestamp()

Since you're serializing generation anyway you might want to just forgo
the timestamp completely. It's not like the values your generating are
globally unique anymore, or hard to guess.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

#7Craig Ringer
craig.ringer@2ndquadrant.com
In reply to: Clifford Hammerschmidt (#6)
Re: C based plugins, clocks, locks, and configuration variables

On 9 Nov. 2016 02:48, "Clifford Hammerschmidt" <tanglebones@gmail.com>
wrote:

Looking closer at the bit math, I screwed it up.... it should be 64 bits

time, 6 bit uuid version, 8 node, 8 seq, and the rest random ... which is
42 bits of random. I'll find the code in a bit.

Huh, so that's what you are doing.

I just added the same thing to the 9.6 BDR development tree last week,
though using 64-bit values, based on a draft Petr wrote. Feel free to take
a look. bdr-plugin/dev-bdr96 branch in 2ndQuadrant/bdr github repo. The
main file is seq2.c .

#8Clifford Hammerschmidt
tanglebones@gmail.com
In reply to: Craig Ringer (#7)
Re: C based plugins, clocks, locks, and configuration variables

On Tue, Nov 8, 2016 at 2:58 PM, Craig Ringer <craig.ringer@2ndquadrant.com>
wrote:

2ndQuadrant/bdr

That is similar. I'm not clear on the usage of OID for sequence (`
DirectFunctionCall1(nextval_oid, seqoid)`) ... does that imply a lock
around a sequence generation? also different is that your sequence doesn't
reset on the time basis, it ascends and wraps independently of the time.

(also, you appear to modulo against the max (2^n-1), not the cardinality
(2^n), ... should that be an & ... i.e. take SEQUENCE_BITS of 1 ->
MAX_SEQ_ID of ((1<<1)-1) = 1 -> (seq % 1) = {0} ... not {0,1} as expected;
(seq & 1) = {0,1} as expected)

We tried 64-bit values for ids (based on twitter's snowflake), but found
that time-replay would cause collisions. We had a server have its time
corrected, going backwards, by an admin; leading to duplicate ids being
generated, leading to a fun day of debugging and a hard lesson about our
assumption that time always increases over time. Using node_id doesn't
protect against this, since it is the same node creating the colliding ids
as the original ids. By extending the ids to include a significant amount
of randomness, and requiring a restart of the db for the time value to move
backwards (by latching onto the last seen time), we narrow the window for
collisions to close enough to zero that winning the lottery is far more
likely (http://preshing.com/20110504/hash-collision-probabilities/ has the
exact math). We also increase the time scale for id wrap around to long
past the likely life expectancy of the software we're building today.

--
Clifford Hammerschmidt, P.Eng.

#9Craig Ringer
craig.ringer@2ndquadrant.com
In reply to: Clifford Hammerschmidt (#8)
Re: C based plugins, clocks, locks, and configuration variables

On 10 November 2016 at 07:18, Clifford Hammerschmidt
<tanglebones@gmail.com> wrote:

On Tue, Nov 8, 2016 at 2:58 PM, Craig Ringer <craig.ringer@2ndquadrant.com>
wrote:

2ndQuadrant/bdr

That is similar. I'm not clear on the usage of OID for sequence
(`DirectFunctionCall1(nextval_oid, seqoid)`) ... does that imply a lock
around a sequence generation?

No.

also different is that your sequence doesn't
reset on the time basis, it ascends and wraps independently of the time.

Right.

(also, you appear to modulo against the max (2^n-1), not the cardinality
(2^n), ... should that be an & ... i.e. take SEQUENCE_BITS of 1 ->
MAX_SEQ_ID of ((1<<1)-1) = 1 -> (seq % 1) = {0} ... not {0,1} as expected;
(seq & 1) = {0,1} as expected)

Hm. I think you're right there.

We tried 64-bit values for ids (based on twitter's snowflake), but found
that time-replay would cause collisions. We had a server have its time
corrected, going backwards, by an admin; leading to duplicate ids being
generated, leading to a fun day of debugging and a hard lesson about our
assumption that time always increases over time.

That's a good point, but it's just going to have to be a documented
limitation. BDR expects you to use NTP and slew time when needed
anyway.

Using node_id doesn't
protect against this, since it is the same node creating the colliding ids
as the original ids. By extending the ids to include a significant amount of
randomness, and requiring a restart of the db for the time value to move
backwards (by latching onto the last seen time), we narrow the window for
collisions to close enough to zero that winning the lottery is far more
likely (http://preshing.com/20110504/hash-collision-probabilities/ has the
exact math). We also increase the time scale for id wrap around to long past
the likely life expectancy of the software we're building today.

It's a good idea. I like what you're doing. I've run into too many
sites that can't or won't use 128-bit generated values though.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Craig Ringer
craig.ringer@2ndquadrant.com
In reply to: Clifford Hammerschmidt (#8)
Re: C based plugins, clocks, locks, and configuration variables

On 10 November 2016 at 07:18, Clifford Hammerschmidt
<tanglebones@gmail.com> wrote:

On Tue, Nov 8, 2016 at 2:58 PM, Craig Ringer <craig.ringer@2ndquadrant.com>
wrote:

2ndQuadrant/bdr

That is similar. I'm not clear on the usage of OID for sequence
(`DirectFunctionCall1(nextval_oid, seqoid)`) ... does that imply a lock
around a sequence generation? also different is that your sequence doesn't
reset on the time basis, it ascends and wraps independently of the time.

Meant to explain more here.

Most of the system identifies sequence relations by oid. All this does
is call nextval. By accepting and passing oid we reduce the number of
syscache/relcache lookups and memory allocations required to call
nextval vs calling it by name. That's about all, really.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers