High-Concurrency GiST in postgreSQL
Hello. This is my first post. As such, feedback on style and choice of
venue are especially welcome.
I am a regular but not especially expert user of a variety of databases,
including postgreSQL.
I have only modest experience with spatial databases.
I have a new project[1]It's not a GiS prject, but it has some similarities: (a) I need to manage up to 10 million three-dimensional "boxes" or as few as 1000 "boxes" (b) The distribution of sizes, aspect ratios and locations in R3 are all unknown a priori and may change during execution under insert/delete. (c) Queries may arrive asynchronously and at high rate from hundreds (or more?) of compute nodes. (d) Successive queries from any node, viewed as a time-sequence, may have very low (or at best sporadic) spatial correlation -- lots of page jumps. (e) R* will be advantageous over R, but Priority R is probably not especially useful since turnover may be greater than 20% during a "job." (f) I would like to avoid teh complications of distributed databases, again because of the high turnover. in which GiST could be very useful, provided I can
achieve high concurrency. Starting with some empirical evidence that R*
would be a good place to start, and after reading "High-Concurrency Locking
in R-Trees" [2]Marcel Kornacker and Douglas Banks. High-Concurrency Locking in R-Trees. (1995), I went looking for an implementation of R-link trees
extended to R*. So I was very interested to read Hellerstein et al. where
they wrote [3]Hellerstein, Naughton, and Pfeffer. Generalized Search Trees for Database Systems. (1995):
*High concurrency, recoverability, and degree-3 consis-
tency are critical factors in a full-fledged database sys-
tem. We are considering extending the results of Kor-
nacker and Banks for R-trees [KB95] to our implemen-
tation of GiSTs.
*
Since this information may be somewhat dated, and GiST has obviously come a
long way in postgreSQL, I am looking for current information and advice on
the state of concurrency in GiST in postgreSQL. If someone has already
done an R*-link tree then that could really help me. ( I can wish, no?)
Thanks for reading and thanks for advice or pointers.
Carlos
[1]: It's not a GiS prject, but it has some similarities: (a) I need to manage up to 10 million three-dimensional "boxes" or as few as 1000 "boxes" (b) The distribution of sizes, aspect ratios and locations in R3 are all unknown a priori and may change during execution under insert/delete. (c) Queries may arrive asynchronously and at high rate from hundreds (or more?) of compute nodes. (d) Successive queries from any node, viewed as a time-sequence, may have very low (or at best sporadic) spatial correlation -- lots of page jumps. (e) R* will be advantageous over R, but Priority R is probably not especially useful since turnover may be greater than 20% during a "job." (f) I would like to avoid teh complications of distributed databases, again because of the high turnover.
(a) I need to manage up to 10 million three-dimensional "boxes" or as few
as 1000 "boxes"
(b) The distribution of sizes, aspect ratios and locations in R3 are all
unknown a priori and may change during execution under insert/delete.
(c) Queries may arrive asynchronously and at high rate from hundreds (or
more?) of compute nodes.
(d) Successive queries from any node, viewed as a time-sequence, may have
very low (or at best sporadic) spatial correlation -- lots of page jumps.
(e) R* will be advantageous over R, but Priority R is probably not
especially useful since turnover may be greater than 20% during a "job."
(f) I would like to avoid teh complications of distributed databases, again
because of the high turnover.
[2]: Marcel Kornacker and Douglas Banks. High-Concurrency Locking in R-Trees. (1995)
R-Trees. (1995)
[3]: Hellerstein, Naughton, and Pfeffer. Generalized Search Trees for Database Systems. (1995)
Database Systems. (1995)
On 12/5/2011 12:31 PM, C. Mundi wrote:
Hello. This is my first post. As such, feedback on style and choice of
venue are especially welcome.I am a regular but not especially expert user of a variety of databases,
including postgreSQL.
I have only modest experience with spatial databases.I have a new project[1] in which GiST could be very useful, provided I
can achieve high concurrency.<SNIP>
concurrency here can mean different things. One application hitting PG
which then uses multiple threads? (Not currently possible) Or one app
with multiple threads each having a database connection? (Which is
really the same as) Multiple app's each having a database connection?
PG limits one database connection to one cpu. Multiple connections will
use multiple cpu.
OR, by concurrency, do you mean, non-blocking? And if you mean
non-blocking, is that for read's, write's, or both?
In PG you can do non-blocking, multiple connections (ie multiple cpu),
reads as much as you want.
Extending to indexes: many connections can read a gist index at the same
time. Is that what you need?
-Andy
On Mon, Dec 5, 2011 at 12:26 PM, Andy Colson <andy@squeakycode.net> wrote:
On 12/5/2011 12:31 PM, C. Mundi wrote:
Hello. This is my first post. As such, feedback on style and choice of
venue are especially welcome.I am a regular but not especially expert user of a variety of databases,
including postgreSQL.
I have only modest experience with spatial databases.I have a new project[1] in which GiST could be very useful, provided I
can achieve high concurrency.<SNIP>concurrency here can mean different things. One application hitting PG
which then uses multiple threads? (Not currently possible) Or one app with
multiple threads each having a database connection? (Which is really the
same as) Multiple app's each having a database connection?PG limits one database connection to one cpu. Multiple connections will
use multiple cpu.OR, by concurrency, do you mean, non-blocking? And if you mean
non-blocking, is that for read's, write's, or both?In PG you can do non-blocking, multiple connections (ie multiple cpu),
reads as much as you want.Extending to indexes: many connections can read a gist index at the same
time. Is that what you need?-Andy
Thanks, Andy. You're quite right of course. I'm thinking of concurrent
clients. Lots of them. I envision thousands of actors (which could be
threads within a process or separate processes) behaving as clients, each
with its own connection to a single-threaded database server. So
concurrency here simply means that the database has to be able to handle a
lot of "simultaneous" connections coming at it fast and asynchronously.
(I'm prepared to deploy a thin queuing layer if it turns out that I
saturate the server.) The compute nodes are doing heavy physics which
depends on spatially organized data, and it is very difficult to predict
which rows an actor will need next. (In fact, knowing that would
presuppose that the problem to be solved could be factored at great savings
in computation.)
So what I really need is minimal locking, as in [Karnacker and Banks 1995].
The whole database can be pre-loaded before the start of the calculation.
Now about 80% of the data in the database will never change during a run.
But about 20% will change via "feedback" from the compute nodes. And the
nature of the problem is that we do not know *a priori* which data is in
the 80% and which is in the 20%. If we did, we could split the database to
ensure no block-on-write impact on reads for the 80%. Alas, we have to
assume that reads and writes are mixed with statistics yet unknown.
The 20% which changes changes spatially, not just in content. This can
lead to the need to rebalance on inserts. And since splitting nodes in
"naive" R* trees is kind of expensive [1]Even if inserts were not potentially expensive for the database, the prospect that an insert triggered by one compute node could occasionally cause *all* the compute to stall when not logivally necessary is horrifying., I am wondering to what extent
the "sibling links" approach described by Karnacker and Banks has -- as
anticipated by Hellerstein *et al.* -- already been implemented in GiST in
postgreSQL. If it has, then I win just by using the 'cube' contrib
extension. Hellerstein notes that sibling pointers have long been in
common use even in B-trees; so I am optimistic for GiST.
So that's my concern. I'm doing 80% reads which are all non-blocking with
20% writes mixed in, and I need to avoid the effect of writes blocking
queries which do not need to traverse branches affected by the write.
Thanks!
Carlos
[1]: Even if inserts were not potentially expensive for the database, the prospect that an insert triggered by one compute node could occasionally cause *all* the compute to stall when not logivally necessary is horrifying.
prospect that an insert triggered by one compute node could occasionally
cause *all* the compute to stall when not logivally necessary is horrifying.
On 12/05/11 1:34 PM, C. Mundi wrote:
So that's my concern. I'm doing 80% reads which are all non-blocking
with 20% writes mixed in, and I need to avoid the effect of writes
blocking queries which do not need to traverse branches affected by
the write.
postgres does no blocking on inserts/updates. the commonest lock is if
you're doing a transaction, and need to select something prior to
updating it, then you use a SELECT ... FOR UPDATE; this locks just the
rows you're going to update so noone else can update them (but other
clients can still read the existing value prior to your COMMIT).
--
john r pierce N 37, W 122
santa cruz ca mid-left coast
On 12/05/11 1:34 PM, C. Mundi wrote:
Thanks, Andy. You're quite right of course. I'm thinking of
concurrent clients. Lots of them. I envision thousands of actors
(which could be threads within a process or separate processes)
behaving as clients, each with its own connection to a single-threaded
database server. So concurrency here simply means that the database
has to be able to handle a lot of "simultaneous" connections coming at
it fast and asynchronously. (I'm prepared to deploy a thin queuing
layer if it turns out that I saturate the server.) The compute nodes
are doing heavy physics which depends on spatially organized data, and
it is very difficult to predict which rows an actor will need next.
(In fact, knowing that would presuppose that the problem to be solved
could be factored at great savings in computation.)
for optimal performance, you typically don't want more active queries
than the number of CPU hardware threads if you're running purely from
cache, and maybe that number + the number of physical disk drives, if
you're doing lots of disk IO. I'm running OLTP style operations on a
dual 6-core 'sandy bridge' xeon, this has 12 cores of 2 threads each, or
24 total hardware threads, and the server has 25 2.5" SAS drives, so we
find staying around 50 active queries is the sweet spot. we manage
this by using a message queueing system, where we can dynamically tune
the worker thread count, where these worker threads do the actual
database connections... that way 1000s of actual clients can lob
requests into the messaging system (AMQP, JMS, etc would be suitable
candidates, but we have our own inhouse system for legacy reasons), and
50 worker processes take these requests, handle them, and reply.
--
john r pierce N 37, W 122
santa cruz ca mid-left coast
On 12/5/2011 3:41 PM, John R Pierce wrote:
On 12/05/11 1:34 PM, C. Mundi wrote:
So that's my concern. I'm doing 80% reads which are all non-blocking
with 20% writes mixed in, and I need to avoid the effect of writes
blocking queries which do not need to traverse branches affected by
the write.postgres does no blocking on inserts/updates. the commonest lock is if
you're doing a transaction, and need to select something prior to
updating it, then you use a SELECT ... FOR UPDATE; this locks just the
rows you're going to update so noone else can update them (but other
clients can still read the existing value prior to your COMMIT).
As an addition to this, Reads and Writes wont block each other, but
you'll need to watch the overlap if its a problem. There are many ways
to go about it depending on what you want (transaction isolation levels,
locking, etc).
In general, I think it might look like:
connection1:
start transaction
select * from table where the_geom && POINT(a b)
connection2:
start transaction
update table set the_geom = POLYGON(a b c d) where rowid = 5;
connection1: (in the same transaction it started above)
select the_geom from table where rowid = 5;
-- gets the origional geom, NOT the one from connection2!
There are transaction options for read committed, read un-committed,
etc, etc. I don't rightly understand them all, but it sounds like
you'll want to.
traverse branches affected by the write
I assume that's a reference to building an underlying tree structure.
You wont need to worry about it. On the other hand, if that's a
reference to some geo-boxing thing where one row is included in another
and you need to update multiple rows, and I'm starting to confuse
myself, then you might have a problem.
Also, as John points out, you'll want a connection pooler. I've heard
good things about pgPool. It'll also spread read's across multiple
computers just incase you need a faster response. (writes go to all
computers, read's round-robin).
-Andy
Agreed on the importance of understanding the transaction modes.
I was specifically pointing to the potential latency of blocked reads
during splitting nodes on inserting when rebalancing. But as Paul points
out, postgres does Ang/Tan splits. While less optimal than R* splits,
Ang/Tan is faster as I recall. So it might not be so bad overall.
And I appreciate the tip to look at pgPool which I didn't know about and
will read up.
Thanks,
Carlos
On Dec 5, 2011 3:26 PM, "Andy Colson" <andy@squeakycode.net> wrote:
Show quoted text
On 12/5/2011 3:41 PM, John R Pierce wrote:
On 12/05/11 1:34 PM, C. Mundi wrote:
So that's my concern. I'm doing 80% reads which are all non-blocking
with 20% writes mixed in, and I need to avoid the effect of writes
blocking queries which do not need to traverse branches affected by
the write.postgres does no blocking on inserts/updates. the commonest lock is if
you're doing a transaction, and need to select something prior to
updating it, then you use a SELECT ... FOR UPDATE; this locks just the
rows you're going to update so noone else can update them (but other
clients can still read the existing value prior to your COMMIT).As an addition to this, Reads and Writes wont block each other, but
you'll need to watch the overlap if its a problem. There are many ways to
go about it depending on what you want (transaction isolation levels,
locking, etc).In general, I think it might look like:
connection1:
start transaction
select * from table where the_geom && POINT(a b)connection2:
start transaction
update table set the_geom = POLYGON(a b c d) where rowid = 5;connection1: (in the same transaction it started above)
select the_geom from table where rowid = 5;
-- gets the origional geom, NOT the one from connection2!There are transaction options for read committed, read un-committed, etc,
etc. I don't rightly understand them all, but it sounds like you'll want
to.traverse branches affected by the write
I assume that's a reference to building an underlying tree structure. You
wont need to worry about it. On the other hand, if that's a reference to
some geo-boxing thing where one row is included in another and you need to
update multiple rows, and I'm starting to confuse myself, then you might
have a problem.Also, as John points out, you'll want a connection pooler. I've heard
good things about pgPool. It'll also spread read's across multiple
computers just incase you need a faster response. (writes go to all
computers, read's round-robin).-Andy
On 12/5/2011 12:31 PM, C. Mundi wrote:
Hello. This is my first post. As such, feedback on style and choice of
venue are especially welcome.
If I might add a personal request. Way down the road, after you get
things working, would you mind dropping by and letting us know what
happened? Did PG work for you or not? Good/Bad Etc.
we get to hear the problems, but hardly ever the solutions.
-Andy