Tree structure

Started by Kaare Rasmussenover 12 years ago12 messagesgeneral
Jump to latest
#1Kaare Rasmussen
kaare@jasonic.dk

Hi

I'm trying to determine the best way to represent a simple tree
structure (like a file/dir tree or a uri path). I guess that's done a
zillion times before; I just don't seem to be able to find the right
solution. I have one special request, that I'd like to find all
'shorter' paths, i.e. given 'a/b/c/d' it'll find

a
a/b
a/b/c
- but not
b
a/c
b/a

There are a number of options to test.

1. As strings
There's no dedicated function (@>)
WHERE clause should read something like 'a/b/c/d' LIKE column || '%',
which is both ugly and (I guess) non indexable
Perhaps regex indexes would work, but not efficient and not optimal

2. As array of strings
My favorite, would be elegant. A GIN index on the whole array would
make for fast performance
Alas @> treats the arrays as a set, not an array
WHERE col @> 'a/b/c/d' would find all of the above rows, including a,
a/c, b/a, etc.

3. ltree contrib
The only option that actually works and uses index
@> works as I want it to.
But the single segments can only be alphanumeric and underscore
ltree only supports GIST
there's a length limit.

The last option is the one I'm using right now, but I hope that somebody
can point out where I'm wrong with regards to the other options, or tell
me that there is a better solution somewhere I didn't look.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Alban Hertroys
haramrae@gmail.com
In reply to: Kaare Rasmussen (#1)
Re: Tree structure

1. As strings
There's no dedicated function (@>)
WHERE clause should read something like 'a/b/c/d' LIKE column || '%',
which is both ugly and (I guess) non indexable
Perhaps regex indexes would work, but not efficient and not optimal

2. As array of strings
My favorite, would be elegant. A GIN index on the whole array would make
for fast performance
Alas @> treats the arrays as a set, not an array
WHERE col @> 'a/b/c/d' would find all of the above rows, including a, a/c,
b/a, etc.

3. ltree contrib
The only option that actually works and uses index
@> works as I want it to.
But the single segments can only be alphanumeric and underscore
ltree only supports GIST
there's a length limit.

4. Using a recursive common table expression (CTE).
http://www.postgresql.org/docs/9.2/static/queries-with.html

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Harald Fuchs
hari.fuchs@gmail.com
In reply to: Kaare Rasmussen (#1)
Re: Tree structure

Kaare Rasmussen <kaare@jasonic.dk> writes:

Hi

I'm trying to determine the best way to represent a simple tree
structure (like a file/dir tree or a uri path). I guess that's done a
zillion times before; I just don't seem to be able to find the right
solution. I have one special request, that I'd like to find all
shorter' paths, i.e. given 'a/b/c/d' it'll find

a
a/b
a/b/c
- but not
b
a/c
b/a

If I understand you correctly, you want a prefix match, and sure there's
a PostgreSQL extension for that:

CREATE EXTENSION prefix;

CREATE TABLE t1 (
id serial NOT NULL,
p prefix_range NOT NULL,
PRIMARY KEY (id)
);

CREATE INDEX pp ON t1 USING gist(p);

INSERT INTO t1 (p) VALUES
('a'),
('b'),
('a/c'),
('a/b'),
('b/a'),
('a/b/c');

EXPLAIN ANALYZE
SELECT id, p
FROM t1
WHERE p @> 'a/b/c/d'
;

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Kaare Rasmussen
kaare@jasonic.dk
In reply to: Alban Hertroys (#2)
Re: Tree structure

Hi Alban

4. Using a recursive common table expression (CTE).
http://www.postgresql.org/docs/9.2/static/queries-with.html

Yes, you're right. In fact that's what I'm testing a way to replace, as
I'm not confident in the performance in all situations. My fault
entirely; I should have told so from the start.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Rémi Cura
remi.cura@gmail.com
In reply to: Kaare Rasmussen (#4)
Re: Tree structure

BE carefull you have a number of limitation with recursive cte (I'm
thinking of update and so.)
You can work around with plpgsql but it might be painfull.

You forgot a solution :
if you need powerfull graph features,
use postgres as a database and a SPARQL speaking frontend.
It may be a bit of overkill ;-)

Cheers,
Rémi-C

2013/9/23 Kaare Rasmussen <kaare@jasonic.dk>

Show quoted text

Hi Alban

4. Using a recursive common table expression (CTE).

http://www.postgresql.org/**docs/9.2/static/queries-with.**html&lt;http://www.postgresql.org/docs/9.2/static/queries-with.html&gt;

Yes, you're right. In fact that's what I'm testing a way to replace, as
I'm not confident in the performance in all situations. My fault entirely;
I should have told so from the start.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/**mailpref/pgsql-general&lt;http://www.postgresql.org/mailpref/pgsql-general&gt;

#6Chris Travers
chris.travers@gmail.com
In reply to: Kaare Rasmussen (#4)
Re: Tree structure

On Sun, Sep 22, 2013 at 9:48 PM, Kaare Rasmussen <kaare@jasonic.dk> wrote:

Hi Alban

4. Using a recursive common table expression (CTE).

http://www.postgresql.org/**docs/9.2/static/queries-with.**html&lt;http://www.postgresql.org/docs/9.2/static/queries-with.html&gt;

Yes, you're right. In fact that's what I'm testing a way to replace, as
I'm not confident in the performance in all situations. My fault entirely;
I should have told so from the start.

It might be helpful for you to discuss what sorts of concerns you have and
how they fit into the specifics of your data. Trees are an area where
different uses may have different recommended solutions. I gave my
thoughts on performance on trees above. There are a few really bad areas I
can think of. For example, if you had a ten-layer deep scan where each
scan pulled around 10% or so of the table, you might be looking at 10
sequential scans and a fair bit of CPU time. If the result set was very
large, you might see things written to disk. There are a number of gotchas.

This being said, *usually* I find that recursive CTE's are one of the
better solutions out there for trees and I think they will perform better
in more situations than many of the other solutions.

--

Best Wishes,
Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor
lock-in.
http://www.efficito.com/learn_more.shtml

#7Merlin Moncure
mmoncure@gmail.com
In reply to: Kaare Rasmussen (#1)
Re: Tree structure

On Fri, Sep 20, 2013 at 6:29 AM, Kaare Rasmussen <kaare@jasonic.dk> wrote:

Hi

I'm trying to determine the best way to represent a simple tree structure
(like a file/dir tree or a uri path). I guess that's done a zillion times
before; I just don't seem to be able to find the right solution. I have one
special request, that I'd like to find all 'shorter' paths, i.e. given
'a/b/c/d' it'll find

a
a/b
a/b/c
- but not
b
a/c
b/a

There are a number of options to test.

1. As strings
There's no dedicated function (@>)
WHERE clause should read something like 'a/b/c/d' LIKE column || '%',
which is both ugly and (I guess) non indexable
Perhaps regex indexes would work, but not efficient and not optimal

2. As array of strings
My favorite, would be elegant. A GIN index on the whole array would make
for fast performance
Alas @> treats the arrays as a set, not an array
WHERE col @> 'a/b/c/d' would find all of the above rows, including a, a/c,
b/a, etc.

3. ltree contrib
The only option that actually works and uses index
@> works as I want it to.
But the single segments can only be alphanumeric and underscore
ltree only supports GIST
there's a length limit.

The last option is the one I'm using right now, but I hope that somebody can
point out where I'm wrong with regards to the other options, or tell me that
there is a better solution somewhere I didn't look.

All materialized path approaches will face index limit so be warned.
For my money, I'd be using ltree if you need complex indexed searching
or some TEXT+btree for simple searching (LIKE '/foo/bar%') on the
'full path' side. maview type approaches are faster to search but
slower to update than a recursive type structure where you use
parent/child relationships + recursive CTE to query.

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Kaare Rasmussen
kaare@jasonic.dk
In reply to: Harald Fuchs (#3)
Re: Tree structure

Sorry, got tangled up in this thing called 'real life'.

If I understand you correctly, you want a prefix match, and sure there's
a PostgreSQL extension for that:

OK, that seems to do the job, thanks a lot. The only small quibble is
that it's an extension.

I'm quite surprised there seem to be no way in core to treat an array as
an array. Using @> treats it as a set, AFAICT.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Merlin Moncure
mmoncure@gmail.com
In reply to: Kaare Rasmussen (#8)
Re: Tree structure

On Thu, Oct 10, 2013 at 1:00 AM, Kaare Rasmussen <kaare@jasonic.dk> wrote:

Sorry, got tangled up in this thing called 'real life'.

If I understand you correctly, you want a prefix match, and sure there's
a PostgreSQL extension for that:

OK, that seems to do the job, thanks a lot. The only small quibble is that
it's an extension.

I'm quite surprised there seem to be no way in core to treat an array as an
array. Using @> treats it as a set, AFAICT.

can you elaborate on that?

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10Kaare Rasmussen
kaare@jasonic.dk
In reply to: Merlin Moncure (#9)
Re: Tree structure

Hi Merlin

On Thu, Oct 10, 2013 at 1:00 AM, Kaare Rasmussen <kaare@jasonic.dk> wrote:

I'm quite surprised there seem to be no way in core to treat an array as an
array. Using @> treats it as a set, AFAICT.

can you elaborate on that?

merlin

To me, an array is a vector (or a vector of vectors). So I'm looking for
an operator where

ARRAY[1,4,3] doesn't contain ARRAY[3,1] and
ARRAY[2,7] isn't contained by ARRAY[1,7,4,2,6] (but ARRAY[1,7,4] is)

IOW order matters to me, but not to the array operators mentioned in
http://www.postgresql.org/docs/9.3/static/functions-array.html. Note
that index support is important.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11Rémi Cura
remi.cura@gmail.com
In reply to: Kaare Rasmussen (#10)
Re: Tree structure

Hey sorry if my answer is stupid,
but there is an extension for array, even if it is limited to int (but int
could be indexes of row)
It's named http://www.postgresql.org/docs/9.3/static/intarray.html
It provides essential function, although lacking some (I re-implemented
union of array with disjoint result).
I think this extension uses indexes

Cheers,
Rémi-C

2013/10/10 Kaare Rasmussen <kaare@jasonic.dk>

Show quoted text

Hi Merlin

On Thu, Oct 10, 2013 at 1:00 AM, Kaare Rasmussen <kaare@jasonic.dk>

wrote:

I'm quite surprised there seem to be no way in core to treat an array as
an
array. Using @> treats it as a set, AFAICT.

can you elaborate on that?

merlin

To me, an array is a vector (or a vector of vectors). So I'm looking for
an operator where

ARRAY[1,4,3] doesn't contain ARRAY[3,1] and
ARRAY[2,7] isn't contained by ARRAY[1,7,4,2,6] (but ARRAY[1,7,4] is)

IOW order matters to me, but not to the array operators mentioned in
http://www.postgresql.org/**docs/9.3/static/functions-**array.html&lt;http://www.postgresql.org/docs/9.3/static/functions-array.html&gt;.
Note that index support is important.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/**mailpref/pgsql-general&lt;http://www.postgresql.org/mailpref/pgsql-general&gt;

#12Kaare Rasmussen
kaare@jasonic.dk
In reply to: Rémi Cura (#11)
Re: Tree structure

Hi Rémi

Hey sorry if my answer is stupid,
but there is an extension for array, even if it is limited to int (but
int could be indexes of row)
It's named http://www.postgresql.org/docs/9.3/static/intarray.html
It provides essential function, although lacking some (I
re-implemented union of array with disjoint result).
I think this extension uses indexes

Thanks for your answer. If there were a similar strarray, it would be
top nice :-)

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general