Re: [postgis-users] Union as an aggregate

Started by strkover 22 years ago3 messagesgeneral
Jump to latest
#1strk
strk@keybit.net

dblasby wrote:

strk wrote:

Dave, I made no special 'unite' code, just sql create command:
CREATE AGGREGATE unite (
sfunc = GeomUnion,
basetype = geometry,
stype = geometry
);

Hum - this shouldnt do anything.

The aggregate should have "null" as its initial value.

GeomUnion(null,<geometry>) --> null.

So, you'll end up with null *very* quickly because
GeomUnion(null,<geometry>) returns null without any actual execution.

It looks like postgres is behaving differently since I always get
a not-null result.

Are you running this on a big table? At worst I'd expect this to use up
memory the same size all the geometries in the table.

Geometry set size is about 2MB:
gis=# select sum(mem_size(the_geom)) from world where gid < 50;
2148316

Initial postmaster process status is:
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
566 pgsql 15 0 3288 3288 2684 S 0.0 0.4 0:00 postmaster

After query:
gis=# select box(unite(the_geom)) from plmshp02_1 where gid < 50;
(180,83.1138763427734),(-180,-90)

The postmaster grew of about 66M (having touched a 78MB upper limit):
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
566 pgsql 19 0 71472 69M 3396 S 0.0 9.2 3:35 postmaster

If I run that again, *exactly the same query*:
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
566 pgsql 14 0 126M 126M 3396 S 0.0 16.8 7:13 postmaster

It looks like someone is leaking memory, either postgres, postgis or geos.
Does it sound correct ?

PS: postgres is 7.3.4, postgis and geos are latest cvs.
--strk;

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: strk (#1)

strk <strk@keybit.net> writes:

dblasby wrote:

Hum - this shouldnt do anything.
The aggregate should have "null" as its initial value.
GeomUnion(null,<geometry>) --> null.
So, you'll end up with null *very* quickly because
GeomUnion(null,<geometry>) returns null without any actual execution.

It looks like postgres is behaving differently since I always get
a not-null result.

Yes. There's a special definition for aggregate behavior when the
transition function is strict and the initial value is null. As Dave
observes, such a combination would have no useful purpose if we did
not special-case it. The special definition is "take the first
non-null input as the initial transition value, and then apply the
transition function for each subsequent non-null input". This does
exactly what's needed for MAX and MIN, and is probably about what
you'd want for a geometric union operator too.

If I run that again, *exactly the same query*:
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
566 pgsql 14 0 126M 126M 3396 S 0.0 16.8 7:13 postmaster

It looks like someone is leaking memory, either postgres, postgis or geos.

On some platforms top's report of memory used can be misleading, because
it begins to count each page of shared memory against a process when the
process first touches that page.  So if you have a big scan that touches
more and more of the shared buffers, the reported process size goes up
--- but there's really no memory leak.  Try a plain "select count(*)"
against your table and see if you see the same change in reported size.
Alternatively, if the reported size continues to increase well beyond
your shared memory allocation, then I'd believe that as evidence of a
leak.

regards, tom lane

#3strk
strk@keybit.net
In reply to: Tom Lane (#2)

tgl wrote:

strk <strk@keybit.net> writes:

If I run that again, *exactly the same query*:
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
566 pgsql 14 0 126M 126M 3396 S 0.0 16.8 7:13 postmaster

It looks like someone is leaking memory, either postgres, postgis or geos.

On some platforms top's report of memory used can be misleading, because
it begins to count each page of shared memory against a process when the
process first touches that page.  So if you have a big scan that touches
more and more of the shared buffers, the reported process size goes up
--- but there's really no memory leak.  Try a plain "select count(*)"
against your table and see if you see the same change in reported size.

No changes in size with count(*).
Testing platform is FreeBSD.

Alternatively, if the reported size continues to increase well beyond
your shared memory allocation, then I'd believe that as evidence of a
leak.

regards, tom lane

thanks for you answer.

--strk;