Query with LIMIT but as random result set?

Started by Stefan Kellerover 13 years ago10 messagesgeneral
Jump to latest
#1Stefan Keller
sfkeller@gmail.com

Hi

I have a query like this

SELECT ST_AsText(way) geom, name AS label
FROM osm_point
LIMIT 10;

When I repeatedly do this, the result set will be always the same.
I have observed this only empirically and I know that the ordering of
the result set is undefined without ORDER BY.
There are two indexes involved, one geospatial for way and one for name.

My question is: Does someone have an idea on how to randomize the
result set on every consecutive query?
And as an option the (limited) resultset should be spatially
distributed (not clustered).

Yours, Stefan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Stefan Keller (#1)
Re: Query with LIMIT but as random result set?

On 01/08/2013 07:20 AM, Stefan Keller wrote:

Hi

I have a query like this

SELECT ST_AsText(way) geom, name AS label
FROM osm_point
LIMIT 10;

When I repeatedly do this, the result set will be always the same.
I have observed this only empirically and I know that the ordering of
the result set is undefined without ORDER BY.
There are two indexes involved, one geospatial for way and one for name.

My question is: Does someone have an idea on how to randomize the
result set on every consecutive query?
And as an option the (limited) resultset should be spatially
distributed (not clustered).

SELECT ST_AsText(way) geom, name AS label
FROM osm_point ORDER BY random()
LIMIT 10;

Yours, Stefan

--
Adrian Klaver
adrian.klaver@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Stefan Keller
sfkeller@gmail.com
In reply to: Stefan Keller (#1)
Re: [postgis-users] Query with LIMIT but as random result set?

Hi,

Thanks a lot to Adrian, Nicolas and Brooks.
"... ORDER BY random() LIMIT 10;" works ok.

But with the following option it gets more tricky assume:

And as an option the (limited) resultset should be spatially
distributed (not clustered).

I'm thinking about some radial spatial distribution function.

Yours, Stefan

2013/1/8 Brooks Kehler <brookskehler@gmail.com>:

this should work -

order by random() limit 10;

On Tue, Jan 8, 2013 at 10:20 AM, Stefan Keller <sfkeller@gmail.com> wrote:

Hi

I have a query like this

SELECT ST_AsText(way) geom, name AS label
FROM osm_point
LIMIT 10;

When I repeatedly do this, the result set will be always the same.
I have observed this only empirically and I know that the ordering of
the result set is undefined without ORDER BY.
There are two indexes involved, one geospatial for way and one for name.

My question is: Does someone have an idea on how to randomize the
result set on every consecutive query?
And as an option the (limited) resultset should be spatially
distributed (not clustered).

Yours, Stefan
_______________________________________________
postgis-users mailing list
postgis-users@lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
postgis-users@lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Stefan Keller (#3)
Re: [postgis-users] Query with LIMIT but as random result set?

Stefan Keller wrote:

"... ORDER BY random() LIMIT 10;" works ok.

But with the following option it gets more tricky assume:

And as an option the (limited) resultset should be spatially
distributed (not clustered).

I'm thinking about some radial spatial distribution function.

So, you explicitly *don't* want a random selection?  By "spatially
distributed" you mean that if you have already chosen one
particular location, other locations which are close to it should
be less probable (or impossible) to include in the limited result
set? How would you define the desired result? The one with the
highest best solution to the "traveling salesman" problem?

-Kevin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Stefan Keller
sfkeller@gmail.com
In reply to: Kevin Grittner (#4)
Re: [postgis-users] Query with LIMIT but as random result set?

Hi Kevin

No; I'm thinking about some query (or function) that selects random
points (POIs) with certain characteristics like decreasing density. I
didn't find much theory about how to *create* such random points.

There seems to be more literature and implementation about measuring
geographic distribution (like in ArcGIS http://bit.ly/13lTFj9 ).

Under "radial distribution function" I understand a function which
describes how density varies depending on the distance from a
reference point (= the user).

Yours, Stefan

2013/1/10 Kevin Grittner <kgrittn@mail.com>:

Stefan Keller wrote:

"... ORDER BY random() LIMIT 10;" works ok.

But with the following option it gets more tricky assume:

And as an option the (limited) resultset should be spatially
distributed (not clustered).

I'm thinking about some radial spatial distribution function.

So, you explicitly *don't* want a random selection? By "spatially
distributed" you mean that if you have already chosen one
particular location, other locations which are close to it should
be less probable (or impossible) to include in the limited result
set? How would you define the desired result? The one with the
highest best solution to the "traveling salesman" problem?

-Kevin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Stefan Keller (#5)
Re: [postgis-users] Query with LIMIT but as random result set?

Hi Stefan,

Please do not to post! :-)
Here, we post replies at the end, so people can see the context before
the new stuff.

On 11/01/13 11:36, Stefan Keller wrote:

Hi Kevin

No; I'm thinking about some query (or function) that selects random
points (POIs) with certain characteristics like decreasing density. I
didn't find much theory about how to *create* such random points.

There seems to be more literature and implementation about measuring
geographic distribution (like in ArcGIS http://bit.ly/13lTFj9 ).

Under "radial distribution function" I understand a function which
describes how density varies depending on the distance from a
reference point (= the user).

Yours, Stefan

2013/1/10 Kevin Grittner <kgrittn@mail.com>:

Stefan Keller wrote:

"... ORDER BY random() LIMIT 10;" works ok.

But with the following option it gets more tricky assume:

And as an option the (limited) resultset should be spatially
distributed (not clustered).

I'm thinking about some radial spatial distribution function.

So, you explicitly *don't* want a random selection? By "spatially
distributed" you mean that if you have already chosen one
particular location, other locations which are close to it should
be less probable (or impossible) to include in the limited result
set? How would you define the desired result? The one with the
highest best solution to the "traveling salesman" problem?

-Kevin

Note that even for an even probability distribution,you have to be careful.

There are at least 2 ways to distribute points in a circle, both are
valid in different contexts.

(1) Select (x,y) in a square containing the circle, through away any
points outside the circle. This has an even probability density per
unit area of the circle.

(2) Select (theta, r) at random. This has a greater density of points
at the centre reducing lineraly towards the circumference.

I think something like the second fits your situation.

DROP TABLE IF EXISTS ranpoint;

CREATE TABLE ranpoint
(
id int,
theta float,
r float
);

-- theta in radians
-- for radius = 100

INSERT INTO ranpoint
(id, theta, r)
VALUES
(generate_series(1, 10), pi() * random(), 100 * random());

TABLE ranpoint;

Cheers,
Gavin

#7Bosco Rama
postgres@boscorama.com
In reply to: Gavin Flower (#6)
Re: [postgis-users] Query with LIMIT but as random result set?

On 01/11/13 09:31, Gavin Flower wrote:

-- theta in radians
-- for radius = 100

INSERT INTO ranpoint
(id, theta, r)
VALUES
(generate_series(1, 10), pi() * random(), 100 * random());

Shouldn't the value for theta be:
2 * pi() * random()

Bosco.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Bosco Rama (#7)
Re: [postgis-users] Query with LIMIT but as random result set?

On 12/01/13 06:45, Bosco Rama wrote:

On 01/11/13 09:31, Gavin Flower wrote:

-- theta in radians
-- for radius = 100

INSERT INTO ranpoint
(id, theta, r)
VALUES
(generate_series(1, 10), pi() * random(), 100 * random());

Shouldn't the value for theta be:
2 * pi() * random()

Bosco.

Very definitely! :-)

Me bad, as the saying goes...

Cheers,
Gavin

#9Gavan Schneider
pg-gts@snkmail.com
In reply to: Gavin Flower (#8)
>

On Saturday, January 12, 2013 at 04:49, Gavin Flower wrote:

On 12/01/13 06:45, Bosco Rama wrote:

Shouldn't the value for theta be:
2 * pi() * random()

Bosco.

Very definitely! :-)

One could also ask if the value for theta shouldn't be:
tau() * random()

<http://tauday.com/&gt; :-)

Regards
Gavan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Gavan Schneider (#9)
Re: >

On 12/01/13 10:44, Gavan Schneider wrote:

On Saturday, January 12, 2013 at 04:49, Gavin Flower wrote:

On 12/01/13 06:45, Bosco Rama wrote:

Shouldn't the value for theta be:
2 * pi() * random()

Bosco.

Very definitely! :-)

One could also ask if the value for theta shouldn't be:
tau() * random()

<http://tauday.com/&gt; :-)

Regards
Gavan

Well Gavan,

I must bow down before your greater wisdom, as I am forced to agree with
you!

Especially as your name sorts before mine, yet our names are almost
exactly the same. :-)

Cheers,
Gavin

P.S. Is tau() a standard part of pg core- if not, when will it be?