Wording in TABLESAMPLE documentation

Started by Nonameover 9 years ago8 messagesdocs
Jump to latest
#1Noname
paddor@gmail.com

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/9.6/static/sql-select.html
Description:

Regarding the TABLESAMPLE documentation on [1]https://www.postgresql.org/docs/9.6/static/sql-select.html, I think in the following
sentence

> If REPEATABLE is not given then a new random sample is selected for each
query.

the word "sample" should be "seed". Of course it results in a new random
sample as well, but IMHO this sentence is about what happens to the seed in
case REPEATABLE (seed) is omitted.

Best regards,
Patrik Wenger

[1]: https://www.postgresql.org/docs/9.6/static/sql-select.html

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#2Simon Riggs
simon@2ndQuadrant.com
In reply to: Noname (#1)
Re: Wording in TABLESAMPLE documentation

On 11 August 2016 at 17:21, <paddor@gmail.com> wrote:

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/9.6/static/sql-select.html
Description:

Regarding the TABLESAMPLE documentation on [1], I think in the following
sentence

&gt; If REPEATABLE is not given then a new random sample is selected for each
query.

the word &quot;sample&quot; should be &quot;seed&quot;. Of course it results in a new random
sample as well, but IMHO this sentence is about what happens to the seed in
case REPEATABLE (seed) is omitted.

Corrected, thanks.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#2)
Re: Wording in TABLESAMPLE documentation

Simon Riggs <simon@2ndquadrant.com> writes:

On 11 August 2016 at 17:21, <paddor@gmail.com> wrote:

&gt; If REPEATABLE is not given then a new random sample is selected for each
query.

the word &quot;sample&quot; should be &quot;seed&quot;. Of course it results in a new random
sample as well, but IMHO this sentence is about what happens to the seed in
case REPEATABLE (seed) is omitted.

Corrected, thanks.

I do not think this is an improvement. The sentence was specifically about
whether the sample (that is, the set of rows selected) would change. This
rewording essentially removes that user-visible behavior guarantee, and
for what? It's certainly not any clearer.

regards, tom lane

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#4Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#3)
Re: Wording in TABLESAMPLE documentation

On 12 August 2016 at 15:24, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

On 11 August 2016 at 17:21, <paddor@gmail.com> wrote:

&gt; If REPEATABLE is not given then a new random sample is selected for each
query.

the word &quot;sample&quot; should be &quot;seed&quot;. Of course it results in a new random
sample as well, but IMHO this sentence is about what happens to the seed in
case REPEATABLE (seed) is omitted.

Corrected, thanks.

I do not think this is an improvement. The sentence was specifically about
whether the sample (that is, the set of rows selected) would change. This
rewording essentially removes that user-visible behavior guarantee, and
for what? It's certainly not any clearer.

It was supposed to be a correction, rather than an improvement. I saw
the use of the word "sample" as an error.

But now you mention it, I agree with you. Let's put it back to say
"sample" but also explain where that new sample comes from... my
attempt to explain this better is in square brackets

"If REPEATABLE is not given then a new random sample will be taken for
each query [based upon the global seed value for the current user.]"

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#4)
Re: Wording in TABLESAMPLE documentation

Simon Riggs <simon@2ndquadrant.com> writes:

But now you mention it, I agree with you. Let's put it back to say
"sample" but also explain where that new sample comes from... my
attempt to explain this better is in square brackets

"If REPEATABLE is not given then a new random sample will be taken for
each query [based upon the global seed value for the current user.]"

I think "global" might have implications we don't want. How about
adding ", based on a system-generated seed"?

regards, tom lane

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#6Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#5)
Re: Wording in TABLESAMPLE documentation

On 12 August 2016 at 16:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

But now you mention it, I agree with you. Let's put it back to say
"sample" but also explain where that new sample comes from... my
attempt to explain this better is in square brackets

"If REPEATABLE is not given then a new random sample will be taken for
each query [based upon the global seed value for the current user.]"

I think "global" might have implications we don't want. How about
adding ", based on a system-generated seed"?

What I was trying to express was that

SELECT setseed(dp);
SELECT * FROM foo TABLESAMPLE ...;
SELECT * FROM foo TABLESAMPLE ...;
SELECT * FROM foo TABLESAMPLE ...;

would yield a repeatable set of samples, similarly repeatable but not
same samples as

SELECT * FROM foo TABLESAMPLE ... REPEATABLE;
SELECT * FROM foo TABLESAMPLE ... REPEATABLE;
SELECT * FROM foo TABLESAMPLE ... REPEATABLE;

so that people understand there is some predictability even without REPEATABLE.

So I don't understand the "based on a system-generated seed", but
maybe I'm missing information.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#6)
Re: Wording in TABLESAMPLE documentation

Simon Riggs <simon@2ndquadrant.com> writes:

On 12 August 2016 at 16:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think "global" might have implications we don't want. How about
adding ", based on a system-generated seed"?

What I was trying to express was that

SELECT setseed(dp);
SELECT * FROM foo TABLESAMPLE ...;
SELECT * FROM foo TABLESAMPLE ...;
SELECT * FROM foo TABLESAMPLE ...;

would yield a repeatable set of samples, similarly repeatable but not
same samples as

SELECT * FROM foo TABLESAMPLE ... REPEATABLE;
SELECT * FROM foo TABLESAMPLE ... REPEATABLE;
SELECT * FROM foo TABLESAMPLE ... REPEATABLE;

But that's *wrong*. Not all tablesample methods make any such guarantee.
In fact, neither of our contrib methods do. Only if you use REPEATABLE
(and the method allows it) is there any promise at all about repeatability.

regards, tom lane

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#7)
Re: Wording in TABLESAMPLE documentation

On 12 August 2016 at 18:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

On 12 August 2016 at 16:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think "global" might have implications we don't want. How about
adding ", based on a system-generated seed"?

What I was trying to express was that

SELECT setseed(dp);
SELECT * FROM foo TABLESAMPLE ...;
SELECT * FROM foo TABLESAMPLE ...;
SELECT * FROM foo TABLESAMPLE ...;

would yield a repeatable set of samples, similarly repeatable but not
same samples as

SELECT * FROM foo TABLESAMPLE ... REPEATABLE;
SELECT * FROM foo TABLESAMPLE ... REPEATABLE;
SELECT * FROM foo TABLESAMPLE ... REPEATABLE;

But that's *wrong*. Not all tablesample methods make any such guarantee.
In fact, neither of our contrib methods do. Only if you use REPEATABLE
(and the method allows it) is there any promise at all about repeatability.

OK, fair enough. I'll just use your wording then. Thanks.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs