Typos/Questions in bloom documentation
http://www.postgresql.org/docs/devel/static/bloom.html
F.4.3 Examples
Claims that the signature length is 80 bits - shouldn't it be 8?
Also, is it OK to link to wikipedia in our documentation? (the link to
bloom filter in the second paragraph)
F.4.4 "Opclass interface"
The "I" should be capitalized in a proper title
F.4.5 Limitation
Should be plural
Other:
The lack of a boolean built-in seems odd. Can that be added easily? If
not could a user do it themselves without resorting to C code?
Recent post on -performance inspires the last question.
/messages/by-id/CANcrS5pR1P1Tj=e-RQQ=FF3WPAy_fyruS0YJer-+iJHxR1JAiA@mail.gmail.com
David J.
On 2016/04/21 6:51, David G. Johnston wrote:
http://www.postgresql.org/docs/devel/static/bloom.html
F.4.3 Examples
Claims that the signature length is 80 bits - shouldn't it be 8?
In F.4.1. Introduction:
... The user can specify signature length (in uint16, default is 5)
So, it seems right to me.
Also, is it OK to link to wikipedia in our documentation? (the link to
bloom filter in the second paragraph)
grep wikipedia doc reveals at least some hits:
doc/src/sgml/release.sgml:26
doc/src/sgml/isn.sgml:361
doc/src/sgml/isn.sgml:367
doc/src/sgml/isn.sgml:369
doc/src/sgml/textsearch.sgml:2774
doc/src/sgml/bloom.sgml:21
doc/src/sgml/monitoring.sgml:2728
doc/src/sgml/pgcrypto.sgml:1289
doc/src/sgml/pgcrypto.sgml:1351
And then some:
doc/src/sgml/acronyms.sgml:16
doc/src/sgml/acronyms.sgml:26
doc/src/sgml/acronyms.sgml:35
doc/src/sgml/acronyms.sgml:54
...
F.4.4 "Opclass interface"
The "I" should be capitalized in a proper title
F.4.5 Limitation
Should be plural
Attached is a patch for these fixes.
Thanks,
Amit
Attachments:
bloom-doc-typos.patchtext/x-diff; name=bloom-doc-typos.patchDownload
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 7349095..d0cf317 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -160,7 +160,7 @@ SELECT pg_relation_size('btree_idx');
</sect2>
<sect2>
- <title>Opclass interface</title>
+ <title>Opclass Interface</title>
<para>
The Bloom opclass interface is simple. It requires 1 supporting function:
@@ -178,7 +178,7 @@ DEFAULT FOR TYPE text USING bloom AS
</sect2>
<sect2>
- <title>Limitation</title>
+ <title>Limitations</title>
<para>
<itemizedlist>
On Wednesday, April 20, 2016, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
wrote:
On 2016/04/21 6:51, David G. Johnston wrote:
http://www.postgresql.org/docs/devel/static/bloom.html
F.4.3 Examples
Claims that the signature length is 80 bits - shouldn't it be 8?
In F.4.1. Introduction:
... The user can specify signature length (in uint16, default is 5)
So, it seems right to me.
Great. Maybe you can consider re-wording it so others can understand. I
have no clue how 80bits is determined. The phase you quote is obtuse to
the casual user as well. If that means 16x5=80 irrespective of columns it
is not clear.
This may be a function of this not being considered user-space code but
something to exercise tests. But if we are going to publish it as an
extension its seems worthy of helping people decide when and how to use
them. The docs as written fail to do that - and reading the Wikipedia page
doesn't cut it either,
David J.
On 2016/04/21 11:19, David G. Johnston wrote:
On Wednesday, April 20, 2016, Amit Langote wrote:
On 2016/04/21 6:51, David G. Johnston wrote:
http://www.postgresql.org/docs/devel/static/bloom.html
F.4.3 Examples
Claims that the signature length is 80 bits - shouldn't it be 8?
In F.4.1. Introduction:
... The user can specify signature length (in uint16, default is 5)
So, it seems right to me.
Great. Maybe you can consider re-wording it so others can understand. I
have no clue how 80bits is determined. The phase you quote is obtuse to
the casual user as well. If that means 16x5=80 irrespective of columns it
is not clear.
I agree it's unclear. Does the following make it any better (updated
patch attached):
- The user can specify signature length (in uint16, default is 5) and the
- number of bits, which can be set per attribute (1 < colN < 2048).
+ The user can specify signature length in units of 16 bits (default is 5)
+ and the number of bits per indexed attribute.
By the way, now I am slightly confused as well about per-column bits
assignment thing:
In F.4.1. Introduction:
... and the number of bits, which can be set per attribute (1 < colN < 2048).
And then in F.4.2. Parameters:
bloom indexes accept the following parameters in the WITH clause.
length
Length of signature in uint16 type values
col1 — col16
Number of bits for corresponding column
Which is it: col1 - col2048 or col1 - col16? Or are they different things
altogether?
Thanks,
Amit
Attachments:
bloom-doc-typos-reword.patchtext/x-diff; name=bloom-doc-typos-reword.patchDownload
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 7349095..ff0bf76 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -22,8 +22,8 @@
allows fast exclusion of non-candidate tuples via signatures.
Since a signature is a lossy representation of all indexed attributes,
search results must be rechecked using heap information.
- The user can specify signature length (in uint16, default is 5) and the
- number of bits, which can be set per attribute (1 < colN < 2048).
+ The user can specify signature length in units of 16 bits (default is 5)
+ and the number of bits per indexed attribute.
</para>
<para>
@@ -51,7 +51,7 @@
<term><literal>length</></term>
<listitem>
<para>
- Length of signature in uint16 type values
+ Length of signature in units of 16 bits
</para>
</listitem>
</varlistentry>
@@ -160,7 +160,7 @@ SELECT pg_relation_size('btree_idx');
</sect2>
<sect2>
- <title>Opclass interface</title>
+ <title>Opclass Interface</title>
<para>
The Bloom opclass interface is simple. It requires 1 supporting function:
@@ -178,7 +178,7 @@ DEFAULT FOR TYPE text USING bloom AS
</sect2>
<sect2>
- <title>Limitation</title>
+ <title>Limitations</title>
<para>
<itemizedlist>
On Wed, Apr 20, 2016 at 9:18 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp
wrote:
On 2016/04/21 11:19, David G. Johnston wrote:
On Wednesday, April 20, 2016, Amit Langote wrote:
On 2016/04/21 6:51, David G. Johnston wrote:
http://www.postgresql.org/docs/devel/static/bloom.html
F.4.3 Examples
Claims that the signature length is 80 bits - shouldn't it be 8?
In F.4.1. Introduction:
... The user can specify signature length (in uint16, default is 5)
So, it seems right to me.
Great. Maybe you can consider re-wording it so others can understand. I
have no clue how 80bits is determined. The phase you quote is obtuse to
the casual user as well. If that means 16x5=80 irrespective of columnsit
is not clear.
I agree it's unclear. Does the following make it any better (updated
patch attached):- The user can specify signature length (in uint16, default is 5) and the - number of bits, which can be set per attribute (1 < colN < 2048). + The user can specify signature length in units of 16 bits (default is 5) + and the number of bits per indexed attribute.
Better. The "and" is confusing. Is the signature length the sum of 16x5
+ (bits per indexed attribute)?
By the way, now I am slightly confused as well about per-column bits
assignment thing:
In F.4.1. Introduction:
... and the number of bits, which can be set per attribute (1 < colN <
2048).And then in F.4.2. Parameters:
bloom indexes accept the following parameters in the WITH clause.
length
Length of signature in uint16 type values
How about: "Number of 16bit units to use for the signature"
col1 — col16
Number of bits for corresponding columnWhich is it: col1 - col2048 or col1 - col16? Or are they different things
altogether?
Good question...
David J.
On Fri, Apr 22, 2016 at 1:25 AM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Wed, Apr 20, 2016 at 9:18 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:I agree it's unclear. Does the following make it any better (updated
patch attached):
I have sent a patch to rework the docs here:
/messages/by-id/CAB7nPqQB8dcFmY1uodmiJOSZdhBFOx-us-uW6rfYrzhpEiBR2g@mail.gmail.com
This may interest people here.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016/06/07 14:41, Michael Paquier wrote:
On Fri, Apr 22, 2016 at 1:25 AM, David G. Johnston
<david.g.johnston@gmail.com> wrote:On Wed, Apr 20, 2016 at 9:18 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:I agree it's unclear. Does the following make it any better (updated
patch attached):I have sent a patch to rework the docs here:
/messages/by-id/CAB7nPqQB8dcFmY1uodmiJOSZdhBFOx-us-uW6rfYrzhpEiBR2g@mail.gmail.com
This may interest people here.
Thanks, Michael.
Regards,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers