Cognitive dissonance
Unix is a text-based operating system with unbelievably helpful text
manipulation tools.
Postgres is a creature of Unix which happens to have unbelievable text
searching and manipulation tools.
Yet, the only one file edition of the Postgres documentation is
in...pdf format. Huh?
I know. I know. I have already brought this up. And various ways of
creating a one file text edition of the documentation have been
proposed to me. I know.
But either I am a visitor from the Crab Nebula, or there is someone
else out there who would like to have a text file of the entire
documentation.
Two examples from other applications.
I use Vim. Vim's documentation is as easy to access as any
documentation on earth...as long as you know exactly what you are
looking for. Otherwise, it is a tremendous pain.
I also use the National Library of Medicine's MeSH subject headings.
25,000 descriptors with definitions, synonyms and a lot of other
things. They give it to you in single files either as text, xml, or
other ways. Big files. Hundreds of megabytes. That makes it so that
you can do just about anything with it you want. It is one of the
seven wonders of the world.
I do suggest that a plain text file of the entire documentation be
made part of the documentation armamentarium.
Respectfully,
John Gage
John Gage wrote:
I also use the National Library of Medicine's MeSH subject headings.
25,000 descriptors with definitions, synonyms and a lot of other things.
They give it to you in single files either as text, xml, or other ways.
Big files. Hundreds of megabytes. That makes it so that you can do just
about anything with it you want. It is one of the seven wonders of the
world.I do suggest that a plain text file of the entire documentation be made
part of the documentation armamentarium.
From <http://www.postgresql.org/docs/manuals/>:
"The DocBook SGML source for the manuals is available as part of the
PostgreSQL source download available in the FTP area."
--
Lew
On Tue, Jun 8, 2010 at 4:04 AM, John Gage <jsmgage@numericable.fr> wrote:
Unix is a text-based operating system with unbelievably helpful text
manipulation tools.Postgres is a creature of Unix which happens to have unbelievable text
searching and manipulation tools.Yet, the only one file edition of the Postgres documentation is in...pdf
format. Huh?
I suppose the next thing you'll be suggesting is that, because
Postgres is a database, the documentation should be stored as some
form of searchable table within the database itself....?
<runs and hides/>
--
Peter Hunsberger
* John Gage (jsmgage@numericable.fr) wrote:
But either I am a visitor from the Crab Nebula, or there is someone else
out there who would like to have a text file of the entire
documentation.
Soo.. there are quite a few man pages, and in-psql's help is also
pretty nice (\h <command> and \?). That's certainly what I typically
use. I admit that we don't include the full command description in the
\h (just the syntax), but that's still extremely useful.
Would a \h+ that gave you the text from the web-page be useful..? That,
plus the various man pages, would cover an awful lot of what's in SGML..
Thanks,
Stephen
On 6/8/2010 9:23 AM, Peter Hunsberger wrote:
On Tue, Jun 8, 2010 at 4:04 AM, John Gage<jsmgage@numericable.fr> wrote:
Unix is a text-based operating system with unbelievably helpful text
manipulation tools.Postgres is a creature of Unix which happens to have unbelievable text
searching and manipulation tools.Yet, the only one file edition of the Postgres documentation is in...pdf
format. Huh?I suppose the next thing you'll be suggesting is that, because
Postgres is a database, the documentation should be stored as some
form of searchable table within the database itself....?<runs and hides/>
Its also available in chm windows help file format. Which i find allot
more useful
http://www.postgresql.org/docs/manuals/
you could print chm to a text file.
also it not hard to dump a PDF document into a text file.
All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary quotation system. Quotations received via any other form of communication will not be honored.
CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other information proprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it addresses. If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified that any unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this e-mail immediately.
Thank you.
justin@magwerks.com (Justin Graf) writes:
Its also available in chm windows help file format. Which i find allot
more useful
http://www.postgresql.org/docs/manuals/
you could print chm to a text file.also it not hard to dump a PDF document into a text file.
I wish I could find a converter that would generate one of the common
eBook formats (epub, mobi).
There do exist CHM readers on mobile platforms such as Android, but
they're much clumsier to work with than the rather more heavily used
eBook readers.
I have poked around for conversions; nothing particularly suitable has
emerged :-(.
--
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://cbbrowne.com/info/internet.html
"MS apparently now has a team dedicated to tracking problems with
Linux and publicizing them. I guess eventually they'll figure out
this back fires... ;)" -- William Burrow <aa126@DELETE.fan.nb.ca>
Thank you all for your suggestions. Thank you very much.
John
1) I suppose the next thing you'll be suggesting is that, because
Postgres is a database, the documentation should be stored as some
form of searchable table within the database itself....?
<runs and hides/>
------Well, that is exactly what I have done with the MeSH subject
headings. And it works like a charm.
2) Its also available in chm windows help file format. Which i find
allot
more useful
http://www.postgresql.org/docs/manuals/
you could print chm to a text file.
------I'll have to boot over to XP, ugh. Will do.
3) also it not hard to dump a PDF document into a text file.
------I would print out what the dump looks like, but this is a family
program
4) Would a \h+ that gave you the text from the web-page be useful..?
That,
plus the various man pages, would cover an awful lot of what's in SGML..
From <http://www.postgresql.org/docs/manuals/>:
"The DocBook SGML source for the manuals is available as part of the
PostgreSQL source download available in the FTP area."
-----I'm headed there. It's just that given the incredibly good
documentation and the fact that it's available in just about every
format except a text file, I was sort of hoping for a policy change on
the part of the powers that be.
***SNIP***
2) Its also available in chm windows help file format. Which i find
allot
more useful
http://www.postgresql.org/docs/manuals/
you could print chm to a text file.------I'll have to boot over to XP, ugh. Will do.
There are linux chm readers
http://www.linux.com/news/software/applications/8209-chm-viewers-for-linux
and one for firefox
https://addons.mozilla.org/en-US/firefox/addon/3235/
All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary quotation system. Quotations received via any other form of communication will not be honored.
CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other information proprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it addresses. If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified that any unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this e-mail immediately.
Thank you.
Justin Graf wrote:
There are linux chm readers
...
Note that even Microsoft deprecated CHM back in 2003 after it was
realized it was full of potential security exploits that couldn't
readily be abated.
On Tue, Jun 8, 2010 at 5:04 AM, John Gage <jsmgage@numericable.fr> wrote:
I do suggest that a plain text file of the entire documentation be made part
of the documentation armamentarium.
Not that I see a whole lot of utility in this endeavor, but it's
possible to do a decent PDF to plain text conversion. I tried some of
the online tools that do this and found <http://www.pdftextonline.com>
to do the best job with the Postgres manual. I didn't bother trying
any client-side applications which do the same job.
Attached is a short snippet of a text export of the PDF manual.
Josh
Attachments:
snippet_pdf_to_text.txttext/plain; charset=UTF-8; name=snippet_pdf_to_text.txtDownload
1) On a list that howls with complaints when posts are in html, it is
surprising that there is resistance to the idea of documentation in
plain text.
2) Posters are correctly referred to the documentation as frequently
as possible. In fact, very frequently. The frequency might decrease
if the documentation were in plain text. It is easier to search a
single plain text file than any other source, except perhaps the
database itself.
3) Postgres is getting pushed off the map at the low end by MySQL, now
owned by Oracle. If Postgres ceased to exist, Ellison would be
thrilled. I chose A2 Hosting (with whom I am very happy) for my
website because they support Postgres. I'm writing cgi scripts in
perl. I had to install the postgres driver for dbi. It was not pre-
installed. There are about four buttons for MySQL on the cPanel and
two farther over on the right for Postgres.
An anecdote. I discovered the tsvector functionality a while back. I
have used it to create indices for my text files and several other
tasks. I recently was re-looking at my files and saw
"tsvector::text". I had forgotten that the double colon is one way to
cast a type. Double colon is not in the html index of the
documentation. I found it by searching my plain text version of the
pdf file. In my opinion, the html documentation is useful for reading
it like a novel or referencing it in these lists.
On Jun 8, 2010, at 9:56 PM, Josh Kupershmidt wrote:
Show quoted text
Not that I see a whole lot of utility in this endeavor
John Gage wrote:
Posters are correctly referred to the documentation as frequently as
possible. In fact, very frequently. The frequency might decrease if
the documentation were in plain text. It is easier to search a single
plain text file than any other source, except perhaps the database
itself.
In reality searches are being done on the web, which combines the HTML
version of the official documentation with blog posts, presentation
materials, the wiki, and similar other resources. This is why I don't
actually care about a text version of the docs; I've just gotten used to
using Google to search the PostgreSQL documentation. The occasional
time when I know I just want to search the manual instead, I can search
the PDF version. Neither of those are great solutions, but they're good
enough that it's not worth fighting to build a text version over as I
see it. I'd use it if it were around, but there's little motivation for
most of us to work on it.
Postgres is getting pushed off the map at the low end by MySQL, now
owned by Oracle.
The dynamics are much more complicated than that. Big MySQL sites are
switching to NoSQL; medium sized MySQL sites are switching to PostgreSQL
to get rid of scaling and reliability issues (I personally have been
seeing a lot of this from Rails installs lately); small to medium size
Oracle shops are switching to PostgreSQL to lower licensing costs.
The idea that plain-text documentation for the database would be a
significant driver in any of these trends would be greatly exaggerating
the significance of a technical detail important to a pretty small
number of people. On my personal list of "things that could be improved
in the documentation", good plain text format is there, but there's a
whole lot of things above it.
--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us
On 09/06/2010, John Gage <jsmgage@numericable.fr> wrote:
1) On a list that howls with complaints when posts are in html, it is
surprising that there is resistance to the idea of documentation in
plain text.2) Posters are correctly referred to the documentation as frequently
as possible. In fact, very frequently. The frequency might decrease
if the documentation were in plain text. It is easier to search a
single plain text file than any other source, except perhaps the
database itself.3) Postgres is getting pushed off the map at the low end by MySQL, now
owned by Oracle. If Postgres ceased to exist, Ellison would be
thrilled. I chose A2 Hosting (with whom I am very happy) for my
website because they support Postgres. I'm writing cgi scripts in
perl. I had to install the postgres driver for dbi. It was not pre-
installed. There are about four buttons for MySQL on the cPanel and
two farther over on the right for Postgres.An anecdote. I discovered the tsvector functionality a while back. I
have used it to create indices for my text files and several other
tasks. I recently was re-looking at my files and saw
"tsvector::text". I had forgotten that the double colon is one way to
cast a type. Double colon is not in the html index of the
documentation. I found it by searching my plain text version of the
pdf file. In my opinion, the html documentation is useful for reading
it like a novel or referencing it in these lists.On Jun 8, 2010, at 9:56 PM, Josh Kupershmidt wrote:
Not that I see a whole lot of utility in this endeavor
Personally I like to use html docs, and it would be good if the
documentation were downloadable from the postgresql website in other
formats, for convenience...
But, what I use is this, which works pretty well:
(e.g. to get the 8.1 dosc)
mkdir postgresql
cd postgresql
wget -r -nH -l 10 -k -np
http://www.postgresql.org/docs/8.1/interactive/index.html
... then after it all downloads:
open the file docs/8.1/interactive/index.html
in your web browser.
e.g.
links docs/8.1/interactive/index.html
HTML is "text", so you can search using grep e.g.
grep -r "ALTER TABLE .* ADD COLUMN" docs/8.1
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Brian Modra Land line: +27 23 5411 462
Mobile: +27 79 69 77 082
5 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa
http://www.zwartberg.com/
Fax: +27865510467
Brian Modra schrieb:
Personally I like to use html docs, and it would be good if the
documentation were downloadable from the postgresql website in other
formats, for convenience...But, what I use is this, which works pretty well:
(e.g. to get the 8.1 dosc)
mkdir postgresql
cd postgresql
wget -r -nH -l 10 -k -np
http://www.postgresql.org/docs/8.1/interactive/index.html... then after it all downloads:
open the file docs/8.1/interactive/index.html
in your web browser.e.g.
links docs/8.1/interactive/index.htmlHTML is "text", so you can search using grep e.g.
grep -r "ALTER TABLE .* ADD COLUMN" docs/8.1
Thats the way i do too. A huge pdf is often not very helpful. In my
personal case i programm often in a train, using my laptop. Searching a
PDF with more than 1.000 pages really hits my battery. With html-files i
could preselect the items to search.
Also it's possible to import the html-files in a postgres-db and using
fulltext-search. ;)
Greetings,
Torsten
--
http://www.dddbl.de - ein Datenbank-Layer, der die Arbeit mit 8
verschiedenen Datenbanksystemen abstrahiert,
Queries von Applikationen trennt und automatisch die Query-Ergebnisse
auswerten kann.
My tupp'th:
Formatted text, whether PDF, HTML or (heaven forbid!) Word Documents,
is easier to read than unformatted plain text, and those of us without
the OP's very admirable proficiency in vi remain at the mercy of the
various readers and their associated search functions.
However, I sure that it's not too arduous a task to extract the text
in these documents and strip them of their formatting?
Or am I missing something?
Dave Coventry <dgcoventry@gmail.com> writes:
Formatted text, whether PDF, HTML or (heaven forbid!) Word Documents,
is easier to read than unformatted plain text, and those of us without
the OP's very admirable proficiency in vi remain at the mercy of the
various readers and their associated search functions.However, I sure that it's not too arduous a task to extract the text
in these documents and strip them of their formatting?Or am I missing something?
Info documentation format. Text based, super user aware, easy to
browse and search, has an index.
You can even produce postgres.info today, it's just not optimised to be
very friendly, it's missing mainly convenient table support and
index.
Regards,
--
dim
Excerpts from John Gage's message of mié jun 09 01:28:54 -0400 2010:
I recently was re-looking at my files and saw
"tsvector::text". I had forgotten that the double colon is one way to
cast a type. Double colon is not in the html index of the
documentation.
I just added an index entry for ::, thanks for pointing out that it was
missing.
If you notice other missing index entries, do not hesitate to point it
out in this mailing list or pgsql-docs.
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Brian Modra wrote:
Personally I like to use html docs, and it would be good if the
documentation were downloadable from the postgresql website in other
formats, for convenience...
Good thing it is, then, albeit not in the most convenient format, i.e.,
DocBook. But then, from there you can generate pretty much any format you
want, right?
--
Lew
On tis, 2010-06-08 at 11:04 +0200, John Gage wrote:
Yet, the only one file edition of the Postgres documentation is
in...pdf format. Huh?I know. I know. I have already brought this up. And various ways
of
creating a one file text edition of the documentation have been
proposed to me. I know.But either I am a visitor from the Crab Nebula, or there is someone
else out there who would like to have a text file of the entire
documentation.
As I said back then, doing this is straightforward, but we kind of need
more than one user who asks for it before we make it part of a regular
service, which comes with maintenance costs.
Excerpts from Peter Eisentraut's message of jue jun 10 02:50:14 -0400 2010:
On tis, 2010-06-08 at 11:04 +0200, John Gage wrote:
Yet, the only one file edition of the Postgres documentation is
in...pdf format. Huh?I know. I know. I have already brought this up. And various ways
of
creating a one file text edition of the documentation have been
proposed to me. I know.But either I am a visitor from the Crab Nebula, or there is someone
else out there who would like to have a text file of the entire
documentation.As I said back then, doing this is straightforward, but we kind of need
more than one user who asks for it before we make it part of a regular
service, which comes with maintenance costs.
Hey, count me as another interested person in a single-file plain-text
doc output format.
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support