some points for FAQ

Started by Pavel Stehulealmost 19 years ago13 messageshackers

Jump to latest

Pavel Stehule

pavel.stehule@gmail.com

almost 19 years ago

4.1)

To SELECT a random row, use:
SELECT col
FROM tab
ORDER BY random()
LIMIT 1;

+ On bigger tables this solution is slow. Please, find smarter
solution on network.

4.6)

ILIKE is slow, specially on multibyte encodings. If is possible use
FULLTEXT. LIKE '%some%' is slow always .. thing about FULLTEXT.

4.11.2)

+ Alternatively (on PostgreSQL 8.2.0 and all later releases) you could
RETURNING clause for retrieving used SERIAL value, e.g.,

new_id = execute("SELECT INSERT INTO person(name) VALUES('Blaise
Pascal') RETURNING id");

4.19)

+ most of problems with invalid OIDs in cache are solved in PostgreSQL
8.3. Please remeber, so every replanning of SQL statements needs time.
Write your application, they can exist without cache invalidation.

I am sorry, I am not able create patch via my minimalistic english knowleage.

Regards
Pavel Stehule

p.s. can we create some general F.A.Q XML format and store FAQ there?

WIP Proposal:

Alvaro Herrera

alvherre@2ndquadrant.com

almost 19 years ago

In reply to: Pavel Stehule (#1)

Re: some points for FAQ

Pavel Stehule escribiï¿½:

p.s. can we create some general F.A.Q XML format and store FAQ there?

WIP Proposal:

<faq name = ..... language = >
<entry number="1.1.1">
<query>....</query>
<ansver>
...
we need some tags from html: <p><br><a><i><b><ul><li><table>

There is a DocBook spec for FAQ lists. Actually a friend of mine was
working on converting our FAQ into that kind of XML.

--
Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
"God is real, unless declared as int"

Pavel Stehule

pavel.stehule@gmail.com

almost 19 years ago

In reply to: Alvaro Herrera (#2)

Re: some points for FAQ

2007/10/9, Alvaro Herrera <alvherre@commandprompt.com>:

Pavel Stehule escribió:

p.s. can we create some general F.A.Q XML format and store FAQ there?

WIP Proposal:

<faq name = ..... language = >
<entry number="1.1.1">
<query>....</query>
<ansver>
...
we need some tags from html: <p><br><a><i><b><ul><li><table>

There is a DocBook spec for FAQ lists. Actually a friend of mine was
working on converting our FAQ into that kind of XML.

I'll look on it

Pavel

Chris Browne

cbbrowne@acm.org

almost 19 years ago

In reply to: Pavel Stehule (#1)

Re: some points for FAQ

alvherre@commandprompt.com (Alvaro Herrera) writes:

Pavel Stehule escribiï¿½:

p.s. can we create some general F.A.Q XML format and store FAQ there?

WIP Proposal:

<faq name = ..... language = >
<entry number="1.1.1">
<query>....</query>
<ansver>
...
we need some tags from html: <p><br><a><i><b><ul><li><table>

There is a DocBook spec for FAQ lists. Actually a friend of mine was
working on converting our FAQ into that kind of XML.

Yup, the structure is known as a <qandaset>

<http://www.docbook.org/tdg/en/html/qandaset.html>

There is an example of this in the Slony-I docs - the admin guide has
a FAQ defined using qandaset and its children.
--
"cbbrowne","@","acm.org"
http://www3.sympatico.ca/cbbrowne/faq.html
All extremists should be taken out and shot.

Bruce Momjian

bruce@momjian.us

almost 19 years ago

In reply to: Pavel Stehule (#1)

Re: some points for FAQ

Pavel Stehule wrote:

4.1)

To SELECT a random row, use:
SELECT col
FROM tab
ORDER BY random()
LIMIT 1;

+ On bigger tables this solution is slow. Please, find smarter
solution on network.

Well, give me a better example that works.

4.6)

ILIKE is slow, specially on multibyte encodings. If is possible use
FULLTEXT. LIKE '%some%' is slow always .. thing about FULLTEXT.

I added a mention of "full text indexing" for word searches.

4.11.2)

+ Alternatively (on PostgreSQL 8.2.0 and all later releases) you could
RETURNING clause for retrieving used SERIAL value, e.g.,

new_id = execute("SELECT INSERT INTO person(name) VALUES('Blaise
Pascal') RETURNING id");

Agreed. I have updated the text to suggest RETURNING be used and
reduced the other examples. The web site should have the updated
content shortly but CVS will have FAQ.html as well soon.

4.19)

+ most of problems with invalid OIDs in cache are solved in PostgreSQL
8.3. Please remeber, so every replanning of SQL statements needs time.
Write your application, they can exist without cache invalidation.

Agreed. Item removed.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Pavel Stehule

pavel.stehule@gmail.com

almost 19 years ago

In reply to: Bruce Momjian (#5)

Re: some points for FAQ

2007/10/9, Bruce Momjian <bruce@momjian.us>:

Pavel Stehule wrote:

4.1)

To SELECT a random row, use:
SELECT col
FROM tab
ORDER BY random()
LIMIT 1;

+ On bigger tables this solution is slow. Please, find smarter
solution on network.

Well, give me a better example that works.

Better universal solution doesn't exist. Exists only unelegant
solutions - but mutch faster.

SELECT id, ...
FROM data
WHERE id = ANY(ARRAY(
SELECT (random()*:max_id)::int
FROM generate_series(1,20)))
LIMIT 1;

max_id is host variable ~ real max id + some

-- fast solution if id is PK of data

4.19)

+ most of problems with invalid OIDs in cache are solved in PostgreSQL
8.3. Please remeber, so every replanning of SQL statements needs time.
Write your application, they can exist without cache invalidation.

Agreed. Item removed.

Cache invalidation isn't 100% protection before this error message.
With specific using of EXECUTE statement, you can get this message
too. But all temp tables related problems are solved.

Regards
Pavel Stehule

Bruce Momjian

bruce@momjian.us

almost 19 years ago

In reply to: Pavel Stehule (#6)

Re: some points for FAQ

Pavel Stehule wrote:

2007/10/9, Bruce Momjian <bruce@momjian.us>:

Pavel Stehule wrote:

4.1)

To SELECT a random row, use:
SELECT col
FROM tab
ORDER BY random()
LIMIT 1;

+ On bigger tables this solution is slow. Please, find smarter
solution on network.

Well, give me a better example that works.

Better universal solution doesn't exist. Exists only unelegant
solutions - but mutch faster.

SELECT id, ...
FROM data
WHERE id = ANY(ARRAY(
SELECT (random()*:max_id)::int
FROM generate_series(1,20)))
LIMIT 1;

max_id is host variable ~ real max id + some

-- fast solution if id is PK of data

Right. We really only want general solutions in the FAQ.

4.19)

+ most of problems with invalid OIDs in cache are solved in PostgreSQL
8.3. Please remeber, so every replanning of SQL statements needs time.
Write your application, they can exist without cache invalidation.

Agreed. Item removed.

Cache invalidation isn't 100% protection before this error message.
With specific using of EXECUTE statement, you can get this message
too. But all temp tables related problems are solved.

OK, let's see how many bug reports we get and we can always re-add it.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Pavel Stehule

pavel.stehule@gmail.com

almost 19 years ago

In reply to: Bruce Momjian (#7)

Re: some points for FAQ

2007/10/9, Bruce Momjian <bruce@momjian.us>:

Pavel Stehule wrote:

2007/10/9, Bruce Momjian <bruce@momjian.us>:

Pavel Stehule wrote:

4.1)

To SELECT a random row, use:
SELECT col
FROM tab
ORDER BY random()
LIMIT 1;

+ On bigger tables this solution is slow. Please, find smarter
solution on network.

Well, give me a better example that works.

Better universal solution doesn't exist. Exists only unelegant
solutions - but mutch faster.

SELECT id, ...
FROM data
WHERE id = ANY(ARRAY(
SELECT (random()*:max_id)::int
FROM generate_series(1,20)))
LIMIT 1;

max_id is host variable ~ real max id + some

-- fast solution if id is PK of data

Right. We really only want general solutions in the FAQ.

ok. I accept it. Can be some note there? Not this strange select.

4.19)

+ most of problems with invalid OIDs in cache are solved in PostgreSQL
8.3. Please remeber, so every replanning of SQL statements needs time.
Write your application, they can exist without cache invalidation.

Agreed. Item removed.

Cache invalidation isn't 100% protection before this error message.
With specific using of EXECUTE statement, you can get this message
too. But all temp tables related problems are solved.

OK, let's see how many bug reports we get and we can always re-add it.

It's true :). You have to try really wild things inside plpgsql procedures.

Pavel

Bruce Momjian

bruce@momjian.us

almost 19 years ago

In reply to: Pavel Stehule (#8)

Re: some points for FAQ

Pavel Stehule wrote:

Better universal solution doesn't exist. Exists only unelegant
solutions - but mutch faster.

SELECT id, ...
FROM data
WHERE id = ANY(ARRAY(
SELECT (random()*:max_id)::int
FROM generate_series(1,20)))
LIMIT 1;

max_id is host variable ~ real max id + some

-- fast solution if id is PK of data

Right. We really only want general solutions in the FAQ.

ok. I accept it. Can be some note there? Not this strange select.

Well, with 8.3 having this be faster I am thinking we should wait to see
if the hacks are needed.

Cache invalidation isn't 100% protection before this error message.
With specific using of EXECUTE statement, you can get this message
too. But all temp tables related problems are solved.

OK, let's see how many bug reports we get and we can always re-add it.

It's true :). You have to try really wild things inside plpgsql procedures.

Good.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#10

Pavel Stehule

pavel.stehule@gmail.com

almost 19 years ago

In reply to: Bruce Momjian (#9)

Re: some points for FAQ

ok. I accept it. Can be some note there? Not this strange select.

Well, with 8.3 having this be faster I am thinking we should wait to see
if the hacks are needed.

difference, on 10K lines (on small think table)

postgres=# select * from test where i = any(array(select
(random()*10000)::int from generate_series(1,20))) limit 1;
i | v
-----+-----
869 | 113
(1 row)

Time: 3,984 ms

postgres=# select * from test order by random() limit 1;
i | v
------+-----
3687 | 293
(1 row)

Time: 21,978 ms

8.2
postgres=# select * from test order by random() limit 1;
i | v
------+-----
4821 | 608
(1 row)

Time: 51,299 ms

postgres=# select * from test where i = any(array(select
(random()*10000)::int from generate_series(1,20))) limit 1;
i | v
-----+-----
762 | 254
(1 row)

Time: 4,530 ms

Results:

8.3 "fast solution' is 6x faster
8.2 'fast solution' is 11x faster .. it's minimum.

Pavel

for me, it's one from typical beginers mistakes

#11

Bruce Momjian

bruce@momjian.us

almost 19 years ago

In reply to: Pavel Stehule (#10)

Re: some points for FAQ

Pavel Stehule wrote:

ok. I accept it. Can be some note there? Not this strange select.

Well, with 8.3 having this be faster I am thinking we should wait to see
if the hacks are needed.

difference, on 10K lines (on small think table)

postgres=# select * from test where i = any(array(select
(random()*10000)::int from generate_series(1,20))) limit 1;
i | v
-----+-----
869 | 113
(1 row)

Time: 3,984 ms

postgres=# select * from test order by random() limit 1;
i | v
------+-----
3687 | 293
(1 row)

Time: 21,978 ms

8.2
postgres=# select * from test order by random() limit 1;
i | v
------+-----
4821 | 608
(1 row)

Time: 51,299 ms

postgres=# select * from test where i = any(array(select
(random()*10000)::int from generate_series(1,20))) limit 1;
i | v
-----+-----
762 | 254
(1 row)

Time: 4,530 ms

Results:

8.3 "fast solution' is 6x faster
8.2 'fast solution' is 11x faster .. it's minimum.

OK, how do we even explain this idea in the FAQ. It pulls 20 random
values from 1 to 10000? That seems pretty hard to code to me. Where do
you get the 10000 number from? How do you know you will hit a match in
20 tries?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#12

Pavel Stehule

pavel.stehule@gmail.com

almost 19 years ago

In reply to: Bruce Momjian (#11)

Re: some points for FAQ

OK, how do we even explain this idea in the FAQ. It pulls 20 random
values from 1 to 10000? That seems pretty hard to code to me. Where do
you get the 10000 number from? How do you know you will hit a match in
20 tries?

Number 10000 you have to store in application .. it's magic constant.
It similar our statistics. And sometimes you have to actualise it.
This is stochastic methods, so it's possible so it doesn't return any
value, and you have to repeat it. Using this method expect knowledge
about generating random numbers. This method is far to ideal, but on
databases with big traffic only this is usable.

Pavel

#13

Bruce Momjian

bruce@momjian.us

almost 19 years ago

In reply to: Pavel Stehule (#12)

Re: some points for FAQ

Pavel Stehule wrote:

OK, how do we even explain this idea in the FAQ. It pulls 20 random
values from 1 to 10000? That seems pretty hard to code to me. Where do
you get the 10000 number from? How do you know you will hit a match in
20 tries?

Number 10000 you have to store in application .. it's magic constant.
It similar our statistics. And sometimes you have to actualise it.
This is stochastic methods, so it's possible so it doesn't return any
value, and you have to repeat it. Using this method expect knowledge
about generating random numbers. This method is far to ideal, but on
databases with big traffic only this is usable.

OK, but this is clearly something I can't just throw into the FAQ and
expect people to figure it out, and going into major detail to explain
it in the FAQ isn't logical either.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +