Regression test fails when BLCKSZ is 1kB
I compiled postgreSQL with 1kB block size and regresion test fails. Main problem
is that output is correct but in different order. See attachment.
I think affected test should contain order by keyword.
Any comments?
Zdenek
Attachments:
regression.diffstext/plain; name=regression.diffsDownload+46-46
Am Montag, 21. April 2008 schrieb Zdenek Kotala:
I compiled postgreSQL with 1kB block size and regresion test fails. Main
problem is that output is correct but in different order. See attachment.
This was previously reported:
http://archives.postgresql.org/pgsql-hackers/2006-11/msg00901.php
I think affected test should contain order by keyword.
For previously established reasons, we don't want to add ORDER BY clauses to
every test that might fail under exceptional circumstances so we test all
plan types equally. I think very small block sizes are fairly exceptional,
unless you have a reason up your sleeve why they are a good idea.
On Mon, Apr 21, 2008 at 5:55 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
For previously established reasons, we don't want to add ORDER BY clauses to
every test that might fail under exceptional circumstances so we test all
plan types equally. I think very small block sizes are fairly exceptional,
unless you have a reason up your sleeve why they are a good idea.
Now that we have autovacuum on by default, we might get into random
failures because of re-ordering. Though I don't seem to recall anybody
complaining yet, it could just be that we are lucky or our regression
suite don't have long enough running tests to give autovacuum chance
to recycle some of the dead tuples.
Thanks,
Pavan
--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
Now that we have autovacuum on by default, we might get into random
failures because of re-ordering. Though I don't seem to recall anybody
complaining yet, it could just be that we are lucky or our regression
suite don't have long enough running tests to give autovacuum chance
to recycle some of the dead tuples.
No, the reason you don't see that is that plain VACUUM doesn't move
tuples around.
regards, tom lane
Peter Eisentraut napsal(a):
Am Montag, 21. April 2008 schrieb Zdenek Kotala:
I compiled postgreSQL with 1kB block size and regresion test fails. Main
problem is that output is correct but in different order. See attachment.This was previously reported:
http://archives.postgresql.org/pgsql-hackers/2006-11/msg00901.phpI think affected test should contain order by keyword.
For previously established reasons, we don't want to add ORDER BY clauses to
every test that might fail under exceptional circumstances so we test all
plan types equally. I think very small block sizes are fairly exceptional,
unless you have a reason up your sleeve why they are a good idea.
I'm only testing behavior with different block size and I think it is not good
idea to support only 8kB for regtest. When 4kB is used then PG fails in Join
regresion test and with 16kB, 32kB it fails because:
*** ./expected/bitmapops.out Fri Apr 11 00:25:26 2008
--- ./results/bitmapops.out Mon Apr 21 15:30:18 2008
***************
*** 20,25 ****
--- 20,26 ----
set enable_seqscan=false;
-- Lower work_mem to trigger use of lossy bitmaps
set work_mem = 64;
+ ERROR: 64 is outside the valid range for parameter "work_mem" (256 .. 2097151)
-- Test bitmap-and.
SELECT count(*) FROM bmscantest WHERE a = 1 AND b = 1;
count
Zdenek
Am Montag, 21. April 2008 schrieb Zdenek Kotala:
I'm only testing behavior with different block size and I think it is not
good idea to support only 8kB for regtest. When 4kB is used then PG fails
in Join regresion test and with 16kB, 32kB it fails because:*** ./expected/bitmapops.out Fri Apr 11 00:25:26 2008 --- ./results/bitmapops.out Mon Apr 21 15:30:18 2008 *************** *** 20,25 **** --- 20,26 ---- set enable_seqscan=false; -- Lower work_mem to trigger use of lossy bitmaps set work_mem = 64; + ERROR: 64 is outside the valid range for parameter "work_mem" (256 .. 2097151) -- Test bitmap-and. SELECT count(*) FROM bmscantest WHERE a = 1 AND b = 1; count
This should probably be fixed by using a unit specification on work_mem. Do
you want to prepare a patch?
On Mon, Apr 21, 2008 at 02:25:31PM +0200, Peter Eisentraut wrote:
I think affected test should contain order by keyword.
For previously established reasons, we don't want to add ORDER BY clauses to
every test that might fail under exceptional circumstances so we test all
plan types equally. I think very small block sizes are fairly exceptional,
unless you have a reason up your sleeve why they are a good idea.
I wonder if it would be feasable to, whenever a regression test fails
to sort both files and compare again. This should tell you if the
difference are *only* rearrangement automatically, without having to
eyeball the output.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.
Am Montag, 21. April 2008 schrieb Martijn van Oosterhout:
I wonder if it would be feasable to, whenever a regression test fails
to sort both files and compare again. This should tell you if the
difference are *only* rearrangement automatically, without having to
eyeball the output.
That sounds like it should be worth a try.
Peter Eisentraut wrote:
Am Montag, 21. April 2008 schrieb Martijn van Oosterhout:
I wonder if it would be feasable to, whenever a regression test fails
to sort both files and compare again. This should tell you if the
difference are *only* rearrangement automatically, without having to
eyeball the output.That sounds like it should be worth a try.
I think we need first to identify cases where we don't care that much
about output order. Teaching pg-regress the new check shouldn't be very
hard.
cheers
andrew
Peter Eisentraut <peter_e@gmx.net> writes:
Am Montag, 21. April 2008 schrieb Martijn van Oosterhout:
I wonder if it would be feasable to, whenever a regression test fails
to sort both files and compare again. This should tell you if the
difference are *only* rearrangement automatically, without having to
eyeball the output.
That sounds like it should be worth a try.
That sounds like a pretty bad idea, since it would treat ordering
differences as insignificant even when they aren't --- for example,
an ordering difference in the output of a query that *has* an
ORDER BY is usually a bug.
regards, tom lane
On Mon, Apr 21, 2008 at 8:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
No, the reason you don't see that is that plain VACUUM doesn't move
tuples around.
I know. But plain VACUUM can free up dead space which can be used for
subsequent updates/inserts and that can cause reordering. For example:
Case 1.
Insert 100 records --- goes into block 1 .. 10
Delete 100 records
Insert 100 more records --- goes into 11 .. 20
Case 2.
Insert 100 records --- goes into block 1 .. 10
Delete 100 records
*Autovacuum triggers*
Insert 100 more records -- goes into block 1 .. 10
Thanks,
Pavan
--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com
On Mon, Apr 21, 2008 at 10:54 PM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
Case 1.
Insert 100 records --- goes into block 1 .. 10
Delete 100 records
Insert 100 more records --- goes into 11 .. 20Case 2.
Insert 100 records --- goes into block 1 .. 10
Delete 100 records
*Autovacuum triggers*
Insert 100 more records -- goes into block 1 .. 10
Its probably not a very neat example because in this simplistic case,
the ordering would still be same, but we can easily construct a
slightly complex example to prove the point.
Thanks,
Pavan
--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com
Peter Eisentraut <peter_e@gmx.net> writes:
Am Montag, 21. April 2008 schrieb Zdenek Kotala:
set work_mem = 64;
+ ERROR: 64 is outside the valid range for parameter "work_mem" (256 ..
2097151) -- Test bitmap-and.
This should probably be fixed by using a unit specification on work_mem. Do
you want to prepare a patch?
The problem is that guc.c enforces a lower limit of 8*BLCKSZ on the
work_mem setting. Unless we add an explicit unit specifier for "blocks"
to GUC's vocabulary, there doesn't seem to be any way to name that value
in the SET command. And it's not entirely clear that the SET would
still have the desired effect for this test, anyway, if it were getting
translated to 256K or more.
Another possible answer is to change the minimum to be just 64K always.
I'm not certain that it's really sensible to tie the minimum work_mem to
BLCKSZ --- I don't think we do anything where work_mem is controlling a
pool of page buffers, do we?
regards, tom lane
Am Montag, 21. April 2008 schrieb Tom Lane:
That sounds like a pretty bad idea, since it would treat ordering
differences as insignificant even when they aren't --- for example,
an ordering difference in the output of a query that *has* an
ORDER BY is usually a bug.
Well, we wouldn't treat ordering differences as OK, but we could print
foo ... FAILED (only ordering differences)
which might give a clue.
Then again, the effort to make this bulletproof might be more than continuing
to field the occasional question about the issue.
Peter Eisentraut napsal(a):
Am Montag, 21. April 2008 schrieb Tom Lane:
That sounds like a pretty bad idea, since it would treat ordering
differences as insignificant even when they aren't --- for example,
an ordering difference in the output of a query that *has* an
ORDER BY is usually a bug.Well, we wouldn't treat ordering differences as OK, but we could print
foo ... FAILED (only ordering differences)
which might give a clue.
When you are able detect ordering difference you are able also check if it is
important for the test or not without any extra effort. Only what we need is put
some flag to test that order is not important.
Then again, the effort to make this bulletproof might be more than continuing
to field the occasional question about the issue.
Regression test MUST BE bulletproof. If you get a error you must know that it is
really error (in postgresql or regtest) and must be fixed. When you start to
ignore some errors because it can happen sometimes you fall in the trap soon.
Zdenek
Andrew Dunstan napsal(a):
Peter Eisentraut wrote:
Am Montag, 21. April 2008 schrieb Martijn van Oosterhout:
I wonder if it would be feasable to, whenever a regression test fails
to sort both files and compare again. This should tell you if the
difference are *only* rearrangement automatically, without having to
eyeball the output.That sounds like it should be worth a try.
I think we need first to identify cases where we don't care that much
about output order. Teaching pg-regress the new check shouldn't be very
hard.
It seems to me only ORDER BY clauses must return sort order. Or are there any
other cases?
Zdenek
Tom Lane napsal(a):
Another possible answer is to change the minimum to be just 64K always.
I'm not certain that it's really sensible to tie the minimum work_mem to
BLCKSZ --- I don't think we do anything where work_mem is controlling a
pool of page buffers, do we?
Yeah, I try to find all usage and it seems everything is related to tuplestore,
Bitmap or Hash join. I think we can set 64K set limit without any problem.
By the way is any reason to have work_mem * 1024 "everywhere" when we have unit
support in GUC?
Zdenek
On Tue, Apr 22, 2008 at 10:31:53AM +0200, Zdenek Kotala wrote:
When you are able detect ordering difference you are able also check if it
is important for the test or not without any extra effort. Only what we
need is put some flag to test that order is not important.
Not true. Sorting the file is going jumble all the results together.
Since we perform many tests in one file, you're not going to be able to
seperate them.
Regression test MUST BE bulletproof. If you get a error you must know that
it is really error (in postgresql or regtest) and must be fixed. When you
start to ignore some errors because it can happen sometimes you fall in the
trap soon.
I think people are misunderstanding. You posted a bunch of diffs with
that comment that they *appeared* to only be ordering differences. How
good did you check? If an 8 become a 9 chances are you'd miss it.
Having a second test checking the sorted results would at least
preclude the chance that there really is something wrong.
It was a guide, not a way of getting out of tests.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.
Martijn van Oosterhout napsal(a):
On Tue, Apr 22, 2008 at 10:31:53AM +0200, Zdenek Kotala wrote:
When you are able detect ordering difference you are able also check if it
is important for the test or not without any extra effort. Only what we
need is put some flag to test that order is not important.Not true. Sorting the file is going jumble all the results together.
Since we perform many tests in one file, you're not going to be able to
seperate them.
Each statement result must be sort separately, otherwise it should hide
problems. For example one statement return A instead of B and second returns B
instead of A. When sort will be used on whole file then it will be reported as a
ordering problem.
Show quoted text
Regression test MUST BE bulletproof. If you get a error you must know that
it is really error (in postgresql or regtest) and must be fixed. When you
start to ignore some errors because it can happen sometimes you fall in the
trap soon.I think people are misunderstanding. You posted a bunch of diffs with
that comment that they *appeared* to only be ordering differences. How
good did you check? If an 8 become a 9 chances are you'd miss it.
Having a second test checking the sorted results would at least
preclude the chance that there really is something wrong.It was a guide, not a way of getting out of tests.
Have a nice day,
On Tue, Apr 22, 2008 at 4:25 PM, Martijn van Oosterhout <kleptog@svana.org>
wrote:
On Tue, Apr 22, 2008 at 10:31:53AM +0200, Zdenek Kotala wrote:
When you are able detect ordering difference you are able also check if
it
is important for the test or not without any extra effort. Only what we
need is put some flag to test that order is not important.Not true. Sorting the file is going jumble all the results together.
Since we perform many tests in one file, you're not going to be able to
seperate them.Regression test MUST BE bulletproof. If you get a error you must know
that
it is really error (in postgresql or regtest) and must be fixed. When
you
start to ignore some errors because it can happen sometimes you fall in
the
trap soon.
I think people are misunderstanding. You posted a bunch of diffs with
that comment that they *appeared* to only be ordering differences. How
good did you check? If an 8 become a 9 chances are you'd miss it.
Having a second test checking the sorted results would at least
preclude the chance that there really is something wrong.It was a guide, not a way of getting out of tests.
In the past, I had faced and tried to work on this exact problem... here's
what I had in mind:
in the .expected file, we would demarcate the section of lines we expect to
come in any order, by using two special markers. Then, when comparing the
actual output with expected output, we would take the demarcated group of
lines, and the corresponding lines from actual output, and compare them
after sorting.
For eg.
foo.expected:
select * from tenk where col1 <= 3 limit 3;
col1 | col2 | col3
-------------------------
?unsorted_result_start
1 | 10 | 100
2 | 20 | 200
3 | 30 | 300
?unsorted_result_end
foo.out:
select * from tenk where col1 <= 3 limit 3;
col1 | col2 | col3
-------------------------
3 | 30 | 300
2 | 20 | 200
1 | 10 | 100
So, the diff program should discard the lines beginning with '?' (meta
character), and then sort and match exactly the same number of lines.
There's another option of putting these '?' lines in a separate file
with corresponding begin/end line numbers of the unsorted group, and using
this as a parameter to the diffing program.
Of course, this needs a change in the (standard) diff that we use from
pg_regress!
Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
Mail sent from my BlackLaptop device