Minmax indexes
Hi,
Here's a reviewable version of what I've dubbed Minmax indexes. Some
people said they would like to use some other name for this feature, but
I have yet to hear usable ideas, so for now I will keep calling them
this way. I'm open to proposals, but if you pick something that cannot
be abbreviated "mm" I might have you prepare a rebased version which
renames the files and structs.
The implementation here has been simplified from what I originally
proposed at 20130614222805.GZ5491@eldon.alvh.no-ip.org -- in particular,
I noticed that there's no need to involve aggregate functions at all; we
can just use inequality operators. So the pg_amproc entries are gone;
only the pg_amop entries are necessary.
I've somewhat punted on the question of doing resummarization separately
from vacuuming. Right now, resummarization (as well as other necessary
index cleanup) takes place in amvacuumcleanup. This is not optimal; I
have stated elsewhere that I'd like to create separate maintenance
actions that can be carried out by autovacuum. That would be useful
both for Minmax indexes and GIN indexes (pending insertion list); maybe
others. That's not part of this patch, however.
The design of this stuff is in the file "minmax-proposal" at the top of
the tree. That file is up to date, though it still contains some open
questions that were present in the original proposal. (I have not fixed
some bogosities pointed out by Noah, for instance. I will do that
shortly.) In a final version, that file would be applied as
src/backend/access/minmax/README, most likely.
One area on which I needed to modify core code is IndexBuildHeapScan. I
needed a version that was able to scan only a certain range of pages,
not the entire table, so I introduced a new IndexBuildHeapRangeScan, and
added a quick "heap_scansetlimits" function. I haven't tested that this
works outside of the HeapRangeScan thingy, so it's probably completely
bogus; I'm open to suggestions if people think this should be
implemented differently. In any case, keeping that implementation
together with vanilla IndexBuildHeapScan makes a lot of sense.
One thing still to tackle is when to mark ranges as unsummarized. Right
now, any new tuple on a page range would cause a new index entry to be
created and a new revmap update. This would cause huge index bloat if,
say, a page is emptied and vacuumed and filled with new tuples with
increasing values outside the original range; each new tuple would
create a new index tuple. I have two ideas about this (1. mark range as
unsummarized if 3rd time we touch the same page range; 2. vacuum the
affected index page if it's full, so we can maintain the index always up
to date without causing unduly bloat), but I haven't implemented
anything yet.
The "amcostestimate" routine is completely bogus; right now it returns
constant 0, meaning the index is always chosen if it exists.
There are opclasses for int4, numeric and text. The latter doesn't work
at all, because collation info is not passed down at all. I will have
to figure that out (even if I find unlikely that minmax indexes have any
usefulness on top of text columns). I admit that numeric hasn't been
tested, and it's quite likely that they won't work; mainly because of
lack of some datumCopy() calls, about which the code contains some
/* XXX */ lines. I think this should be relatively straightforward.
Ideally, the final version of this patch would contain opclasses for all
supported datatypes (i.e. the same that have got btree opclasses).
I have messed up the opclass information, as evidenced by failures in
opr_sanity regression test. I will research that later.
There's working contrib/pageinspect support; pg_xlogdump (and wal_debug)
seems to work sanely too.
This patch compiles cleanly under -Werror.
The research leading to these results has received funding from the
European Union's Seventh Framework Programme (FP7/2007-2013) under
grant agreement n� 318633
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
minmax-1.patchtext/x-diff; charset=us-asciiDownload+3649-19
On Sat, 2013-09-14 at 21:14 -0300, Alvaro Herrera wrote:
Here's a reviewable version of what I've dubbed Minmax indexes.
Please fix duplicate OID 3177.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 15 September 2013 01:14, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Hi,
Here's a reviewable version of what I've dubbed Minmax indexes. Some
people said they would like to use some other name for this feature, but
I have yet to hear usable ideas, so for now I will keep calling them
this way. I'm open to proposals, but if you pick something that cannot
be abbreviated "mm" I might have you prepare a rebased version which
renames the files and structs.The implementation here has been simplified from what I originally
proposed at 20130614222805.GZ5491@eldon.alvh.no-ip.org -- in particular,
I noticed that there's no need to involve aggregate functions at all; we
can just use inequality operators. So the pg_amproc entries are gone;
only the pg_amop entries are necessary.I've somewhat punted on the question of doing resummarization separately
from vacuuming. Right now, resummarization (as well as other necessary
index cleanup) takes place in amvacuumcleanup. This is not optimal; I
have stated elsewhere that I'd like to create separate maintenance
actions that can be carried out by autovacuum. That would be useful
both for Minmax indexes and GIN indexes (pending insertion list); maybe
others. That's not part of this patch, however.The design of this stuff is in the file "minmax-proposal" at the top of
the tree. That file is up to date, though it still contains some open
questions that were present in the original proposal. (I have not fixed
some bogosities pointed out by Noah, for instance. I will do that
shortly.) In a final version, that file would be applied as
src/backend/access/minmax/README, most likely.One area on which I needed to modify core code is IndexBuildHeapScan. I
needed a version that was able to scan only a certain range of pages,
not the entire table, so I introduced a new IndexBuildHeapRangeScan, and
added a quick "heap_scansetlimits" function. I haven't tested that this
works outside of the HeapRangeScan thingy, so it's probably completely
bogus; I'm open to suggestions if people think this should be
implemented differently. In any case, keeping that implementation
together with vanilla IndexBuildHeapScan makes a lot of sense.One thing still to tackle is when to mark ranges as unsummarized. Right
now, any new tuple on a page range would cause a new index entry to be
created and a new revmap update. This would cause huge index bloat if,
say, a page is emptied and vacuumed and filled with new tuples with
increasing values outside the original range; each new tuple would
create a new index tuple. I have two ideas about this (1. mark range as
unsummarized if 3rd time we touch the same page range; 2. vacuum the
affected index page if it's full, so we can maintain the index always up
to date without causing unduly bloat), but I haven't implemented
anything yet.The "amcostestimate" routine is completely bogus; right now it returns
constant 0, meaning the index is always chosen if it exists.There are opclasses for int4, numeric and text. The latter doesn't work
at all, because collation info is not passed down at all. I will have
to figure that out (even if I find unlikely that minmax indexes have any
usefulness on top of text columns). I admit that numeric hasn't been
tested, and it's quite likely that they won't work; mainly because of
lack of some datumCopy() calls, about which the code contains some
/* XXX */ lines. I think this should be relatively straightforward.
Ideally, the final version of this patch would contain opclasses for all
supported datatypes (i.e. the same that have got btree opclasses).I have messed up the opclass information, as evidenced by failures in
opr_sanity regression test. I will research that later.There's working contrib/pageinspect support; pg_xlogdump (and wal_debug)
seems to work sanely too.
This patch compiles cleanly under -Werror.The research leading to these results has received funding from the
European Union's Seventh Framework Programme (FP7/2007-2013) under
grant agreement n° 318633
Thanks for the patch, but I seem to have immediately hit a snag:
pgbench=# CREATE INDEX minmaxtest ON pgbench_accounts USING minmax (aid);
PANIC: invalid xlog record length 0
--
Thom
On 15.09.2013 03:14, Alvaro Herrera wrote:
+ Partial indexes are not supported; since an index is concerned with minimum and + maximum values of the involved columns across all the pages in the table, it + doesn't make sense to exclude values. Another way to see "partial" indexes + here would be those that only considered some pages in the table instead of all + of them; but this would be difficult to implement and manage and, most likely, + pointless.
Something like this seems completely sensible to me:
create index i_accounts on accounts using minmax (ts) where valid = true;
The situation where that would be useful is if 'valid' accounts are
fairly well clustered, but invalid ones are scattered all over the
table. The minimum and maximum stoed in the index would only concern
valid accounts.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 16 September 2013 at 11:03 Heikki Linnakangas <hlinnakangas@vmware.com>
wrote:
Something like this seems completely sensible to me:
create index i_accounts on accounts using minmax (ts) where valid = true;
The situation where that would be useful is if 'valid' accounts are
fairly well clustered, but invalid ones are scattered all over the
table. The minimum and maximum stoed in the index would only concern
valid accounts.
Here's one that occurs to me:
CREATE INDEX i_billing_id_mm ON billing(id) WHERE paid_in_full IS NOT TRUE;
Note that this would be a frequently moving target and over years of billing,
the subset would be quite small compared to the full system (imagine, say, 50k
rows out of 20M).
Best Wises,
Chris Travers
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support
On 2013-09-16 11:19:19 +0100, Chris Travers wrote:
On 16 September 2013 at 11:03 Heikki Linnakangas <hlinnakangas@vmware.com>
wrote:Something like this seems completely sensible to me:
create index i_accounts on accounts using minmax (ts) where valid = true;
The situation where that would be useful is if 'valid' accounts are
fairly well clustered, but invalid ones are scattered all over the
table. The minimum and maximum stoed in the index would only concern
valid accounts.
Yes, I wondered the same myself.
Here's one that occurs to me:
CREATE INDEX i_billing_id_mm ON billing(id) WHERE paid_in_full IS NOT TRUE;
Note that this would be a frequently moving target and over years of billing,
the subset would be quite small compared to the full system (imagine, say, 50k
rows out of 20M).
In that case you'd just use a normal btree index, no?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 16, 2013 at 3:47 AM, Thom Brown <thom@linux.com> wrote:
On 15 September 2013 01:14, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Hi,
Here's a reviewable version of what I've dubbed Minmax indexes.
Thanks for the patch, but I seem to have immediately hit a snag:
pgbench=# CREATE INDEX minmaxtest ON pgbench_accounts USING minmax (aid);
PANIC: invalid xlog record length 0
fwiw, this seems to be triggered by ANALYZE.
At least i can trigger it by executing ANALYZE on the table (attached
is a stacktrace of a backend exhibiting the failure)
Another thing is this messages i got when compiling:
"""
mmxlog.c: In function ‘minmax_xlog_revmap_set’:
mmxlog.c:161:14: warning: unused variable ‘blkno’ [-Wunused-variable]
bufpage.c: In function ‘PageIndexDeleteNoCompact’:
bufpage.c:1066:18: warning: ‘lastused’ may be used uninitialized in
this function [-Wmaybe-uninitialized]
"""
--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157
Attachments:
stacktrace.txttext/plain; charset=US-ASCII; name=stacktrace.txtDownload
On 17 September 2013 07:20, Jaime Casanova <jaime@2ndquadrant.com> wrote:
On Mon, Sep 16, 2013 at 3:47 AM, Thom Brown <thom@linux.com> wrote:
On 15 September 2013 01:14, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
Hi,
Here's a reviewable version of what I've dubbed Minmax indexes.
Thanks for the patch, but I seem to have immediately hit a snag:
pgbench=# CREATE INDEX minmaxtest ON pgbench_accounts USING minmax (aid);
PANIC: invalid xlog record length 0fwiw, this seems to be triggered by ANALYZE.
At least i can trigger it by executing ANALYZE on the table (attached
is a stacktrace of a backend exhibiting the failure)Another thing is this messages i got when compiling:
"""
mmxlog.c: In function ‘minmax_xlog_revmap_set’:
mmxlog.c:161:14: warning: unused variable ‘blkno’ [-Wunused-variable]
bufpage.c: In function ‘PageIndexDeleteNoCompact’:
bufpage.c:1066:18: warning: ‘lastused’ may be used uninitialized in
this function [-Wmaybe-uninitialized]
"""
I'm able to run ANALYSE manually without it dying:
pgbench=# analyse pgbench_accounts;
ANALYZE
pgbench=# analyse pgbench_accounts;
ANALYZE
pgbench=# create index minmaxtest on pgbench_accounts using minmax (aid);
PANIC: invalid xlog record length 0
--
Thom
On Tue, Sep 17, 2013 at 3:30 AM, Thom Brown <thom@linux.com> wrote:
On 17 September 2013 07:20, Jaime Casanova <jaime@2ndquadrant.com> wrote:
On Mon, Sep 16, 2013 at 3:47 AM, Thom Brown <thom@linux.com> wrote:
On 15 September 2013 01:14, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:Hi,
Here's a reviewable version of what I've dubbed Minmax indexes.
Thanks for the patch, but I seem to have immediately hit a snag:
pgbench=# CREATE INDEX minmaxtest ON pgbench_accounts USING minmax
(aid);
PANIC: invalid xlog record length 0fwiw, this seems to be triggered by ANALYZE.
At least i can trigger it by executing ANALYZE on the table (attached
is a stacktrace of a backend exhibiting the failure)I'm able to run ANALYSE manually without it dying:
try inserting some data before the ANALYZE, that will force a
resumarization which is mentioned in the stack trace of the failure
--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 17 September 2013 14:37, Jaime Casanova <jaime@2ndquadrant.com> wrote:
On Tue, Sep 17, 2013 at 3:30 AM, Thom Brown <thom@linux.com> wrote:
On 17 September 2013 07:20, Jaime Casanova <jaime@2ndquadrant.com>
wrote:
On Mon, Sep 16, 2013 at 3:47 AM, Thom Brown <thom@linux.com> wrote:
On 15 September 2013 01:14, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:Hi,
Here's a reviewable version of what I've dubbed Minmax indexes.
Thanks for the patch, but I seem to have immediately hit a snag:
pgbench=# CREATE INDEX minmaxtest ON pgbench_accounts USING minmax
(aid);
PANIC: invalid xlog record length 0fwiw, this seems to be triggered by ANALYZE.
At least i can trigger it by executing ANALYZE on the table (attached
is a stacktrace of a backend exhibiting the failure)I'm able to run ANALYSE manually without it dying:
try inserting some data before the ANALYZE, that will force a
resumarization which is mentioned in the stack trace of the failure
I've tried inserting 1 row then ANALYSE and 10,000 rows then ANALYSE, and
in both cases there's no error. But then trying to create the index again
results in my original error.
--
Thom
On Tue, Sep 17, 2013 at 8:43 AM, Thom Brown <thom@linux.com> wrote:
On 17 September 2013 14:37, Jaime Casanova <jaime@2ndquadrant.com> wrote:
On Tue, Sep 17, 2013 at 3:30 AM, Thom Brown <thom@linux.com> wrote:
On 17 September 2013 07:20, Jaime Casanova <jaime@2ndquadrant.com>
wrote:On Mon, Sep 16, 2013 at 3:47 AM, Thom Brown <thom@linux.com> wrote:
On 15 September 2013 01:14, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:Hi,
Here's a reviewable version of what I've dubbed Minmax indexes.
Thanks for the patch, but I seem to have immediately hit a snag:
pgbench=# CREATE INDEX minmaxtest ON pgbench_accounts USING minmax
(aid);
PANIC: invalid xlog record length 0fwiw, this seems to be triggered by ANALYZE.
At least i can trigger it by executing ANALYZE on the table (attached
is a stacktrace of a backend exhibiting the failure)I'm able to run ANALYSE manually without it dying:
try inserting some data before the ANALYZE, that will force a
resumarization which is mentioned in the stack trace of the failureI've tried inserting 1 row then ANALYSE and 10,000 rows then ANALYSE, and in
both cases there's no error. But then trying to create the index again
results in my original error.
Ok
So, please confirm if this is the pattern you are following:
CREATE TABLE t1(i int);
INSERT INTO t1 SELECT generate_series(1, 10000);
CREATE INDEX idx1 ON t1 USING minmax (i);
if that, then the attached stack trace (index_failure_thom.txt) should
correspond to the failure you are looking.
My test was slightly different:
CREATE TABLE t1(i int);
CREATE INDEX idx1 ON t1 USING minmax (i);
INSERT INTO t1 SELECT generate_series(1, 10000);
ANALYZE t1;
and the failure happened in a different time, in resumarization
(attached index_failure_jcm.txt)
but in the end, both failures seems to happen for the same reason: a
record of length 0... at XLogInsert time
#4 XLogInsert at xlog.c:966
#5 mmSetHeapBlockItemptr at mmrevmap.c:169
#6 mm_doinsert at minmax.c:1410
actually, if you create a temp table both tests works fine
--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157
Thom Brown wrote:
Thanks for testing.
Thanks for the patch, but I seem to have immediately hit a snag:
pgbench=# CREATE INDEX minmaxtest ON pgbench_accounts USING minmax (aid);
PANIC: invalid xlog record length 0
Silly mistake I had already made in another patch. Here's an
incremental patch which fixes this bug. Apply this on top of previous
minmax-1.patch.
I also renumbered the duplicate OID pointed out by Peter, and fixed the
two compiler warnings reported by Jaime.
Note you'll need to re-initdb in order to get the right catalog entries.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
minmax-2-incr.patchtext/x-diff; charset=us-asciiDownload+42-37
On 17 September 2013 22:03, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Thom Brown wrote:
Thanks for testing.
Thanks for the patch, but I seem to have immediately hit a snag:
pgbench=# CREATE INDEX minmaxtest ON pgbench_accounts USING minmax (aid);
PANIC: invalid xlog record length 0Silly mistake I had already made in another patch. Here's an
incremental patch which fixes this bug. Apply this on top of previous
minmax-1.patch.
Thanks.
Hit another issue with exactly the same procedure:
pgbench=# create index minmaxtest on pgbench_accounts using minmax (aid);
ERROR: lock 176475 is not held
--
Thom
On Tue, September 17, 2013 23:03, Alvaro Herrera wrote:
[minmax-1.patch. + minmax-2-incr.patch. (and initdb)]
The patches apply and compile OK.
I've not yet really tested; I just wanted to mention that make check gives the following differences:
*** /home/aardvark/pg_stuff/pg_sandbox/pgsql.minmax/src/test/regress/expected/opr_sanity.out 2013-09-17 23:18:31.427356703
+0200
--- /home/aardvark/pg_stuff/pg_sandbox/pgsql.minmax/src/test/regress/results/opr_sanity.out 2013-09-17 23:20:48.208150824
+0200
***************
*** 1076,1081 ****
--- 1076,1086 ----
2742 | 2 | @@@
2742 | 3 | <@
2742 | 4 | =
+ 3847 | 1 | <
+ 3847 | 2 | <=
+ 3847 | 3 | =
+ 3847 | 4 | >=
+ 3847 | 5 | >
4000 | 1 | <<
4000 | 1 | ~<~
4000 | 2 | &<
***************
*** 1098,1104 ****
4000 | 15 | >
4000 | 16 | @>
4000 | 18 | =
! (62 rows)
-- Check that all opclass search operators have selectivity estimators.
-- This is not absolutely required, but it seems a reasonable thing
--- 1103,1109 ----
4000 | 15 | >
4000 | 16 | @>
4000 | 18 | =
! (67 rows)
-- Check that all opclass search operators have selectivity estimators.
-- This is not absolutely required, but it seems a reasonable thing
***************
*** 1272,1280 ****
WHERE am.amname <> 'btree' AND am.amname <> 'gist' AND am.amname <> 'gin'
GROUP BY amname, amsupport, opcname, amprocfamily
HAVING count(*) != amsupport OR amprocfamily IS NULL;
! amname | opcname | count
! --------+---------+-------
! (0 rows)
SELECT amname, opcname, count(*)
FROM pg_am am JOIN pg_opclass op ON opcmethod = am.oid
--- 1277,1288 ----
WHERE am.amname <> 'btree' AND am.amname <> 'gist' AND am.amname <> 'gin'
GROUP BY amname, amsupport, opcname, amprocfamily
HAVING count(*) != amsupport OR amprocfamily IS NULL;
! amname | opcname | count
! --------+-------------+-------
! minmax | int4_ops | 1
! minmax | text_ops | 1
! minmax | numeric_ops | 1
! (3 rows)
SELECT amname, opcname, count(*)
FROM pg_am am JOIN pg_opclass op ON opcmethod = am.oid
======================================================================
Erik Rijkers
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Thom Brown wrote:
Hit another issue with exactly the same procedure:
pgbench=# create index minmaxtest on pgbench_accounts using minmax (aid);
ERROR: lock 176475 is not held
That's what I get for restructuring the way buffers are acquired to use
the FSM, and then neglecting to test creation on decently-sized indexes.
Fix attached.
I just realized that xlog replay is also broken.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
minmax-3-incr.patchtext/x-diff; charset=us-asciiDownload+2-4
Erik Rijkers wrote:
On Tue, September 17, 2013 23:03, Alvaro Herrera wrote:
[minmax-1.patch. + minmax-2-incr.patch. (and initdb)]
The patches apply and compile OK.
I've not yet really tested; I just wanted to mention that make check gives the following differences:
Oops, I forgot to update the expected file. I had to comment on this
when submitting minmax-2-incr.patch and forgot. First, those extra five
operators are supposed to be there; expected file needs an update. As
for this:
--- 1277,1288 ---- WHERE am.amname <> 'btree' AND am.amname <> 'gist' AND am.amname <> 'gin' GROUP BY amname, amsupport, opcname, amprocfamily HAVING count(*) != amsupport OR amprocfamily IS NULL; ! amname | opcname | count ! --------+-------------+------- ! minmax | int4_ops | 1 ! minmax | text_ops | 1 ! minmax | numeric_ops | 1 ! (3 rows)
I think the problem is that the query is wrong. This is the complete query:
SELECT amname, opcname, count(*)
FROM pg_am am JOIN pg_opclass op ON opcmethod = am.oid
LEFT JOIN pg_amproc p ON amprocfamily = opcfamily AND
amproclefttype = amprocrighttype AND amproclefttype = opcintype
WHERE am.amname <> 'btree' AND am.amname <> 'gist' AND am.amname <> 'gin'
GROUP BY amname, amsupport, opcname, amprocfamily
HAVING count(*) != amsupport OR amprocfamily IS NULL;
I should be, instead, this:
SELECT amname, opcname, count(*)
FROM pg_am am JOIN pg_opclass op ON opcmethod = am.oid
LEFT JOIN pg_amproc p ON amprocfamily = opcfamily AND
amproclefttype = amprocrighttype AND amproclefttype = opcintype
WHERE am.amname <> 'btree' AND am.amname <> 'gist' AND am.amname <> 'gin'
GROUP BY amname, amsupport, opcname, amprocfamily
HAVING count(*) != amsupport AND (amprocfamily IS NOT NULL);
This query is supposed to check that there are no opclasses with
mismatching number of support procedures; but if the left join returns a
null-extended row for pg_amproc, that means there is no support proc,
yet count(*) will return 1. So count(*) will not match amsupport, and
the row is supposed to be excluded by the amprocfamily IS NULL clause in
HAVING.
Both queries return empty in HEAD, but only the second one correctly
returns empty with the patch applied.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 17, 2013 at 4:03 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Thom Brown wrote:
Thanks for testing.
Thanks for the patch, but I seem to have immediately hit a snag:
pgbench=# CREATE INDEX minmaxtest ON pgbench_accounts USING minmax (aid);
PANIC: invalid xlog record length 0Silly mistake I had already made in another patch. Here's an
incremental patch which fixes this bug. Apply this on top of previous
minmax-1.patch.I also renumbered the duplicate OID pointed out by Peter, and fixed the
two compiler warnings reported by Jaime.Note you'll need to re-initdb in order to get the right catalog entries.
Hi,
Found another problem with the this steps:
create table t1 (i int);
create index idx_t1_i on t1 using minmax(i);
insert into t1 select generate_series(1, 2000000);
ERROR: could not read block 1 in file "base/12645/16397_vm": read
only 0 of 8192 bytes
STATEMENT: insert into t1 select generate_series(1, 2000000);
ERROR: could not read block 1 in file "base/12645/16397_vm": read
only 0 of 8192 bytes
After that, i keep receiving these messages (when autovacuum tries to
vacuum this table):
ERROR: could not truncate file "base/12645/16397_vm" to 2 blocks:
it's only 1 blocks now
CONTEXT: automatic vacuum of table "postgres.public.t1"
ERROR: could not truncate file "base/12645/16397_vm" to 2 blocks:
it's only 1 blocks now
CONTEXT: automatic vacuum of table "postgres.public.t1"
--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jaime Casanova wrote:
Found another problem with the this steps:
create table t1 (i int);
create index idx_t1_i on t1 using minmax(i);
insert into t1 select generate_series(1, 2000000);
ERROR: could not read block 1 in file "base/12645/16397_vm": read
only 0 of 8192 bytes
Thanks. This was a trivial off-by-one bug; fixed in the attached patch.
While studying it, I noticed that I was also failing to notice extension
of the fork by another process. I have tried to fix that also in the
current patch, but I'm afraid that a fully robust solution for this will
involve having a cached fork size in the index's relcache entry -- just
like we have smgr_vm_nblocks. In fact, since the revmap fork is
currently reusing the VM forknum, I might even be able to use the same
variable to keep track of the fork size. But I don't really like this
bit of reusing the VM forknum for revmap, so I've refrained from
extending that assumption into further code for the time being.
There was also a bug that we would try to initialize a revmap page twice
during recovery, if two backends thought they needed to extend it; that
would cause the data written by the first extender to be lost.
This patch applies on top of the two previous incremental patches. I
will send a full patch later, including all those fixes and the fix for
the opr_sanity regression test.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
minmax-4-incr.patchtext/x-diff; charset=us-asciiDownload+57-50
On Wed, September 25, 2013 00:14, Alvaro Herrera wrote:
[minmax-4-incr.patch]
After a --data-checksums initdb (successful), the following error came up:
after the statement: create index t_minmax_idx on t using minmax (r);
WARNING: page verification failed, calculated checksum 25951 but expected 0
ERROR: invalid page in block 1 of relation base/21324/26267_vm
it happens reliably. every time I run the program.
Below is the whole program that I used.
Thanks,
Erik Rijkers
#!/bin/sh
t=t
if [[ 1 -eq 1 ]]; then
echo "
drop table if exists $t ;
create table $t
as
select i, cast( random() * 10^9 as integer ) as r
from generate_series(1, 1000000) as f(i) ;
analyze $t;
table $t limit 5;
select count(*) from $t;
explain analyze select min(r), max(r) from $t;
select min(r), max(r) from $t;
create index ${t}_minmax_idx on $t using minmax (r);
analyze $t;
explain analyze select min(r), max(r) from $t;
select min(r), max(r) from $t;
" | psql
fi
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Sep 15, 2013 at 5:44 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Hi,
Here's a reviewable version of what I've dubbed Minmax indexes. Some
people said they would like to use some other name for this feature, but
I have yet to hear usable ideas, so for now I will keep calling them
this way. I'm open to proposals, but if you pick something that cannot
be abbreviated "mm" I might have you prepare a rebased version which
renames the files and structs.The implementation here has been simplified from what I originally
proposed at 20130614222805.GZ5491@eldon.alvh.no-ip.org -- in particular,
I noticed that there's no need to involve aggregate functions at all; we
can just use inequality operators. So the pg_amproc entries are gone;
only the pg_amop entries are necessary.I've somewhat punted on the question of doing resummarization separately
from vacuuming. Right now, resummarization (as well as other necessary
index cleanup) takes place in amvacuumcleanup. This is not optimal; I
have stated elsewhere that I'd like to create separate maintenance
actions that can be carried out by autovacuum. That would be useful
both for Minmax indexes and GIN indexes (pending insertion list); maybe
others. That's not part of this patch, however.The design of this stuff is in the file "minmax-proposal" at the top of
the tree. That file is up to date, though it still contains some open
questions that were present in the original proposal. (I have not fixed
some bogosities pointed out by Noah, for instance. I will do that
shortly.) In a final version, that file would be applied as
src/backend/access/minmax/README, most likely.One area on which I needed to modify core code is IndexBuildHeapScan. I
needed a version that was able to scan only a certain range of pages,
not the entire table, so I introduced a new IndexBuildHeapRangeScan, and
added a quick "heap_scansetlimits" function. I haven't tested that this
works outside of the HeapRangeScan thingy, so it's probably completely
bogus; I'm open to suggestions if people think this should be
implemented differently. In any case, keeping that implementation
together with vanilla IndexBuildHeapScan makes a lot of sense.One thing still to tackle is when to mark ranges as unsummarized. Right
now, any new tuple on a page range would cause a new index entry to be
created and a new revmap update. This would cause huge index bloat if,
say, a page is emptied and vacuumed and filled with new tuples with
increasing values outside the original range; each new tuple would
create a new index tuple. I have two ideas about this (1. mark range as
unsummarized if 3rd time we touch the same page range;
Why only at 3rd time?
Doesn't it need to be precise, like if someone inserts a row having
value greater than max value of corresponding index tuple,
then that index tuple's corresponding max value needs to be updated
and I think its updated with the help of validity map.
For example:
considering we need to store below info for each index tuple:
In each index tuple (corresponding to one page range), we store:
- first block this tuple applies to
- last block this tuple applies to
- for each indexed column:
* min() value across all tuples in the range
* max() value across all tuples in the range
Assume first and last block for index tuple is same (assume block
no. 'x') and min value is 5 and max is 10.
Now user insert/update value in block 'x' such that max value of
index col. is 11, if we don't update corresponding
index tuple or at least invalidate it, won't it lead to wrong results?
2. vacuum the
affected index page if it's full, so we can maintain the index always up
to date without causing unduly bloat), but I haven't implemented
anything yet.The "amcostestimate" routine is completely bogus; right now it returns
constant 0, meaning the index is always chosen if it exists.
I think for first version, you might want to keep things simple, but
there should be some way for optimizer to select this index.
So rather than choose if it is present, we can make optimizer choose
when some-one says set enable_minmax index to true.
How about keeping this up-to-date during foreground operations.
Vacuum/Maintainer task maintaining things usually have problems of
bloat and
then we need optimize/workaround issues.
Lot of people have raised this or similar point previously and what
I read you are of opinion that it seems to be slow.
I really don't think that it can be so slow that adding so much
handling to get it up-to-date by some maintainer task is useful.
Currently there are
systems like Oracle where index clean-up is mainly done during
foreground operation, so this alone cannot be reason for slowness.
Comparing the logic with IOS is also not completely right as for
IOS, we need to know each tuple's visibility, which is not the case
here.
Now it can so happen that min and max values are sometimes not right
because later the operation is rolled back, but I think such cases
will
be less and we can find some way to handle such cases may be
maintainer task only, but the handling will be quite simpler.
On Windows, patch gives below compilation errors:
src\backend\access\minmax\mmtuple.c(96): error C2057: expected
constant expression
src\backend\access\minmax\mmtuple.c(96): error C2466: cannot
allocate an array of constant size 0
src\backend\access\minmax\mmtuple.c(96): error C2133: 'values' : unknown size
src\backend\access\minmax\mmtuple.c(97): error C2057: expected
constant expression
src\backend\access\minmax\mmtuple.c(97): error C2466: cannot
allocate an array of constant size 0
src\backend\access\minmax\mmtuple.c(97): error C2133: 'nulls' : unknown size
src\backend\access\minmax\mmtuple.c(102): error C2057: expected
constant expression
src\backend\access\minmax\mmtuple.c(102): error C2466: cannot
allocate an array of constant size 0
src\backend\access\minmax\mmtuple.c(102): error C2133:
'phony_nullbitmap' : unknown size
src\backend\access\minmax\mmtuple.c(110): warning C4034: sizeof returns 0
src\backend\access\minmax\mmtuple.c(246): error C2057: expected
constant expression
src\backend\access\minmax\mmtuple.c(246): error C2466: cannot
allocate an array of constant size 0
src\backend\access\minmax\mmtuple.c(246): error C2133: 'values' : unknown size
src\backend\access\minmax\mmtuple.c(247): error C2057: expected
constant expression
src\backend\access\minmax\mmtuple.c(247): error C2466: cannot
allocate an array of constant size 0
src\backend\access\minmax\mmtuple.c(247): error C2133: 'allnulls' :
unknown size
src\backend\access\minmax\mmtuple.c(248): error C2057: expected
constant expression
src\backend\access\minmax\mmtuple.c(248): error C2466: cannot
allocate an array of constant size 0
src\backend\access\minmax\mmtuple.c(248): error C2133: 'hasnulls' :
unknown size
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers