Let's drop two obsolete features which are bear-traps for novices

Started by Josh Berkusabout 11 years ago68 messages
#1Josh Berkus
josh@agliodbs.com

PostgreSQL has two bits of obsolete, incomplete functionality which
entrap and frustrate new users in large numbers. Both of these features
share the following characteristics:

* added more than 10 years ago
* have the same names as useful features from other databases
* were never finished and lack critical functionality
* have not seen significant work in the last 4 releases

Every other day on IRC I run into a newbie who has used one of these
features under the mistaken impression that it is useful, and then had
to be guided in how to get their data out of this broken feature at some
length. Unknown are the number of users who didn't ask for help but
simply chose to use a different database instead.

Of course, I'm talking about the MONEY type and hash indexes (not the
hash ops class, which is useful, just the index type). It's time to put
both of these features out to pasture. Certainly neither of theise
features would be accepted into PostgreSQL today given the shape they're in.

Having these broken features around is like leaving an armed bear-trap
in a public park.

Now, I know the first thing someone will do is jump up and claim that
they were just about to fix WAL-logging on hash indexes, or add casts to
the money type. But if that hasn't happened in the last 5 years, it's
not going to happen.

We'd be doing our users a huge favor by just removing them in 9.5.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#1)
Re: Let's drop two obsolete features which are bear-traps for novices

Josh Berkus <josh@agliodbs.com> writes:

Of course, I'm talking about the MONEY type and hash indexes (not the
hash ops class, which is useful, just the index type). It's time to put
both of these features out to pasture. Certainly neither of theise
features would be accepted into PostgreSQL today given the shape they're in.

I don't care one way or the other about the money type, but I will defend
hash indexes, especially seeing that we've already added a pretty
in-your-face warning as of 9.5:

regression=# create table foo(f1 int);
CREATE TABLE
regression=# create index on foo using hash (f1);
WARNING: hash indexes are not WAL-logged and their use is discouraged
CREATE INDEX

Now, I know the first thing someone will do is jump up and claim that
they were just about to fix WAL-logging on hash indexes,

I don't know if/when that will happen as such, but Simon was making noises
about writing code to treat hash indexes as unlogged automatically, which
would more or less fix the worst risks. That's not just a special case
for hash indexes, but any index AM that lacks WAL support, as third-party
AMs might well do.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Eric Ridge
eebbrr@gmail.com
In reply to: Tom Lane (#2)
Re: Let's drop two obsolete features which are bear-traps for novices

On Fri, Oct 31, 2014 at 6:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't know if/when that will happen as such, but Simon was making noises
about writing code to treat hash indexes as unlogged automatically, which
would more or less fix the worst risks. That's not just a special case
for hash indexes, but any index AM that lacks WAL support, as third-party
AMs might well do.

As someone writing a 3rd-party AM, literally right this moment, do you have
a link to that thread? While I follow this list fairly closely I don't
remember seeing this. I'd love to understand the thoughts around handling
extension-based AMs.

eric

#4Amit Langote
amitlangote09@gmail.com
In reply to: Eric Ridge (#3)
Re: Let's drop two obsolete features which are bear-traps for novices

Hi,

On Sat, Nov 1, 2014 at 1:21 PM, Eric Ridge <eebbrr@gmail.com> wrote:

On Fri, Oct 31, 2014 at 6:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't know if/when that will happen as such, but Simon was making noises
about writing code to treat hash indexes as unlogged automatically, which
would more or less fix the worst risks. That's not just a special case
for hash indexes, but any index AM that lacks WAL support, as third-party
AMs might well do.

As someone writing a 3rd-party AM, literally right this moment, do you have
a link to that thread? While I follow this list fairly closely I don't
remember seeing this. I'd love to understand the thoughts around handling
extension-based AMs.

You are looking for this:

/messages/by-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Let's drop two obsolete features which are bear-traps for novices

On 10/31/2014 03:07 PM, Tom Lane wrote:

I don't care one way or the other about the money type, but I will defend
hash indexes, especially seeing that we've already added a pretty
in-your-face warning as of 9.5:

regression=# create table foo(f1 int);
CREATE TABLE
regression=# create index on foo using hash (f1);
WARNING: hash indexes are not WAL-logged and their use is discouraged
CREATE INDEX

Yes, and I'm arguing that is the wrong decision. If hash indexes are
"discouraged", then they shouldn't be in core in the first place.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andres Freund
andres@2ndquadrant.com
In reply to: Josh Berkus (#5)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 10:18:03 -0700, Josh Berkus wrote:

On 10/31/2014 03:07 PM, Tom Lane wrote:

I don't care one way or the other about the money type, but I will defend
hash indexes, especially seeing that we've already added a pretty
in-your-face warning as of 9.5:

regression=# create table foo(f1 int);
CREATE TABLE
regression=# create index on foo using hash (f1);
WARNING: hash indexes are not WAL-logged and their use is discouraged
CREATE INDEX

Yes, and I'm arguing that is the wrong decision. If hash indexes are
"discouraged", then they shouldn't be in core in the first place.

Last time we discussed it there were people (IIRC Andrew was one of
them) commenting that they use hash indexes *precisely* because they're
not WAL logged and that they can live with the dangers that creates. I
don't think that's sufficient justification for introducing the feature
at all. But it's nothing new that removing a feature has to fit quite
different criteria than adding one.

So, by that argument we could remove hash indexes once we have unlogged
indexes on logged tables. But then there's no need to remove them
anymore...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#5)
Re: Let's drop two obsolete features which are bear-traps for novices

Josh Berkus <josh@agliodbs.com> writes:

On 10/31/2014 03:07 PM, Tom Lane wrote:

I don't care one way or the other about the money type, but I will defend
hash indexes, especially seeing that we've already added a pretty
in-your-face warning as of 9.5:

regression=# create table foo(f1 int);
CREATE TABLE
regression=# create index on foo using hash (f1);
WARNING: hash indexes are not WAL-logged and their use is discouraged
CREATE INDEX

Yes, and I'm arguing that is the wrong decision. If hash indexes are
"discouraged", then they shouldn't be in core in the first place.

There's an awful lot of stuff in core that probably shouldn't be there,
but we've not made a move to rip it out, especially not if there wasn't
a way to make it an external module.

In the case of hash indexes, because we still have to have the hash
opclasses in core, there's no way that it could be pushed out as an
extension module even if we otherwise had full support for AMs as
extensions. So what I hear you proposing is "let's break this so
thoroughly that it *can't* be fixed". I'm not on board with that.
I think the WARNING will do just fine to discourage novices who are
not familiar with the state of the hash AM. In the meantime, we
could push forward with the idea of making hash indexes automatically
unlogged, so that recovering from a crash wouldn't be quite so messy/
dangerous.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#6)
Re: Let's drop two obsolete features which are bear-traps for novices

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-11-01 10:18:03 -0700, Josh Berkus wrote:

Yes, and I'm arguing that is the wrong decision. If hash indexes are
"discouraged", then they shouldn't be in core in the first place.

Last time we discussed it there were people (IIRC Andrew was one of
them) commenting that they use hash indexes *precisely* because they're
not WAL logged and that they can live with the dangers that creates. I
don't think that's sufficient justification for introducing the feature
at all. But it's nothing new that removing a feature has to fit quite
different criteria than adding one.

So, by that argument we could remove hash indexes once we have unlogged
indexes on logged tables. But then there's no need to remove them
anymore...

Yeah. When we last discussed this, the difficulty was around how to make
that combination work. An unlogged index on an unlogged table is no
problem: the init-fork mechanism is able to make them both go to empty
after a crash. But for an unlogged index on a logged table, just making
the index go to empty is not the correct recovery action.

We'd been wondering how to make crash recovery change the index's pg_index
entry into not indisvalid/indisready status. That's hard to do. But
maybe we don't have to. What about having the init-fork mechanism restore
a hash index into a state with the metapage marked as invalid? hashinsert
etc could simply do nothing when they see this metapage state.
hashgettuple could throw an error saying the index isn't usable until it's
been REINDEXed.

This is not quite as nice as an indisvalid-based solution, because it
requires some extra cooperation from the index AM's code. But setting up
an init fork requires work from the index AM anyway, so that objection
doesn't seem terribly strong to me.

Thoughts?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#8)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 13:58:02 -0400, Tom Lane wrote:

Yeah. When we last discussed this, the difficulty was around how to make
that combination work. An unlogged index on an unlogged table is no
problem: the init-fork mechanism is able to make them both go to empty
after a crash. But for an unlogged index on a logged table, just making
the index go to empty is not the correct recovery action.

Just removing the relfilenode would probably be better than what we have
today. It sure is pretty unsatisfactory to get a ERROR after a
crash/immediate restart, but it's far better than getting wrong results
back (or even crash).

We'd been wondering how to make crash recovery change the index's pg_index
entry into not indisvalid/indisready status. That's hard to do.

It really is hard.

What I've been wondering about whether we should be boring and do it on
backend startup. Roughly similar to the way the relcache init file is
handled. The first backend that starts up for a certain database* scans
pg_index and rechecks all unlogged indexes of logged tables and marks
them as invalid.

* There's a couple ways we could detect that. The most trivial is to
just have a 'initialized' marker file in the database whose presence we
test on startup. That's probably acceptable costwise in comparison to
all the other stuff happening at backend init. We could get more fancy,
but it's probably not needed.

But
maybe we don't have to. What about having the init-fork mechanism restore
a hash index into a state with the metapage marked as invalid? hashinsert
etc could simply do nothing when they see this metapage state.
hashgettuple could throw an error saying the index isn't usable until it's
been REINDEXed.

This is not quite as nice as an indisvalid-based solution, because it
requires some extra cooperation from the index AM's code. But setting up
an init fork requires work from the index AM anyway, so that objection
doesn't seem terribly strong to me.

The most annoying thing I see with that kind of approach is that we'd
need to read the metapage pretty early during planning. The second most
annoying is that it's harder to for the user to detect which indexes are
in that state - we'd probably need to provide a SQL level function to
detect it.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#6)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/01/2014 01:26 PM, Andres Freund wrote:

On 2014-11-01 10:18:03 -0700, Josh Berkus wrote:

On 10/31/2014 03:07 PM, Tom Lane wrote:

I don't care one way or the other about the money type, but I will defend
hash indexes, especially seeing that we've already added a pretty
in-your-face warning as of 9.5:

regression=# create table foo(f1 int);
CREATE TABLE
regression=# create index on foo using hash (f1);
WARNING: hash indexes are not WAL-logged and their use is discouraged
CREATE INDEX

Yes, and I'm arguing that is the wrong decision. If hash indexes are
"discouraged", then they shouldn't be in core in the first place.

Last time we discussed it there were people (IIRC Andrew was one of
them) commenting that they use hash indexes *precisely* because they're
not WAL logged and that they can live with the dangers that creates. I
don't think that's sufficient justification for introducing the feature
at all. But it's nothing new that removing a feature has to fit quite
different criteria than adding one.

So, by that argument we could remove hash indexes once we have unlogged
indexes on logged tables. But then there's no need to remove them
anymore...

Yes, although there might not be much reason to use them either. I think
Tom's first step of making hash indexes automatically unlogged makes
sense. Longer term I'd like to see unlogged as an option for all AMs -
especially btree. It makes plenty of sense to me to be able to make the
data resilient even if the indexes, which can after all be recreated in
the unlikely event of a crash, are not.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Andres Freund
andres@2ndquadrant.com
In reply to: Andrew Dunstan (#10)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 14:13:02 -0400, Andrew Dunstan wrote:

Yes, although there might not be much reason to use them either. I think
Tom's first step of making hash indexes automatically unlogged makes sense.
Longer term I'd like to see unlogged as an option for all AMs - especially
btree.

If we implement unlogged indexes, it should definitely work for all
builtin indexes.

It makes plenty of sense to me to be able to make the data resilient
even if the indexes, which can after all be recreated in the unlikely event
of a crash, are not.

While I agree that it can be useful, I think the fact that replicas/HS
can't use such index makes them far less useful.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#8)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/01/2014 01:58 PM, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-11-01 10:18:03 -0700, Josh Berkus wrote:

Yes, and I'm arguing that is the wrong decision. If hash indexes are
"discouraged", then they shouldn't be in core in the first place.

Last time we discussed it there were people (IIRC Andrew was one of
them) commenting that they use hash indexes *precisely* because they're
not WAL logged and that they can live with the dangers that creates. I
don't think that's sufficient justification for introducing the feature
at all. But it's nothing new that removing a feature has to fit quite
different criteria than adding one.
So, by that argument we could remove hash indexes once we have unlogged
indexes on logged tables. But then there's no need to remove them
anymore...

Yeah. When we last discussed this, the difficulty was around how to make
that combination work. An unlogged index on an unlogged table is no
problem: the init-fork mechanism is able to make them both go to empty
after a crash. But for an unlogged index on a logged table, just making
the index go to empty is not the correct recovery action.

We'd been wondering how to make crash recovery change the index's pg_index
entry into not indisvalid/indisready status. That's hard to do. But
maybe we don't have to. What about having the init-fork mechanism restore
a hash index into a state with the metapage marked as invalid? hashinsert
etc could simply do nothing when they see this metapage state.
hashgettuple could throw an error saying the index isn't usable until it's
been REINDEXed.

This is not quite as nice as an indisvalid-based solution, because it
requires some extra cooperation from the index AM's code. But setting up
an init fork requires work from the index AM anyway, so that objection
doesn't seem terribly strong to me.

Thoughts?

Isn't the planner still going to try to use the index in that case? If
it's not then I'd be OK with it, but if it's going to make the table
largely unusable until it's reindexed that would be rather sad.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#9)
Re: Let's drop two obsolete features which are bear-traps for novices

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-11-01 13:58:02 -0400, Tom Lane wrote:

maybe we don't have to. What about having the init-fork mechanism restore
a hash index into a state with the metapage marked as invalid? hashinsert
etc could simply do nothing when they see this metapage state.
hashgettuple could throw an error saying the index isn't usable until it's
been REINDEXed.

The most annoying thing I see with that kind of approach is that we'd
need to read the metapage pretty early during planning.

No, I was specifically *not* proposing that. What I proposed was if the
planner picks the index for use in a query, you get an error.

Yeah, if we were trying to duplicate the behavior of indisvalid, there'd
need to be a way to detect the invalid index at plan time and not use it.
But I'm not sure that that's actually an improvement from the user's
standpoint: what they'd see is queries suddenly, and silently, performing
a lot worse than they expect. An explicit complaint about the necessary
REINDEX seems more user-friendly from where I sit.

However, if the consensus is that silently ignoring the index is the best
behavior, I would not be too concerned about the cost of checking the
metapage to see if the index is valid. A hash index's metapage would tend
to stay pinned in shared buffers anyway, because it's necessarily touched
on each use of the index. If that opinion proves wrong, the AM could take
steps to cache the state in the index's relcache entry (btree already
maintains similar cached state).

The second most
annoying is that it's harder to for the user to detect which indexes are
in that state - we'd probably need to provide a SQL level function to
detect it.

Sure, we could do that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Andres Freund
andres@2ndquadrant.com
In reply to: Andrew Dunstan (#12)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 14:19:22 -0400, Andrew Dunstan wrote:

Isn't the planner still going to try to use the index in that case? If it's
not then I'd be OK with it, but if it's going to make the table largely
unusable until it's reindexed that would be rather sad.

Both the planner (for querying) and the executor (to avoid inserting
tuples into the index) would have to query the state of such indexes. I
don't think it can reasonably work otherwise.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#14)
Re: Let's drop two obsolete features which are bear-traps for novices

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-11-01 14:19:22 -0400, Andrew Dunstan wrote:

Isn't the planner still going to try to use the index in that case? If it's
not then I'd be OK with it, but if it's going to make the table largely
unusable until it's reindexed that would be rather sad.

Both the planner (for querying) and the executor (to avoid inserting
tuples into the index) would have to query the state of such indexes. I
don't think it can reasonably work otherwise.

The executor doesn't need to know anything, since the AM can trivially
make aminsert be a no-op if the index is internally invalid. The planner
only needs to know something if we think that silently being slow for a
query meant to search the index is better than throwing an error reminding
the user that the index needs to be reindexed.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#13)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 14:23:07 -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-11-01 13:58:02 -0400, Tom Lane wrote:

maybe we don't have to. What about having the init-fork mechanism restore
a hash index into a state with the metapage marked as invalid? hashinsert
etc could simply do nothing when they see this metapage state.
hashgettuple could throw an error saying the index isn't usable until it's
been REINDEXed.

The most annoying thing I see with that kind of approach is that we'd
need to read the metapage pretty early during planning.

No, I was specifically *not* proposing that. What I proposed was if the
planner picks the index for use in a query, you get an error.

Ugh. I think that'll be pretty ugly. At the very least we'd need to
provide a way to mark such indexes as 'actually invalid' in the sense of
indisready. It'll not only block queries that sensibly would end up
using that index, but also all modifications - making the relation
essentially read only.

If we just want this, we can just remove the main fork of such relations
on startup, and not bother with index specific stuff.

Yeah, if we were trying to duplicate the behavior of indisvalid, there'd
need to be a way to detect the invalid index at plan time and not use it.
But I'm not sure that that's actually an improvement from the user's
standpoint: what they'd see is queries suddenly, and silently, performing
a lot worse than they expect. An explicit complaint about the necessary
REINDEX seems more user-friendly from where I sit.

A REINDEX is imo unlikely to be acceptable. It takes long (why would you
bother on a small table?) and locks the relation/indexes.

However, if the consensus is that silently ignoring the index is the best
behavior, I would not be too concerned about the cost of checking the
metapage to see if the index is valid. A hash index's metapage would tend
to stay pinned in shared buffers anyway, because it's necessarily touched
on each use of the index. If that opinion proves wrong, the AM could take
steps to cache the state in the index's relcache entry (btree already
maintains similar cached state).

We also could just put it in the generic relcache code...

Yea.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#16)
Re: Let's drop two obsolete features which are bear-traps for novices

Andres Freund <andres@2ndquadrant.com> writes:

A REINDEX is imo unlikely to be acceptable. It takes long (why would you
bother on a small table?) and locks the relation/indexes.

I think the goalposts just took a vacation to Acapulco.

What exactly do you think is going to make a crashed unlogged index valid
again without a REINDEX? Certainly the people who are currently using
hash indexes in the way Andrew describes are expecting to have to REINDEX
them after a crash.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#17)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 14:39:21 -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

A REINDEX is imo unlikely to be acceptable. It takes long (why would you
bother on a small table?) and locks the relation/indexes.

I think the goalposts just took a vacation to Acapulco.

I think that might be caused by a misunderstanding.

What exactly do you think is going to make a crashed unlogged index valid
again without a REINDEX? Certainly the people who are currently using
hash indexes in the way Andrew describes are expecting to have to REINDEX
them after a crash.

Obviously that individual index needs to be recreated. What I mean is
that I don't think it'll be acceptable that the table essentially can't
be queried before that's done. The situations in which I'd found
unlogged indexes useful is where there's some indexes are critical for
the OLTP business (those would continue to be logged), but some other
large ones are for things that aren't absolutely essential. Reports and
such.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#16)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/01/2014 02:34 PM, Andres Freund wrote:

Yeah, if we were trying to duplicate the behavior of indisvalid, there'd
need to be a way to detect the invalid index at plan time and not use it.
But I'm not sure that that's actually an improvement from the user's
standpoint: what they'd see is queries suddenly, and silently, performing
a lot worse than they expect. An explicit complaint about the necessary
REINDEX seems more user-friendly from where I sit.

A REINDEX is imo unlikely to be acceptable. It takes long (why would you
bother on a small table?) and locks the relation/indexes.

It's a bit of a pity we don't have REINDEX CONCURRENTLY.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#18)
Re: Let's drop two obsolete features which are bear-traps for novices

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-11-01 14:39:21 -0400, Tom Lane wrote:

What exactly do you think is going to make a crashed unlogged index valid
again without a REINDEX? Certainly the people who are currently using
hash indexes in the way Andrew describes are expecting to have to REINDEX
them after a crash.

Obviously that individual index needs to be recreated. What I mean is
that I don't think it'll be acceptable that the table essentially can't
be queried before that's done. The situations in which I'd found
unlogged indexes useful is where there's some indexes are critical for
the OLTP business (those would continue to be logged), but some other
large ones are for things that aren't absolutely essential. Reports and
such.

Sure. And as long as you aren't issuing queries that would want to scan
the crashed index, it won't matter either way. The question is whether
you'd rather that your "inessential reporting queries" fail without the
broken index, or that they take extreme amounts of time/resources.
I don't think it's obvious that the first alternative is bad.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Andres Freund
andres@2ndquadrant.com
In reply to: Andrew Dunstan (#19)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 14:45:47 -0400, Andrew Dunstan wrote:

On 11/01/2014 02:34 PM, Andres Freund wrote:

Yeah, if we were trying to duplicate the behavior of indisvalid, there'd
need to be a way to detect the invalid index at plan time and not use it.
But I'm not sure that that's actually an improvement from the user's
standpoint: what they'd see is queries suddenly, and silently, performing
a lot worse than they expect. An explicit complaint about the necessary
REINDEX seems more user-friendly from where I sit.

A REINDEX is imo unlikely to be acceptable. It takes long (why would you
bother on a small table?) and locks the relation/indexes.

It's a bit of a pity we don't have REINDEX CONCURRENTLY.

We essentially don't have it because people opined towards the end of
9.4 that a brief (as in two pg_class updates) AccessExclusive lock
window makes the feature moot. I still think that's quite heavily
disregarding the practial reality.
Luckily opinion seems to have shifted a bit again.

It'd also be really helpful if REINDEX CONCURRENTLY had a way to only
reindex invalid indexes. But that probably is just a smop.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#19)
Re: Let's drop two obsolete features which are bear-traps for novices

Andrew Dunstan <andrew@dunslane.net> writes:

It's a bit of a pity we don't have REINDEX CONCURRENTLY.

Well, that's an entirely different thread. Let's not get distracted
by that topic.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#20)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 14:48:20 -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-11-01 14:39:21 -0400, Tom Lane wrote:

What exactly do you think is going to make a crashed unlogged index valid
again without a REINDEX? Certainly the people who are currently using
hash indexes in the way Andrew describes are expecting to have to REINDEX
them after a crash.

Obviously that individual index needs to be recreated. What I mean is
that I don't think it'll be acceptable that the table essentially can't
be queried before that's done. The situations in which I'd found
unlogged indexes useful is where there's some indexes are critical for
the OLTP business (those would continue to be logged), but some other
large ones are for things that aren't absolutely essential. Reports and
such.

Sure. And as long as you aren't issuing queries that would want to scan
the crashed index, it won't matter either way. The question is whether
you'd rather that your "inessential reporting queries" fail without the
broken index, or that they take extreme amounts of time/resources.
I don't think it's obvious that the first alternative is bad.

In some of these cases the unlogged index would still be used for a
subset of the OLTP workload, e.g. because they're smaller. We e.g. have
a client that has smaller (as in 50GB instead of 600GB) indexes for rows
of a certain type in the table, but also one that spans the whole thing.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#17)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/01/2014 02:39 PM, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

A REINDEX is imo unlikely to be acceptable. It takes long (why would you
bother on a small table?) and locks the relation/indexes.

I think the goalposts just took a vacation to Acapulco.

What exactly do you think is going to make a crashed unlogged index valid
again without a REINDEX? Certainly the people who are currently using
hash indexes in the way Andrew describes are expecting to have to REINDEX
them after a crash.

That's certainly true. They were warned of the risks and found them
acceptable.

The real question here is whether the table should continue to be usable
in a degraded state until it's reindexed.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25Andres Freund
andres@2ndquadrant.com
In reply to: Andrew Dunstan (#24)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 14:56:35 -0400, Andrew Dunstan wrote:

On 11/01/2014 02:39 PM, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

A REINDEX is imo unlikely to be acceptable. It takes long (why would you
bother on a small table?) and locks the relation/indexes.

I think the goalposts just took a vacation to Acapulco.

What exactly do you think is going to make a crashed unlogged index valid
again without a REINDEX? Certainly the people who are currently using
hash indexes in the way Andrew describes are expecting to have to REINDEX
them after a crash.

That's certainly true. They were warned of the risks and found them
acceptable.

The real question here is whether the table should continue to be usable in
a degraded state until it's reindexed.

One argument in that direction imo is HS. We certainly would just
generally ignore unlogged indexes for querying while InRecovery, right?
Because otherwise HS would become pretty useless. And I think it'd be
pretty wierd if things worked on HS and not on the primary (or the HS
after promotion).

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#23)
Re: Let's drop two obsolete features which are bear-traps for novices

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-11-01 14:48:20 -0400, Tom Lane wrote:

Sure. And as long as you aren't issuing queries that would want to scan
the crashed index, it won't matter either way. The question is whether
you'd rather that your "inessential reporting queries" fail without the
broken index, or that they take extreme amounts of time/resources.
I don't think it's obvious that the first alternative is bad.

In some of these cases the unlogged index would still be used for a
subset of the OLTP workload, e.g. because they're smaller. We e.g. have
a client that has smaller (as in 50GB instead of 600GB) indexes for rows
of a certain type in the table, but also one that spans the whole thing.

Meh. There are no existing use-cases where anyone would dare use a hash
index in that way, because they couldn't be sure it would be valid.

However, it seems like we're arguing over not much. We'd certainly need a
new AM entry point to test whether the index is internally valid, so as to
support a SQL-level function that can report that. This argument boils
down to whether or not the planner should expend a few cycles to call
that, which in the end is unlikely to cost enough to notice (especially if
the AM arranges to cache that state in indexes' relcache entries).

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#24)
Re: Let's drop two obsolete features which are bear-traps for novices

Andrew Dunstan <andrew@dunslane.net> writes:

The real question here is whether the table should continue to be usable
in a degraded state until it's reindexed.

It certainly will be, as long as your notion of "usable in a degraded
state" doesn't include issuing queries that would prefer to use the broken
index. The discussion is about what exactly should happen if you do that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#25)
Re: Let's drop two obsolete features which are bear-traps for novices

Andres Freund <andres@2ndquadrant.com> writes:

One argument in that direction imo is HS. We certainly would just
generally ignore unlogged indexes for querying while InRecovery, right?
Because otherwise HS would become pretty useless. And I think it'd be
pretty wierd if things worked on HS and not on the primary (or the HS
after promotion).

I don't see how HS has anything to do with this discussion. We would
certainly have the index marked as unlogged in the catalogs, and we
would therefore not use it while InRecovery. Once you promote a
slave to live, it would be in the same state as a post-crash-recovery
master.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#28)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 15:11:40 -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

One argument in that direction imo is HS. We certainly would just
generally ignore unlogged indexes for querying while InRecovery, right?
Because otherwise HS would become pretty useless. And I think it'd be
pretty wierd if things worked on HS and not on the primary (or the HS
after promotion).

I don't see how HS has anything to do with this discussion. We would
certainly have the index marked as unlogged in the catalogs, and we
would therefore not use it while InRecovery.

Consider:
SELECT * FROM tbl WHERE active AND value = $1;
that can be satisfied by two indexes. One on (value), and an unlogged
index on (value) WHERE active. While in HS only the logged one will be
used. But if we don't silently ignore invalid unlogged indexes, hell
will break loose after promition because suddenly the predicated index
will be used in plans.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#29)
Re: Let's drop two obsolete features which are bear-traps for novices

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-11-01 15:11:40 -0400, Tom Lane wrote:

I don't see how HS has anything to do with this discussion.

Consider:
SELECT * FROM tbl WHERE active AND value = $1;
that can be satisfied by two indexes. One on (value), and an unlogged
index on (value) WHERE active. While in HS only the logged one will be
used. But if we don't silently ignore invalid unlogged indexes, hell
will break loose after promition because suddenly the predicated index
will be used in plans.

This example is utterly unrelated to HS: the same hazard would exist
on the master after a crash, with or without HS.

With current usage of hash indexes, nobody would ever construct such an
arrangement, or at least they would not expect to be able to start up
querying without having reindexed first. My proposal to throw errors
was based on the assumption that unlogged indexes would only be used to
support a readily identified subset of inessential queries. If we think
examples like the above are things people would actually want to do, then
yeah we would want to silently skip broken indexes. I personally find
this example less than compelling and think it's far more likely that
we'll get complaints about the system *not* complaining when a broken
index would be needed. However, it's only one line of code either way,
so I'm not going to argue about it further.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#30)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 15:33:27 -0400, Tom Lane wrote:

With current usage of hash indexes, nobody would ever construct such an
arrangement

Do you think this should only be implemented for hash indexes? I'd think
we'd do it for all existing index AMs?

I'm not all that excited about unlogged hash indexes. Yes, that'll
remove a annoying hazard, but I probably won't use them anyway. I am
somewhat excited about the more general unlogged indexes feature.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#31)
Re: Let's drop two obsolete features which are bear-traps for novices

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-11-01 15:33:27 -0400, Tom Lane wrote:

With current usage of hash indexes, nobody would ever construct such an
arrangement

Do you think this should only be implemented for hash indexes? I'd think
we'd do it for all existing index AMs?

I think we should eventually get to that point, perhaps. I'd want to do
it first for hash, since that's what direly needs it. I can see the value
of doing it for btree as well, but I'm less sure that the other index
types would have enough usage to justify the work.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33Peter Geoghegan
pg@heroku.com
In reply to: Andres Freund (#31)
Re: Let's drop two obsolete features which are bear-traps for novices

On Sat, Nov 1, 2014 at 12:40 PM, Andres Freund <andres@2ndquadrant.com> wrote:

I'm not all that excited about unlogged hash indexes. Yes, that'll
remove a annoying hazard, but I probably won't use them anyway. I am
somewhat excited about the more general unlogged indexes feature.

I strongly agree.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Let's drop two obsolete features which are bear-traps for novices

All,

While there's argument about hash indexes, it looks like nobody minds if
the MONEY type goes bye-bye. So, time for a patch ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35David Rowley
dgrowleyml@gmail.com
In reply to: Josh Berkus (#34)
Re: Let's drop two obsolete features which are bear-traps for novices

On Sun, Nov 2, 2014 at 12:59 PM, Josh Berkus <josh@agliodbs.com> wrote:

All,

While there's argument about hash indexes, it looks like nobody minds if
the MONEY type goes bye-bye. So, time for a patch ...

Will the patch move the feature to a contrib module?

Or will our release notes state that all MONEY columns must be changed to
an appropriate NUMERIC type pre-upgrade?

Regards

David Rowley

#36Andres Freund
andres@2ndquadrant.com
In reply to: Josh Berkus (#34)
Re: Let's drop two obsolete features which are bear-traps for novices

On 2014-11-01 16:59:35 -0700, Josh Berkus wrote:

All,

While there's argument about hash indexes, it looks like nobody minds if
the MONEY type goes bye-bye. So, time for a patch ...

FWIW there have been somewhat recent patches for money and it was
undeprecated not too long ago. So apparently there's users for it out
there. I personally would never use money, but I doubt the gain is worth
the cost of breaking existing setups without a warning period.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#37Jaime Casanova
jaime@2ndquadrant.com
In reply to: Andres Freund (#36)
Re: Let's drop two obsolete features which are bear-traps for novices

El nov 2, 2014 7:54 AM, "Andres Freund" <andres@2ndquadrant.com> escribió:

On 2014-11-01 16:59:35 -0700, Josh Berkus wrote:

All,

While there's argument about hash indexes, it looks like nobody minds if
the MONEY type goes bye-bye. So, time for a patch ...

FWIW there have been somewhat recent patches for money and it was
undeprecated not too long ago. So apparently there's users for it out
there. I personally would never use money, but I doubt the gain is worth
the cost of breaking existing setups without a warning period.

Not knowing how difficult it could be maybe a fair compromise is to move
MONEY datatype to a contrib. And documenting its limitations.

--
Jaime Casanova
2ndQuadrant Consultant
PostgreSQL Development, 24x7 Support, Training & Services

#38Andrew Dunstan
andrew@dunslane.net
In reply to: Jaime Casanova (#37)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/02/2014 10:01 AM, Jaime Casanova wrote:

El nov 2, 2014 7:54 AM, "Andres Freund" <andres@2ndquadrant.com
<mailto:andres@2ndquadrant.com>> escribió:

On 2014-11-01 16:59:35 -0700, Josh Berkus wrote:

All,

While there's argument about hash indexes, it looks like nobody

minds if

the MONEY type goes bye-bye. So, time for a patch ...

FWIW there have been somewhat recent patches for money and it was
undeprecated not too long ago. So apparently there's users for it out
there. I personally would never use money, but I doubt the gain is worth
the cost of breaking existing setups without a warning period.

Not knowing how difficult it could be maybe a fair compromise is to
move MONEY datatype to a contrib. And documenting its limitations.

That's pretty much dead in the water, I think. Builtin types and
functions have Oid values in different ranges from non-builtin types
such as those in contrib. It's one reason we have no chance of bringing
hstore into core as people have previously asked for. And for the same
reason I think moving a core type out to contrib won't work. In any
case, contrib shouldn't be a rubbish heap of old deprecated features.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#38)
Re: Let's drop two obsolete features which are bear-traps for novices

Andrew Dunstan <andrew@dunslane.net> writes:

On 11/02/2014 10:01 AM, Jaime Casanova wrote:

Not knowing how difficult it could be maybe a fair compromise is to
move MONEY datatype to a contrib. And documenting its limitations.

That's pretty much dead in the water, I think. Builtin types and
functions have Oid values in different ranges from non-builtin types
such as those in contrib. It's one reason we have no chance of bringing
hstore into core as people have previously asked for. And for the same
reason I think moving a core type out to contrib won't work.

Well, the OID compatibility issue could be dodged by saying that we can't
do a pg_upgrade (in-place upgrade) of a database containing MONEY
columns. In fact, we might be able to just reject databases containing
MONEY[] (array) columns, which seems like it might be only a minor hazard.
Either way, requiring a dump/reload for upgrade is surely a better answer
for users of the type than just summarily screwing them.

In any
case, contrib shouldn't be a rubbish heap of old deprecated features.

There's a fair amount of contrib that was never anything else, so I don't
agree with that reasoning too much.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#39)
Re: Let's drop two obsolete features which are bear-traps for novices

I wrote:

Either way, requiring a dump/reload for upgrade is surely a better answer
for users of the type than just summarily screwing them.

BTW, after reflecting a bit more I'm less than convinced that this
datatype is completely useless. Even if you prefer to store currency
values in numeric columns, casting to or from money provides a way to
accept or emit values in whatever monetary format the LC_MONETARY locale
setting specifies. That seems like a useful feature, and it's one you
could not easily duplicate using to_char/to_number (not to mention that
those functions aren't without major shortcomings of their own).

A quick trawl through the git history for cash.c indicates that most of
the bug fixes it's gotten recently have concerned doing formatting
correctly for non-US-like LC_MONETARY settings, so I think there are
indeed people out there who care about that capability.

Certainly we could package that capability in another way besides a
datatype, but I think we should actually do so (and give users a release
or two to switch over) before we consider nuking the type.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#41Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#39)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/02/2014 11:53 AM, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

On 11/02/2014 10:01 AM, Jaime Casanova wrote:

Not knowing how difficult it could be maybe a fair compromise is to
move MONEY datatype to a contrib. And documenting its limitations.

That's pretty much dead in the water, I think. Builtin types and
functions have Oid values in different ranges from non-builtin types
such as those in contrib. It's one reason we have no chance of bringing
hstore into core as people have previously asked for. And for the same
reason I think moving a core type out to contrib won't work.

Well, the OID compatibility issue could be dodged by saying that we can't
do a pg_upgrade (in-place upgrade) of a database containing MONEY
columns. In fact, we might be able to just reject databases containing
MONEY[] (array) columns, which seems like it might be only a minor hazard.
Either way, requiring a dump/reload for upgrade is surely a better answer
for users of the type than just summarily screwing them.

Well, OK, yes, if we're prepared to abandon pg_upgrade-ability.

In any
case, contrib shouldn't be a rubbish heap of old deprecated features.

There's a fair amount of contrib that was never anything else, so I don't
agree with that reasoning too much.

Maybe my memory is failing. What in contrib is stuff that used to be in
core?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#42Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#41)
Re: Let's drop two obsolete features which are bear-traps for novices

Andrew Dunstan <andrew@dunslane.net> writes:

On 11/02/2014 11:53 AM, Tom Lane wrote:

Well, the OID compatibility issue could be dodged by saying that we can't
do a pg_upgrade (in-place upgrade) of a database containing MONEY
columns. In fact, we might be able to just reject databases containing
MONEY[] (array) columns, which seems like it might be only a minor hazard.
Either way, requiring a dump/reload for upgrade is surely a better answer
for users of the type than just summarily screwing them.

Well, OK, yes, if we're prepared to abandon pg_upgrade-ability.

Not following your point? Removing the type entirely would certainly
break pg_upgrade-ability as well.

In any
case, contrib shouldn't be a rubbish heap of old deprecated features.

There's a fair amount of contrib that was never anything else, so I don't
agree with that reasoning too much.

Maybe my memory is failing. What in contrib is stuff that used to be in
core?

Nothing that I recall at the moment, but there is certainly plenty of
stuff of dubious quality in there. I'd argue that chkpass, intagg,
intarray, isn, spi, and xml2 are all in worse shape than the money type.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#42)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/02/2014 02:41 PM, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

On 11/02/2014 11:53 AM, Tom Lane wrote:

Well, the OID compatibility issue could be dodged by saying that we can't
do a pg_upgrade (in-place upgrade) of a database containing MONEY
columns. In fact, we might be able to just reject databases containing
MONEY[] (array) columns, which seems like it might be only a minor hazard.
Either way, requiring a dump/reload for upgrade is surely a better answer
for users of the type than just summarily screwing them.

Well, OK, yes, if we're prepared to abandon pg_upgrade-ability.

Not following your point? Removing the type entirely would certainly
break pg_upgrade-ability as well.

I'm not entirely convinced that we should remove it.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#44Michael Banck
michael.banck@credativ.de
In reply to: Tom Lane (#40)
Re: Let's drop two obsolete features which are bear-traps for novices

Am Sonntag, den 02.11.2014, 12:41 -0500 schrieb Tom Lane:

BTW, after reflecting a bit more I'm less than convinced that this
datatype is completely useless. Even if you prefer to store currency
values in numeric columns, casting to or from money provides a way to
accept or emit values in whatever monetary format the LC_MONETARY locale
setting specifies. That seems like a useful feature, and it's one you
could not easily duplicate using to_char/to_number (not to mention that
those functions aren't without major shortcomings of their own).

As an additional datapoint, Vitesse Data changed the DB schema from
NUMERIC to MONEY for their TPCH benchmark for performance reasons: "The
modification to data types is easy to understand -- money and double
types are faster than Numeric (and no one on this planet has a bank
account that overflows the money type, not any time soon)."[1]http://vitesse-timing-on.blogspot.de/2014/10/running-tpch-on-postgresql-part-1.html And
"Replaced NUMERIC fields representing currency with MONEY"[2]http://vitessedata.com/benchmark/.

Not sure whether they modified/optimized PostgreSQL with respect to the
MONEY data type and/or how much performance that gained, so CCing CK Tan
as well.

Michael

[1]: http://vitesse-timing-on.blogspot.de/2014/10/running-tpch-on-postgresql-part-1.html
[2]: http://vitessedata.com/benchmark/

--
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax: +49 (2161) 4643-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#45Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Andrew Dunstan (#19)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/1/14, 1:45 PM, Andrew Dunstan wrote:

On 11/01/2014 02:34 PM, Andres Freund wrote:

Yeah, if we were trying to duplicate the behavior of indisvalid, there'd
need to be a way to detect the invalid index at plan time and not use it.
But I'm not sure that that's actually an improvement from the user's
standpoint: what they'd see is queries suddenly, and silently, performing
a lot worse than they expect. An explicit complaint about the necessary
REINDEX seems more user-friendly from where I sit.

A REINDEX is imo unlikely to be acceptable. It takes long (why would you
bother on a small table?) and locks the relation/indexes.

It's a bit of a pity we don't have REINDEX CONCURRENTLY.

Reviews welcome: https://commitfest.postgresql.org/action/patch_view?id=1563
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#46David Fetter
david@fetter.org
In reply to: Michael Banck (#44)
Re: Let's drop two obsolete features which are bear-traps for novices

On Mon, Nov 03, 2014 at 01:54:09PM +0100, Michael Banck wrote:

Am Sonntag, den 02.11.2014, 12:41 -0500 schrieb Tom Lane:

BTW, after reflecting a bit more I'm less than convinced that this
datatype is completely useless. Even if you prefer to store currency
values in numeric columns, casting to or from money provides a way to
accept or emit values in whatever monetary format the LC_MONETARY locale
setting specifies. That seems like a useful feature, and it's one you
could not easily duplicate using to_char/to_number (not to mention that
those functions aren't without major shortcomings of their own).

As an additional datapoint, Vitesse Data changed the DB schema from
NUMERIC to MONEY for their TPCH benchmark for performance reasons: "The
modification to data types is easy to understand -- money and double
types are faster than Numeric (and no one on this planet has a bank
account that overflows the money type, not any time soon)."[1] And
"Replaced NUMERIC fields representing currency with MONEY"[2].

Not sure whether they modified/optimized PostgreSQL with respect to the
MONEY data type and/or how much performance that gained, so CCing CK Tan
as well.

How does our NUMERIC type's performance compare to other systems'
precise types? I realize that some of those systems might have
restrictions on publishing numbers they don't authorize, but they
might have pushed some authorized numbers if those numbers make them
look good.

Also, just a general three-state comparison, "we beat them handily,"
"we're somewhere near them" or "we're not even close" would be enough
to work from.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#47Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: David Fetter (#46)
Re: Let's drop two obsolete features which are bear-traps for novices

David Fetter wrote:

On Mon, Nov 03, 2014 at 01:54:09PM +0100, Michael Banck wrote:

As an additional datapoint, Vitesse Data changed the DB schema from
NUMERIC to MONEY for their TPCH benchmark for performance reasons: "The
modification to data types is easy to understand -- money and double
types are faster than Numeric (and no one on this planet has a bank
account that overflows the money type, not any time soon)."[1] And
"Replaced NUMERIC fields representing currency with MONEY"[2].

Not sure whether they modified/optimized PostgreSQL with respect to the
MONEY data type and/or how much performance that gained, so CCing CK Tan
as well.

How does our NUMERIC type's performance compare to other systems'
precise types? I realize that some of those systems might have
restrictions on publishing numbers they don't authorize, but they
might have pushed some authorized numbers if those numbers make them
look good.

There is a real advantage of money over numeric in the performance
front. I haven't measured it, but suffice to say that money uses
integer operations which map almost directly to CPU instructions,
whereas numeric needs to decode from our varlena base-10000 digit
format, operate on digits at a time, then encode back. No matter how
much you optimize numeric, it's never going to outperform stuff that
runs practically on bare electrons. As far as I recall, some TPCH
queries run aggregations on currency columns.

Now, whether this makes a measurable difference or not in TPCH terms, I
have no idea.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#48Jeff Janes
jeff.janes@gmail.com
In reply to: Andres Freund (#6)
Re: Let's drop two obsolete features which are bear-traps for novices

On Sat, Nov 1, 2014 at 10:26 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-11-01 10:18:03 -0700, Josh Berkus wrote:

On 10/31/2014 03:07 PM, Tom Lane wrote:

I don't care one way or the other about the money type, but I will

defend

hash indexes, especially seeing that we've already added a pretty
in-your-face warning as of 9.5:

regression=# create table foo(f1 int);
CREATE TABLE
regression=# create index on foo using hash (f1);
WARNING: hash indexes are not WAL-logged and their use is discouraged
CREATE INDEX

Yes, and I'm arguing that is the wrong decision. If hash indexes are
"discouraged", then they shouldn't be in core in the first place.

Last time we discussed it there were people (IIRC Andrew was one of
them) commenting that they use hash indexes *precisely* because they're
not WAL logged and that they can live with the dangers that creates. I
don't think that's sufficient justification for introducing the feature
at all. But it's nothing new that removing a feature has to fit quite
different criteria than adding one.

So, by that argument we could remove hash indexes once we have unlogged
indexes on logged tables. But then there's no need to remove them
anymore...

I would object to removing hash indexes as long as the alternative way to
index oversized value is to write all my SQL to look like:

select count(*) from foo where substr(x,1,2700)=substr($1,1,2700) and x=$1

Now, if the planner were smart enough to realize that x=$1 implies
substr(x,1,2700)=substr($1,1,2700), that might be a different matter. But
it is not.

Or, if there were a way to create a view on foo which would do this
implication automatically, but again as far as I know there is not a way to
do that either.

Cheers,

Jeff

#49Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Alvaro Herrera (#47)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/3/14, 1:34 PM, Alvaro Herrera wrote:

David Fetter wrote:

On Mon, Nov 03, 2014 at 01:54:09PM +0100, Michael Banck wrote:

As an additional datapoint, Vitesse Data changed the DB schema from
NUMERIC to MONEY for their TPCH benchmark for performance reasons: "The
modification to data types is easy to understand -- money and double
types are faster than Numeric (and no one on this planet has a bank
account that overflows the money type, not any time soon)."[1] And
"Replaced NUMERIC fields representing currency with MONEY"[2].

Not sure whether they modified/optimized PostgreSQL with respect to the
MONEY data type and/or how much performance that gained, so CCing CK Tan
as well.

How does our NUMERIC type's performance compare to other systems'
precise types? I realize that some of those systems might have
restrictions on publishing numbers they don't authorize, but they
might have pushed some authorized numbers if those numbers make them
look good.

There is a real advantage of money over numeric in the performance
front. I haven't measured it, but suffice to say that money uses
integer operations which map almost directly to CPU instructions,
whereas numeric needs to decode from our varlena base-10000 digit
format, operate on digits at a time, then encode back. No matter how
much you optimize numeric, it's never going to outperform stuff that
runs practically on bare electrons. As far as I recall, some TPCH
queries run aggregations on currency columns.

Now, whether this makes a measurable difference or not in TPCH terms, I
have no idea.

The performance of our numeric vs Oracle's was a common complaint when I was at EnterpriseDB (in 2007).

Perhaps numeric's performance could be greatly improved in cases where the precision is low enough to map to an int/bigint. That would get us closer to eliminating money as well as give other uses a big win. Of course, how to do that and not break pg_upgrade is a huge question...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#50Tatsuo Ishii
ishii@postgresql.org
In reply to: Jim Nasby (#49)
Re: Let's drop two obsolete features which are bear-traps for novices

The performance of our numeric vs Oracle's was a common complaint when
I was at EnterpriseDB (in 2007).

Perhaps numeric's performance could be greatly improved in cases where
the precision is low enough to map to an int/bigint. That would get us
closer to eliminating money as well as give other uses a big win. Of
course, how to do that and not break pg_upgrade is a huge question...

Just out of curiosity, why is Oracle's NUMBER (I assume you are
talking about this) so fast?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#51Kevin Grittner
kgrittn@ymail.com
In reply to: Alvaro Herrera (#47)
Re: Let's drop two obsolete features which are bear-traps for novices

Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

There is a real advantage of money over numeric in the performance
front. I haven't measured it, but suffice to say that money uses
integer operations which map almost directly to CPU instructions,
whereas numeric needs to decode from our varlena base-10000 digit
format, operate on digits at a time, then encode back. No matter how
much you optimize numeric, it's never going to outperform stuff that
runs practically on bare electrons. As far as I recall, some TPCH
queries run aggregations on currency columns.

Now, whether this makes a measurable difference or not in TPCH terms, I
have no idea.

Best of 3 tries on each statement...

A count(*) as a baseline:

test=# do $$ begin perform count(*) from generate_series(1,10000000); end; $$;
DO
Time: 3260.920 ms

A sum of money:

test=# do $$ begin perform sum('10000.01'::money) from generate_series(1,10000000); end; $$;
DO
Time: 3572.709 ms

A sum of numeric:

test=# do $$ begin perform sum('10000.01'::numeric) from generate_series(1,10000000); end; $$;
DO
Time: 4805.559 ms

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#52David Fetter
david@fetter.org
In reply to: Tatsuo Ishii (#50)
Re: Let's drop two obsolete features which are bear-traps for novices

On Tue, Nov 04, 2014 at 07:51:06AM +0900, Tatsuo Ishii wrote:

The performance of our numeric vs Oracle's was a common complaint when
I was at EnterpriseDB (in 2007).

Perhaps numeric's performance could be greatly improved in cases where
the precision is low enough to map to an int/bigint. That would get us
closer to eliminating money as well as give other uses a big win. Of
course, how to do that and not break pg_upgrade is a huge question...

Just out of curiosity, why is Oracle's NUMBER (I assume you are
talking about this) so fast?

I suspect that what happens is that NUMBER is stored as a native type
(int2, int4, int8, int16) that depends on its size and then cast to
the next upward thing as needed, taking any performance hits at that
point. The documentation hints (38 decimal places) at a 128-bit
internal representation as the maximum. I don't know what happens
when you get past what 128 bits can represent.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#53Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/02/2014 11:41 AM, Tom Lane wrote:

Nothing that I recall at the moment, but there is certainly plenty of
stuff of dubious quality in there. I'd argue that chkpass, intagg,
intarray, isn, spi, and xml2 are all in worse shape than the money type.

Why are we holding on to xml2 again?

FWIW, I'd be fine with moving ISN, intarray and intagg to PGXN. In
fact, I think it would be better for them to be there. And they're not
core types, so there shouldn't be an issue with that.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#54Michael Paquier
michael.paquier@gmail.com
In reply to: Tom Lane (#7)
Re: Let's drop two obsolete features which are bear-traps for novices

On Sun, Nov 2, 2014 at 2:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

In the case of hash indexes, because we still have to have the hash
opclasses in core, there's no way that it could be pushed out as an
extension module even if we otherwise had full support for AMs as
extensions. So what I hear you proposing is "let's break this so
thoroughly that it *can't* be fixed". I'm not on board with that.
I think the WARNING will do just fine to discourage novices who are
not familiar with the state of the hash AM. In the meantime, we
could push forward with the idea of making hash indexes automatically
unlogged, so that recovering from a crash wouldn't be quite so messy/
dangerous.

There is as well another way: finally support WAL-logging for hash indexes.
--
Michael

#55Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#53)
Re: Let's drop two obsolete features which are bear-traps for novices

Josh Berkus <josh@agliodbs.com> writes:

On 11/02/2014 11:41 AM, Tom Lane wrote:

Nothing that I recall at the moment, but there is certainly plenty of
stuff of dubious quality in there. I'd argue that chkpass, intagg,
intarray, isn, spi, and xml2 are all in worse shape than the money type.

Why are we holding on to xml2 again?

IIRC, there's some xpath-related functionality in there that's not
yet available in core (and needs some redesign before it'd ever get
accepted into core, so there's not a real quick fix to be had).

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#56Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#54)
Re: Let's drop two obsolete features which are bear-traps for novices

Michael Paquier <michael.paquier@gmail.com> writes:

There is as well another way: finally support WAL-logging for hash indexes.

Sure, but that's a bit easier said than done. I think Robert looked into
that a year or two back and found some stumbling blocks that it wasn't
obvious how to surmount. I do hope it will happen eventually, but we
should assume that it's a long-term answer not a short-term one.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#57Feng Tian
ftian@vitessedata.com
In reply to: Michael Banck (#44)
Re: Let's drop two obsolete features which are bear-traps for novices

Hi,

This is Feng from Vitesse. Performance different between Money and
Numeric is *HUGE*. For TPCH Q1, the performance difference is 5x for
stock postgres, and ~20x for vitesse.

Stock postgres, for my laptop, TPCH 1G, Q1, use money type ~ 9s, use
Numeric (15, 2) is ~53s.

Kevin,
test=# do $$ begin perform sum('10000.01'::numeric) from
generate_series(1,10000000); end; $$;

This may not reflect the difference of the two data type. One aggregate
is not where most of the time is spent. TPCH Q1 has many more computing.

On Mon, Nov 3, 2014 at 4:54 AM, Michael Banck <michael.banck@credativ.de>
wrote:

Show quoted text

Am Sonntag, den 02.11.2014, 12:41 -0500 schrieb Tom Lane:

BTW, after reflecting a bit more I'm less than convinced that this
datatype is completely useless. Even if you prefer to store currency
values in numeric columns, casting to or from money provides a way to
accept or emit values in whatever monetary format the LC_MONETARY locale
setting specifies. That seems like a useful feature, and it's one you
could not easily duplicate using to_char/to_number (not to mention that
those functions aren't without major shortcomings of their own).

As an additional datapoint, Vitesse Data changed the DB schema from
NUMERIC to MONEY for their TPCH benchmark for performance reasons: "The
modification to data types is easy to understand -- money and double
types are faster than Numeric (and no one on this planet has a bank
account that overflows the money type, not any time soon)."[1] And
"Replaced NUMERIC fields representing currency with MONEY"[2].

Not sure whether they modified/optimized PostgreSQL with respect to the
MONEY data type and/or how much performance that gained, so CCing CK Tan
as well.

Michael

[1]
http://vitesse-timing-on.blogspot.de/2014/10/running-tpch-on-postgresql-part-1.html
[2] http://vitessedata.com/benchmark/

--
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax: +49 (2161) 4643-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#58Craig Ringer
craig@2ndquadrant.com
In reply to: Tom Lane (#42)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/03/2014 03:41 AM, Tom Lane wrote:

Nothing that I recall at the moment, but there is certainly plenty of
stuff of dubious quality in there. I'd argue that chkpass, intagg,
intarray, isn, spi, and xml2 are all in worse shape than the money type.

What's wrong with intarray?

I'm aware of issues with the rest (and I think citext has a few of its
own too) but I didn't think intarray had any real issues.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#59CK Tan
cktan@vitessedata.com
In reply to: Feng Tian (#57)
Re: Let's drop two obsolete features which are bear-traps for novices

Josh,

Do you have a list of what needs to be done to keep the MONEY type?
What is wrong with it?

Thanks,
-cktan

On Mon, Nov 3, 2014 at 10:30 PM, Feng Tian <ftian@vitessedata.com> wrote:

Hi,

This is Feng from Vitesse. Performance different between Money and Numeric
is *HUGE*. For TPCH Q1, the performance difference is 5x for stock
postgres, and ~20x for vitesse.

Stock postgres, for my laptop, TPCH 1G, Q1, use money type ~ 9s, use Numeric
(15, 2) is ~53s.

Kevin,
test=# do $$ begin perform sum('10000.01'::numeric) from
generate_series(1,10000000); end; $$;

This may not reflect the difference of the two data type. One aggregate is
not where most of the time is spent. TPCH Q1 has many more computing.

On Mon, Nov 3, 2014 at 4:54 AM, Michael Banck <michael.banck@credativ.de>
wrote:

Am Sonntag, den 02.11.2014, 12:41 -0500 schrieb Tom Lane:

BTW, after reflecting a bit more I'm less than convinced that this
datatype is completely useless. Even if you prefer to store currency
values in numeric columns, casting to or from money provides a way to
accept or emit values in whatever monetary format the LC_MONETARY locale
setting specifies. That seems like a useful feature, and it's one you
could not easily duplicate using to_char/to_number (not to mention that
those functions aren't without major shortcomings of their own).

As an additional datapoint, Vitesse Data changed the DB schema from
NUMERIC to MONEY for their TPCH benchmark for performance reasons: "The
modification to data types is easy to understand -- money and double
types are faster than Numeric (and no one on this planet has a bank
account that overflows the money type, not any time soon)."[1] And
"Replaced NUMERIC fields representing currency with MONEY"[2].

Not sure whether they modified/optimized PostgreSQL with respect to the
MONEY data type and/or how much performance that gained, so CCing CK Tan
as well.

Michael

[1]
http://vitesse-timing-on.blogspot.de/2014/10/running-tpch-on-postgresql-part-1.html
[2] http://vitessedata.com/benchmark/

--
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax: +49 (2161) 4643-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#60Albe Laurenz
laurenz.albe@wien.gv.at
In reply to: David Fetter (#52)
Re: Let's drop two obsolete features which are bear-traps for novices

David Fetter wrote:

On Tue, Nov 04, 2014 at 07:51:06AM +0900, Tatsuo Ishii wrote:

Just out of curiosity, why is Oracle's NUMBER (I assume you are
talking about this) so fast?

I suspect that what happens is that NUMBER is stored as a native type
(int2, int4, int8, int16) that depends on its size and then cast to
the next upward thing as needed, taking any performance hits at that
point. The documentation hints (38 decimal places) at a 128-bit
internal representation as the maximum. I don't know what happens
when you get past what 128 bits can represent.

No, Oracle stores NUMBERs as variable length field (up to 22 bytes),
where the first byte encodes the sign and the comma position and the
remaining bytes encode the digits, each byte representing two digits
in base-100 notation (see Oracle Metalink note 1007641.6).

So it's not so different from PostgreSQL.
No idea why their arithmetic should be faster.

Yours,
Laurenz Albe

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#61ktm@rice.edu
ktm@rice.edu
In reply to: Michael Paquier (#54)
Re: Let's drop two obsolete features which are bear-traps for novices

On Tue, Nov 04, 2014 at 11:44:22AM +0900, Michael Paquier wrote:

On Sun, Nov 2, 2014 at 2:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

In the case of hash indexes, because we still have to have the hash
opclasses in core, there's no way that it could be pushed out as an
extension module even if we otherwise had full support for AMs as
extensions. So what I hear you proposing is "let's break this so
thoroughly that it *can't* be fixed". I'm not on board with that.
I think the WARNING will do just fine to discourage novices who are
not familiar with the state of the hash AM. In the meantime, we
could push forward with the idea of making hash indexes automatically
unlogged, so that recovering from a crash wouldn't be quite so messy/
dangerous.

There is as well another way: finally support WAL-logging for hash indexes.

+1

Ken

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#62Kevin Grittner
kgrittn@ymail.com
In reply to: Feng Tian (#57)
Re: Let's drop two obsolete features which are bear-traps for novices

Feng Tian <ftian@vitessedata.com> wrote:

Performance different between Money and Numeric is *HUGE*. For
TPCH Q1, the performance difference is 5x for stock postgres, and
~20x for vitesse.

Stock postgres, for my laptop, TPCH 1G, Q1, use money type ~ 9s,
use Numeric (15, 2) is ~53s.

test=# do $$ begin perform sum('10000.01'::numeric) from generate_series(1,10000000); end; $$;

This may not reflect the difference of the two data type. One
aggregate is not where most of the time is spent. TPCH Q1 has
many more computing.

That's why I gave the count(*) baseline. If you subtract that from
the sum() timings for money and numeric, numeric takes 5x as long
as money in my quick-and-dirty benchmark. I have no problem
believing that some applications can see that level of performance
hit from using numeric instead of money. It's a little surprising
to me to see more than a factor of five slowdown from using numeric
instead of money; it leaves me a little curious how that happens.
Perhaps that workload is more conducive to keeping those money
amounts in registers and doing register arithmetic.

In any event, I'm against removing or re-deprecating the money
type. There are some problems with money; there are other problems
with numeric. If the docs are failing to make the trade-offs
clear, we should fix the docs. And we can always look at improving
either or both types. The fact that numeric is 5x to 20x slower
than money in important applications, combined with the fact that
it performs far worse than a similar type in another product,
suggest that numeric could stand a serious optimization pass.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#63Andreas Karlsson
andreas@proxel.se
In reply to: Tom Lane (#40)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/02/2014 06:41 PM, Tom Lane wrote:

I wrote:

Either way, requiring a dump/reload for upgrade is surely a better answer
for users of the type than just summarily screwing them.

BTW, after reflecting a bit more I'm less than convinced that this
datatype is completely useless. Even if you prefer to store currency
values in numeric columns, casting to or from money provides a way to
accept or emit values in whatever monetary format the LC_MONETARY locale
setting specifies. That seems like a useful feature, and it's one you
could not easily duplicate using to_char/to_number (not to mention that
those functions aren't without major shortcomings of their own).

I personally find that one of the money type's missfeatures, since the
currency symbol does not vary just per language/locale but also for
which currency you are working with. For databases with multiple
currencies the output format of money is just confusing.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#64Merlin Moncure
mmoncure@gmail.com
In reply to: Kevin Grittner (#62)
Re: Let's drop two obsolete features which are bear-traps for novices

On Tue, Nov 4, 2014 at 8:16 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

In any event, I'm against removing or re-deprecating the money
type. There are some problems with money; there are other problems
with numeric. If the docs are failing to make the trade-offs
clear, we should fix the docs. And we can always look at improving
either or both types. The fact that numeric is 5x to 20x slower
than money in important applications, combined with the fact that
it performs far worse than a similar type in another product,
suggest that numeric could stand a serious optimization pass.

Money should stay. It is the only fixed point integer based numeric
that comes with the database. Should a non locale dependent fixed
point type that offers the same performance come along, then you have
a good argument to deprecate.

numeric is definitely a dog which the fundamental problem IMNSHO. I'm
guessing optimization paths are going to revolve around utilizing
binary integer ops in cases where it's known to be safe to do so.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#65Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#55)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/03/2014 10:11 PM, Tom Lane wrote:

Josh Berkus<josh@agliodbs.com> writes:

On 11/02/2014 11:41 AM, Tom Lane wrote:

Nothing that I recall at the moment, but there is certainly plenty of
stuff of dubious quality in there. I'd argue that chkpass, intagg,
intarray, isn, spi, and xml2 are all in worse shape than the money type.

Why are we holding on to xml2 again?

IIRC, there's some xpath-related functionality in there that's not
yet available in core (and needs some redesign before it'd ever get
accepted into core, so there's not a real quick fix to be had).

Yes, xpath_table is badly broken, as Robert Haas documented and I
expanded on a while back. See for example
</messages/by-id/4D6BCC1E.3010406@dunslane.net&gt; I
don't have any time of my own to work on this any time soon. But I know
that it has users, so just throwing it out would upset some people.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#66Tom Lane
tgl@sss.pgh.pa.us
In reply to: Craig Ringer (#58)
Re: Let's drop two obsolete features which are bear-traps for novices

Craig Ringer <craig@2ndquadrant.com> writes:

On 11/03/2014 03:41 AM, Tom Lane wrote:

Nothing that I recall at the moment, but there is certainly plenty of
stuff of dubious quality in there. I'd argue that chkpass, intagg,
intarray, isn, spi, and xml2 are all in worse shape than the money type.

What's wrong with intarray?

The single biggest practical problem with it is that it creates new
definitions of the <@ and @> operators, which cause confusion with the
core operators of those names. For example, we've repeatedly seen bug
reports along the lines of "why does this query not use this index"
because of people trying to use the contrib @> operator with a core GIN
index or vice versa. (In fairness, I think intarray might be older
than the core operators, but that doesn't make this less of a problem.)

Another thing that grates on me is the mostly-arbitrary restriction
to non-null-containing arrays. That's an implementation artifact rather
than something semantically required by the operations.

More generally, it seems like a grab bag of not terribly well designed
features, and the features that do seem well designed seem like they
ought to be more generic than just for int4 arrays. So to me it feels
like proof-of-concept experimentation rather than a production-grade
thing that we could feel good about moving into core.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#67Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Let's drop two obsolete features which are bear-traps for novices

On 11/04/2014 07:33 AM, Tom Lane wrote:

More generally, it seems like a grab bag of not terribly well designed
features, and the features that do seem well designed seem like they
ought to be more generic than just for int4 arrays. So to me it feels
like proof-of-concept experimentation rather than a production-grade
thing that we could feel good about moving into core.

Yah, as a user of intarray, I think it belongs as an external extension.
Heck, I think it might get more attention as an external extension, and
if it were external then folks could fork it and create versions that
don't have operator conflicts.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#68David Fetter
david@fetter.org
In reply to: Albe Laurenz (#60)
Re: Let's drop two obsolete features which are bear-traps for novices

On Tue, Nov 04, 2014 at 08:30:21AM +0000, Laurenz Albe wrote:

David Fetter wrote:

On Tue, Nov 04, 2014 at 07:51:06AM +0900, Tatsuo Ishii wrote:

Just out of curiosity, why is Oracle's NUMBER (I assume you are
talking about this) so fast?

I suspect that what happens is that NUMBER is stored as a native
type (int2, int4, int8, int16) that depends on its size and then
cast to the next upward thing as needed, taking any performance
hits at that point. The documentation hints (38 decimal places)
at a 128-bit internal representation as the maximum. I don't know
what happens when you get past what 128 bits can represent.

No, Oracle stores NUMBERs as variable length field (up to 22 bytes),
where the first byte encodes the sign and the comma position and the
remaining bytes encode the digits, each byte representing two digits
in base-100 notation (see Oracle Metalink note 1007641.6).

Thanks for clearing that up, and sorry for spreading misinformed
guesses.

So it's not so different from PostgreSQL.
No idea why their arithmetic should be faster.

I have an idea, but this time, I think it's right. They have at least
one team of people whose job it is to make sure that it is fast.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers