hstore improvements?

Started by Andrew Gierthabout 17 years ago30 messageshackers
Jump to latest
#1Andrew Gierth
andrew@tao11.riddles.org.uk

I have a patch almost done that adds some obvious but currently
missing functionality to hstore, specifically the ability to construct
an hstore from a record, and the ability to construct one from a pair
of arrays.

If there's any other features that people find notably missing from
hstore, I could stick them in too; any requests?

Also, hstore has an (undocumented) limit of 65535 bytes for keys and
values, and it does not behave very cleanly when given longer values
(it truncates them mod 2^16, rather than erroring). That gives rise to
two obvious questions: (1) are those lengths reasonable? they strike
me as being rather long for keys and rather short for values; and (2)
should exceeding the lengths throw an error?

--
Andrew (irc:RhodiumToad)

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andrew Gierth (#1)
Re: hstore improvements?

Andrew Gierth wrote:

I have a patch almost done that adds some obvious but currently
missing functionality to hstore, specifically the ability to construct
an hstore from a record, and the ability to construct one from a pair
of arrays.

If there's any other features that people find notably missing from
hstore, I could stick them in too; any requests?

Optionally compressing the values would be nice ...

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Alvaro Herrera (#2)
Re: hstore improvements?

Alvaro Herrera wrote:

Andrew Gierth wrote:

I have a patch almost done that adds some obvious but currently
missing functionality to hstore, specifically the ability to construct
an hstore from a record, and the ability to construct one from a pair
of arrays.

If there's any other features that people find notably missing from
hstore, I could stick them in too; any requests?

Optionally compressing the values would be nice ...

The whole Datum will be toasted/compressed if it gets large enough. Is
that not enough?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Heikki Linnakangas (#3)
Re: hstore improvements?

Heikki Linnakangas wrote:

Alvaro Herrera wrote:

Andrew Gierth wrote:

I have a patch almost done that adds some obvious but currently
missing functionality to hstore, specifically the ability to construct
an hstore from a record, and the ability to construct one from a pair
of arrays.

If there's any other features that people find notably missing from
hstore, I could stick them in too; any requests?

Optionally compressing the values would be nice ...

The whole Datum will be toasted/compressed if it gets large enough. Is
that not enough?

It doesn't always gets large enough, and there's no way to control that.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#5David E. Wheeler
david@kineticode.com
In reply to: Andrew Gierth (#1)
Re: hstore improvements?

On Mar 13, 2009, at 11:06 AM, Andrew Gierth wrote:

If there's any other features that people find notably missing from
hstore, I could stick them in too; any requests?

Can you create slices? That is, create a new hstore as a subset of an
existing hstore?

Also, hstore has an (undocumented) limit of 65535 bytes for keys and
values, and it does not behave very cleanly when given longer values
(it truncates them mod 2^16, rather than erroring). That gives rise to
two obvious questions: (1) are those lengths reasonable? they strike
me as being rather long for keys and rather short for values; and (2)
should exceeding the lengths throw an error?

I agree. The keys can be much shorter without any threat of loss. Can
the value not essentially be TEXT, and thus theoretically unlimited in
size?

Best,

David

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#4)
Re: hstore improvements?

Alvaro Herrera <alvherre@commandprompt.com> writes:

Heikki Linnakangas wrote:

Alvaro Herrera wrote:

Optionally compressing the values would be nice ...

The whole Datum will be toasted/compressed if it gets large enough. Is
that not enough?

It doesn't always gets large enough, and there's no way to control that.

Maybe not, but putting compression into a datatype is NOT NOT NOT the
answer. You only want one layer of compression in a system, and that
means if you want more control you need to speak to the TOAST code.

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#5)
Re: hstore improvements?

"David E. Wheeler" <david@kineticode.com> writes:

On Mar 13, 2009, at 11:06 AM, Andrew Gierth wrote:

Also, hstore has an (undocumented) limit of 65535 bytes for keys and
values, and it does not behave very cleanly when given longer values
(it truncates them mod 2^16, rather than erroring). That gives rise to
two obvious questions: (1) are those lengths reasonable? they strike
me as being rather long for keys and rather short for values; and (2)
should exceeding the lengths throw an error?

I agree. The keys can be much shorter without any threat of loss. Can
the value not essentially be TEXT, and thus theoretically unlimited in
size?

Well, TEXT is limited to 1GB by the toastable-datum rules, as is the
whole hstore datum, so there's no point in worrying about "huge"
values. I agree though that 64K is on the small side for a data limit.
If we wanted to keep the lengths in the same 32 bits they presumably
occupy now, what about splitting 8/24 (=> 255 bytes for key, 24MB for
value)?

As for truncation rather than throwing an error, I'd argue that that's
a flat-out bug and the fix deserves back-patching.

regards, tom lane

#8Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: David E. Wheeler (#5)
Re: hstore improvements?

"David" == "David E Wheeler" <david@kineticode.com> writes:

On Mar 13, 2009, at 11:06 AM, Andrew Gierth wrote:

If there's any other features that people find notably missing
from hstore, I could stick them in too; any requests?

David> Can you create slices? That is, create a new hstore as a
David> subset of an existing hstore?

ooh. good point. What would be a good operator for that?

I'm thinking that (hstore -> text[]) should probably return text[],
and maybe (hstore => text[]) returning hstore?

i.e.

select ('a=>1,b=>2,c=>3'::hstore) -> ARRAY['a','b'];
-- returns '{1,2}'

select ('a=>1,b=>2,c=>3'::hstore) => ARRAY['a','b'];
-- returns 'a=>1,b=>2'

(by analogy to the existing operators -> for lookup and => for
construction)

--
Andrew (irc:RhodiumToad)

#9Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#7)
Re: hstore improvements?

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> As for truncation rather than throwing an error, I'd argue that
Tom> that's a flat-out bug and the fix deserves back-patching.

Separate patch for that part then?

--
Andrew (irc:RhodiumToad)

#10David E. Wheeler
david@kineticode.com
In reply to: Andrew Gierth (#8)
Re: hstore improvements?

On Mar 13, 2009, at 1:21 PM, Andrew Gierth wrote:

I'm thinking that (hstore -> text[]) should probably return
text[],

and maybe (hstore => text[]) returning hstore?

i.e.

select ('a=>1,b=>2,c=>3'::hstore) -> ARRAY['a','b'];
-- returns '{1,2}'

select ('a=>1,b=>2,c=>3'::hstore) => ARRAY['a','b'];
-- returns 'a=>1,b=>2'

(by analogy to the existing operators -> for lookup and => for
construction)

Is a more Perlish syntax out of the question?

SELECT ('a=>1,b=>2,c=>3'::hstore)['a', 'b'];
-- returns '{1,2}'
select ('a=>1,b=>2,c=>3'::hstore){'a','b'};
-- returns 'a=>1,b=>2'

Best,

David

#11Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: David E. Wheeler (#10)
Re: hstore improvements?

"David" == "David E Wheeler" <david@kineticode.com> writes:

select ('a=>1,b=>2,c=>3'::hstore) -> ARRAY['a','b'];
-- returns '{1,2}'

select ('a=>1,b=>2,c=>3'::hstore) => ARRAY['a','b'];
-- returns 'a=>1,b=>2'

(by analogy to the existing operators -> for lookup and => for
construction)

David> Is a more Perlish syntax out of the question?

Yes. Sorry.

David> SELECT ('a=>1,b=>2,c=>3'::hstore)['a', 'b'];
David> -- returns '{1,2}'

That would require integrating hstore into core - array subscripting
isn't a user-definable operation.

David> select ('a=>1,b=>2,c=>3'::hstore){'a','b'};
David> -- returns 'a=>1,b=>2'

And that would require changing the parser...

--
Andrew (irc:RhodiumToad)

#12David E. Wheeler
david@kineticode.com
In reply to: Andrew Gierth (#11)
Re: hstore improvements?

On Mar 13, 2009, at 2:26 PM, Andrew Gierth wrote:

David> Is a more Perlish syntax out of the question?

Yes. Sorry.

David> SELECT ('a=>1,b=>2,c=>3'::hstore)['a', 'b'];
David> -- returns '{1,2}'

That would require integrating hstore into core - array subscripting
isn't a user-definable operation.

David> select ('a=>1,b=>2,c=>3'::hstore){'a','b'};
David> -- returns 'a=>1,b=>2'

And that would require changing the parser...

How functionS, then?

SELECT slice(('a=>1,b=>2,c=>3'::hstore), ARRAY['a', 'b']);
-- returns '{1,2}'

SELECT hslice(('a=>1,b=>2,c=>3'::hstore), ARRAY['a','b']);
-- returns 'a=>1,b=>2'

Better names welcome, of course. But there isn't even a slice
interface for array, is there?

SELECT slice(ARRAY[ 'a', 'b', 'c' ], ARRAY[1, 3]);
-- returns '{a,c}'

Best,

David

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#10)
Re: hstore improvements?

"David E. Wheeler" <david@kineticode.com> writes:

Is a more Perlish syntax out of the question?

Yes. SQL is not Perl.

regards, tom lane

#14David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#13)
Re: hstore improvements?

On Mar 13, 2009, at 2:35 PM, Tom Lane wrote:

Is a more Perlish syntax out of the question?

Yes. SQL is not Perl.

You mean all this time I thought I was writing Perl when I was using
PostgreSQL, and it turns out that it's *not* Perl? That explains the
strange lack of sigils.

Thanks for setting me straight, Tom.

:-P

D

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#7)
Re: hstore improvements?

I wrote:

If we wanted to keep the lengths in the same 32 bits they presumably
occupy now, what about splitting 8/24 (=> 255 bytes for key, 24MB for
value)?

Sigh, fingers faster than brain today. A 24-bit length field could
represent lengths up to 16MB, not 24MB. Still, it seems like a
reasonable maximum.

Or we could increase the size of hstore values so as to provide more
than 32 bits total for this, but that would presumably be pessimal for
all existing applications; there is evidently no one using more than
64K, or we'd have heard complaints before.

regards, tom lane

#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#15)
Re: hstore improvements?

Tom Lane wrote:

Or we could increase the size of hstore values so as to provide more
than 32 bits total for this, but that would presumably be pessimal for
all existing applications; there is evidently no one using more than
64K, or we'd have heard complaints before.

Would it work to allow storing toast pointers for values?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#16)
Re: hstore improvements?

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom Lane wrote:

Or we could increase the size of hstore values so as to provide more
than 32 bits total for this, but that would presumably be pessimal for
all existing applications; there is evidently no one using more than
64K, or we'd have heard complaints before.

Would it work to allow storing toast pointers for values?

Given that there is nobody using this for values > 64K, that seems like
far too much complication. (Hint: how you gonna vacuum toast pointers
embedded within datums? Especially within a datatype that isn't even
known to the core code?)

regards, tom lane

#18Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#15)
Re: hstore improvements?

On Mar 13, 2009, at 4:47 PM, Tom Lane wrote:

Or we could increase the size of hstore values so as to provide more
than 32 bits total for this, but that would presumably be pessimal for
all existing applications; there is evidently no one using more than
64K, or we'd have heard complaints before.

Unless they haven't realized that we've been pulling a MySQL and
silently truncating their data. :(

On another point, I agree that compression would be nice, and the way
to fix that is to expose knobs for controlling TOAST thresholds
(something I've wanted forever).
--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

#19Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#15)
Re: hstore improvements?

Tom Lane wrote:

I wrote:

If we wanted to keep the lengths in the same 32 bits they presumably
occupy now, what about splitting 8/24 (=> 255 bytes for key, 24MB for
value)?

Sigh, fingers faster than brain today. A 24-bit length field could
represent lengths up to 16MB, not 24MB. Still, it seems like a
reasonable maximum.

Or we could increase the size of hstore values so as to provide more
than 32 bits total for this, but that would presumably be pessimal for
all existing applications; there is evidently no one using more than
64K, or we'd have heard complaints before.

Yeah, I have to say that it would never have occurred to me to use
hstore for large values like that; 64K is pretty much a whole page of
text. If you need to store that much data, use a real table. Or maybe
CouchDB.

As an hstore user, I'd be fine with simply limiting it to 64K (or, heck,
8K) and throwing an error. I'd also be fine with limiting keys to 255
bytes, although we'd have to warn people.

--Josh

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#19)
Re: hstore improvements?

Josh Berkus <josh@agliodbs.com> writes:

As an hstore user, I'd be fine with simply limiting it to 64K (or, heck,
8K) and throwing an error. I'd also be fine with limiting keys to 255
bytes, although we'd have to warn people.

Yeah, 255 might well be more of a problem than the other limit. We
could move to something like 10/22 or 12/20 split, which would give
us 1KB/4MB or 4KB/1MB limits.

regards, tom lane

#21Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#20)
#22Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Andrew Gierth (#1)
#23Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Ron Mayer (#22)
#24Josh Berkus
josh@agliodbs.com
In reply to: Ron Mayer (#22)
#25Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Andrew Gierth (#1)
#26Josh Berkus
josh@agliodbs.com
In reply to: Andrew Gierth (#25)
#27Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#26)
#28Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Josh Berkus (#26)
#29Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Josh Berkus (#21)
#30Josh Berkus
josh@agliodbs.com
In reply to: Dimitri Fontaine (#28)