sortsupport for text

Started by Robert Haasabout 14 years ago75 messageshackers

robertmhaas@gmail.com

about 14 years ago

I decided to investigate the possible virtues of allowing "text" to
use the sortsupport infrastructure, since strings are something people
often want to sort. I generated 100,000 random alphanumeric strings,
each 30 characters in length, and loaded them into a single-column
table, froze it, ran SELECT SUM(1) FROM tablename on it until it was
fully cached, and then repeatedly quicksorted the table contents using
my default locale (en_US.UTF8). I repeated this test a number of
times, removing and recreating the data directory via initdb each
time. The test was performed on my home desktop, which is running
Fedora 14 (yeah, I know I should reinstall) and equipped with an AMD
Athlon 5000 Dual-Core Processor. Here's the exact test query:

SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t) x;

On unpatched master, this takes about 416 ms (plus or minus a few).
With the attached patch, it takes about 389 ms (plus or minus a very
few), a speedup of about 7%.

I repeated the experiment using the C locale, like this:

SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t COLLATE "C") x;

Here, it takes about 202 ms with the patch, and about 231 ms on
unpatched master, a savings of about 13%.

I also tried on a larger data set of 5 million strings, with a heap
sort using work_mem=256MB. Unfortunately, there was so much noise
there that it was hard to get any useful measurements: the exact same
code base, using the exact same test script (that started with an
initdb) could perform quite differently on consecutive runs, perhaps
because the choice of blocks chosen to contain the database itself
affected the efficiency of reading and writing temporary files. I
think it may be faster, and the results on the smaller data set argue
that it should be faster, but I was unable to gather reproducible
numbers. I did observe the following oprofile results on a run on
this larger data set, on master:

12789 28.2686 libc-2.13.so strcoll_l
6802 15.0350 postgres text_cmp
5081 11.2310 postgres comparetup_heap
3510 7.7584 postgres comparison_shim
2892 6.3924 postgres lc_collate_is_c
2722 6.0167 no-vmlinux /no-vmlinux
2596 5.7382 postgres varstr_cmp
2517 5.5635 libc-2.13.so __strlen_sse2
2515 5.5591 libc-2.13.so __memcpy_sse2
968 2.1397 postgres tuplesort_heap_siftup
710 1.5694 postgres bttextcmp
664 1.4677 postgres pg_detoast_datum_packed

Clearly, a lot of that is unnecessary. Doing lc_collate_is_c for
every tuple is a complete waste; as is translating the collation OID
to collate_t; this patch arranges to do those things just once per
sort. The comparison_shim is also a waste. Considering all that, I
had hoped for more like a 15-20% gain from this approach, but it
didn't happen, I suppose because some of the instructions saved just
resulted in more processor stalls. All the same, I'm inclined to
think it's still worth doing.

I didn't attempt to handle the weirdness that is UTF-8 on Windows,
since I don't develop on Windows. I thought when I wrote this code
that I could just leave the comparator uninitialized and let the
caller default to the shim implementation if the sort-support function
didn't do anything. But I see now that
PrepareSortSupportFromOrderingOp() feels that it's entitled to assume
that the sort-support function will always fill in a comparator.
Either that assumption needs to be changed, or the corresponding
Windows code needs to be written, or the sort support function needs
to call PrepareSortSupportComparisonShim() in this case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Noah Misch

noah@leadboat.com

about 14 years ago

In reply to: Robert Haas (#1)

Re: sortsupport for text

On Fri, Mar 02, 2012 at 03:45:38PM -0500, Robert Haas wrote:

SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t) x;

On unpatched master, this takes about 416 ms (plus or minus a few).
With the attached patch, it takes about 389 ms (plus or minus a very
few), a speedup of about 7%.

I repeated the experiment using the C locale, like this:

SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t COLLATE "C") x;

Here, it takes about 202 ms with the patch, and about 231 ms on
unpatched master, a savings of about 13%.

[oprofile report, further discussion]

Thanks for looking into this. Your patch is also a nice demonstration of
sortsupport's ability to help with more than just fmgr overhead.

Considering all that, I
had hoped for more like a 15-20% gain from this approach, but it
didn't happen, I suppose because some of the instructions saved just
resulted in more processor stalls. All the same, I'm inclined to
think it's still worth doing.

This is a border case, but I suggest that a 13% speedup on a narrowly-tailored
benchmark, degrading to 7% in common configurations, is too meager to justify
adopting this patch.

Kevin Grittner

Kevin.Grittner@wicourts.gov

about 14 years ago

In reply to: Noah Misch (#2)

Re: sortsupport for text

Noah Misch <noah@leadboat.com> wrote:

On Fri, Mar 02, 2012 at 03:45:38PM -0500, Robert Haas wrote:

SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t) x;

[13% faster with patch for C collation; 7% faster for UTF8]

I had hoped for more like a 15-20% gain from this approach, but
it didn't happen, I suppose because some of the instructions
saved just resulted in more processor stalls. All the same, I'm
inclined to think it's still worth doing.

This is a border case, but I suggest that a 13% speedup on a
narrowly-tailored benchmark, degrading to 7% in common
configurations, is too meager to justify adopting this patch.

We use the C collation and have character strings in most indexes
and ORDER BY clauses. Unless there are significant contra-
indications, I'm in favor of adopting this patch.

-Kevin

Bruce Momjian

bruce@momjian.us

about 14 years ago

In reply to: Robert Haas (#1)

Re: sortsupport for text

On Fri, Mar 2, 2012 at 8:45 PM, Robert Haas <robertmhaas@gmail.com> wrote:

12789 28.2686 libc-2.13.so strcoll_l
6802 15.0350 postgres text_cmp

I'm still curious how it would compare to call strxfrm and sort the
resulting binary blobs. I don't think the sortsupport stuff actually
makes this any easier though. Since using it requires storing the
binary blob somewhere I think the support would have to be baked into
tuplesort (or hacked into the sortkey as an expr that was evaluated
earlier somehow).

It's a tradeoff and not an obvious one. The binary blobs are larger
and it would mean reading and copying more data around memory. But it
would mean doing the work that strcoll_l does only n times instead of
nlogn times. That might be a pretty significant gain.

--
greg

Tom Lane

tgl@sss.pgh.pa.us

about 14 years ago

In reply to: Bruce Momjian (#4)

Re: sortsupport for text

Greg Stark <stark@mit.edu> writes:

I'm still curious how it would compare to call strxfrm and sort the
resulting binary blobs.

In principle that should be a win; it's hard to believe that strxfrm
would have gotten into the standards if it were not a win for sorting
applications.

I don't think the sortsupport stuff actually
makes this any easier though. Since using it requires storing the
binary blob somewhere I think the support would have to be baked into
tuplesort (or hacked into the sortkey as an expr that was evaluated
earlier somehow).

Well, obviously something has to be done, but I think it might be
possible to express this as another sortsupport API function rather than
doing anything as ugly as hardwiring strxfrm into the callers.

However, it occurred to me that we could pretty easily jury-rig
something that would give us an idea about the actual benefit available
here. To wit: make a C function that wraps strxfrm, basically
strxfrm(text) returns bytea. Then compare the performance of
ORDER BY text_col to ORDER BY strxfrm(text_col).

(You would need to have either both or neither of text and bytea
using the sortsupport code paths for this to be a fair comparison.)

One other thing I've always wondered about in this connection is the
general performance of sorting toasted datums. Is it better to detoast
them in every comparison, or pre-detoast to save comparison cycles at
the cost of having to push much more data around? I didn't see any
discussion of this point in Robert's benchmarks, but I don't think we
should go very far towards enabling sortsupport for text until we
understand the issue and know whether we need to add more infrastructure
for it. If you cross your eyes a little bit, this is very much like
the strxfrm question...

regards, tom lane

Robert Haas

robertmhaas@gmail.com

about 14 years ago

In reply to: Bruce Momjian (#4)

Re: sortsupport for text

On Sat, Mar 17, 2012 at 6:58 PM, Greg Stark <stark@mit.edu> wrote:

On Fri, Mar 2, 2012 at 8:45 PM, Robert Haas <robertmhaas@gmail.com> wrote:

12789 28.2686 libc-2.13.so strcoll_l
6802 15.0350 postgres text_cmp

I'm still curious how it would compare to call strxfrm and sort the
resulting binary blobs. I don't think the sortsupport stuff actually
makes this any easier though. Since using it requires storing the
binary blob somewhere I think the support would have to be baked into
tuplesort (or hacked into the sortkey as an expr that was evaluated
earlier somehow).

Well, the real problem here is that the strxfrm'd representations
aren't just bigger - they are huge. On MacBook Pro, if the input
representation is n characters, the strxfrm'd representation is 9x+3
characters. If the we're sorting very wide tuples of which the sort
key is only a small part, maybe that would be acceptable, but if the
sort key makes up the bulk of the tuple than caching the strxfrm()
representation works out to slashing work_mem tenfold. That might be
just fine if the sort is going to fit in work_mem either way, but if
it turns a quicksort into a heap sort then I feel pretty confident
that it's going to be a loser. Keep in mind that even if the
strxfrm'd representation were no larger at all, it would still amount
to an additional copy of the data, so you'd still potentially be
eating up lots of work_mem that way. The fact that it's an order of
magnitude larger is just making a bad problem worse.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Robert Haas

robertmhaas@gmail.com

about 14 years ago

In reply to: Tom Lane (#5)

Re: sortsupport for text

On Sun, Mar 18, 2012 at 11:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, it occurred to me that we could pretty easily jury-rig
something that would give us an idea about the actual benefit available
here. To wit: make a C function that wraps strxfrm, basically
strxfrm(text) returns bytea. Then compare the performance of
ORDER BY text_col to ORDER BY strxfrm(text_col).

(You would need to have either both or neither of text and bytea
using the sortsupport code paths for this to be a fair comparison.)

Since the index will be ~9x bigger at least on this machine, I think I
know the answer, but I suppose it doesn't hurt to test it. It's not
that much work.

One other thing I've always wondered about in this connection is the
general performance of sorting toasted datums. Is it better to detoast
them in every comparison, or pre-detoast to save comparison cycles at
the cost of having to push much more data around? I didn't see any
discussion of this point in Robert's benchmarks, but I don't think we
should go very far towards enabling sortsupport for text until we
understand the issue and know whether we need to add more infrastructure
for it. If you cross your eyes a little bit, this is very much like
the strxfrm question...

It would be surprising to me if there is a one-size-fits-all answer to
this question. For example, if the tuple got toasted because it's got
lots of columns and we had to take desperate measures to get it to fit
into an 8kB block at all, chances are that detoasting will work out
well. We'll use a bit more memory, but hopefully that'll be repaid by
much faster comparisons. OTOH, if you have a data set with many
relatively short strings and a few very long ones, detoasting up-front
could turn a quicksort into a heapsort. Since only a small fraction
of the comparisons would have involved one of the problematic long
strings anyway, it's unlikely to be worth the expense of keeping those
strings around in detoasted form for the entire sort (unless maybe
reconstructing them even a few times is problematic because we're
under heavy cache pressure and we get lots of disk seeks as a result).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Martijn van Oosterhout

kleptog@svana.org

about 14 years ago

In reply to: Robert Haas (#6)

Re: sortsupport for text

On Mon, Mar 19, 2012 at 12:19:53PM -0400, Robert Haas wrote:

On Sat, Mar 17, 2012 at 6:58 PM, Greg Stark <stark@mit.edu> wrote:

On Fri, Mar 2, 2012 at 8:45 PM, Robert Haas <robertmhaas@gmail.com> wrote:

12789 28.2686 libc-2.13.so strcoll_l
6802 15.0350 postgres text_cmp

I'm still curious how it would compare to call strxfrm and sort the
resulting binary blobs. I don't think the sortsupport stuff actually
makes this any easier though. Since using it requires storing the
binary blob somewhere I think the support would have to be baked into
tuplesort (or hacked into the sortkey as an expr that was evaluated
earlier somehow).

Well, the real problem here is that the strxfrm'd representations
aren't just bigger - they are huge. On MacBook Pro, if the input
representation is n characters, the strxfrm'd representation is 9x+3
characters.

Ouch. I was holding out hope that you could get a meaningful
improvement if we could use the first X bytes of the strxfrm output so
you only need to do a strcoll on strings that actually nearly match.
But with an information density of 9 bytes for one 1 character it
doesn't seem worthwhile.

That and this gem in the strxfrm manpage:

RETURN VALUE
The strxfrm() function returns the number of bytes required to
store the transformed string in dest excluding the terminating
'\0' character. If the value returned is n or more, the
contents of dest are indeterminate.

Which means that you have to take the entire transformed string, you
can't just ask for the first bit. I think that kind of leaves the whole
idea dead in the water.

Just for interest I looked at the ICU API for this and they have the
same restriction. There is another function which you can use to
return partial sort keys (ucol_nextSortKeyPart) but it produces
"uncompressed sortkeys", which it seems is what Mac OSX is doing, which
seems useless for our purposes. Either this is a hard problem or we're
nowhere near a target use case.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

He who writes carelessly confesses thereby at the very outset that he does
not attach much importance to his own thoughts.

-- Arthur Schopenhauer

Bruce Momjian

bruce@momjian.us

about 14 years ago

In reply to: Martijn van Oosterhout (#8)

Re: sortsupport for text

On Mon, Mar 19, 2012 at 9:23 PM, Martijn van Oosterhout
<kleptog@svana.org> wrote:

Ouch. I was holding out hope that you could get a meaningful
improvement if we could use the first X bytes of the strxfrm output so
you only need to do a strcoll on strings that actually nearly match.
But with an information density of 9 bytes for one 1 character it
doesn't seem worthwhile.

When I was playing with glibc it was 4n. I think what they do is have
n bytes for the high order bits, then n bytes for low order bits like
capitalization or whitespace differences. I suspect they used to use
16 bits for each and have gone to some larger size.

That and this gem in the strxfrm manpage:

RETURN VALUE
The strxfrm() function returns the number of bytes required to
store the transformed string in dest excluding the terminating
'\0' character. If the value returned is n or more, the
contents of dest are indeterminate.

Which means that you have to take the entire transformed string, you
can't just ask for the first bit. I think that kind of leaves the whole
idea dead in the water.

I believe the intended API is that you allocate a buffer with your
guess of the right size, call strxfrm and if it returns a larger
number you realloc your buffer and call it again.

--
greg

sortsupport for text

Attachments: