GSOC 2018 Project - A New Sorting Routine

Started by Kefan Yangalmost 8 years ago15 messageshackers
Jump to latest
#1Kefan Yang
starordust@gmail.com

Hello, Hackers!

I am working on my project in Google Summer of Code 2018
<https://wiki.postgresql.org/wiki/GSoC_2018#Sorting_algorithms_benchmark_and_implementation_.282018.29&gt;.
In this project, I am trying to improve the in-memory sorting routine in
PostgreSQL. Now I am very excited to share my progress with you guys.

Originally, PostgreSQL is using the QuickSort implemented by J. L. Bentley
and M. D. McIlroy in "Engineering a sort function" with some modifications.
This sorting routine is very fast, yet may fall to O(n^2) time complexity
in the worst case scenario. We are trying to find faster sorting algorithms
with guaranteed O(nlogn) time complexity.

In this patch, I

1. Use IntroSort to implement pg_qsort. IntroSort is a hybrid sorting
algorithm. It uses Quicksort most of the time, but switch to insertion sort
when the array is small and heapsort when the recursion exceeds depth limit.
2. Only check if the array is preordered once on the whole array to get
better overall performance. Previously the sorting routine checks if the
array is preordered on every recursion.

After some performance test, I find the new sorting routine

1. Slightly faster on sorting random arrays.
2. Much faster on worst case scenario since it has O(nlogn) worst case
complexity.
3. Has nearly the same performance on mostly sorted arrays.

I use both standalone tests and pgbench to show the result. A more detailed
report is in the attachment, along with the patch and some scripts to
reproduce the result.

*Notice:* this patch may fail a test case in ‘make check’ because QuickSort
and HeapSort are unstable. The two sorting routines may give different
results if multiple entries in the array have the same key value.

Generally speaking, I think the new sorting routine seems to be an
improvement in many ways. Please let me know if you have any thoughts or
suggestions.

Regards,

Kefan Yang

Attachments:

NewSorting.zipapplication/x-zip-compressed; name=NewSorting.zipDownload+14-17
#2Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Kefan Yang (#1)
Re: GSOC 2018 Project - A New Sorting Routine

Hi Kefan,

On 07/10/2018 11:02 PM, Kefan Yang wrote:

Hello, Hackers!

I am working on my project in Google Summer of Code 2018
<https://wiki.postgresql.org/wiki/GSoC_2018#Sorting_algorithms_benchmark_and_implementation_.282018.29&gt;.
In this project, I am trying to improve the in-memory sorting routine in
PostgreSQL. Now I am very excited to share my progress with you guys.

Originally, PostgreSQL is using the QuickSort implemented by J. L.
Bentley and M. D. McIlroy in "Engineering a sort function" with some
modifications. This sorting routine is very fast, yet may fall to O(n^2)
time complexity in the worst case scenario. We are trying to find faster
sorting algorithms with guaranteed O(nlogn) time complexity.

Time complexity is nice, but it merely estimates the number of
comparisons needed by the sort algorithm. It entirely ignores other
factors that are quite important - behavior with caches, for example.
And quicksort works really well in this regard, I think.

The worst-case complexity may be an issue, but we're already dealing
with it by using median-of-three (actually, median-of-nine, IIRC) in
pg_qsort. Hitting the worst-case accidentally is possible, but it should
be quite unlikely. It's still deterministic so an adversary might
construct a data set triggering it and use it for a DDoS, but if that
was a real issue in practice - I assume we'd already hear about it. But
even if it was, I guess the easiest way to deal with it would be to
randomize the selection of pivots.

In other words, replacing quicksort with an algorithm that is slower on
average but has better worst-case behavior is unlikely to be accepted
with joy, when the worst case is unlikely / bordering with impossible.

In this patch, I

1. Use IntroSort to implement pg_qsort. IntroSort is a hybrid sorting
algorithm. It uses Quicksort most of the time, but switch to
insertion sort when the array is small and heapsort when the
recursion exceeds depth limit.
2. Only check if the array is preordered once on the whole array to get
better overall performance. Previously the sorting routine checks if
the array is preordered on every recursion.

After some performance test, I find the new sorting routine

1. Slightly faster on sorting random arrays.
2. Much faster on worst case scenario since it has O(nlogn) worst case
complexity.
3. Has nearly the same performance on mostly sorted arrays.

I use both standalone tests and pgbench to show the result. A more
detailed report is in the attachment, along with the patch and some
scripts to reproduce the result.

I find those results rather unconvincing.

First of all, testing this on t2.micro is *insane* considering that this
instance type is subject to throttling (depending on CPU credits). I
don't know if this happened to be an issue during your tests, of course.
Furthermore, the instance only has 1 virtual core, so there's likely a
lot of noise due to other tasks (kernel of whatever needs to run).

Secondly, I see the PDF includes results for various data set types
(random, reversed, mostly random, ...) but the archive you provided only
includes the random + killer cases.

And finally, I see the PDF reports "CPU clocks" but I'm not sure what
that actually is? Is that elapsed time in milliseconds or something else?

So I've done a bit of benchmarking by running the battery of tests I've
previously used for sort-related patches, and those results seem much
less optimistic. I've done this on two different x86 machines (one with
an old i5-2500K CPU, the other one with rather new e5-2620v4). Full
results and scripts are available at [1]https://bitbucket.org/tvondra/sort-intro-sort-i5/src/master/ and [2]https://bitbucket.org/tvondra/sort-intro-sort/src/master/, a summary of the
results is attached here.

Each spreadsheet has a couple of "comparison N" sheets, where N is the
number of rows on the test. The last set of columns is comparison to
unpatched master, where values below 100% mean "faster than master" and
above 100% "slower than master".

On the (quite old) i5-2500k CPU, there's pretty much no difference
between master with and without the patch.

On the (much newer) e5-2620v4 system, the results seem somewhat more
variable - ~10% regressions on CREATE INDEX cases, ~5% gains on the
other cases, for the smallest data set (10k rows). But as the data set
grows, the regressions pretty clearly prevail. Not great, I guess :-(

I don't want to discourage you from working on sorting, and I'm sure
significant improvements in this area are possible (and needed). But my
guess is that those optimizations will happen at higher level, not by
tweaking the low-level algorithm.

regards

[1]: https://bitbucket.org/tvondra/sort-intro-sort-i5/src/master/
[2]: https://bitbucket.org/tvondra/sort-intro-sort/src/master/

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

i5-2500k.odsapplication/vnd.oasis.opendocument.spreadsheet; name=i5-2500k.odsDownload+29-13
e5-2620-v4.odsapplication/vnd.oasis.opendocument.spreadsheet; name=e5-2620-v4.odsDownload+19-20
#3Kefan Yang
starordust@gmail.com
In reply to: Kefan Yang (#1)
Fwd: GSOC 2018 Project - A New Sorting Routine

---------- Forwarded message ----------
From: Kefan Yang <starordust@gmail.com>
Date: 2018-07-13 15:02 GMT-07:00
Subject: Re: GSOC 2018 Project - A New Sorting Routine
To: Tomas Vondra <tomas.vondra@2ndquadrant.com>

Hey Tomas,

Thanks for your reply!

First I’d like to make some clarification about my test result.

First of all, testing this on t2.micro is *insane* considering that this
instance type is subject to throttling (depending on CPU credits). I
don't know if this happened to be an issue during your tests, of course.
Furthermore, the instance only has 1 virtual core, so there's likely a
lot of noise due to other tasks (kernel of whatever needs to run).

I agree that testing this on t2.micro instance might not be a very great
idea. But I have tried 100 rounds for each test case to minimize the
influence of other processes. In fact, I think the result is
consistent with yours on i5-2500k (the older CPU): intro sort with only one
preordered check is slightly better on random cases. There's no big
difference if we also take the other cases into consideration. (I would say
intro sort is still better by simply counting the number of improved test
cases)

Secondly, I see the PDF includes results for various data set types

(random, reversed, mostly random, ...) but the archive you provided only
includes the random + killer cases.

I tried data of all patterns in the standalone test, but only random and
killer cases in pg_bench. This is a problem but I would try other cases in
pg_bench once I write a shell script to do it automatically.

And finally, I see the PDF reports "CPU clocks" but I'm not sure what

that actually is? Is that elapsed time in milliseconds or something else?

Sorry for the confusion, but "CPU clocks" actually means CPU clock ticks,
which are units of time of a constant but system-specific length.

After reading your test results, I think the main problems of intro sort are

1. Slow on CREATE INDEX cases.

I am still trying to figure out where the bottleneck is. Is the data
pattern in index creation very different from other cases? Also, pg_qsort
has 10%-20% advantage at creating index even on sorted data (faster CPU, N
= 1000000). This is very strange to me since the two sorting routines
execute exactly the same code when the input data is sorted.

2. Slow on faster CPU and large data set.

The main difference is still on CREATE INDEX. But there are several SELECT
cases(faster CPU, line 179, 206, 377) where pg_qsort can have more than 60%
advantage, which is crazy. All these cases are dealing with mostly sorted
data.

Personally, I would say intro sort is good as long as it has nearly the
same performance as pg_qsort on average cases because of the better
worst-case complexity. In fact, we can make the two sorting routines as
similar as we want by increasing the threshold that intro sort switches to
heap sort. Therefore, I think by no means could pg_qsort be 10%-20% faster
than a well-optimized intro sort because they execute the same code most of
the time. There must be something special about CREATE INDEX test cases,
and I am trying to figure it out. Also, I am considering performing
the preordered check on every recursion, like what pg_qsort does, and see
how it goes.

Finally, you've mentioned

But even if it was, I guess the easiest way to deal with it would be to

randomize the selection of pivots.

I am wondering if pg_sort with random pivot selecting could be faster than
intro sort. I've just done some simple tests, which shows that intro sort
in faster in all cases. But I guess it depends on how we would implement
the random number generation.

In conclusion, I still believe intro sort is good if we optimized the
implementation and select the parameters more carefully.

Thanks again for your time. I am hoping to see your thoughts on this:)

Regards,

Kefan

2018-07-12 17:50 GMT-07:00 Tomas Vondra <tomas.vondra@2ndquadrant.com>:

Show quoted text

Hi Kefan,

On 07/10/2018 11:02 PM, Kefan Yang wrote:

Hello, Hackers!

I am working on my project in Google Summer of Code 2018
<https://wiki.postgresql.org/wiki/GSoC_2018#Sorting_algorith

ms_benchmark_and_implementation_.282018.29>.

In this project, I am trying to improve the in-memory sorting routine in
PostgreSQL. Now I am very excited to share my progress with you guys.

Originally, PostgreSQL is using the QuickSort implemented by J. L.
Bentley and M. D. McIlroy in "Engineering a sort function" with some
modifications. This sorting routine is very fast, yet may fall to O(n^2)
time complexity in the worst case scenario. We are trying to find faster
sorting algorithms with guaranteed O(nlogn) time complexity.

Time complexity is nice, but it merely estimates the number of
comparisons needed by the sort algorithm. It entirely ignores other
factors that are quite important - behavior with caches, for example.
And quicksort works really well in this regard, I think.

The worst-case complexity may be an issue, but we're already dealing
with it by using median-of-three (actually, median-of-nine, IIRC) in
pg_qsort. Hitting the worst-case accidentally is possible, but it should
be quite unlikely. It's still deterministic so an adversary might
construct a data set triggering it and use it for a DDoS, but if that
was a real issue in practice - I assume we'd already hear about it. But
even if it was, I guess the easiest way to deal with it would be to
randomize the selection of pivots.

In other words, replacing quicksort with an algorithm that is slower on
average but has better worst-case behavior is unlikely to be accepted
with joy, when the worst case is unlikely / bordering with impossible.

In this patch, I

1. Use IntroSort to implement pg_qsort. IntroSort is a hybrid sorting
algorithm. It uses Quicksort most of the time, but switch to
insertion sort when the array is small and heapsort when the
recursion exceeds depth limit.
2. Only check if the array is preordered once on the whole array to get
better overall performance. Previously the sorting routine checks if
the array is preordered on every recursion.

After some performance test, I find the new sorting routine

1. Slightly faster on sorting random arrays.
2. Much faster on worst case scenario since it has O(nlogn) worst case
complexity.
3. Has nearly the same performance on mostly sorted arrays.

I use both standalone tests and pgbench to show the result. A more
detailed report is in the attachment, along with the patch and some
scripts to reproduce the result.

I find those results rather unconvincing.

First of all, testing this on t2.micro is *insane* considering that this
instance type is subject to throttling (depending on CPU credits). I
don't know if this happened to be an issue during your tests, of course.
Furthermore, the instance only has 1 virtual core, so there's likely a
lot of noise due to other tasks (kernel of whatever needs to run).

Secondly, I see the PDF includes results for various data set types
(random, reversed, mostly random, ...) but the archive you provided only
includes the random + killer cases.

And finally, I see the PDF reports "CPU clocks" but I'm not sure what
that actually is? Is that elapsed time in milliseconds or something else?

So I've done a bit of benchmarking by running the battery of tests I've
previously used for sort-related patches, and those results seem much
less optimistic. I've done this on two different x86 machines (one with
an old i5-2500K CPU, the other one with rather new e5-2620v4). Full
results and scripts are available at [1] and [2], a summary of the
results is attached here.

Each spreadsheet has a couple of "comparison N" sheets, where N is the
number of rows on the test. The last set of columns is comparison to
unpatched master, where values below 100% mean "faster than master" and
above 100% "slower than master".

On the (quite old) i5-2500k CPU, there's pretty much no difference
between master with and without the patch.

On the (much newer) e5-2620v4 system, the results seem somewhat more
variable - ~10% regressions on CREATE INDEX cases, ~5% gains on the
other cases, for the smallest data set (10k rows). But as the data set
grows, the regressions pretty clearly prevail. Not great, I guess :-(

I don't want to discourage you from working on sorting, and I'm sure
significant improvements in this area are possible (and needed). But my
guess is that those optimizations will happen at higher level, not by
tweaking the low-level algorithm.

regards

[1] https://bitbucket.org/tvondra/sort-intro-sort-i5/src/master/
[2] https://bitbucket.org/tvondra/sort-intro-sort/src/master/

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In reply to: Kefan Yang (#3)
Re: GSOC 2018 Project - A New Sorting Routine

On Fri, Jul 13, 2018 at 3:04 PM, Kefan Yang <starordust@gmail.com> wrote:

1. Slow on CREATE INDEX cases.

I am still trying to figure out where the bottleneck is. Is the data pattern
in index creation very different from other cases? Also, pg_qsort has
10%-20% advantage at creating index even on sorted data (faster CPU, N =
1000000). This is very strange to me since the two sorting routines execute
exactly the same code when the input data is sorted.

Yes. CREATE INDEX uses heap TID as a tie-breaker, so it's impossible
for any two index tuples to compare as equal within tuplesort.c, even
though they may be equal in other contexts. This is likely to defeat
things like the Bentley-McIlroy optimization where equal keys are
swapped, which is very effective in the event of many equal keys.

(Could also be parallelism, though I suppose you probably accounted for that.)

--
Peter Geoghegan

#5Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Kefan Yang (#3)
Re: Fwd: GSOC 2018 Project - A New Sorting Routine

On 07/14/2018 12:04 AM, Kefan Yang wrote:

...

And finally, I see the PDF reports "CPU clocks" but I'm not sure what
that actually is? Is that elapsed time in milliseconds or something
else?

Sorry for the confusion, but "CPU clocks" actually means CPU clock
ticks, which are units of time of a constant but system-specific length.

OK, how do you measure this metric?

After reading your test results, I think the main problems of intro sort are

1. Slow on CREATE INDEX cases.

I am still trying to figure out where the bottleneck is. Is the data
pattern in index creation very different from other cases? Also,
pg_qsort has 10%-20% advantage at creating index even on sorted data
(faster CPU, N = 1000000). This is very strange to me since the two
sorting routines execute exactly the same code when the input data is
sorted.

2. Slow on faster CPU and large data set.

The main difference is still on CREATE INDEX.  But there are several
SELECT cases(faster CPU, line 179, 206, 377) where pg_qsort can have
more than 60% advantage, which is crazy. All these cases are dealing
with mostly sorted data. 

Personally, I would say intro sort is good as long as it has nearly the
same performance as pg_qsort on average cases because of the better
worst-case complexity. In fact, we can make the two sorting routines as
similar as we want by increasing the threshold that intro sort switches
to heap sort. Therefore, I think by no means could pg_qsort be 10%-20%
faster than a well-optimized intro sort because they execute the same
code most of the time. There must be something special about CREATE
INDEX test cases, and I am trying to figure it out. Also, I am
considering performing the preordered check on every recursion, like
what pg_qsort does, and see how it goes.

I don't know. All I have is the results I shared. I suggest you try
reproducing the tests on your system - the scripts are in the git
repositories.

Finally, you've mentioned

But even if it was, I guess the easiest way to deal with it would be to
randomize the selection of pivots.

I am wondering if pg_sort with random pivot selecting could be faster
than intro sort. I've just done some simple tests, which shows that
intro sort in faster in all cases. But I guess it depends on how we
would implement the random number generation.

Unlikely. The pivot randomization is merely a way to defeat an adversary
attempting to perform DoS by triggering sorts on a killer sequence.
Randomization makes it much harder/impossible, because the killer
sequence changes over time. It's not a regular performance optimization.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In reply to: Tomas Vondra (#5)
Re: Fwd: GSOC 2018 Project - A New Sorting Routine

On Fri, Jul 13, 2018 at 6:03 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

Unlikely. The pivot randomization is merely a way to defeat an adversary
attempting to perform DoS by triggering sorts on a killer sequence.
Randomization makes it much harder/impossible, because the killer
sequence changes over time. It's not a regular performance optimization.

+1. The importance of the quadratic worst case for an industrial
strength quicksort seems to often be overstated.

Robert Sedgewick's Algorithms book has *excellent* analysis of
quicksort's worst case, which might be worth a read.

--
Peter Geoghegan

#7Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Peter Geoghegan (#4)
Re: GSOC 2018 Project - A New Sorting Routine

On 07/14/2018 12:10 AM, Peter Geoghegan wrote:

On Fri, Jul 13, 2018 at 3:04 PM, Kefan Yang <starordust@gmail.com> wrote:

1. Slow on CREATE INDEX cases.

I am still trying to figure out where the bottleneck is. Is the data pattern
in index creation very different from other cases? Also, pg_qsort has
10%-20% advantage at creating index even on sorted data (faster CPU, N =
1000000). This is very strange to me since the two sorting routines execute
exactly the same code when the input data is sorted.

Yes. CREATE INDEX uses heap TID as a tie-breaker, so it's impossible
for any two index tuples to compare as equal within tuplesort.c, even
though they may be equal in other contexts. This is likely to defeat
things like the Bentley-McIlroy optimization where equal keys are
swapped, which is very effective in the event of many equal keys.

(Could also be parallelism, though I suppose you probably accounted for that.)

Hmmm. Those scripts are older than max_parallel_maintenance_workers, so
were only setting the regular max_parallel_workers_per_gather GUCs. OTOH
these tests were done on fairly small data sets, starting from 10k rows
and the 10-20% regression is clearly visible for all scales (we don't
use parallel CREATE INDEX for tiny tables, right?). And it's not visible
on the i5 CPU at all, which would be a bit strange if it's
parallelism-related.

So I doubt it's this, but I've tweaked the scripts to also set this GUC
and restarted the tests on both machines. Let's see what that does.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#8Andrey Borodin
amborodin@acm.org
In reply to: Tomas Vondra (#7)
Re: GSOC 2018 Project - A New Sorting Routine

Hi, Tomas!

15 июля 2018 г., в 1:20, Tomas Vondra <tomas.vondra@2ndquadrant.com> написал(а):

So I doubt it's this, but I've tweaked the scripts to also set this GUC
and restarted the tests on both machines. Let's see what that does.

Do you observe any different results?

Thanks!

Best regards, Andrey Borodin.

#9Kefan Yang
starordust@gmail.com
In reply to: Kefan Yang (#1)
RE: GSOC 2018 Project - A New Sorting Routine

Hey Tomas!

I am trying to reproduce the results on my machine. Could you please share the script to generate .ods files?

Regards,
Kefan

From: Tomas Vondra
Sent: July 18, 2018 2:05 AM
To: Andrey Borodin
Cc: Peter Geoghegan; Kefan Yang; PostgreSQL Hackers
Subject: Re: GSOC 2018 Project - A New Sorting Routine

On 07/18/2018 07:06 AM, Andrey Borodin wrote:

Hi, Tomas!

15 июля 2018 г., в 1:20, Tomas Vondra <tomas.vondra@2ndquadrant.com
<mailto:tomas.vondra@2ndquadrant.com>> написал(а):

So I doubt it's this, but I've tweaked the scripts to also set this GUC
and restarted the tests on both machines. Let's see what that does.

Do you observe any different results?

It did change the CREATE INDEX results, depending on the scale. The full
data is available at [1]https://bitbucket.org/tvondra/sort-intro-sort-xeon/src/master/ and [2]https://bitbucket.org/tvondra/sort-intro-sort-i5/src/master/, attached is a spreadsheet summary from
the Xeon box.

For the largest scale (1M rows) the regressions for CREATE INDEX queries
mostly disappeared. For 10k rows it still affects CREATE INDEX with a
text column, and the 100k case behaves just like before (so significant
regressions for CREATE INDEX).

I don't have time to investigate this further at the moment, but I'm
still of the opinion that there's little to gain by replacing our
current sort algorithm with this.

[1]: https://bitbucket.org/tvondra/sort-intro-sort-xeon/src/master/
[2]: https://bitbucket.org/tvondra/sort-intro-sort-i5/src/master/

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Kefan Yang (#9)
Re: GSOC 2018 Project - A New Sorting Routine

I don't have any script for that - load the files into a spreadsheet,
create pivot tables and you're done.

regards

On 07/18/2018 11:13 PM, Kefan Yang wrote:

Hey Tomas!

 

I am trying to reproduce the results on my machine. Could you please
share the script to generate .ods files?

 

Regards,

Kefan

 

*From: *Tomas Vondra <mailto:tomas.vondra@2ndquadrant.com>
*Sent: *July 18, 2018 2:05 AM
*To: *Andrey Borodin <mailto:x4mmm@yandex-team.ru>
*Cc: *Peter Geoghegan <mailto:pg@bowt.ie>; Kefan Yang
<mailto:starordust@gmail.com>; PostgreSQL Hackers
<mailto:pgsql-hackers@lists.postgresql.org>
*Subject: *Re: GSOC 2018 Project - A New Sorting Routine

 

 

 

On 07/18/2018 07:06 AM, Andrey Borodin wrote:

Hi, Tomas!

15 июля 2018 г., в 1:20, Tomas Vondra <tomas.vondra@2ndquadrant.com

<mailto:tomas.vondra@2ndquadrant.com>> написал(а):

 

So I doubt it's this, but I've tweaked the scripts to also set this GUC

and restarted the tests on both machines. Let's see what that does.

Do you observe any different results?

 

It did change the CREATE INDEX results, depending on the scale. The full

data is available at [1] and [2], attached is a spreadsheet summary from

the Xeon box.

 

For the largest scale (1M rows) the regressions for CREATE INDEX queries

mostly disappeared. For 10k rows it still affects CREATE INDEX with a

text column, and the 100k case behaves just like before (so significant

regressions for CREATE INDEX).

 

I don't have time to investigate this further at the moment, but I'm

still of the opinion that there's little to gain by replacing our

current sort algorithm with this.

 

 

[1] https://bitbucket.org/tvondra/sort-intro-sort-xeon/src/master/

[2] https://bitbucket.org/tvondra/sort-intro-sort-i5/src/master/

 

regards

 

--

Tomas Vondra                  http://www.2ndQuadrant.com

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

 

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tomas Vondra (#10)
Re: GSOC 2018 Project - A New Sorting Routine

On 2018-Jul-18, Tomas Vondra wrote:

I don't have any script for that - load the files into a spreadsheet,
create pivot tables and you're done.

What!? You don't use psql's \crosstabview !?

... walks away disappointed ...

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#12Kefan Yang
starordust@gmail.com
In reply to: Tomas Vondra (#10)
RE: GSOC 2018 Project - A New Sorting Routine

Hi Tomas!

I did a few tests on my own Linux machine, but the problem is that my resources on AWS(CPU, RAM and even Disk space) are very limited. I considered establishing virtual machine on my own PC but the performance is even worse.

My original patch has two main optimizations: (1) switch to heap sort when depth limit exceeded (2) check whether the array is presorted only once at the beginning. Now I want to test these optimizations separately. On AWS EC2 instance, regressions on CREATE INDEX cases seems to be less significant if we use (1) only, but I can only test up to 100000 records and 512MB memory using your scripts.

So would you mind re-running the tests using the two patches I provided in the attachment? That will be very helpful

Regards,
Kefan

From: Tomas Vondra
Sent: July 18, 2018 2:26 PM
To: Kefan Yang
Cc: Andrey Borodin; Peter Geoghegan; PostgreSQL Hackers
Subject: Re: GSOC 2018 Project - A New Sorting Routine

I don't have any script for that - load the files into a spreadsheet,
create pivot tables and you're done.

regards

On 07/18/2018 11:13 PM, Kefan Yang wrote:

Hey Tomas!

 

I am trying to reproduce the results on my machine. Could you please
share the script to generate .ods files?

 

Regards,

Kefan

 

*From: *Tomas Vondra <mailto:tomas.vondra@2ndquadrant.com>
*Sent: *July 18, 2018 2:05 AM
*To: *Andrey Borodin <mailto:x4mmm@yandex-team.ru>
*Cc: *Peter Geoghegan <mailto:pg@bowt.ie>; Kefan Yang
<mailto:starordust@gmail.com>; PostgreSQL Hackers
<mailto:pgsql-hackers@lists.postgresql.org>
*Subject: *Re: GSOC 2018 Project - A New Sorting Routine

 

 

 

On 07/18/2018 07:06 AM, Andrey Borodin wrote:

Hi, Tomas!

15 июля 2018 г., в 1:20, Tomas Vondra <tomas.vondra@2ndquadrant.com

<mailto:tomas.vondra@2ndquadrant.com>> написал(а):

 

So I doubt it's this, but I've tweaked the scripts to also set this GUC

and restarted the tests on both machines. Let's see what that does.

Do you observe any different results?

 

It did change the CREATE INDEX results, depending on the scale. The full

data is available at [1] and [2], attached is a spreadsheet summary from

the Xeon box.

 

For the largest scale (1M rows) the regressions for CREATE INDEX queries

mostly disappeared. For 10k rows it still affects CREATE INDEX with a

text column, and the 100k case behaves just like before (so significant

regressions for CREATE INDEX).

 

I don't have time to investigate this further at the moment, but I'm

still of the opinion that there's little to gain by replacing our

current sort algorithm with this.

 

 

[1] https://bitbucket.org/tvondra/sort-intro-sort-xeon/src/master/

[2] https://bitbucket.org/tvondra/sort-intro-sort-i5/src/master/

 

regards

 

--

Tomas Vondra                  http://www.2ndQuadrant.com

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

 

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

check_once.diffapplication/octet-stream; name=check_once.diffDownload+127-98
use_heap.diffapplication/octet-stream; name=use_heap.diffDownload+286-65
#13Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Kefan Yang (#12)
Re: GSOC 2018 Project - A New Sorting Routine

On 07/24/2018 12:21 AM, Kefan Yang wrote:

Hi Tomas!

I did a few tests on my own Linux machine, but the problem is that my
resources on AWS(CPU, RAM and even Disk space) are very limited. I
considered establishing virtual machine on my own PC but the performance
is even worse.

My original patch has two main optimizations: (1) switch to heap sort
when depth limit exceeded (2) check whether the array is presorted only
once at the beginning. Now I want to test these optimizations
separately. On AWS EC2 instance, regressions on CREATE INDEX cases seems
to be less significant if we use (1) only, but I can only test up to
100000 records and 512MB memory using your scripts.

So would you mind re-running the tests using the two patches I provided
in the attachment? That will be very helpful

I can do that, but it'll have to wait a couple of days. I'm currently
using the boxes for some other tests.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#14Kefan Yang
starordust@gmail.com
In reply to: Tomas Vondra (#13)
RE: GSOC 2018 Project - A New Sorting Routine

Hey Tomas!

Sorry to bother but it would be great if we can get the test results this week.

Regards,
Kefan

From: Tomas Vondra
Sent: July 24, 2018 8:16 AM
To: Kefan Yang
Cc: Andrey Borodin; Peter Geoghegan; alvherre@2ndquadrant.com; PostgreSQL Hackers
Subject: Re: GSOC 2018 Project - A New Sorting Routine

On 07/24/2018 12:21 AM, Kefan Yang wrote:

Hi Tomas!

I did a few tests on my own Linux machine, but the problem is that my
resources on AWS(CPU, RAM and even Disk space) are very limited. I
considered establishing virtual machine on my own PC but the performance
is even worse.

My original patch has two main optimizations: (1) switch to heap sort
when depth limit exceeded (2) check whether the array is presorted only
once at the beginning. Now I want to test these optimizations
separately. On AWS EC2 instance, regressions on CREATE INDEX cases seems
to be less significant if we use (1) only, but I can only test up to
100000 records and 512MB memory using your scripts.

So would you mind re-running the tests using the two patches I provided
in the attachment? That will be very helpful

I can do that, but it'll have to wait a couple of days. I'm currently
using the boxes for some other tests.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15Kefan Yang
starordust@gmail.com
In reply to: Kefan Yang (#1)
RE: GSOC 2018 Project - A New Sorting Routine

Thanks for your time!

From: Tomas Vondra
Sent: August 1, 2018 6:30 AM
To: Kefan Yang
Cc: Andrey Borodin; Peter Geoghegan; alvherre@2ndquadrant.com; PostgreSQL Hackers
Subject: Re: GSOC 2018 Project - A New Sorting Routine

On 07/30/2018 11:21 PM, Kefan Yang wrote:

Hey Tomas!

Sorry to bother but it would be great if we can get the test results
this week.

Attached are results from the i5 machine. I'm unable to rerun the tests
on the xeon box at the moment (which is IMHO the more interesting one).

Complete test data is available at

https://bitbucket.org/tvondra/sort-intro-sort-i5-2/src

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services