Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Tue, 26 Jun 2012 17:04:42 -0400 Robert Haas wrote:
I see you posted up a follow-up email asking Tom what he had in mind.
Personally, I don't think this needs incredibly complicated testing.
I think you should just test a workload involving inserting and/or
updating rows with lots of trailing NULL columns, and then another
workload with a table of similar width that... doesn't. If we can't
find a regression - or, better, we find a win in one or both cases -
then I think we're done here.
As per the last discussion for this patch, performance data needs to be provided before this patch's Review can proceed further.
So as per your suggestion and from the discussions about this patch, I have collected the performance data as below:
Results are taken with following configuration.
1. Schema - UNLOGGED TABLE with 2,000,000 records having all columns are INT type.
2. shared_buffers = 10GB
3. All the performance result are taken with single connection.
4. Performance is collected for INSERT operation (insert into temptable select * from inittable)
Platform details:
Operating System: Suse-Linux 10.2 x86_64
Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz)
RAM : 24GB
Documents Attached:
init.sh : Which will create the schema
sql_used.sql : sql's used for taking results
Trim_Nulls_Perf_Report.html : Performance data
Observations from Performance Results
------------------------------------------------
1. There is no performance change for cloumns that have all valid values(non- NULLs).
2. There is a visible performance increase when number of columns containing NULLS are more than > 60~70% in table have large number of columns.
3. There are visible space savings when number of columns containing NULLS are more than > 60~70% in table have large number of columns.
Let me know if there is more performance data needs to be collected for this patch?
With Regards,
Amit Kapila.
On Saturday, October 13, 2012 1:24 PM Amit kapila wrote:
Tue, 26 Jun 2012 17:04:42 -0400 Robert Haas wrote:
I see you posted up a follow-up email asking Tom what he had in mind.
Personally, I don't think this needs incredibly complicated testing.
I think you should just test a workload involving inserting and/or
updating rows with lots of trailing NULL columns, and then another
workload with a table of similar width that... doesn't. If we can't
find a regression - or, better, we find a win in one or both cases -
then I think we're done here.
As per the last discussion for this patch, performance data needs to be provided before this patch's Review can proceed >further.
So as per your suggestion and from the discussions about this patch, I have collected the performance data as below:
Results are taken with following configuration.
1. Schema - UNLOGGED TABLE with 2,000,000 records having all columns are INT type.
2. shared_buffers = 10GB
3. All the performance result are taken with single connection.
4. Performance is collected for INSERT operation (insert into temptable select * from inittable)
Platform details:
Operating System: Suse-Linux 10.2 x86_64
Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz)
RAM : 24GB
Further to Performance data, I have completed the review of the Patch.
Basic stuff:
------------
- Rebase of Patch is required.
As heap_fill_tuple function prototype is moved to different file [htup.h to htup_details.h]
- Compiles cleanly without any errors/warnings
- Regression tests pass.
Code Review comments:
---------------------
1. There is possibility of memory growth in case of toast table, if trailing toasted columns are updated to NULLs;
i.e. In Function toast_insert_or_update, for tuples when 'need_change' variable is true, numAttrs are modified to last non null column values,
and in old tuple de-toasted columns are not getting freed, if this repeats for more number of tuples there is chance of out of memory.
if ( need_change)
{
numAttrs = lastNonNullValOffset + 1;
....
}
if (need_delold)
for (i = 0; i < numAttrs; i++) <== Tailing toasted value wouldn't be freed as updated to NULL and numAttrs is modified to smaller value.
if (toast_delold[i])
toast_delete_datum(rel, toast_oldvalues[i]);
2. Comments need to updated in following functions; how ending null columns are skipped in header part.
heap_fill_tuple - function header
heap_form_tuple, heap_form_minimal_tuple, heap_form_minimal_tuple.
3. Why following change is required in function toast_flatten_tuple_attribute
- numAttrs = tupleDesc->natts;
+ numAttrs = HeapTupleHeaderGetNatts(olddata);
Detailed Performance Report for Insert and Update Operations is attached with this mail.
Observations from Performance Results
------------------------------------------------
1. There is no performance change for cloumns that have all valid values(non- NULLs).
2. There is a visible performance increase when number of columns containing NULLS are more than > 60~70% in table have large number of columns.
3. There are visible space savings when number of columns containing NULLS are more than > 60~70% in table have large number of columns.
With Regards,
Amit Kapila.
Attachments:
Trim_Tailing_Nulls_Perf_Report.htmltext/html; name=Trim_Tailing_Nulls_Perf_Report.htmlDownload
From: pgsql-hackers-owner@postgresql.org [pgsql-hackers-owner@postgresql.org] on behalf of Amit kapila [amit.kapila@huawei.com]
Sent: Monday, October 15, 2012 7:28 PM
To: robertmhaas@gmail.com; josh@agliodbs.com
Cc: pgsql-hackers@postgresql.org
Subject: [HACKERS] Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
On Monday, October 15, 2012 7:28 PM Amit kapila wrote:
On Saturday, October 13, 2012 1:24 PM Amit kapila wrote:
Tue, 26 Jun 2012 17:04:42 -0400 Robert Haas wrote:
I see you posted up a follow-up email asking Tom what he had in mind.
Personally, I don't think this needs incredibly complicated testing.
I think you should just test a workload involving inserting and/or
updating rows with lots of trailing NULL columns, and then another
workload with a table of similar width that... doesn't. If we can't
find a regression - or, better, we find a win in one or both cases -
then I think we're done here.
As per the last discussion for this patch, performance data needs to be provided before this patch's Review can proceed >further.
So as per your suggestion and from the discussions about this patch, I have collected the performance data as below:
Results are taken with following configuration.
1. Schema - UNLOGGED TABLE with 2,000,000 records having all columns are INT type.
2. shared_buffers = 10GB
3. All the performance result are taken with single connection.
4. Performance is collected for INSERT operation (insert into temptable select * from inittable)
Platform details:
Operating System: Suse-Linux 10.2 x86_64
Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz)
RAM : 24GB
Further to Performance data, I have completed the review of the Patch.
Please find the patch to address Review Comments attached with this mail.
IMO, now its ready for a committer.
With Regards,
Amit Kapila.
Attachments:
Truncate-trailing-null-columns-from-heap-rows.v2.patchapplication/octet-stream; name=Truncate-trailing-null-columns-from-heap-rows.v2.patchDownload+481-69
On 13 October 2012 08:54, Amit kapila <amit.kapila@huawei.com> wrote:
As per the last discussion for this patch, performance data needs to be
provided before this patch's Review can proceed further.So as per your suggestion and from the discussions about this patch, I have
collected the performance data as below:Results are taken with following configuration.
1. Schema - UNLOGGED TABLE with 2,000,000 records having all columns are INT
type.
2. shared_buffers = 10GB
3. All the performance result are taken with single connection.
4. Performance is collected for INSERT operation (insert into temptable
select * from inittable)Platform details:
Operating System: Suse-Linux 10.2 x86_64
Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz)
RAM : 24GBDocuments Attached:
init.sh : Which will create the schema
sql_used.sql : sql's used for taking resultsTrim_Nulls_Perf_Report.html : Performance data
Observations from Performance Results
------------------------------------------------
1. There is no performance change for cloumns that have all valid
values(non- NULLs).2. There is a visible performance increase when number of columns containing
NULLS are more than > 60~70% in table have large number of columns.3. There are visible space savings when number of columns containing NULLS
are more than > 60~70% in table have large number of columns.Let me know if there is more performance data needs to be collected for this
patch?
I can't make sense of your performance report. Because of that I can't
derive the same conclusions from it you do.
Can you explain the performance results in more detail, so we can see
what they mean? Like which are the patched, which are the unpatched
results? Which results are comparable, what the percentages mean etc..
We might then move quickly towards commit, or at least more tests.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thursday, December 20, 2012 5:46 PM Simon Riggs wrote:
On 13 October 2012 08:54, Amit kapila <amit.kapila@huawei.com> wrote:
As per the last discussion for this patch, performance data needs to
be
provided before this patch's Review can proceed further.
So as per your suggestion and from the discussions about this patch,
I have
------------------------------------------------
1. There is no performance change for cloumns that have all valid
values(non- NULLs).2. There is a visible performance increase when number of columns
containing
NULLS are more than > 60~70% in table have large number of columns.
3. There are visible space savings when number of columns containing
NULLS
are more than > 60~70% in table have large number of columns.
Let me know if there is more performance data needs to be collected
for this
patch?
I can't make sense of your performance report. Because of that I can't
derive the same conclusions from it you do.Can you explain the performance results in more detail, so we can see
what they mean? Like which are the patched, which are the unpatched
results?
On the extreme let it is mentioned Original Code/ Trim Triling Nulls Patch.
In any case I have framed the results again as below:
1. Table with 800 columns
A. INSERT tuples with 600 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+---------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+----------------------
1A: 0.2068 250000 | 0.2302 222223 | 10.1% tps, 11.1% space
1B: 0.0448 500000 | 0.0481 472223 | 6.8% tps, 5.6% space
1C: 0.0433 750000 | 0.0493 694445 | 12.2% tps, 7.4% space
2. Table with 800 columns
A. INSERT tuples with 300 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+---------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+----------------------
2A: 0.0280 666667 | 0.0287 666667 | 2.3% tps, 0% space
2B: 0.0143 1333334 | 0.0152 1333334 | 5.3% tps, 0% space
2C: 0.0145 2000000 | 0.0149 2000000 | 2.9% tps, 0% space
3. Table with 300 columns
A. INSERT tuples with 150 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+--------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+--------------------
3A: 0.2815 166667 | 0.2899 166667 | 2.9% tps, 0% space
3B: 0.0851 333334 | 0.0870 333334 | 2.2% tps, 0% space
3C: 0.0846 500000 | 0.0852 500000 | 0.7% tps, 0% space
4. Table with 300 columns
A. INSERT tuples with 250 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+-------------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+-------------------------
4A: 0.5447 66667 | 0.5996 58824 | 09.2% tps, 11.8% space
4B: 0.1251 135633 | 0.1232 127790 | -01.5% tps, 5.8% space
4C: 0.1223 202299 | 0.1361 186613 | 10.1% tps, 7.5% space
Please let me know, if still it is not clear.
With Regards,
Amit Kapila.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 20 December 2012 14:56, Amit Kapila <amit.kapila@huawei.com> wrote:
1. There is no performance change for cloumns that have all valid
values(non- NULLs).
I don't see any tests (at all) that measure this.
I'm particularly interested in lower numbers of columns, so we can
show no regression for the common case.
2. There is a visible performance increase when number of columns
containing
NULLS are more than > 60~70% in table have large number of columns.
3. There are visible space savings when number of columns containing
NULLS
are more than > 60~70% in table have large number of columns.
Agreed.
I would call that quite disappointing though and was expecting better.
Are we sure the patch works and the tests are correct?
The lack of any space saving for lower % values is strange and
somewhat worrying. There should be a 36? byte saving for 300 null
columns in an 800 column table - how does that not show up at all?
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: -1503838346036575320@unknownmsgid
Simon Riggs <simon@2ndQuadrant.com> writes:
The lack of any space saving for lower % values is strange and
somewhat worrying. There should be a 36? byte saving for 300 null
columns in an 800 column table - how does that not show up at all?
You could only fit about 4 such rows in an 8K page (assuming the columns
are all int4s). Unless the savings is enough to allow 5 rows to fit in
a page, the effective savings will be zilch.
This may well mean that the whole thing is a waste of time in most
scenarios --- the more likely it is to save anything, the more likely
that the savings will be lost anyway due to page alignment
considerations, because wider rows inherently pack less efficiently.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 23 December 2012 17:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndQuadrant.com> writes:
The lack of any space saving for lower % values is strange and
somewhat worrying. There should be a 36? byte saving for 300 null
columns in an 800 column table - how does that not show up at all?You could only fit about 4 such rows in an 8K page (assuming the columns
are all int4s). Unless the savings is enough to allow 5 rows to fit in
a page, the effective savings will be zilch.
If that's the case, the use case is tiny, especially considering how
sensitive the saving is to the exact location of the NULLs.
This may well mean that the whole thing is a waste of time in most
scenarios --- the more likely it is to save anything, the more likely
that the savings will be lost anyway due to page alignment
considerations, because wider rows inherently pack less efficiently.
ISTM that we'd get a better gain and a wider use case by compressing
the whole block, with some bits masked out to allow updates/deletes.
The string of zeroes in the null bitmap would compress easily, but so
would other aspects also.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sunday, December 23, 2012 8:11 PM Simon Riggs wrote:
On 20 December 2012 14:56, Amit Kapila <amit.kapila@huawei.com> wrote:
1. There is no performance change for cloumns that have all valid
values(non- NULLs).I don't see any tests (at all) that measure this.
I'm particularly interested in lower numbers of columns, so we can
show no regression for the common case.
For now I have taken for 300 columns, I can take for 10~30 columns reading
as well if required
1. Table with 300 columns (all integer columns)
A. INSERT tuples without trailing nulls
B. UPDATE last column value to "null"
----------------+---------------------+------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS | TPS | Results
----------------+---------------------+------------------
1A: 0.1348 | 0.1352 | 0.3%
1B: 0.0495 | 0.0495 | 0.0%
2. There is a visible performance increase when number of columns
containing
NULLS are more than > 60~70% in table have large number of
columns.
3. There are visible space savings when number of columns
containing
NULLS
are more than > 60~70% in table have large number of columns.
Agreed.
I would call that quite disappointing though and was expecting better.
Are we sure the patch works and the tests are correct?The lack of any space saving for lower % values is strange and
somewhat worrying. There should be a 36? byte saving for 300 null
columns in an 800 column table - how does that not show up at all?
300 NULL's case will save approximately 108 bytes, as 3 tuples will be
accommodated in such case.
So now the total space left in page will be approximately 1900 bytes
(including 108 bytes saved by optimization).
Now the point is that in existing test case all rows are same (approx 2100
bytes), so no space saving is shown, but incase the last row is such that it
can get accommodated in space saved (remaining space of page + space saved
due to NULLS optimization), then it can show space savings as well.
In anycase there is a performance gain for 300 NULLS case as well.
Apart from above, the performance data for less number of columns (where the
trailing nulls are such that they cross word boundary) also show similar
gains:
The below cases (2 & 3) can give benefit as it will save 4 bytes per tuple
2. Table with 12 columns (first 3 integer followed by 9 Boolean columns)
A. INSERT tuples with 9 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+---------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+----------------------
2A: 0.8485 12739 | 0.8524 10811 | 0.4% 15.1%
2B: 0.5847 25478 | 0.5749 23550 | -1.5% 7.5%
2C: 0.5591 38217 | 0.5545 34361 | 0.8% 10.0%
3. Table with 12 columns (first 3 integer followed by 9 Boolean columns)
A. INSERT tuples with 4 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+---------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+----------------------
3A: 0.8443 14706 | 0.8626 12739 | 2.3% 13.3%
3B: 0.5307 29412 | 0.5272 27445 | -0.6% 6.7%
3C: 0.5102 44118 | 0.5218 40184 | 2.2% 8.9%
As a conclusion point, I would like to say that this patch doesn't have
performance regression for most used scenario's
and it gives benefit in some of the trailing null's cases.
With Regards,
Amit Kapila.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 24 December 2012 13:13, Amit Kapila <amit.kapila@huawei.com> wrote:
Apart from above, the performance data for less number of columns (where the
trailing nulls are such that they cross word boundary) also show similar
gains:The below cases (2 & 3) can give benefit as it will save 4 bytes per tuple
2. Table with 12 columns (first 3 integer followed by 9 Boolean columns)
A. INSERT tuples with 9 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+---------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+----------------------
2A: 0.8485 12739 | 0.8524 10811 | 0.4% 15.1%
2B: 0.5847 25478 | 0.5749 23550 | -1.5% 7.5%
2C: 0.5591 38217 | 0.5545 34361 | 0.8% 10.0%3. Table with 12 columns (first 3 integer followed by 9 Boolean columns)
A. INSERT tuples with 4 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+---------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+----------------------
3A: 0.8443 14706 | 0.8626 12739 | 2.3% 13.3%
3B: 0.5307 29412 | 0.5272 27445 | -0.6% 6.7%
3C: 0.5102 44118 | 0.5218 40184 | 2.2% 8.9%As a conclusion point, I would like to say that this patch doesn't have
performance regression for most used scenario's
and it gives benefit in some of the trailing null's cases.
Not really sure about the 100s of columns use case.
But showing gain in useful places in these more common cases wins my vote.
Thanks for testing. Barring objections, will commit.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: -998245857536193953@unknownmsgid
Simon Riggs wrote:
Not really sure about the 100s of columns use case.
But showing gain in useful places in these more common cases wins
my vote.Thanks for testing. Barring objections, will commit.
Do we have any results on just a plain, old stock pgbench run, with
the default table definitions?
That would be a reassuring complement to the other tests.
-Kevin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Resolved by subject fallback
On Monday, December 24, 2012 7:58 PM Kevin Grittner wrote:
Simon Riggs wrote:
Not really sure about the 100s of columns use case.
But showing gain in useful places in these more common cases wins
my vote.Thanks for testing. Barring objections, will commit.
Do we have any results on just a plain, old stock pgbench run, with
the default table definitions?That would be a reassuring complement to the other tests.
Shall run pgbench tpc_b with scalefactor - 50 for 10 mins with/without patch and will post the results.
Kindly let me know if you feel any other test needs to be run.
With Regards,
Amit Kapila.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Monday, December 24, 2012 7:58 PM Kevin Grittner wrote:
Simon Riggs wrote:
Not really sure about the 100s of columns use case.
But showing gain in useful places in these more common cases wins
my vote.Thanks for testing. Barring objections, will commit.
Do we have any results on just a plain, old stock pgbench run, with
the default table definitions?That would be a reassuring complement to the other tests.
Sever Configuration:
The database cluster will be initialized with locales
COLLATE: C
CTYPE: C
MESSAGES: en_US.UTF-8
MONETARY: en_US.UTF-8
NUMERIC: en_US.UTF-8
TIME: en_US.UTF-8
shared_buffers = 1GB
checkpoint_segments = 255
checkpoint_timeout = 15min
pgbench:
transaction type: TPC-B (sort of)
scaling factor: 75
query mode: simple
number of clients: 8
number of threads: 8
duration: 600 s
Performance: Average of 3 runs of pgbench in tps
9.3devel | with trailing null patch
----------+--------------------------
578.9872 | 573.4980
With Regards,
Amit Kapila.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 24 December 2012 16:57, Amit Kapila <amit.kapila@huawei.com> wrote:
Performance: Average of 3 runs of pgbench in tps
9.3devel | with trailing null patch
----------+--------------------------
578.9872 | 573.4980
On balance, it would seem optimizing for this special case would
affect everybody negatively; not much, but enough. Which means we
should rekect this patch.
Do you have a reason why a different view might be taken?
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: 7068109204275850807@unknownmsgid
On Wednesday, January 09, 2013 4:52 PM Simon Riggs wrote:
On 24 December 2012 16:57, Amit Kapila <amit.kapila@huawei.com> wrote:
Performance: Average of 3 runs of pgbench in tps
9.3devel | with trailing null patch
----------+--------------------------
578.9872 | 573.4980On balance, it would seem optimizing for this special case would
affect everybody negatively; not much, but enough. Which means we
should rekect this patch.Do you have a reason why a different view might be taken?
I have tried to dig why this gap is coming. I have observed that there is
very less change in normal path.
I wanted to give it some more time to exactly find if something can be done
to avoid performance dip in normal execution.
Right now I am busy in certain other work. But definitely in coming week or
so, I shall spare time to work on it again.
With Regards,
Amit Kapila.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9 January 2013 12:06, Amit Kapila <amit.kapila@huawei.com> wrote:
On Wednesday, January 09, 2013 4:52 PM Simon Riggs wrote:
On 24 December 2012 16:57, Amit Kapila <amit.kapila@huawei.com> wrote:
Performance: Average of 3 runs of pgbench in tps
9.3devel | with trailing null patch
----------+--------------------------
578.9872 | 573.4980On balance, it would seem optimizing for this special case would
affect everybody negatively; not much, but enough. Which means we
should rekect this patch.Do you have a reason why a different view might be taken?
I have tried to dig why this gap is coming. I have observed that there is
very less change in normal path.
I wanted to give it some more time to exactly find if something can be done
to avoid performance dip in normal execution.Right now I am busy in certain other work. But definitely in coming week or
so, I shall spare time to work on it again.
Perhaps. Not every idea produces useful outcomes. Even after your
excellent research, it appears we haven't made this work yet. It's a
shame. Should we invest more time? It's considered rude to advise
others how to spend their time, but let me say this: we simply don't
have enough time to do everything and we need to be selective,
prioritising our time on to the things that look to give the best
benefit.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: -8059625272450518027@unknownmsgid
On Wednesday, January 09, 2013 5:45 PM Simon Riggs wrote:
On 9 January 2013 12:06, Amit Kapila <amit.kapila@huawei.com> wrote:
On Wednesday, January 09, 2013 4:52 PM Simon Riggs wrote:
On 24 December 2012 16:57, Amit Kapila <amit.kapila@huawei.com>
wrote:
Performance: Average of 3 runs of pgbench in tps
9.3devel | with trailing null patch
----------+--------------------------
578.9872 | 573.4980On balance, it would seem optimizing for this special case would
affect everybody negatively; not much, but enough. Which means we
should rekect this patch.Do you have a reason why a different view might be taken?
I have tried to dig why this gap is coming. I have observed that
there is
very less change in normal path.
I wanted to give it some more time to exactly find if something canbe done
to avoid performance dip in normal execution.
Right now I am busy in certain other work. But definitely in coming
week or
so, I shall spare time to work on it again.
Perhaps. Not every idea produces useful outcomes. Even after your
excellent research, it appears we haven't made this work yet. It's a
shame. Should we invest more time? It's considered rude to advise
others how to spend their time, but let me say this: we simply don't
have enough time to do everything and we need to be selective,
prioritising our time on to the things that look to give the best
benefit.
I think we can reject this patch.
With Regards,
Amit Kapila.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 09, 2013 at 11:22:06AM +0000, Simon Riggs wrote:
On 24 December 2012 16:57, Amit Kapila <amit.kapila@huawei.com> wrote:
Performance: Average of 3 runs of pgbench in tps
9.3devel | with trailing null patch
----------+--------------------------
578.9872 | 573.4980On balance, it would seem optimizing for this special case would
affect everybody negatively; not much, but enough. Which means we
should rekect this patch.Do you have a reason why a different view might be taken?
We've mostly ignored performance changes of a few percentage points, because
the global consequences of adding or removing code to the binary image can
easily change performance that much. It would be great to get to the point
where we can reason about 1% performance changes, but we're not there. If a
measured +1% performance change would have yielded a different decision, we
should rethink the decision based on more-robust criteria.
(Most of this was said in initial April 2012 discussion.) This patch is a
tough one, because it will rarely help the most-common workloads. If it can
reduce the tuple natts from 9 to 8, the tuple shrinks by a respectable eight
bytes. But if it reduces natts from 72 to 9, we save nothing. It pays off
nicely for the widest, sparsest tables: say, a table with 1000 columns, all
but ten are NULL, and those non-NULL columns are near the front of the table.
I've never seen such a design, but a few folks seemed to find it credible.
I've failed to come up with a non-arbitrary reason to recommend for or against
this patch, so I profess neutrality on the ultimate issue.
Thanks,
nm
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Noah Misch <noah@leadboat.com> writes:
On Wed, Jan 09, 2013 at 11:22:06AM +0000, Simon Riggs wrote:
On balance, it would seem optimizing for this special case would
affect everybody negatively; not much, but enough. Which means we
should rekect this patch.
I've failed to come up with a non-arbitrary reason to recommend for or against
this patch, so I profess neutrality on the ultimate issue.
I think if we can't convincingly show an attractive performance gain,
we should reject the patch on the grounds of added complexity. Now,
the amount of added code surely isn't much, but nonetheless this patch
eliminates an invariant that's always been there (ie, that a tuple's
natts matches the tupdesc it was built with). That will at the very
least complicate forensic work, and it might well break third-party
code, or corners of the core system that we haven't noticed yet.
I'd be okay with this patch if it had more impressive performance
results, but as it is I think we're better off without it.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thursday, January 24, 2013 7:42 AM Noah Misch wrote:
On Wed, Jan 09, 2013 at 11:22:06AM +0000, Simon Riggs wrote:
On 24 December 2012 16:57, Amit Kapila <amit.kapila@huawei.com>
wrote:
Performance: Average of 3 runs of pgbench in tps
9.3devel | with trailing null patch
----------+--------------------------
578.9872 | 573.4980On balance, it would seem optimizing for this special case would
affect everybody negatively; not much, but enough. Which means we
should rekect this patch.Do you have a reason why a different view might be taken?
We've mostly ignored performance changes of a few percentage points,
because
the global consequences of adding or removing code to the binary image
can
easily change performance that much. It would be great to get to the
point
where we can reason about 1% performance changes, but we're not there.
If a
measured +1% performance change would have yielded a different
decision, we
should rethink the decision based on more-robust criteria.
Today, I had taken one more set of readings of original pg_bench which are
below in this mail. The difference is that this set of readings are on Suse
11 and with Shared buffers - 4G
IMO, the changes in this patch are not causing any regression, however it
gives performance/size reduction
in some of the cases as shown by data in previous mails.
So the point to decide is whether the usecases in which it is giving benefit
are worth to have this Patch?
As Tom had already raised some concern's about the code, I think the Patch
can only have merit if the usecase
makes sense to people.
System Configuration:
Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz) & RAM : 24GB
Operating System: Suse-Linux 11.2 x86_64
Sever Configuration:
The database cluster will be initialized with locales
COLLATE: C
CTYPE: C
MESSAGES: en_US.UTF-8
MONETARY: en_US.UTF-8
NUMERIC: en_US.UTF-8
TIME: en_US.UTF-8
shared_buffers = 4GB
checkpoint_segments = 255
checkpoint_timeout = 10min
pgbench:
transaction type: TPC-B (sort of)
scaling factor: 75
query mode: simple
number of clients: 4
number of threads: 4
duration: 300 s
original patch
Run-1: 311 312
Run-2: 307 302
Run-3: 300 312
Run-4: 285 295
Run-5: 308 305
Avg : 302.2 305.2
With Regards,
Amit Kapila.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers