SET work_mem = '1TB';

Started by Simon Riggsover 12 years ago12 messages
#1Simon Riggs
simon@2ndQuadrant.com
1 attachment(s)

I worked up a small patch to support Terabyte setting for memory.
Which is OK, but it only works for 1TB, not for 2TB or above.

Which highlights that since we measure things in kB, we have an
inherent limit of 2047GB for our memory settings. It isn't beyond
belief we'll want to go that high, or at least won't be by end 2014
and will be annoying sometime before 2020.

Solution seems to be to support something potentially bigger than INT
for GUCs. So we can reclassify GUC_UNIT_MEMORY according to the
platform we're on.

Opinions?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

terabyte_work_mem.v1.patchapplication/octet-stream; name=terabyte_work_mem.v1.patchDownload
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 22ba35f..843d7e1 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -4837,7 +4837,7 @@ parse_int(const char *value, int *result, int flags, const char **hintmsg)
 		{
 			/* Set hint for use if no match or trailing garbage */
 			if (hintmsg)
-				*hintmsg = gettext_noop("Valid units for this parameter are \"kB\", \"MB\", and \"GB\".");
+				*hintmsg = gettext_noop("Valid units for this parameter are \"kB\", \"MB\", \"GB\" and \"TB\".");
 
 #if BLCKSZ < 1024 || BLCKSZ > (1024*1024)
 #error BLCKSZ must be between 1KB and 1MB
@@ -4891,6 +4891,22 @@ parse_int(const char *value, int *result, int flags, const char **hintmsg)
 						break;
 				}
 			}
+			else if (strncmp(endptr, "TB", 2) == 0)
+			{
+				endptr += 2;
+				switch (flags & GUC_UNIT_MEMORY)
+				{
+					case GUC_UNIT_KB:
+						val *= 1024 * KB_PER_GB;
+						break;
+					case GUC_UNIT_BLOCKS:
+						val *= 1024 * KB_PER_GB / (BLCKSZ / 1024);
+						break;
+					case GUC_UNIT_XBLOCKS:
+						val *= 1024 * KB_PER_GB / (XLOG_BLCKSZ / 1024);
+						break;
+				}
+			}
 		}
 		else if (flags & GUC_UNIT_TIME)
 		{
#2Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Simon Riggs (#1)
Re: SET work_mem = '1TB';

On 22/05/13 09:13, Simon Riggs wrote:

I worked up a small patch to support Terabyte setting for memory.
Which is OK, but it only works for 1TB, not for 2TB or above.

Which highlights that since we measure things in kB, we have an
inherent limit of 2047GB for our memory settings. It isn't beyond
belief we'll want to go that high, or at least won't be by end 2014
and will be annoying sometime before 2020.

Solution seems to be to support something potentially bigger than INT
for GUCs. So we can reclassify GUC_UNIT_MEMORY according to the
platform we're on.

Opinions?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

I suspect it should be fixed before it starts being a problem, for 2
reasons:

1. best to panic early while we have time
(or more prosaically: doing it soon gives us more time to get it
right without undue pressure)

2. not able to cope with 2TB and above might put off companies with
seriously massive databases from moving to Postgres

Probably an idea to check what other values should be increased as well.

Cheers,
Gavin

#3Jeff Janes
jeff.janes@gmail.com
In reply to: Simon Riggs (#1)
1 attachment(s)
Re: SET work_mem = '1TB';

On Tuesday, May 21, 2013, Simon Riggs wrote:

I worked up a small patch to support Terabyte setting for memory.
Which is OK, but it only works for 1TB, not for 2TB or above.

I've incorporated my review into a new version, attached.

Added "TB" to the docs, added the macro KB_PER_TB, and made "show" to print
"1TB" rather than "1024GB".

I tested several of the memory settings to see that it can be set and
retrieved. I haven't tested actual execution as I don't have that kind of
RAM.

I don't see how it could have a performance impact, it passes make check
etc., and I don't think it warrants a new regression test.

I'll set it to ready for committer.

Cheers,

Jeff

Attachments:

terabyte_work_mem.JJ.v2.patchtext/x-patch; charset=US-ASCII; name=terabyte_work_mem.JJ.v2.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index c7d84b5..940ed6e
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 39,45 ****
       For convenience,
       a different unit can also be specified explicitly.  Valid memory units
       are <literal>kB</literal> (kilobytes), <literal>MB</literal>
!      (megabytes), and <literal>GB</literal> (gigabytes); valid time units
       are <literal>ms</literal> (milliseconds), <literal>s</literal>
       (seconds), <literal>min</literal> (minutes), <literal>h</literal>
       (hours), and <literal>d</literal> (days).  Note that the multiplier
--- 39,45 ----
       For convenience,
       a different unit can also be specified explicitly.  Valid memory units
       are <literal>kB</literal> (kilobytes), <literal>MB</literal>
!      (megabytes), <literal>GB</literal> (gigabytes), and <literal>TB</literal> (terabytes); valid time units
       are <literal>ms</literal> (milliseconds), <literal>s</literal>
       (seconds), <literal>min</literal> (minutes), <literal>h</literal>
       (hours), and <literal>d</literal> (days).  Note that the multiplier
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
new file mode 100644
index ea16c64..0e5b0c9
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***************
*** 105,110 ****
--- 105,111 ----
  
  #define KB_PER_MB (1024)
  #define KB_PER_GB (1024*1024)
+ #define KB_PER_TB (1024*1024*1024)
  
  #define MS_PER_S 1000
  #define S_PER_MIN 60
*************** parse_int(const char *value, int *result
*** 4837,4843 ****
  		{
  			/* Set hint for use if no match or trailing garbage */
  			if (hintmsg)
! 				*hintmsg = gettext_noop("Valid units for this parameter are \"kB\", \"MB\", and \"GB\".");
  
  #if BLCKSZ < 1024 || BLCKSZ > (1024*1024)
  #error BLCKSZ must be between 1KB and 1MB
--- 4838,4844 ----
  		{
  			/* Set hint for use if no match or trailing garbage */
  			if (hintmsg)
! 				*hintmsg = gettext_noop("Valid units for this parameter are \"kB\", \"MB\", \"GB\" and \"TB\".");
  
  #if BLCKSZ < 1024 || BLCKSZ > (1024*1024)
  #error BLCKSZ must be between 1KB and 1MB
*************** parse_int(const char *value, int *result
*** 4891,4896 ****
--- 4892,4913 ----
  						break;
  				}
  			}
+ 			else if (strncmp(endptr, "TB", 2) == 0)
+ 			{
+ 				endptr += 2;
+ 				switch (flags & GUC_UNIT_MEMORY)
+ 				{
+ 					case GUC_UNIT_KB:
+ 						val *= KB_PER_TB;
+ 						break;
+ 					case GUC_UNIT_BLOCKS:
+ 						val *= KB_PER_TB / (BLCKSZ / 1024);
+ 						break;
+ 					case GUC_UNIT_XBLOCKS:
+ 						val *= KB_PER_TB / (XLOG_BLCKSZ / 1024);
+ 						break;
+ 				}
+ 			}
  		}
  		else if (flags & GUC_UNIT_TIME)
  		{
*************** _ShowOption(struct config_generic * reco
*** 7384,7390 ****
  								break;
  						}
  
! 						if (result % KB_PER_GB == 0)
  						{
  							result /= KB_PER_GB;
  							unit = "GB";
--- 7401,7412 ----
  								break;
  						}
  
! 						if (result % KB_PER_TB == 0)
! 						{
! 							result /= KB_PER_TB;
! 							unit = "TB";
! 						}
! 						else if (result % KB_PER_GB == 0)
  						{
  							result /= KB_PER_GB;
  							unit = "GB";
#4Fujii Masao
masao.fujii@gmail.com
In reply to: Jeff Janes (#3)
1 attachment(s)
Re: SET work_mem = '1TB';

On Tue, Jun 18, 2013 at 1:06 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Tuesday, May 21, 2013, Simon Riggs wrote:

I worked up a small patch to support Terabyte setting for memory.
Which is OK, but it only works for 1TB, not for 2TB or above.

I've incorporated my review into a new version, attached.

Added "TB" to the docs, added the macro KB_PER_TB, and made "show" to print
"1TB" rather than "1024GB".

Looks good to me. But I found you forgot to change postgresql.conf.sample,
so I changed it and attached the updated version of the patch.

Barring any objection to this patch and if no one picks up this, I
will commit this.

Regards,

--
Fujii Masao

Attachments:

terabyte_work_mem_fujii_v3.patchapplication/octet-stream; name=terabyte_work_mem_fujii_v3.patchDownload
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 39,45 ****
       For convenience,
       a different unit can also be specified explicitly.  Valid memory units
       are <literal>kB</literal> (kilobytes), <literal>MB</literal>
!      (megabytes), and <literal>GB</literal> (gigabytes); valid time units
       are <literal>ms</literal> (milliseconds), <literal>s</literal>
       (seconds), <literal>min</literal> (minutes), <literal>h</literal>
       (hours), and <literal>d</literal> (days).  Note that the multiplier
--- 39,45 ----
       For convenience,
       a different unit can also be specified explicitly.  Valid memory units
       are <literal>kB</literal> (kilobytes), <literal>MB</literal>
!      (megabytes), <literal>GB</literal> (gigabytes), and <literal>TB</literal> (terabytes); valid time units
       are <literal>ms</literal> (milliseconds), <literal>s</literal>
       (seconds), <literal>min</literal> (minutes), <literal>h</literal>
       (hours), and <literal>d</literal> (days).  Note that the multiplier
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***************
*** 105,110 ****
--- 105,111 ----
  
  #define KB_PER_MB (1024)
  #define KB_PER_GB (1024*1024)
+ #define KB_PER_TB (1024*1024*1024)
  
  #define MS_PER_S 1000
  #define S_PER_MIN 60
***************
*** 4837,4843 **** parse_int(const char *value, int *result, int flags, const char **hintmsg)
  		{
  			/* Set hint for use if no match or trailing garbage */
  			if (hintmsg)
! 				*hintmsg = gettext_noop("Valid units for this parameter are \"kB\", \"MB\", and \"GB\".");
  
  #if BLCKSZ < 1024 || BLCKSZ > (1024*1024)
  #error BLCKSZ must be between 1KB and 1MB
--- 4838,4844 ----
  		{
  			/* Set hint for use if no match or trailing garbage */
  			if (hintmsg)
! 				*hintmsg = gettext_noop("Valid units for this parameter are \"kB\", \"MB\", \"GB\", and \"TB\".");
  
  #if BLCKSZ < 1024 || BLCKSZ > (1024*1024)
  #error BLCKSZ must be between 1KB and 1MB
***************
*** 4891,4896 **** parse_int(const char *value, int *result, int flags, const char **hintmsg)
--- 4892,4913 ----
  						break;
  				}
  			}
+ 			else if (strncmp(endptr, "TB", 2) == 0)
+ 			{
+ 				endptr += 2;
+ 				switch (flags & GUC_UNIT_MEMORY)
+ 				{
+ 					case GUC_UNIT_KB:
+ 						val *= KB_PER_TB;
+ 						break;
+ 					case GUC_UNIT_BLOCKS:
+ 						val *= KB_PER_TB / (BLCKSZ / 1024);
+ 						break;
+ 					case GUC_UNIT_XBLOCKS:
+ 						val *= KB_PER_TB / (XLOG_BLCKSZ / 1024);
+ 						break;
+ 				}
+ 			}
  		}
  		else if (flags & GUC_UNIT_TIME)
  		{
***************
*** 7384,7390 **** _ShowOption(struct config_generic * record, bool use_units)
  								break;
  						}
  
! 						if (result % KB_PER_GB == 0)
  						{
  							result /= KB_PER_GB;
  							unit = "GB";
--- 7401,7412 ----
  								break;
  						}
  
! 						if (result % KB_PER_TB == 0)
! 						{
! 							result /= KB_PER_TB;
! 							unit = "TB";
! 						}
! 						else if (result % KB_PER_GB == 0)
  						{
  							result /= KB_PER_GB;
  							unit = "GB";
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***************
*** 27,33 ****
  # Memory units:  kB = kilobytes        Time units:  ms  = milliseconds
  #                MB = megabytes                     s   = seconds
  #                GB = gigabytes                     min = minutes
! #                                                   h   = hours
  #                                                   d   = days
  
  
--- 27,33 ----
  # Memory units:  kB = kilobytes        Time units:  ms  = milliseconds
  #                MB = megabytes                     s   = seconds
  #                GB = gigabytes                     min = minutes
! #                TB = terabytes                     h   = hours
  #                                                   d   = days
  
  
#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Fujii Masao (#4)
Re: SET work_mem = '1TB';

On 18 June 2013 17:10, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jun 18, 2013 at 1:06 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Tuesday, May 21, 2013, Simon Riggs wrote:

I worked up a small patch to support Terabyte setting for memory.
Which is OK, but it only works for 1TB, not for 2TB or above.

I've incorporated my review into a new version, attached.

Added "TB" to the docs, added the macro KB_PER_TB, and made "show" to print
"1TB" rather than "1024GB".

Looks good to me. But I found you forgot to change postgresql.conf.sample,
so I changed it and attached the updated version of the patch.

Barring any objection to this patch and if no one picks up this, I
will commit this.

In truth, I hadn't realised somebody had added this to the CF. It was
meant to be an exploration and demonstration that further work was/is
required rather than a production quality submission. AFAICS it is
still limited to '1 TB' only...

Thank you both for adding to this patch. Since you've done that, it
seems churlish of me to interrupt that commit.

I will make a note to extend the support to higher values of TBs later.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Josh Berkus
josh@agliodbs.com
In reply to: Simon Riggs (#1)
Re: SET work_mem = '1TB';

In truth, I hadn't realised somebody had added this to the CF. It was
meant to be an exploration and demonstration that further work was/is
required rather than a production quality submission. AFAICS it is
still limited to '1 TB' only...

At the beginning of the CF, I do a sweep of patch files emailed to
-hackers and not in the CF. I believe there were three such of yours,
take a look at the CF list. Like I said, better to track them
unnecessarily than to lose them.

Thank you both for adding to this patch. Since you've done that, it
seems churlish of me to interrupt that commit.

Well, I think that someone needs to actually test doing a sort with,
say, 100GB of RAM and make sure it doesn't crash. Anyone have a machine
they can try that on?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Stephen Frost
sfrost@snowman.net
In reply to: Josh Berkus (#6)
Re: SET work_mem = '1TB';

* Josh Berkus (josh@agliodbs.com) wrote:

Well, I think that someone needs to actually test doing a sort with,
say, 100GB of RAM and make sure it doesn't crash. Anyone have a machine
they can try that on?

It can be valuable to bump up work_mem well beyond the amount of system
memory actually available on the system to get the 'right' plan to be
chosen (which often ends up needing much less actual memory to run).

I've used that trick on a box w/ 512GB of RAM and had near-100G PG
backend processes which were doing hashjoins. Don't think I've ever had
it try doing a sort w/ a really big work_mem.

Thanks,

Stephen

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#6)
Re: SET work_mem = '1TB';

On 18 June 2013 18:45, Josh Berkus <josh@agliodbs.com> wrote:

In truth, I hadn't realised somebody had added this to the CF. It was
meant to be an exploration and demonstration that further work was/is
required rather than a production quality submission. AFAICS it is
still limited to '1 TB' only...

At the beginning of the CF, I do a sweep of patch files emailed to
-hackers and not in the CF. I believe there were three such of yours,
take a look at the CF list. Like I said, better to track them
unnecessarily than to lose them.

Thanks. Please delete the patch marked "Batch API for After Triggers".
All others are submissions by me.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Josh Berkus
josh@agliodbs.com
In reply to: Simon Riggs (#1)
Re: SET work_mem = '1TB';

On 06/18/2013 10:59 AM, Simon Riggs wrote:

Thanks. Please delete the patch marked "Batch API for After Triggers".
All others are submissions by me.

The CF app doesn't permit deletion of patches, so I marked it "returned
with feedback".

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Fujii Masao
masao.fujii@gmail.com
In reply to: Simon Riggs (#5)
Re: SET work_mem = '1TB';

On Wed, Jun 19, 2013 at 2:40 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 18 June 2013 17:10, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jun 18, 2013 at 1:06 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Tuesday, May 21, 2013, Simon Riggs wrote:

I worked up a small patch to support Terabyte setting for memory.
Which is OK, but it only works for 1TB, not for 2TB or above.

I've incorporated my review into a new version, attached.

Added "TB" to the docs, added the macro KB_PER_TB, and made "show" to print
"1TB" rather than "1024GB".

Looks good to me. But I found you forgot to change postgresql.conf.sample,
so I changed it and attached the updated version of the patch.

Barring any objection to this patch and if no one picks up this, I
will commit this.

In truth, I hadn't realised somebody had added this to the CF. It was
meant to be an exploration and demonstration that further work was/is
required rather than a production quality submission. AFAICS it is
still limited to '1 TB' only...

Yes.

Thank you both for adding to this patch. Since you've done that, it
seems churlish of me to interrupt that commit.

I was thinking that this is the infrastructure patch for your future
proposal, i.e., support higher values of TBs. But if it interferes with
your future proposal, of course I'm okay to drop this patch. Thought?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Fujii Masao (#10)
Re: SET work_mem = '1TB';

On 18 June 2013 22:57, Fujii Masao <masao.fujii@gmail.com> wrote:

On Wed, Jun 19, 2013 at 2:40 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 18 June 2013 17:10, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jun 18, 2013 at 1:06 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Tuesday, May 21, 2013, Simon Riggs wrote:

I worked up a small patch to support Terabyte setting for memory.
Which is OK, but it only works for 1TB, not for 2TB or above.

I've incorporated my review into a new version, attached.

Added "TB" to the docs, added the macro KB_PER_TB, and made "show" to print
"1TB" rather than "1024GB".

Looks good to me. But I found you forgot to change postgresql.conf.sample,
so I changed it and attached the updated version of the patch.

Barring any objection to this patch and if no one picks up this, I
will commit this.

In truth, I hadn't realised somebody had added this to the CF. It was
meant to be an exploration and demonstration that further work was/is
required rather than a production quality submission. AFAICS it is
still limited to '1 TB' only...

Yes.

Thank you both for adding to this patch. Since you've done that, it
seems churlish of me to interrupt that commit.

I was thinking that this is the infrastructure patch for your future
proposal, i.e., support higher values of TBs. But if it interferes with
your future proposal, of course I'm okay to drop this patch. Thought?

Yes, please commit.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Fujii Masao
masao.fujii@gmail.com
In reply to: Simon Riggs (#11)
Re: SET work_mem = '1TB';

On Wed, Jun 19, 2013 at 4:47 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 18 June 2013 22:57, Fujii Masao <masao.fujii@gmail.com> wrote:

On Wed, Jun 19, 2013 at 2:40 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 18 June 2013 17:10, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jun 18, 2013 at 1:06 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Tuesday, May 21, 2013, Simon Riggs wrote:

I worked up a small patch to support Terabyte setting for memory.
Which is OK, but it only works for 1TB, not for 2TB or above.

I've incorporated my review into a new version, attached.

Added "TB" to the docs, added the macro KB_PER_TB, and made "show" to print
"1TB" rather than "1024GB".

Looks good to me. But I found you forgot to change postgresql.conf.sample,
so I changed it and attached the updated version of the patch.

Barring any objection to this patch and if no one picks up this, I
will commit this.

In truth, I hadn't realised somebody had added this to the CF. It was
meant to be an exploration and demonstration that further work was/is
required rather than a production quality submission. AFAICS it is
still limited to '1 TB' only...

Yes.

Thank you both for adding to this patch. Since you've done that, it
seems churlish of me to interrupt that commit.

I was thinking that this is the infrastructure patch for your future
proposal, i.e., support higher values of TBs. But if it interferes with
your future proposal, of course I'm okay to drop this patch. Thought?

Yes, please commit.

Committed.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers