PSA: If you are running Precise/12.04 upgrade your kernel.

Started by Joshua D. Drakealmost 13 years ago12 messagesgeneral
Jump to latest
#1Joshua D. Drake
jd@commandprompt.com

Hello,

I had the distinct displeasure of staying up entirely too late with a
customer this week because they upgraded to 12.04 and immediately
experienced a huge performance regression. In the process they also
upgraded to PostgreSQL 9.1 from 8.4. There were a lot of knobs to
change/fix/modify because of this. However, nothing I did fixed the
problem. Until... I upgraded the kernel.

Upgrading from 3.2Precise to the 3.9.4 kernel produced the following
results:

http://www.commandprompt.com/blogs/joshua_drake/2013/06/the_steaming_pile_that_is_precise_with_kernel_32/

I have since verified this on more than one machine as well. Upgrading
the kernel has drastically reduced overall IOWAIT times.

Sincerely,

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
a rose in the deeps of my heart. - W.B. Yeats

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Scott Marlowe
scott.marlowe@gmail.com
In reply to: Joshua D. Drake (#1)
Re: PSA: If you are running Precise/12.04 upgrade your kernel.

On Thu, Jun 6, 2013 at 4:35 PM, Joshua D. Drake <jd@commandprompt.com> wrote:

Hello,

I had the distinct displeasure of staying up entirely too late with a
customer this week because they upgraded to 12.04 and immediately
experienced a huge performance regression. In the process they also upgraded
to PostgreSQL 9.1 from 8.4. There were a lot of knobs to change/fix/modify
because of this. However, nothing I did fixed the problem. Until... I
upgraded the kernel.

Upgrading from 3.2Precise to the 3.9.4 kernel produced the following
results:

I've since heard that 3.4 also fixes this issue as well.

What are you using for your IO on these boxes?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Joshua D. Drake
jd@commandprompt.com
In reply to: Scott Marlowe (#2)
Re: PSA: If you are running Precise/12.04 upgrade your kernel.

On 06/06/2013 03:48 PM, Scott Marlowe wrote:

On Thu, Jun 6, 2013 at 4:35 PM, Joshua D. Drake <jd@commandprompt.com> wrote:

Hello,

I had the distinct displeasure of staying up entirely too late with a
customer this week because they upgraded to 12.04 and immediately
experienced a huge performance regression. In the process they also upgraded
to PostgreSQL 9.1 from 8.4. There were a lot of knobs to change/fix/modify
because of this. However, nothing I did fixed the problem. Until... I
upgraded the kernel.

Upgrading from 3.2Precise to the 3.9.4 kernel produced the following
results:

I've since heard that 3.4 also fixes this issue as well.

What are you using for your IO on these boxes?

I was able to demonstrate it over iSCSI to a Nimble Storage SAN as well
as DAS with 2 drive RAID 1 for xlogs and 8 drive RAID 10 for data (DL385
G7).

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
a rose in the deeps of my heart. - W.B. Yeats

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Toby Corkindale
toby.corkindale@strategicdata.com.au
In reply to: Joshua D. Drake (#1)
Re: PSA: If you are running Precise/12.04 upgrade your kernel.

On 07/06/13 08:35, Joshua D. Drake wrote:

Hello,

I had the distinct displeasure of staying up entirely too late with a
customer this week because they upgraded to 12.04 and immediately
experienced a huge performance regression. In the process they also
upgraded to PostgreSQL 9.1 from 8.4. There were a lot of knobs to
change/fix/modify because of this. However, nothing I did fixed the
problem. Until... I upgraded the kernel.

Upgrading from 3.2Precise to the 3.9.4 kernel produced the following
results:

http://www.commandprompt.com/blogs/joshua_drake/2013/06/the_steaming_pile_that_is_precise_with_kernel_32/

I have since verified this on more than one machine as well. Upgrading
the kernel has drastically reduced overall IOWAIT times.

I'd be curious to hear if the same problem applies to the 3.2 kernel
that's in the recently-released Debian "Wheezy"?

(My ubuntu precise boxes have been running the backported kernels for a
while, as it is, but some debian squeeze boxes are due to be upgraded to
debian wheezy soon)

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Nikhil G. Daddikar
ngd@celoxis.com
In reply to: Scott Marlowe (#2)
Re: PSA: If you are running Precise/12.04 upgrade your kernel.

Folks,

This is bad news as I run Ubuntu 12.04 LTS. However, my ubuntu 12.04 LTS
boxes have been updated to "3.5.0-32-generic" (official update). Any
idea whether the Postgresql has problems with this kernel? I'd like to
follow the "official" LTS updates because I am not sure what other
surprises I could face if I move to an unofficial one.

Thanks!
Nikhil

On 07-06-2013 04:18, Scott Marlowe wrote:

On Thu, Jun 6, 2013 at 4:35 PM, Joshua D. Drake <jd@commandprompt.com> wrote:

Hello,

I had the distinct displeasure of staying up entirely too late with a
customer this week because they upgraded to 12.04 and immediately
experienced a huge performance regression. In the process they also upgraded
to PostgreSQL 9.1 from 8.4. There were a lot of knobs to change/fix/modify
because of this. However, nothing I did fixed the problem. Until... I
upgraded the kernel.

Upgrading from 3.2Precise to the 3.9.4 kernel produced the following
results:

I've since heard that 3.4 also fixes this issue as well.

What are you using for your IO on these boxes?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Toby Corkindale
toby.corkindale@strategicdata.com.au
In reply to: Nikhil G. Daddikar (#5)
Re: PSA: If you are running Precise/12.04 upgrade your kernel.

Perhaps someone with a spare server floating around could install Ubuntu
LTS and run some pg-bench benchmarks with the various kernel options?

Like you, I'd have to stick to official updates for production systems.

-Toby

On 07/06/13 15:36, Nikhil G Daddikar wrote:

Folks,

This is bad news as I run Ubuntu 12.04 LTS. However, my ubuntu 12.04 LTS
boxes have been updated to "3.5.0-32-generic" (official update). Any
idea whether the Postgresql has problems with this kernel? I'd like to
follow the "official" LTS updates because I am not sure what other
surprises I could face if I move to an unofficial one.

Thanks!
Nikhil

On 07-06-2013 04:18, Scott Marlowe wrote:

On Thu, Jun 6, 2013 at 4:35 PM, Joshua D. Drake <jd@commandprompt.com>
wrote:

Hello,

I had the distinct displeasure of staying up entirely too late with a
customer this week because they upgraded to 12.04 and immediately
experienced a huge performance regression. In the process they also
upgraded
to PostgreSQL 9.1 from 8.4. There were a lot of knobs to
change/fix/modify
because of this. However, nothing I did fixed the problem. Until... I
upgraded the kernel.

Upgrading from 3.2Precise to the 3.9.4 kernel produced the following
results:

I've since heard that 3.4 also fixes this issue as well.

What are you using for your IO on these boxes?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Bosco Rama
postgres@boscorama.com
In reply to: Joshua D. Drake (#1)
Re: PSA: If you are running Precise/12.04 upgrade your kernel.

On 06/06/13 15:35, Joshua D. Drake wrote:

I had the distinct displeasure of staying up entirely too late with a
customer this week because they upgraded to 12.04 and immediately
experienced a huge performance regression. In the process they also
upgraded to PostgreSQL 9.1 from 8.4. There were a lot of knobs to
change/fix/modify because of this. However, nothing I did fixed the
problem. Until... I upgraded the kernel.

We ran head-long into this problem the day after you posted this. We
are in the process of moving from PG 8.4 on UB Server 10.0 LTS onto
PG 9.2 on UB Server 12.04 LTS and encountered this very issue during the
pg_upgradecluster process.

A colleague mentioned this LKML thread:
<http://lkml.indiana.edu/hypermail/linux/kernel/1210.1/00725.html&gt;

Seems it was fixed in 3.9.x. I'm wonder if there is any way to easily
determine if the fix was back-ported to the various Ubunutu-maintained
kernels for Precise?

Bosco.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Joshua D. Drake
jd@commandprompt.com
In reply to: Bosco Rama (#7)
Re: PSA: If you are running Precise/12.04 upgrade your kernel.

On 06/14/2013 09:12 AM, Bosco Rama wrote:

A colleague mentioned this LKML thread:
<http://lkml.indiana.edu/hypermail/linux/kernel/1210.1/00725.html&gt;

Seems it was fixed in 3.9.x. I'm wonder if there is any way to easily
determine if the fix was back-ported to the various Ubunutu-maintained
kernels for Precise?

It is pretty easy to test for using iozone with multiple threads.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
a rose in the deeps of my heart. - W.B. Yeats

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Stuart Bishop
stuart@stuartbishop.net
In reply to: Joshua D. Drake (#3)
Re: PSA: If you are running Precise/12.04 upgrade your kernel.

On Fri, Jun 7, 2013 at 5:51 AM, Joshua D. Drake <jd@commandprompt.com> wrote:

On 06/06/2013 03:48 PM, Scott Marlowe wrote:

On Thu, Jun 6, 2013 at 4:35 PM, Joshua D. Drake <jd@commandprompt.com>
wrote:

I had the distinct displeasure of staying up entirely too late with a
customer this week because they upgraded to 12.04 and immediately
experienced a huge performance regression. In the process they also
upgraded
to PostgreSQL 9.1 from 8.4. There were a lot of knobs to
change/fix/modify
because of this. However, nothing I did fixed the problem. Until... I
upgraded the kernel.

Upgrading from 3.2Precise to the 3.9.4 kernel produced the following
results:

I've since heard that 3.4 also fixes this issue as well.

What are you using for your IO on these boxes?

I was able to demonstrate it over iSCSI to a Nimble Storage SAN as well as
DAS with 2 drive RAID 1 for xlogs and 8 drive RAID 10 for data (DL385 G7).

This might sound familiar:

http://postgresql.1045698.n5.nabble.com/Ubuntu-12-04-3-2-Kernel-Bad-for-PostgreSQL-Performance-td5735284.html

tl;dr for that thread seems to be a driver problem (fusionIO?), I'm
unsure if Ubuntu specific or in the upstream kernel.

--
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10Joshua D. Drake
jd@commandprompt.com
In reply to: Stuart Bishop (#9)
Re: PSA: If you are running Precise/12.04 upgrade your kernel.

On 06/17/2013 01:34 PM, Stuart Bishop wrote:

I've since heard that 3.4 also fixes this issue as well.

What are you using for your IO on these boxes?

I was able to demonstrate it over iSCSI to a Nimble Storage SAN as well as
DAS with 2 drive RAID 1 for xlogs and 8 drive RAID 10 for data (DL385 G7).

This might sound familiar:

http://postgresql.1045698.n5.nabble.com/Ubuntu-12-04-3-2-Kernel-Bad-for-PostgreSQL-Performance-td5735284.html

tl;dr for that thread seems to be a driver problem (fusionIO?), I'm
unsure if Ubuntu specific or in the upstream kernel.

If it is a driver problem, then two different drivers were buggy the
Nimble Storage San driver (iSCSI) as well as the DL385 DAS (LSI). Anyway
the upgrade to 3.9 makes the problem disappear. There are other insights
in the comments of the blog post.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
a rose in the deeps of my heart. - W.B. Yeats

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11Shaun Thomas
sthomas@optionshouse.com
In reply to: Joshua D. Drake (#10)
Re: PSA: If you are running Precise/12.04 upgrade your kernel.

On 06/17/2013 04:00 PM, Joshua D. Drake wrote:

http://postgresql.1045698.n5.nabble.com/Ubuntu-12-04-3-2-Kernel-Bad-for-PostgreSQL-Performance-td5735284.html

tl;dr for that thread seems to be a driver problem (fusionIO?), I'm
unsure if Ubuntu specific or in the upstream kernel.

That instance wasn't a driver problem. The problem was that the FusionIO
driver uses kernel threads to perform IO, and it seems that several of
the 3.x kernels have issues with task migration using the new CFS CPU
scheduler which replaced the O(1) one.

The next thread related to this that fixed our particular case was this one:

/messages/by-id/50E4AAB1.9040902@optionshouse.com

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#12Scott Marlowe
scott.marlowe@gmail.com
In reply to: Shaun Thomas (#11)
Re: PSA: If you are running Precise/12.04 upgrade your kernel.

Good to know. I've got a few spare machines I might be able to test
3.2 kernels on in the next few months

On Thu, Jun 20, 2013 at 12:54 PM, Shaun Thomas <sthomas@optionshouse.com> wrote:

On 06/17/2013 04:00 PM, Joshua D. Drake wrote:

http://postgresql.1045698.n5.nabble.com/Ubuntu-12-04-3-2-Kernel-Bad-for-PostgreSQL-Performance-td5735284.html

tl;dr for that thread seems to be a driver problem (fusionIO?), I'm
unsure if Ubuntu specific or in the upstream kernel.

That instance wasn't a driver problem. The problem was that the FusionIO
driver uses kernel threads to perform IO, and it seems that several of the
3.x kernels have issues with task migration using the new CFS CPU scheduler
which replaced the O(1) one.

The next thread related to this that fixed our particular case was this one:

/messages/by-id/50E4AAB1.9040902@optionshouse.com

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related
to this email

--
To understand recursion, one must first understand recursion.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general