Optimise PostgreSQL for fast testing

Started by Dmytrii Nagirniakabout 14 years ago26 messagesgeneral
Jump to latest
#1Dmytrii Nagirniak
dnagir@gmail.com

Hi guys,

I wonder if you can suggest me how to speed-up PG when running specs.
I asked it at SO here:

http://stackoverflow.com/questions/9407442/optimise-postgresql-for-fast-testing

But briefly. PG specs are >2x slower than SQLite.
I want it to be on par (don't care about reliability or anything, just need
fast specs).

Would appreciate some suggestions.

Cheers,
Dmytrii Nagirniak
http://ApproachE.com <http://www.ApproachE.com&gt;

#2Jan Kesten
jan@dafuer.de
In reply to: Dmytrii Nagirniak (#1)
Re: Optimise PostgreSQL for fast testing

Hi Dmytrii,

just as short idea, put "fsync = off" in your postgres.conf. That turns off that after a commit data is forcilby written to disk - if the database crashes there might be dataloss.

Von meinem iPhone gesendet

Am 23.02.2012 um 06:13 schrieb Dmytrii Nagirniak <dnagir@gmail.com>:

Show quoted text

Hi guys,

I wonder if you can suggest me how to speed-up PG when running specs.
I asked it at SO here:

http://stackoverflow.com/questions/9407442/optimise-postgresql-for-fast-testing

But briefly. PG specs are >2x slower than SQLite.
I want it to be on par (don't care about reliability or anything, just need fast specs).

Would appreciate some suggestions.

Cheers,
Dmytrii Nagirniak
http://ApproachE.com

#3Dmytrii Nagirniak
dnagir@gmail.com
In reply to: Jan Kesten (#2)
Re: Optimise PostgreSQL for fast testing

On 23/02/2012, at 4:38 PM, Jan Kesten wrote:

Hi Dmytrii,

just as short idea, put "fsync = off" in your postgres.conf. That turns off that after a commit data is forcilby written to disk - if the database crashes there might be dataloss.

Thanks. So far I tried:

fsync = off
full_page_writes = off

It seems it got a *little* faster (down to ~65 seconds from ~76) but is till too far from my target of ~34 secs.

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Dmytrii Nagirniak (#3)
Re: Optimise PostgreSQL for fast testing

Hello

SQLite should be faster in single user test - it is optimized for this
purpose. So you cannot to get same speed from PostgreSQL

Pavel

2012/2/23 Dmytrii Nagirniak <dnagir@gmail.com>:

Show quoted text

On 23/02/2012, at 4:38 PM, Jan Kesten wrote:

Hi Dmytrii,

just as short idea, put "fsync = off" in your postgres.conf. That turns off
that after a commit data is forcilby written to disk - if the database
crashes there might be dataloss.

Thanks. So far I tried:

fsync = off
full_page_writes = off

It seems it got a *little* faster (down to ~65 seconds from ~76) but is till
too far from my target of ~34 secs.

#5Dmytrii Nagirniak
dnagir@gmail.com
In reply to: Pavel Stehule (#4)
Re: Optimise PostgreSQL for fast testing

On 23/02/2012, at 5:05 PM, Pavel Stehule wrote:

SQLite should be faster in single user test - it is optimized for this
purpose. So you cannot to get same speed from PostgreSQL

That's unfortunate to hear.
But hoped with a bit of tuning to get PG close to SQLite by the fact that I can change the settings in such a way so it runs more like a single user DB.

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Dmytrii Nagirniak (#5)
Re: Optimise PostgreSQL for fast testing

2012/2/23 Dmytrii Nagirniak <dnagir@gmail.com>:

On 23/02/2012, at 5:05 PM, Pavel Stehule wrote:

SQLite should be faster in single user test - it is optimized for this
purpose. So you cannot to get same speed from PostgreSQL

That's unfortunate to hear.
But hoped with a bit of tuning to get PG close to SQLite by the fact that I can change the settings in such a way so it runs more like a single user DB.

It depends on test queries and data set size - but with simple
queries and small dataset SQLite should be 2x - 10x faster than
PostgreSQL.

PostgreSQL is optimized for complex queries and multi user environment

Pavel

Show quoted text
#7Marti Raudsepp
marti@juffo.org
In reply to: Dmytrii Nagirniak (#3)
Re: Optimise PostgreSQL for fast testing

On Thu, Feb 23, 2012 at 08:02, Dmytrii Nagirniak <dnagir@gmail.com> wrote:

Thanks. So far I tried:

fsync = off
full_page_writes = off

It seems it got a *little* faster (down to ~65 seconds from ~76) but is till
too far from my target of ~34 secs.

If you have lots of very simple queries, then usually much of the
overhead is in query planning. There are a few tricks to dumb down the
planner to make it faster -- although that may come at the cost of
slowing down execution.

* If your queries use joins at all, you can reduce planning overhead
by setting join_collapse_limit=1 and from_collapse_limit=1 or some
other low number.
* Set default_statistics_target=5 or another low number and run
ANALYZE on the whole database.
* Set enable_bitmapscan=off, enable_material=off,
enable_mergejoin=off, enable_hashjoin=off -- this will prevent the
planner for trying certain kinds of plans in the first place.

Just for the heck of it, you might gain a bit by setting
track_activities=off, update_process_title=off

Regards,
Marti

#8Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Dmytrii Nagirniak (#1)
Re: Optimise PostgreSQL for fast testing

On Wednesday, February 22, 2012 9:13:06 pm Dmytrii Nagirniak wrote:

Hi guys,

I wonder if you can suggest me how to speed-up PG when running specs.
I asked it at SO here:

http://stackoverflow.com/questions/9407442/optimise-postgresql-for-fast-tes
ting

But briefly. PG specs are >2x slower than SQLite.
I want it to be on par (don't care about reliability or anything, just need
fast specs).

Would appreciate some suggestions.

Not enough information to make suggestions.
Why are you switching databases?
What are the specs?
What is the application?
What is the use case?
Single user?
Networked, multiple user?
Do you see the application/database growing?

At this point you are comparing apples and oranges. Sqlite is basically a single
user embedded database, Postgres a multi user, networked database. They both
work well for the use they are designed for, it is a matter of determining which
is a better fit for your anticipated use.

Cheers,
Dmytrii Nagirniak
http://ApproachE.com <http://www.ApproachE.com&gt;

--
Adrian Klaver
adrian.klaver@gmail.com

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Dmytrii Nagirniak (#1)
Re: Optimise PostgreSQL for fast testing

On Thu, Feb 23, 2012 at 5:13 AM, Dmytrii Nagirniak <dnagir@gmail.com> wrote:

I wonder if you can suggest me how to speed-up PG when running specs.
I asked it at SO here:

http://stackoverflow.com/questions/9407442/optimise-postgresql-for-fast-testing

But briefly. PG specs are >2x slower than SQLite.
I want it to be on par (don't care about reliability or anything, just need
fast specs).

Would appreciate some suggestions.

You really need to explain why this matters...

You mention a "typical Ruby on Rails app" and then discuss SQLite.
Well, typical web apps have more than 1 user, so fairly obviously
using SQLite isn't appropriate. If SQLite isn't appropriate, why are
you testing with it? How does a test run on a database you aren't
using in production tell you anything about the success or otherwise
of your program. It doesn't, so saying it runs quicker is irrelevant,
surely?

Perhaps just run half the test, that would make it twice as quick and
still just as valid.

If Postgres tests run in ~1 minute, what benefit have you gained from
saving 30 seconds? How often are you running tests?

So please explain a little more.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#10Jack Christensen
jackc@hylesanderson.edu
In reply to: Simon Riggs (#9)
Re: Optimise PostgreSQL for fast testing

On 2/23/2012 9:22 AM, Simon Riggs wrote:

On Thu, Feb 23, 2012 at 5:13 AM, Dmytrii Nagirniak<dnagir@gmail.com> wrote:

I wonder if you can suggest me how to speed-up PG when running specs.
I asked it at SO here:

http://stackoverflow.com/questions/9407442/optimise-postgresql-for-fast-testing

But briefly. PG specs are>2x slower than SQLite.
I want it to be on par (don't care about reliability or anything, just need
fast specs).

Would appreciate some suggestions.

You really need to explain why this matters...

You mention a "typical Ruby on Rails app" and then discuss SQLite.
Well, typical web apps have more than 1 user, so fairly obviously
using SQLite isn't appropriate. If SQLite isn't appropriate, why are
you testing with it? How does a test run on a database you aren't
using in production tell you anything about the success or otherwise
of your program. It doesn't, so saying it runs quicker is irrelevant,
surely?

Perhaps just run half the test, that would make it twice as quick and
still just as valid.

If Postgres tests run in ~1 minute, what benefit have you gained from
saving 30 seconds? How often are you running tests?

So please explain a little more.

As another Rails developer using PostgreSQL I think I can explain the
use case. In standard Rails usage, the ORM handles all SQL query
generation and thus the application is database agnostic. It is typical
to use SQLite in development and testing and MySQL or PostgreSQL in
production. However, if any PostgreSQL specific functionality is used
then obviously PostgreSQL must also be used in development and testing.

Another common practice is test-driven development. So the test suite
for the application may run scores or hundreds of times per day per
developer. So the speed of the test suite is of vital importance to
developers. A 30 second difference 100's of times per day really can add
up.

--
Jack Christensen
jackc@hylesanderson.edu

#11David Salisbury
salisbury@globe.gov
In reply to: Jack Christensen (#10)
Re: Optimise PostgreSQL for fast testing

On 2/23/12 9:06 AM, Jack Christensen wrote:

As another Rails developer using PostgreSQL I think I can explain the
use case. In standard Rails usage, the ORM handles all SQL query
generation and thus the application is database agnostic. It is typical
to use SQLite in development and testing and MySQL or PostgreSQL in
production. However, if any PostgreSQL specific functionality is used
then obviously PostgreSQL must also be used in development and testing.

Another common practice is test-driven development. So the test suite
for the application may run scores or hundreds of times per day per
developer. So the speed of the test suite is of vital importance to
developers. A 30 second difference 100's of times per day really can add
up.

Perhaps the emphasis should be on the tests themselves, and not PG cycles.
Is he using Factory or Factory.build?.. that sort of thing. Is he running
the entire test suite, when in fact just running one test would do until
final checkin?

And I'm curious as to why anyone would need to run tests 100s of times a day.
How much code can ya write, or is he simply writing tests themselves all day?

-ds

#12Scott Marlowe
scott.marlowe@gmail.com
In reply to: David Salisbury (#11)
Re: Optimise PostgreSQL for fast testing

On Thu, Feb 23, 2012 at 10:05 AM, David Salisbury <salisbury@globe.gov> wrote:

On 2/23/12 9:06 AM, Jack Christensen wrote:

As another Rails developer using PostgreSQL I think I can explain the
use case. In standard Rails usage, the ORM handles all SQL query
generation and thus the application is database agnostic. It is typical
to use SQLite in development and testing and MySQL or PostgreSQL in
production. However, if any PostgreSQL specific functionality is used
then obviously PostgreSQL must also be used in development and testing.

Another common practice is test-driven development. So the test suite
for the application may run scores or hundreds of times per day per
developer. So the speed of the test suite is of vital importance to
developers. A 30 second difference 100's of times per day really can add
up.

Perhaps the emphasis should be on the tests themselves, and not PG cycles.
Is he using Factory or Factory.build?.. that sort of thing.  Is he running
the entire test suite, when in fact just running one test would do until
final checkin?

And I'm curious as to why anyone would need to run tests 100s of times a
day.
How much code can ya write, or is he simply writing tests themselves all
day?

He's probably doing automated continuous integration testing. Two
jobs ago we had a setup to do that and had 40k tests. The whole test
suite took about 30 minutes to runm and kicked off automatically when
the last one finished and anyone touched any code.

#13Simon Riggs
simon@2ndQuadrant.com
In reply to: Scott Marlowe (#12)
Re: Optimise PostgreSQL for fast testing

On Thu, Feb 23, 2012 at 5:20 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:

He's probably doing automated continuous integration testing.  Two
jobs ago we had a setup to do that and had 40k tests.  The whole test
suite took about 30 minutes to runm and kicked off automatically when
the last one finished and anyone touched any code.

Having lots of tests is a good thing. Bring 'em on.

If you use SQLite for that, then it all runs in a single thread and it
could easily take 30 minutes or longer.

Now all you have to do is parallelise the tests and everything can
work 10 times quicker and it would be much faster than the time SQLite
produced.

So using PostgreSQL for testing would be both quicker and more
accurate, if you set the tests up right.

The PostgreSQL regression tests are parallelised - if they weren't
we'd produce a lot less work

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#14Scott Marlowe
scott.marlowe@gmail.com
In reply to: Simon Riggs (#13)
Re: Optimise PostgreSQL for fast testing

On Thu, Feb 23, 2012 at 11:15 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Thu, Feb 23, 2012 at 5:20 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:

He's probably doing automated continuous integration testing.  Two
jobs ago we had a setup to do that and had 40k tests.  The whole test
suite took about 30 minutes to runm and kicked off automatically when
the last one finished and anyone touched any code.

Having lots of tests is a good thing. Bring 'em on.

If you use SQLite for that, then it all runs in a single thread and it
could easily take 30 minutes or longer.

Now all you have to do is parallelise the tests and everything can
work 10 times quicker and it would be much faster than the time SQLite
produced.

It's funny how once you start thinking of how to optimize to run 8 or
16 or more concurrent tests, you sometimes forget that doing that same
thing to some simpler tools might result in very poor performance til
you have to run the tests on the old system and start wondering why
you ever thought it was fast.

#15Dmytrii Nagirniak
dnagir@gmail.com
In reply to: Marti Raudsepp (#7)
Re: Optimise PostgreSQL for fast testing

On 23/02/2012, at 7:35 PM, Marti Raudsepp wrote:

If you have lots of very simple queries, then usually much of the
overhead is in query planning. There are a few tricks to dumb down the
planner to make it faster -- although that may come at the cost of
slowing down execution.

* If your queries use joins at all, you can reduce planning overhead
by setting join_collapse_limit=1 and from_collapse_limit=1 or some
other low number.
* Set default_statistics_target=5 or another low number and run
ANALYZE on the whole database.
* Set enable_bitmapscan=off, enable_material=off,
enable_mergejoin=off, enable_hashjoin=off -- this will prevent the
planner for trying certain kinds of plans in the first place.

Just for the heck of it, you might gain a bit by setting
track_activities=off, update_process_title=off

Thanks a lot Marti.
After I've applied these settings I couldn't see major improvement over fsync=off. It was pretty much within the error margin.

But you're right that most of the queries are simple and I hope this will work a little bit faster on small runs (like a single test suite).
Will definitely keep an eye on these settings.

So far fsync=off is the best I could get.

Thanks one more time for the help.

Cheers.

#16Dmytrii Nagirniak
dnagir@gmail.com
In reply to: Adrian Klaver (#8)
Re: Optimise PostgreSQL for fast testing

On 24/02/2012, at 2:06 AM, Adrian Klaver wrote:

Would appreciate some suggestions.

Not enough information to make suggestions.

Jack Christensen pretty nailed it very well. But I'll answer the particulars here too:

Why are you switching databases?

Need FTS from PG. No other reasons yet.

What are the specs?

A typical DB spec (test) does the following:
1. Creates a number of records (commonly about 5-ish, but may vary from 1 to ~40 across all tables).
2. Executes some queries against the dataset (**MOST** of them are pretty simple, with only 1-2 joins; only some queries use 5-ish joins, sorting, distinct etc).
3. May update couple of records too (we are talking about a couple only, so it in the range of 1-5, very rarely ~20-30).
4. At the end a spec truncates all the tables (uses truncate, not delete).

This repeats on every spec/test (hundreds of those).

What is the application?

As I said, it is a typical Rails web application.
But for the purpose of this thread it is irrelevant since we're basically talking about single user, development/test environment where the only user of the app is the "spec/test" and no concurrency.

What is the use case?
Single user?
Networked, multiple user?

See above.

Do you see the application/database growing?

No. As I said before the database is used ONLY for running tests and can be recreated at any time.
Generally it will never have any data in it (except when specs are running).

Cheers.

#17Dmytrii Nagirniak
dnagir@gmail.com
In reply to: Simon Riggs (#9)
Re: Optimise PostgreSQL for fast testing

On 24/02/2012, at 2:22 AM, Simon Riggs wrote:

Would appreciate some suggestions.

You really need to explain why this matters...

I've just replied to Adrian with more details. I suppose you don't mind to take a look there so I won't copy-paste it :)

You mention a "typical Ruby on Rails app" and then discuss SQLite.
Well, typical web apps have more than 1 user, so fairly obviously
using SQLite isn't appropriate.

This is rally irrelevant to this thread, but I believe you are not right here: http://www.sqlite.org/whentouse.html
Websites

SQLite usually will work great as the database engine for low to medium traffic websites (which is to say, 99.9% of all websites).

If SQLite isn't appropriate, why are
you testing with it?

It is appropriate in the first place. But another reason to use SQLite for testing is that is fast and easy to maintain and setup.
Since I need to use PG FTS (and that's the only reason), I have to use PG for testing too.
And this is where this thread comes in.

Perhaps just run half the test, that would make it twice as quick and
still just as valid.

If Postgres tests run in ~1 minute, what benefit have you gained from
saving 30 seconds? How often are you running tests?

I think it is irrelevant to the this thread, but here's my math:

I run full suite aprox every 5 mins. Which is ~ 100 times a day.
Now, 30s*100 = 50mins vs 60s=100mins.
This is another ~hour lost a day.

On top of that I would run a single test file probably every couple of minutes or so.
(So even if it is a couple of seconds slower, it all adds up a lot).

This is just a common TDD style. Watch the https://www.destroyallsoftware.com/ for example to see what I mean.

Don't get me wrong, I realise that PG and SQLite are totally different beasts.

That's totally fine if PG can't beat SQLite on speed in **this particular case**.
I just want to try to tune it to be as fast as it can (for **this particular case**, see my reply to Adrian).

Cheers.

#18Dmytrii Nagirniak
dnagir@gmail.com
In reply to: Simon Riggs (#13)
Re: Optimise PostgreSQL for fast testing

On 24/02/2012, at 5:15 AM, Simon Riggs wrote:

Now all you have to do is parallelise the tests and everything can
work 10 times quicker and it would be much faster than the time SQLite
produced.

So using PostgreSQL for testing would be both quicker and more
accurate, if you set the tests up right.

That is certainly true. And there are number of techniques to make tests faster.
But that is outside of the scope of this thread I believe.

I only want to make it run locally (don't care about CI yet) faster.
So far I could get the most out of with fsync=off (which is ~15% improvement).

#19Greg Smith
gsmith@gregsmith.com
In reply to: Dmytrii Nagirniak (#17)
Re: Optimise PostgreSQL for fast testing

On 02/23/2012 07:16 PM, Dmytrii Nagirniak wrote:

That's totally fine if PG can't beat SQLite on speed in **this
particular case**.
I just want to try to tune it to be as fast as it can (for **this
particular case**, see my reply to Adrian).

You can find all of the big tunable parameters at
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server That's as
good of a general "how do I make this faster by tweaking the server"
guide as you'll get.

Once you've hit the big tunables--shared_buffers, checkpoint_segments,
work_mem, effective_cache_size, and tweaking either synchronous_commit
or fsync--there's not too much else you can do except dig into what's
slow in individual queries. Only other thing that might help is running
ANALYZE against the whole database after any major loading of test data,
just to make sure the queries are being executed with good statistics.

If you can extract the SQL from the test cases so they can be executed
directly with the psql client, you could add "\timing" before them to
see how long each individual query runs, to look for the long running
ones. It's possible that every statement is a little slower, which
would be unsurprising and not something you can really resolve if so.
It could just be a small number that are being executed poorly though,
in which case specific query tweaking might be possible. You might get
further insight by posting the EXPLAIN ANALYZE plans of whatever the
slowest single query is. More on that subject at
http://wiki.postgresql.org/wiki/Slow_Query_Questions

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

#20Scott Marlowe
scott.marlowe@gmail.com
In reply to: Dmytrii Nagirniak (#18)
Re: Optimise PostgreSQL for fast testing

On Thu, Feb 23, 2012 at 5:22 PM, Dmytrii Nagirniak <dnagir@gmail.com> wrote:

On 24/02/2012, at 5:15 AM, Simon Riggs wrote:

Now all you have to do is parallelise the tests and everything can
work 10 times quicker and it would be much faster than the time SQLite
produced.

So using PostgreSQL for testing would be both quicker and more
accurate, if you set the tests up right.

That is certainly true. And there are number of techniques to make tests
faster.
But that is outside of the scope of this thread I believe.

Is there a reaon why you can't parallelize your tests? If you could
run 10 or so at a time it would be worth benchmarking.

Also, look into automating your testing, so that you don't need to run
the tests all the time, they run in the background, and if something
breaks they send you an alert with the code that broke things etc.

#21Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Dmytrii Nagirniak (#16)
#22Alban Hertroys
haramrae@gmail.com
In reply to: Dmytrii Nagirniak (#16)
#23Simon Riggs
simon@2ndQuadrant.com
In reply to: Dmytrii Nagirniak (#17)
#24Dmytrii Nagirniak
dnagir@gmail.com
In reply to: Simon Riggs (#23)
#25Dmytrii Nagirniak
dnagir@gmail.com
In reply to: Dmytrii Nagirniak (#24)
#26Simon Riggs
simon@2ndQuadrant.com
In reply to: Dmytrii Nagirniak (#25)