non-WAL btree?

Started by Alex Vinogradovsover 17 years ago14 messagesgeneral
Jump to latest
#1Alex Vinogradovs
AVinogradovs@Clearpathnet.com

Guys,

I was wondering if there is a btree indexing implementation that
is not WAL-logged. I'm loading data in bulks, and index logging
is an unnecessary overhead for me (easier to rebuild on crash).

Thanks!

best regards,
Alex Vinogradovs

#2Joshua D. Drake
jd@commandprompt.com
In reply to: Alex Vinogradovs (#1)
Re: non-WAL btree?

On Fri, 01 Aug 2008 12:41:12 -0700
Alex Vinogradovs <AVinogradovs@Clearpathnet.com> wrote:

Guys,

I was wondering if there is a btree indexing implementation that
is not WAL-logged. I'm loading data in bulks, and index logging
is an unnecessary overhead for me (easier to rebuild on crash).

Drop the index during load?

Thanks!

best regards,
Alex Vinogradovs

--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#3Alex Vinogradovs
AVinogradovs@Clearpathnet.com
In reply to: Joshua D. Drake (#2)
Re: non-WAL btree?

By loading in bulks, I mean I load some 40-50 thousand
rows at once into a table that already has some millions.
Index rebuild on that table after each 50k inserts will
be even less efficient ;)

Alex.

Show quoted text

On Fri, 2008-08-01 at 12:57 -0700, Joshua Drake wrote:

On Fri, 01 Aug 2008 12:41:12 -0700
Alex Vinogradovs <AVinogradovs@Clearpathnet.com> wrote:

Guys,

I was wondering if there is a btree indexing implementation that
is not WAL-logged. I'm loading data in bulks, and index logging
is an unnecessary overhead for me (easier to rebuild on crash).

Drop the index during load?

Thanks!

best regards,
Alex Vinogradovs

#4David Wilson
david.t.wilson@gmail.com
In reply to: Alex Vinogradovs (#3)
Re: non-WAL btree?

On Fri, Aug 1, 2008 at 4:07 PM, Alex Vinogradovs
<AVinogradovs@clearpathnet.com> wrote:

By loading in bulks, I mean I load some 40-50 thousand
rows at once into a table that already has some millions.
Index rebuild on that table after each 50k inserts will
be even less efficient ;)

How many indexes do you have on this...? I do this pretty regularly
(actually, I do 4k batches with COPY, 4-10 concurrent batches every 10
seconds, for 2-3 days at a time) and, having testing dropping indices,
nothing to do with the index has a particularly strong performance
impact.

That said, a significant increase in checkpoint segments was required
to get good performance out of the above use case. If you haven't
tried that, I'd say that's a good place to start.

What makes you think it's specifically index WAL work, though?

--
- David T. Wilson
david.t.wilson@gmail.com

#5Joshua D. Drake
jd@commandprompt.com
In reply to: Alex Vinogradovs (#3)
Re: non-WAL btree?

On Fri, 01 Aug 2008 13:07:18 -0700
Alex Vinogradovs <AVinogradovs@Clearpathnet.com> wrote:

By loading in bulks, I mean I load some 40-50 thousand
rows at once into a table that already has some millions.
Index rebuild on that table after each 50k inserts will
be even less efficient ;)

Fair enough, to answer your question, "no". We don't have any non wal
table tables or index types.

Sincerely,

Joshua D. Drake
--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#6David Wilson
david.t.wilson@gmail.com
In reply to: David Wilson (#4)
Re: non-WAL btree?

On Fri, Aug 1, 2008 at 4:16 PM, David Wilson <david.t.wilson@gmail.com> wrote:

On Fri, Aug 1, 2008 at 4:07 PM, Alex Vinogradovs
<AVinogradovs@clearpathnet.com> wrote:

By loading in bulks, I mean I load some 40-50 thousand
rows at once into a table that already has some millions.
Index rebuild on that table after each 50k inserts will
be even less efficient ;)

How many indexes do you have on this...? I do this pretty regularly
(actually, I do 4k batches with COPY, 4-10 concurrent batches every 10
seconds, for 2-3 days at a time) and, having testing dropping indices,
nothing to do with the index has a particularly strong performance
impact.

Sorry, as I hit send, I realized I should clarify this: I do my bulk
loads with the indexes active; I don't rebuild them (as they're
necessary during the batch calculations). Dropping the indexes and
using test data didn't show a significant performance improvement over
leaving the indexes enabled.

--
- David T. Wilson
david.t.wilson@gmail.com

#7Alex Vinogradovs
AVinogradovs@Clearpathnet.com
In reply to: Joshua D. Drake (#5)
Re: non-WAL btree?

Isn't hash indexing implementation non-WAL ?

Alex.

Show quoted text

On Fri, 2008-08-01 at 13:16 -0700, Joshua Drake wrote:

On Fri, 01 Aug 2008 13:07:18 -0700
Alex Vinogradovs <AVinogradovs@Clearpathnet.com> wrote:

By loading in bulks, I mean I load some 40-50 thousand
rows at once into a table that already has some millions.
Index rebuild on that table after each 50k inserts will
be even less efficient ;)

Fair enough, to answer your question, "no". We don't have any non wal
table tables or index types.

Sincerely,

Joshua D. Drake

#8Alex Vinogradovs
AVinogradovs@Clearpathnet.com
In reply to: David Wilson (#6)
Re: non-WAL btree?

It's not that I expect a lot of improvement by having non-WAL
indexing, it just sounds logical to me to have that, since
index can be re-created fast enough during recovery, and it
would reduce my IO to some extent.

Alex.

Show quoted text

Sorry, as I hit send, I realized I should clarify this: I do my bulk
loads with the indexes active; I don't rebuild them (as they're
necessary during the batch calculations). Dropping the indexes and
using test data didn't show a significant performance improvement over
leaving the indexes enabled.

#9Francisco Reyes
lists@stringsutils.com
In reply to: Alex Vinogradovs (#8)
Re: non-WAL btree?

On 4:36 pm 08/01/08 Alex Vinogradovs <AVinogradovs@Clearpathnet.com> wrote:

It's not that I expect a lot of improvement by having non-WAL
indexing

Have you tried using a fill factor less than 90%?
That is on my list of things to test, but have not done yet.. In particular
you need to find a balance where the speed gained justifies the increase in
size of the index.. and the potential slowdowns because of larger indexes.

#10Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Alex Vinogradovs (#7)
Re: non-WAL btree?

On Fri, Aug 1, 2008 at 3:32 PM, Alex Vinogradovs
<AVinogradovs@clearpathnet.com> wrote:

Isn't hash indexing implementation non-WAL ?

yes, but that's because no one thinks is worth the effort of making
them WAL logged while they keep slower than btree...

--
regards,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Guayaquil - Ecuador
Cel. (593) 87171157

#11Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Alex Vinogradovs (#8)
Re: non-WAL btree?

On Fri, Aug 1, 2008 at 3:36 PM, Alex Vinogradovs
<AVinogradovs@clearpathnet.com> wrote:

It's not that I expect a lot of improvement by having non-WAL
indexing, it just sounds logical to me to have that, since
index can be re-created fast enough during recovery,

and why you think that? if they are non WAL logged the only way to
re-create them after a recovery is with a REINDEX... dropping the
index and create after the bulk is just the same, i think...

--
regards,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Guayaquil - Ecuador
Cel. (593) 87171157

#12Alex Vinogradovs
AVinogradovs@Clearpathnet.com
In reply to: Jaime Casanova (#11)
Re: non-WAL btree?

It's all about number of repetions. If say I load my table
with 50k every minute, and run reindex every minute, how
long do you think it would take by end of the day, when
my table (it's daily partition actually) is at maximum
capacity ? And database may actually never crash, and
I won't have to run reindex at all ;)

Btw, SELECT INTO is also a non-WAL operation when
archiving is disabled, or am I missing something ?

Alex.

Show quoted text

On Fri, 2008-08-01 at 16:43 -0500, Jaime Casanova wrote:

On Fri, Aug 1, 2008 at 3:36 PM, Alex Vinogradovs
<AVinogradovs@clearpathnet.com> wrote:

It's not that I expect a lot of improvement by having non-WAL
indexing, it just sounds logical to me to have that, since
index can be re-created fast enough during recovery,

and why you think that? if they are non WAL logged the only way to
re-create them after a recovery is with a REINDEX... dropping the
index and create after the bulk is just the same, i think...

#13Glen Parker
glenebob@nwlink.com
In reply to: Jaime Casanova (#11)
Re: non-WAL btree?

Jaime Casanova wrote:

On Fri, Aug 1, 2008 at 3:36 PM, Alex Vinogradovs
<AVinogradovs@clearpathnet.com> wrote:

It's not that I expect a lot of improvement by having non-WAL
indexing, it just sounds logical to me to have that, since
index can be re-created fast enough during recovery,

and why you think that? if they are non WAL logged the only way to
re-create them after a recovery is with a REINDEX... dropping the
index and create after the bulk is just the same, i think...

They don't all have to be non-WAL, first off; it could be optional per
index. Second, non-WAL would provide a benefit in the case the OP
mentioned, and the only time it would be a detriment is in the event of
a fault. Reindexing of non-WAL indexes could be automatic during recovery.

Non-WAL indexing is an option I would almost certainly take advantage of
if it existed.

-Glen

#14Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Alex Vinogradovs (#12)
Re: non-WAL btree?

On Fri, Aug 1, 2008 at 4:49 PM, Alex Vinogradovs
<AVinogradovs@clearpathnet.com> wrote:

It's all about number of repetions. If say I load my table
with 50k every minute, and run reindex every minute, how
long do you think it would take by end of the day, when
my table (it's daily partition actually) is at maximum
capacity ? And database may actually never crash, and
I won't have to run reindex at all ;)

maybe http://www.postgresql.org/docs/8.3/static/wal-async-commit.html
is what you need...

begin;
set local synchronous_commit to off;
insert...
insert...
...
commit;

or

set synchronous_commit to off;
copy command;
set synchronous_commit to on;

--
regards,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Guayaquil - Ecuador
Cel. (593) 87171157