Index location patch for review

Started by Jim Buttafuocoover 24 years ago18 messageshackers
Jump to latest
#1Jim Buttafuoco
jim@buttafuoco.net

Hi all,

Attached is a patch that adds support for specifying a location for
indexes via the "create database" command.

I believe this patch is complete, but it is my first .

Thanks
Jim

Attachments:

location.diffsapplication/octet-stream; name=location.diffsDownload+310-106
#2Bruce Momjian
bruce@momjian.us
In reply to: Jim Buttafuoco (#1)
Re: Index location patch for review

Hi all,

Attached is a patch that adds support for specifying a location for
indexes via the "create database" command.

I believe this patch is complete, but it is my first .

This patch allows index locations to be specified as different from data
locations. Is this a feature direction we want to go in? Comments?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#3Thomas Lockhart
lockhart@fourpalms.org
In reply to: Bruce Momjian (#2)
Re: Index location patch for review

Attached is a patch that adds support for specifying a location for
indexes via the "create database" command.

This patch allows index locations to be specified as different from data
locations. Is this a feature direction we want to go in? Comments?

I have not looked at the patch, either implementation or proposed
syntax, but in general we certainly want to head in a direction where we
have full control over placement of storage resources.

- Thomas

#4Darren King
DarrenK@Routescape.com
In reply to: Thomas Lockhart (#3)
Re: Index location patch for review

Attached is a patch that adds support for specifying a location for
indexes via the "create database" command.

I believe this patch is complete, but it is my first .

This patch allows index locations to be specified as
different from data locations. Is this a feature direction
we want to go in? Comments?

Having the table and index on separate drives can do wonders for i/o
performance. :)

darrenk

#5Stefan Rindeskar
sr@globecom.net
In reply to: Bruce Momjian (#2)
Re: Index location patch for review

I am very new to this mailinglist so I apologize if I start talking early but
I've been working as a sysadmin and that kind of problems for a long while
now and my suggestion is that it is a start but I think that we should aim a
little higher than this and use something more like the Oracle approach
instead. Where they introduce an abstraction layer in the form of a
tablespace. And this tablespace is then referenced from the create table or
create index instead.
eg:
table -> tablespace -> path to physical storage
index -> tablespace -> path to physical storage

Advantages:
Changes can be done to storage whithout need to change create scripts for db,
tables and so on.
Designers can specify in which tablespace tables/indexes should reside based
on usage.
Sysadmins can work with tablespaces and change paths without changing
anything in the database/table/index definitions.

The alternative is symlinks to distribute the load and that is not a pretty
sight dba-wise.

Hope you can bare with me on this, since I think it is an very important
issue.
I'm unfortunately not a fast coder yet (but I'm getting faster :-) ). But I
could start writing a spec if someone is interrested.

Bruce Momjian wrote:

Show quoted text

Hi all,

Attached is a patch that adds support for specifying a location for
indexes via the "create database" command.

I believe this patch is complete, but it is my first .

This patch allows index locations to be specified as different from data
locations. Is this a feature direction we want to go in? Comments?

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

#6Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Stefan Rindeskar (#5)
Re: Index location patch for review

Attached is a patch that adds support for specifying a
location for indexes via the "create database" command.

I believe this patch is complete, but it is my first .

This patch allows index locations to be specified as
different from data locations. Is this a feature direction
we want to go in? Comments?

The more general and "standard" way to go are TABLESPACEs.
But probably proposed feature will be compatible with
tablespaces, when we'll got them: we could use new "create
database" syntax to specify default tablespace for indices.
Unfortunately I removed message with patch, can you send it
to me, Bruce?

Vadim

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mikheev, Vadim (#6)
Re: Index location patch for review

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:

The more general and "standard" way to go are TABLESPACEs.
But probably proposed feature will be compatible with
tablespaces, when we'll got them:

Will it be? I'm afraid of creating a backwards-compatibility
problem for ourselves when it comes time to implement tablespaces.

At the very least I'd like to see some information demonstrating
how much benefit there is to this proposed patch, before we
consider whether to adopt it. If there's a significant performance
benefit to splitting a PG database along the table-vs-index divide,
then it's interesting as a short-term improvement ... but Jim didn't
even make that assertion, let alone provide evidence to back it up.

regards, tom lane

#8Jim Buttafuoco
jim@buttafuoco.net
In reply to: Tom Lane (#7)
Re: Index location patch for review

I agree that groups of objects in separate data storage areas are needed
and that is what I am trying to get to. Don't you think that Postgresql
with locations/files is the same as Oracle tablespaces. I don't think
we want to invent our own filesystem (which is what a tablespace really
is...).

Jim

Show quoted text

Attached is a patch that adds support for specifying a
location for indexes via the "create database" command.

I believe this patch is complete, but it is my first .

This patch allows index locations to be specified as
different from data locations. Is this a feature direction
we want to go in? Comments?

The more general and "standard" way to go are TABLESPACEs.
But probably proposed feature will be compatible with
tablespaces, when we'll got them: we could use new "create
database" syntax to specify default tablespace for indices.
Unfortunately I removed message with patch, can you send it
to me, Bruce?

Vadim

#9Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#7)
Re: Index location patch for review

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:

The more general and "standard" way to go are TABLESPACEs.
But probably proposed feature will be compatible with
tablespaces, when we'll got them:

Will it be? I'm afraid of creating a backwards-compatibility
problem for ourselves when it comes time to implement tablespaces.

At the very least I'd like to see some information demonstrating
how much benefit there is to this proposed patch, before we
consider whether to adopt it. If there's a significant performance
benefit to splitting a PG database along the table-vs-index divide,
then it's interesting as a short-term improvement ... but Jim didn't
even make that assertion, let alone provide evidence to back it up.

If that is your only concern, I can tell you for sure that if the
locations are on different drives, there will be a performance benefit.
It is standard database practice to put indexes on different drives than
data. In fact, sometimes you want to put two tables that are frequently
joined on separate drives.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#10Thomas Lockhart
lockhart@fourpalms.org
In reply to: Mikheev, Vadim (#6)
Re: Index location patch for review

...

At the very least I'd like to see some information demonstrating
how much benefit there is to this proposed patch, before we
consider whether to adopt it. If there's a significant performance
benefit to splitting a PG database along the table-vs-index divide,
then it's interesting as a short-term improvement ... but Jim didn't
even make that assertion, let alone provide evidence to back it up.

Clearly there can be a *storage management* benefit to having control
over what gets put where, so this does not need to be justified strictly
on a performance basis.

For features like this, we will feel free to evolve them or
revolutionize them with further development, so I'm not worried about
the backward compatibility issue for cases like this.

Comments?

- Thomas

#11Jim Buttafuoco
jim@buttafuoco.net
In reply to: Thomas Lockhart (#10)
Re: Index location patch for review

just change the work tablespace below to location and that is exactly
what this patch is trying to do. You can think of the LOCATION and
INDEX_LOCATION provided to the create database command as the default
storage locations for these objects. In the future, I want to enable
the DBA to specify LOCATIONS any object just like Oracle. I am also
planning on a pg_locations table and "create location" command which
will do what the current initlocation script does and more.

Jim

I am very new to this mailinglist so I apologize if I start talking

early but

I've been working as a sysadmin and that kind of problems for a long

while

now and my suggestion is that it is a start but I think that we should

aim a

little higher than this and use something more like the Oracle

approach

instead. Where they introduce an abstraction layer in the form of a
tablespace. And this tablespace is then referenced from the create

table or

create index instead.
eg:
table -> tablespace -> path to physical storage
index -> tablespace -> path to physical storage

Advantages:
Changes can be done to storage whithout need to change create scripts

for db,

tables and so on.
Designers can specify in which tablespace tables/indexes should reside

based

on usage.
Sysadmins can work with tablespaces and change paths without changing
anything in the database/table/index definitions.

The alternative is symlinks to distribute the load and that is not a

pretty

sight dba-wise.

Hope you can bare with me on this, since I think it is an very

important

issue.
I'm unfortunately not a fast coder yet (but I'm getting faster :-) ).

But I

could start writing a spec if someone is interrested.

Bruce Momjian wrote:

Hi all,

Attached is a patch that adds support for specifying a location

for

indexes via the "create database" command.

I believe this patch is complete, but it is my first .

This patch allows index locations to be specified as different from

data

locations. Is this a feature direction we want to go in? Comments?

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania

19026

---------------------------(end of

broadcast)---------------------------

TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

---------------------------(end of

broadcast)---------------------------

TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to

majordomo@postgresql.org)

Show quoted text
#12Bruce Momjian
bruce@momjian.us
In reply to: Thomas Lockhart (#10)
Re: Index location patch for review

...

At the very least I'd like to see some information demonstrating
how much benefit there is to this proposed patch, before we
consider whether to adopt it. If there's a significant performance
benefit to splitting a PG database along the table-vs-index divide,
then it's interesting as a short-term improvement ... but Jim didn't
even make that assertion, let alone provide evidence to back it up.

Clearly there can be a *storage management* benefit to having control
over what gets put where, so this does not need to be justified strictly
on a performance basis.

For features like this, we will feel free to evolve them or
revolutionize them with further development, so I'm not worried about
the backward compatibility issue for cases like this.

Comments?

Agreed.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#13Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Bruce Momjian (#12)
Re: Index location patch for review

The more general and "standard" way to go are TABLESPACEs.
But probably proposed feature will be compatible with
tablespaces, when we'll got them:

Will it be? I'm afraid of creating a backwards-compatibility
problem for ourselves when it comes time to implement tablespaces.

As I said, INDEX_LOCATION in CREATE DATABASE could mean location
of default tablespace for indices in future and one will be able
to override tablespace for particular index with TABLESPACE
clause in CREATE INDEX command.

At the very least I'd like to see some information demonstrating
how much benefit there is to this proposed patch, before we
consider whether to adopt it. If there's a significant performance
benefit to splitting a PG database along the table-vs-index divide,
then it's interesting as a short-term improvement ... but Jim didn't
even make that assertion, let alone provide evidence to back it up.

Agreed. He mentioned significant performance difference but it would
be great to see results of pgbench tests with scaling factor of >= 10.
Jim?

Also, after reviewing patch I have to say that it will NOT work
with WAL. Jim, please do not name index' dir as "<TBL_NODE>_index".
Instead, just use different TBL_NODE for indices (different number).
It's not good to put if(reln->rd_rel->relkind == RELKIND_INDEX)
stuff into storage manager - only two numbers (tblnode & relnode)
must be used to identify file, no any other logical information
totally unrelated to storage issues.

Vadim

#14Jim Buttafuoco
jim@buttafuoco.net
In reply to: Mikheev, Vadim (#13)
Re: Index location patch for review

Vadim,

I don't understand the WAL issue below, can you explain. The dir name
is the same name as the database with _index added to it. This is how
the current datpath stuff works. I really just copied the datpath code
to get this patch to work...

Also I have been running this patch (both 7.1.3 and 7.2devel) against
some of my companies applications. I have loaded a small database 10G
data and 15G indexes both with and without the patch. There seems to be
between 5% and 10% performance gain doing most common db commands
(selects, selects with joins and inserts). The system is a DUAL P3 733
with 3 IDE disks. One for PGDATA, second for APPDATA and third for
APPIDX. As you can see I have seperated WAL files, GLOBAL, Application
data and application indexes over 3 disks. Our production systems have
around 50k queries/day ( not including data loads), so I believe that
when this patch get put into production, with 20 disks and 10 database
the performance increase should go up.

I should also add, that I have been working on the second part of this
patch, which will allow tables and indexes to be put into LOCATIONS
also. I am going planning on having a PG_LOCATIONS table and
CREATE|DROP|ALTER location SQL command instead of the initlocation shell
script we currently have. The only thing stopping me now is 7.2 testing
I am planning on doing once the beta begins and problems adding a
location column to the pg_class table with the necessary support code in
heap.c...

Thanks for all the comments (keep them comming)

Jim

The more general and "standard" way to go are TABLESPACEs.
But probably proposed feature will be compatible with
tablespaces, when we'll got them:

Will it be? I'm afraid of creating a backwards-compatibility
problem for ourselves when it comes time to implement tablespaces.

As I said, INDEX_LOCATION in CREATE DATABASE could mean location
of default tablespace for indices in future and one will be able
to override tablespace for particular index with TABLESPACE
clause in CREATE INDEX command.

At the very least I'd like to see some information demonstrating
how much benefit there is to this proposed patch, before we
consider whether to adopt it. If there's a significant performance
benefit to splitting a PG database along the table-vs-index divide,
then it's interesting as a short-term improvement ... but Jim didn't
even make that assertion, let alone provide evidence to back it up.

Agreed. He mentioned significant performance difference but it would
be great to see results of pgbench tests with scaling factor of >= 10.
Jim?

Also, after reviewing patch I have to say that it will NOT work
with WAL. Jim, please do not name index' dir as "<TBL_NODE>_index".
Instead, just use different TBL_NODE for indices (different number).
It's not good to put if(reln->rd_rel->relkind == RELKIND_INDEX)
stuff into storage manager - only two numbers (tblnode & relnode)
must be used to identify file, no any other logical information
totally unrelated to storage issues.

Vadim

---------------------------(end of

broadcast)---------------------------

Show quoted text

TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

#15Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Jim Buttafuoco (#14)
Re: Index location patch for review

I don't understand the WAL issue below, can you explain. The dir name
is the same name as the database with _index added to it. This is how
the current datpath stuff works. I really just copied the datpath
code to get this patch to work...

At the time of after crash recovery WAL is not able to read relation
description from catalog and so only relfilenode is provided for
storage manager in relation structure (look backend/access/transam/
xlogutils.c:XLogOpenRelation). Well, we could add Index/Table
file type identifier to RmgrData (rmgr.c in the same dir) to set
relkind in relation structure, but I don't see any reason to
do so when we can just use different tblnode number for indices and
name index dirs just like other dirs under 'base' named - ie
only tblnode number is used for dir names, without any additions
unrelated to storage issues.

Vadim

#16Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Mikheev, Vadim (#15)
Re: Index location patch for review

Also I have been running this patch (both 7.1.3 and 7.2devel) against
some of my companies applications. I have loaded a small database 10G

We are not familiar with your applications. It would be better to see
results of test suit available to the community. pgbench is first to
come in mind. Such tests would be more valuable.

Vadim

#17Jim Buttafuoco
jim@buttafuoco.net
In reply to: Mikheev, Vadim (#16)
Re: Index location patch for review

I could also symlink all index files back to the tblnode directory?

I don't understand the WAL issue below, can you explain. The dir

name

is the same name as the database with _index added to it. This is

how

the current datpath stuff works. I really just copied the datpath
code to get this patch to work...

At the time of after crash recovery WAL is not able to read relation
description from catalog and so only relfilenode is provided for
storage manager in relation structure (look backend/access/transam/
xlogutils.c:XLogOpenRelation). Well, we could add Index/Table
file type identifier to RmgrData (rmgr.c in the same dir) to set
relkind in relation structure, but I don't see any reason to
do so when we can just use different tblnode number for indices and
name index dirs just like other dirs under 'base' named - ie
only tblnode number is used for dir names, without any additions
unrelated to storage issues.

Vadim

---------------------------(end of

broadcast)---------------------------

TIP 1: subscribe and unsubscribe commands go to

majordomo@postgresql.org

Show quoted text
#18Jim Buttafuoco
jim@buttafuoco.net
In reply to: Jim Buttafuoco (#17)
Re: Index location patch for review

Here is my pgbench results. As you can see the I am getting 2X tps with
the 2 directories. I believe this is a BIG win for Postgresql if we can
figure out the WAL recovery issues.

Can someone other than me apply the patch and verify the pgbench
results.

My hardward setup is a dual processor P3/733 running Redhat 7.1 with 512
megs of memory. The postgresql.conf file is the installed version with
NO changes.

Jim

template1=# create database one_dir with location='PGDATA1';
template1=# create database two_dir with location='PGDATA1'
index_location='PGIDX1';
for X in 1 2 3 4 5 6 7 8 9 10
do
pgbench -i -s 10 one_dir >>one_dir.log
pgbench -i -s 10 two_dir >>two_dir.log
done

bash-2.04$ grep 'excluding' one_dir.log
tps = 44.319306(excluding connections establishing)
tps = 34.641020(excluding connections establishing)
tps = 50.516889(excluding connections establishing)
tps = 52.747039(excluding connections establishing)
tps = 16.203821(excluding connections establishing)
tps = 36.902861(excluding connections establishing)
tps = 52.511769(excluding connections establishing)
tps = 53.479882(excluding connections establishing)
tps = 54.599429(excluding connections establishing)
tps = 36.780419(excluding connections establishing)
tps = 48.048279(excluding connections establishing)

bash-2.04$ grep 'excluding' two_dir.log
tps = 58.739049(excluding connections establishing)
tps = 100.259270(excluding connections establishing)
tps = 103.156166(excluding connections establishing)
tps = 110.829358(excluding connections establishing)
tps = 111.929690(excluding connections establishing)
tps = 106.840118(excluding connections establishing)
tps = 101.563159(excluding connections establishing)
tps = 102.877060(excluding connections establishing)
tps = 103.784717(excluding connections establishing)
tps = 53.056309(excluding connections establishing)
tps = 73.842428(excluding connections establishing)

Also I have been running this patch (both 7.1.3 and 7.2devel)

against

some of my companies applications. I have loaded a small database

10G

We are not familiar with your applications. It would be better to see
results of test suit available to the community. pgbench is first to
come in mind. Such tests would be more valuable.

Vadim

---------------------------(end of

broadcast)---------------------------

Show quoted text

TIP 4: Don't 'kill -9' the postmaster