Quick question regarding tablespaces

Started by Mike Rylanderabout 22 years ago9 messageshackers

miker@purplefrog.com

about 22 years ago

Now that PG will have tablespaces I can stick my really high I/O data on a
fiberchannel array, and save some money by putting the rest of it (also the
majority of it) on less expensive SCSI RAID sets. Will I also be able to
tune individual tablespaces with the likes of random_page_cost? Sorry if I
missed this somewhere...

TIA

--miker

Gavin Sherry

swm@linuxworld.com.au

about 22 years ago

In reply to: Mike Rylander (#1)

Re: Quick question regarding tablespaces

Hi Mike,

In this release, unfortunately not.

I had some idea early on of putting rand_page_cost in pg_tablespace and
having the planner have access to it for costing. I didn't actually get
around to it but. :-(

Gavin

On Mon, 28 Jun 2004, Mike Rylander wrote:

Show quoted text

Now that PG will have tablespaces I can stick my really high I/O data on a
fiberchannel array, and save some money by putting the rest of it (also the
majority of it) on less expensive SCSI RAID sets. Will I also be able to
tune individual tablespaces with the likes of random_page_cost? Sorry if I
missed this somewhere...

TIA

--miker

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

!DSPAM:40e47cf853041767292179!

Mike Rylander

miker@purplefrog.com

about 22 years ago

In reply to: Gavin Sherry (#2)

Re: Quick question regarding tablespaces

On Thursday 01 July 2004 06:43 pm, Gavin Sherry wrote:

Hi Mike,

In this release, unfortunately not.

That't too bad, but it's not that urgent I suppose.

I had some idea early on of putting rand_page_cost in pg_tablespace and
having the planner have access to it for costing. I didn't actually get
around to it but. :-(

Well, I haven't looked at the PG source before, but if you have some specific
design ideas I would be glad to help out. I'm just not sure where (or when,
with the official release coming (sort of) soon) to start, but with some
pointers I'll do what I can!

-miker

Show quoted text

Gavin

On Mon, 28 Jun 2004, Mike Rylander wrote:

Now that PG will have tablespaces I can stick my really high I/O data on
a fiberchannel array, and save some money by putting the rest of it (also
the majority of it) on less expensive SCSI RAID sets. Will I also be
able to tune individual tablespaces with the likes of random_page_cost?
Sorry if I missed this somewhere...

TIA

--miker

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

!DSPAM:40e47cf853041767292179!

Gavin Sherry

swm@linuxworld.com.au

about 22 years ago

In reply to: Mike Rylander (#3)

Re: Quick question regarding tablespaces

On Thu, 1 Jul 2004, Mike Rylander wrote:

On Thursday 01 July 2004 06:43 pm, Gavin Sherry wrote:

Hi Mike,

In this release, unfortunately not.

That't too bad, but it's not that urgent I suppose.

I had some idea early on of putting rand_page_cost in pg_tablespace and
having the planner have access to it for costing. I didn't actually get
around to it but. :-(

Well, I haven't looked at the PG source before, but if you have some specific
design ideas I would be glad to help out. I'm just not sure where (or when,
with the official release coming (sort of) soon) to start, but with some
pointers I'll do what I can!

Well, it wont be in 7.5. Feel free to start looking at how
random_page_cost in cost_index(). It might be worthwhile introducing a per
tablespace performance factor so that we could could say that the cost of
fetching an index tuple from tablespace A is half that of fetching an
index tuple from tablespace B. That idea might not actually turn out to be
a very good one once I look at it closely though.

Gavin

Scott Marlowe

smarlowe@qwest.net

about 22 years ago

In reply to: Gavin Sherry (#4)

Re: Quick question regarding tablespaces

On Thu, 2004-07-01 at 18:54, Gavin Sherry wrote:

On Thu, 1 Jul 2004, Mike Rylander wrote:

On Thursday 01 July 2004 06:43 pm, Gavin Sherry wrote:

Hi Mike,

In this release, unfortunately not.

That't too bad, but it's not that urgent I suppose.

I had some idea early on of putting rand_page_cost in pg_tablespace and
having the planner have access to it for costing. I didn't actually get
around to it but. :-(

Well, I haven't looked at the PG source before, but if you have some specific
design ideas I would be glad to help out. I'm just not sure where (or when,
with the official release coming (sort of) soon) to start, but with some
pointers I'll do what I can!

Well, it wont be in 7.5. Feel free to start looking at how
random_page_cost in cost_index(). It might be worthwhile introducing a per
tablespace performance factor so that we could could say that the cost of
fetching an index tuple from tablespace A is half that of fetching an
index tuple from tablespace B. That idea might not actually turn out to be
a very good one once I look at it closely though.

How about having a per cluster / database / tablespace / table type
setup that goes in a hierarchy, if they're there. I.e. if the database
doesn't have it's own random_page_cost, it inherits from cluster, if a
tablespace doesn't have one, it inherits from cluster->database, and so
on to individual tables / indexes. It may be that it's easier to
implement for them all now while doing it for tablespaces. Just
wondering. I'm a user, not a hacker, so I have no idea how much that
idea makes any sense, but I would certainly love to be able to set an
index to have a random_page_cost effect of 1.1 while the table it lives
in is 1.3, the tablespace 1.4, and so on. But not required, because it
always inherits from the parent if it doesn't have one, like stats
target.

Mike Rylander

miker@purplefrog.com

about 22 years ago

In reply to: Gavin Sherry (#4)

Re: Quick question regarding tablespaces

On Thursday 01 July 2004 08:54 pm, Gavin Sherry wrote:

On Thu, 1 Jul 2004, Mike Rylander wrote:

On Thursday 01 July 2004 06:43 pm, Gavin Sherry wrote:

Hi Mike,

In this release, unfortunately not.

That't too bad, but it's not that urgent I suppose.

I had some idea early on of putting rand_page_cost in pg_tablespace and
having the planner have access to it for costing. I didn't actually get
around to it but. :-(

Well, I haven't looked at the PG source before, but if you have some
specific design ideas I would be glad to help out. I'm just not sure
where (or when, with the official release coming (sort of) soon) to
start, but with some pointers I'll do what I can!

Well, it wont be in 7.5. Feel free to start looking at how
random_page_cost in cost_index().

I will start looking there.

It might be worthwhile introducing a per
tablespace performance factor so that we could could say that the cost of
fetching an index tuple from tablespace A is half that of fetching an
index tuple from tablespace B.

As random_page_cost is tied directly to the performance of a filesystem, my
thought was to leave the setting from the config file as a cluster-wide (and
default tablespace) setting that would be overridden by a tablespace specific
setting... i.e.

ALTER TABLESPACE ... SET RANDOM PAGE COST x.x;

or even setting a scaling factor that would shift the global random page cost.
this scaling factor would be set on all tablespaces and would have a default
of 1. Then it could be set lower ( 0.5 means that tablespace is 2 times
faster than the default tablespace, or global setting). Is that more what
your were thinking?

That idea might not actually turn out to be
a very good one once I look at it closely though.

If the latter is what you were thinking, I tend to agree. But I think a
direct setting for each tablespace would be a very big benefit. At least I'm
pretty sure I would use it :)

--miker

Show quoted text

Gavin

Mike Rylander

miker@purplefrog.com

about 22 years ago

In reply to: Scott Marlowe (#5)

Re: Quick question regarding tablespaces

On Thursday 01 July 2004 09:26 pm, Scott Marlowe wrote:

On Thu, 2004-07-01 at 18:54, Gavin Sherry wrote:

On Thu, 1 Jul 2004, Mike Rylander wrote:

On Thursday 01 July 2004 06:43 pm, Gavin Sherry wrote:

Hi Mike,

In this release, unfortunately not.

That't too bad, but it's not that urgent I suppose.

I had some idea early on of putting rand_page_cost in pg_tablespace
and having the planner have access to it for costing. I didn't
actually get around to it but. :-(

Well, I haven't looked at the PG source before, but if you have some
specific design ideas I would be glad to help out. I'm just not sure
where (or when, with the official release coming (sort of) soon) to
start, but with some pointers I'll do what I can!

Well, it wont be in 7.5. Feel free to start looking at how
random_page_cost in cost_index(). It might be worthwhile introducing a
per tablespace performance factor so that we could could say that the
cost of fetching an index tuple from tablespace A is half that of
fetching an index tuple from tablespace B. That idea might not actually
turn out to be a very good one once I look at it closely though.

How about having a per cluster / database / tablespace / table type
setup that goes in a hierarchy, if they're there. I.e. if the database
doesn't have it's own random_page_cost, it inherits from cluster, if a
tablespace doesn't have one, it inherits from cluster->database, and so
on to individual tables / indexes.

I was thinking of purely tablespace-based random_page_cost, as that variable
is tied to the access time of a particular filesystem.

It may be that it's easier to
implement for them all now while doing it for tablespaces. Just
wondering. I'm a user, not a hacker, so I have no idea how much that
idea makes any sense, but I would certainly love to be able to set an
index to have a random_page_cost effect of 1.1 while the table it lives
in is 1.3, the tablespace 1.4, and so on. But not required, because it
always inherits from the parent if it doesn't have one, like stats
target.

I have been thinking about something along the lines of a 'user_cost_push'
index attribute. This would default to 1 (if not set) and would be
multiplied against the cost of the plan node for the index to help or hurt
the use of the index in cases where the planner consistently makes the wrong
choice regarding the use of the index (due to funky stats, etc.).

Though perhaps I am just thinking around the problem. I know there has been
some pretty big work done on the stats collector recently.

--miker

Bruce Momjian

bruce@momjian.us

about 22 years ago

In reply to: Scott Marlowe (#5)

Re: Quick question regarding tablespaces

I would like to see some tool that reported an semi-accurate value for
random page cost before adding the value per tablespace.

---------------------------------------------------------------------------

Scott Marlowe wrote:

On Thu, 2004-07-01 at 18:54, Gavin Sherry wrote:

On Thu, 1 Jul 2004, Mike Rylander wrote:

On Thursday 01 July 2004 06:43 pm, Gavin Sherry wrote:

Hi Mike,

In this release, unfortunately not.

That't too bad, but it's not that urgent I suppose.

I had some idea early on of putting rand_page_cost in pg_tablespace and
having the planner have access to it for costing. I didn't actually get
around to it but. :-(

Well, I haven't looked at the PG source before, but if you have some specific
design ideas I would be glad to help out. I'm just not sure where (or when,
with the official release coming (sort of) soon) to start, but with some
pointers I'll do what I can!

Well, it wont be in 7.5. Feel free to start looking at how
random_page_cost in cost_index(). It might be worthwhile introducing a per
tablespace performance factor so that we could could say that the cost of
fetching an index tuple from tablespace A is half that of fetching an
index tuple from tablespace B. That idea might not actually turn out to be
a very good one once I look at it closely though.

How about having a per cluster / database / tablespace / table type
setup that goes in a hierarchy, if they're there. I.e. if the database
doesn't have it's own random_page_cost, it inherits from cluster, if a
tablespace doesn't have one, it inherits from cluster->database, and so
on to individual tables / indexes. It may be that it's easier to
implement for them all now while doing it for tablespaces. Just
wondering. I'm a user, not a hacker, so I have no idea how much that
idea makes any sense, but I would certainly love to be able to set an
index to have a random_page_cost effect of 1.1 while the table it lives
in is 1.3, the tablespace 1.4, and so on. But not required, because it
always inherits from the parent if it doesn't have one, like stats
target.

!DSPAM:40e4b98b142131356954127!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Manfred Koizar

mkoi-pg@aon.at

almost 22 years ago

In reply to: Mike Rylander (#7)

Re: Quick question regarding tablespaces

On Thu, 1 Jul 2004 22:55:56 -0400, Mike Rylander <miker@purplefrog.com>
wrote:

I was thinking of purely tablespace-based random_page_cost, as that variable
is tied to the access time of a particular filesystem.

Strictly speaking we'd also need tablespace-based sequential_page_cost.

Servus
Manfred