[Q] single image Table across multiple PG servers

Started by V S Pover 16 years ago12 messagesgeneral
Jump to latest
#1V S P
toreason@fastmail.fm

Hello

I am researching how to store the data for easy 'user-driven'
reporting (where I do not need to develop application for
every user request).

The data will typically be number ranges and text strings with
unique Id for each row

I hope there will be a lot of data :-).

So in that anticipation I am looking for a way
to allow
SQL/ODBC access to the data

but in a way that each table resides on more than one
PG server

for example:

table 1 lives in 3 PG instances (I can partition the data by date range)
table 2 lives in the same 3 instances plus another one (because it's
bigger)

and I would like users to be able to issue SQL from within ODBC that
joins them.

I do not mind if I have to for example name tables like

tb1_srv1 and so on
but some how then have the joined with the global view -- if such
functionality exists.

That is, the users with SQL/ODBC will only be doing reads -- not updates
and some how they have to see a 'single' tables that resides on multiple
hosts.

I do not mind the complexity for the inserts (because I can program for
it) -- but selects need to be easy from things like MS Access and
Crystal reports, or for more sophisticated users from packages like 'R'

The reason why I think the data will not fit into one database,
is because I just do not have money for servers (everything is coming
out
of my small pocket) so I just want to deploy inexpensive computers
but add them as I get more data to serve.

I looked at Hypertable but there is no ODBC driver for it.

Alternatively I also looked making the host operating systems into
a single Image OS. Which appears to be possible with

http://www.kerrighed.org/wiki/index.php/Main_Page

However, I do not know if PG will even run there.
Obviously top-of the line performance for me is not necessary
but has to be not unusable.

thank you in advance
--
Vlad P
author of C++ ORM http://github.com/vladp/CppOrm/tree/master

--
http://www.fastmail.fm - A fast, anti-spam email service.

#2Scott Marlowe
scott.marlowe@gmail.com
In reply to: V S P (#1)
Re: [Q] single image Table across multiple PG servers

On Mon, Jul 13, 2009 at 11:16 PM, V S P<toreason@fastmail.fm> wrote:

Hello

I am researching how to store the data for easy 'user-driven'
reporting (where I do not need to develop application for
every user request).

The data will typically be number ranges and text strings with
unique Id for each row

I hope there will be a lot of data :-).

So in that anticipation I am looking for a way
to allow
SQL/ODBC access to the data

but in a way that each table resides on more than one
PG server

for example:

table 1 lives in 3 PG instances (I can partition the data by date range)
table 2 lives in the same 3 instances plus another one (because it's
bigger)

and I would like users to be able to issue SQL from within ODBC that
joins them.

I think that skype's skytools could be used to create such a solution,
in particular pl/proxy.

#3Ransika de Silva
ransika@gmail.com
In reply to: Scott Marlowe (#2)
Re: [Q] single image Table across multiple PG servers

Hello,
We were also in search of having a table split across multiple databases but
then found out about skypetools and at the same time the following article;
http://www.jurriaanpersyn.com/archives/2009/02/12/database-sharding-at-netlog-with-mysql-and-php/,
true that it's not done with PG, but the same thing can be done with PG as
well.

Assume this will be helpful for you.

Regards,
Ransika

On Tue, Jul 14, 2009 at 4:20 PM, Scott Marlowe <scott.marlowe@gmail.com>wrote:

On Mon, Jul 13, 2009 at 11:16 PM, V S P<toreason@fastmail.fm> wrote:

Hello

I am researching how to store the data for easy 'user-driven'
reporting (where I do not need to develop application for
every user request).

The data will typically be number ranges and text strings with
unique Id for each row

I hope there will be a lot of data :-).

So in that anticipation I am looking for a way
to allow
SQL/ODBC access to the data

but in a way that each table resides on more than one
PG server

for example:

table 1 lives in 3 PG instances (I can partition the data by date range)
table 2 lives in the same 3 instances plus another one (because it's
bigger)

and I would like users to be able to issue SQL from within ODBC that
joins them.

I think that skype's skytools could be used to create such a solution,
in particular pl/proxy.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Ransika De Silva
SCMAD 1.0, SCJP 1.4,
BSc.(Hons) Information Systems

#4V S P
toreason@fastmail.fm
In reply to: Ransika de Silva (#3)
Re: [Q] single image Table across multiple PG servers

Hi,
thank you for the links

I read through the presentation

and they did not solve the issue for me -- which presenting a e
table from multiple
shards as one single table (at least for reads) for ODBC clients.

I also do not think that skypetools do that

they have implemented essentially an API on top of their shards
(separate db servers)
that does the table querying. That means that I have to write a
separate API for every time
a user decides on a new query.

May be I misunderstood the approaches, but none of them actually
figures out how to
utilize the computing power/memory of multiple servers to satisfy
requests that spawn across
servers.

I think Oracle supports the ability to at least reference a table
in another server, I do not think
PG does that

It is possible that I have to look into the free DB2 server
offering (as the free version is exactly meant
to run on underpowered computers)

I just wanted to ask the list first.

thank you

On Tue, 14 Jul 2009 13:04 +0530, "Ransika de Silva"
<ransika@gmail.com> wrote:

Hello,

We were also in search of having a table split across multiple
databases but then found out about skypetools and at the same
time the following
article; [1]http://www.jurriaanpersyn.com/archives/2009/02/12/dat
abase-sharding-at-netlog-with-mysql-and-php/, true that it's not
done with PG, but the same thing can be done with PG as well.

Assume this will be helpful for you.

Regards,

Ransika

On Tue, Jul 14, 2009 at 4:20 PM, Scott Marlowe
<[2]scott.marlowe@gmail.com> wrote:

On Mon, Jul 13, 2009 at 11:16 PM, V S P<[3]toreason@fastmail.fm>
wrote:

Hello

I am researching how to store the data for easy 'user-driven'
reporting (where I do not need to develop application for
every user request).

The data will typically be number ranges and text strings with
unique Id for each row

I hope there will be a lot of data :-).

So in that anticipation I am looking for a way
to allow
SQL/ODBC access to the data

but in a way that each table resides on more than one
PG server

for example:

table 1 lives in 3 PG instances (I can partition the data by

date range)

table 2 lives in the same 3 instances plus another one (because

it's

bigger)

and I would like users to be able to issue SQL from within ODBC

that

joins them.

I think that skype's skytools could be used to create such a
solution,
in particular pl/proxy.

--
Sent via pgsql-general mailing list
([4]pgsql-general@postgresql.org)
To make changes to your subscription:
[5]: http://www.postgresql.org/mailpref/pgsql-general

--
Ransika De Silva
SCMAD 1.0, SCJP 1.4,
BSc.(Hons) Information Systems

References

1. http://www.jurriaanpersyn.com/archives/2009/02/12/database-sharding-at-netlog-with-mysql-and-php/
2. mailto:scott.marlowe@gmail.com
3. mailto:toreason@fastmail.fm
4. mailto:pgsql-general@postgresql.org
5. http://www.postgresql.org/mailpref/pgsql-general
--
Vlad P
author of C++ ORM http://github.com/vladp/CppOrm/tree/master

--
http://www.fastmail.fm - The professional email service

#5Kevin Kempter
kevink@consistentstate.com
In reply to: V S P (#4)
Re: [Q] single image Table across multiple PG servers

On Tuesday 14 July 2009 12:38:27 V S P wrote:

Hi,
thank you for the links

I read through the presentation

and they did not solve the issue for me -- which presenting a e
table from multiple
shards as one single table (at least for reads) for ODBC clients.

I also do not think that skypetools do that

they have implemented essentially an API on top of their shards
(separate db servers)
that does the table querying. That means that I have to write a
separate API for every time
a user decides on a new query.

May be I misunderstood the approaches, but none of them actually
figures out how to
utilize the computing power/memory of multiple servers to satisfy
requests that spawn across
servers.

I think Oracle supports the ability to at least reference a table
in another server, I do not think
PG does that

It is possible that I have to look into the free DB2 server
offering (as the free version is exactly meant
to run on underpowered computers)

I just wanted to ask the list first.

thank you

On Tue, 14 Jul 2009 13:04 +0530, "Ransika de Silva"
<ransika@gmail.com> wrote:

Hello,

We were also in search of having a table split across multiple
databases but then found out about skypetools and at the same
time the following
article; [1]http://www.jurriaanpersyn.com/archives/2009/02/12/dat
abase-sharding-at-netlog-with-mysql-and-php/, true that it's not
done with PG, but the same thing can be done with PG as well.

Assume this will be helpful for you.

Regards,

Ransika

On Tue, Jul 14, 2009 at 4:20 PM, Scott Marlowe
<[2]scott.marlowe@gmail.com> wrote:

On Mon, Jul 13, 2009 at 11:16 PM, V S P<[3]toreason@fastmail.fm>

wrote:

Hello

I am researching how to store the data for easy 'user-driven'
reporting (where I do not need to develop application for
every user request).

The data will typically be number ranges and text strings with
unique Id for each row

I hope there will be a lot of data :-).

So in that anticipation I am looking for a way
to allow
SQL/ODBC access to the data

but in a way that each table resides on more than one
PG server

for example:

table 1 lives in 3 PG instances (I can partition the data by

date range)

table 2 lives in the same 3 instances plus another one (because

it's

bigger)

and I would like users to be able to issue SQL from within ODBC

that

joins them.

I think that skype's skytools could be used to create such a
solution,
in particular pl/proxy.

--
Sent via pgsql-general mailing list
([4]pgsql-general@postgresql.org)
To make changes to your subscription:
[5]http://www.postgresql.org/mailpref/pgsql-general

--
Ransika De Silva
SCMAD 1.0, SCJP 1.4,
BSc.(Hons) Information Systems

References

1.
http://www.jurriaanpersyn.com/archives/2009/02/12/database-sharding-at-netl
og-with-mysql-and-php/ 2. mailto:scott.marlowe@gmail.com
3. mailto:toreason@fastmail.fm
4. mailto:pgsql-general@postgresql.org
5. http://www.postgresql.org/mailpref/pgsql-general
--
Vlad P
author of C++ ORM http://github.com/vladp/CppOrm/tree/master

Maybe this is what you're looking for?

http://www.enterprisedb.com/community/projects/gridsql.do

#6Kevin Kempter
kevink@consistentstate.com
In reply to: V S P (#1)
Re: [Q] single image Table across multiple PG servers

On Tuesday 14 July 2009 13:35:23 you wrote:

yes -- thank you

that's in the direction of what I am looking for

ODBC connectivity and joins across databases!

It looks like it cannot support
a single image table across databases

It's effectively a single image across nodes - in that the grid allows you to
partition a table across nodes (i.e. physical servers) so any sql see's it as
a single table so long as you send the sql to the grid controller

But do you know if supports viewes that combines tables
from different servers (this way
I can prefix a table on each server with server_id and then
just combine them in the view in a single Image table)

no need for this, the system presents the clients (via the controller) a
single table - even though its actually partitioned across nodes

It's quite similar to the Informix XPS product if that helps

Show quoted text

I am downloading it now and will try out on my windows dev
machine.

thanks

Maybe this is what you're looking for?

http://www.enterprisedb.com/community/projects/gridsql.do

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Vlad P
author of C++ ORM http://github.com/vladp/CppOrm/tree/master

#7Scott Mead
scott.lists@enterprisedb.com
In reply to: V S P (#1)
Re: [Q] single image Table across multiple PG servers

On Tue, Jul 14, 2009 at 1:16 AM, V S P <toreason@fastmail.fm> wrote:

Hello

I am researching how to store the data for easy 'user-driven'
reporting (where I do not need to develop application for
every user request).

The data will typically be number ranges and text strings with
unique Id for each row

I hope there will be a lot of data :-).

So in that anticipation I am looking for a way
to allow
SQL/ODBC access to the data

but in a way that each table resides on more than one
PG server

Take a look at open-source GridSQL:http://sourceforge.net/projects/gridsql/

It'll do what you want and provide parallel query across your nodes.

--Scott

#8Greg Sabino Mullane
greg@turnstep.com
In reply to: V S P (#1)
Re: [Q] single image Table across multiple PG servers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

The reason why I think the data will not fit into one database,
is because I just do not have money for servers (everything is coming
out of my small pocket) so I just want to deploy inexpensive computers
but add them as I get more data to serve.

I think you might be suffering from premature optimization. Or just wildly
optimistic on your hardware needs. Postgres scales vertically extremely
well, so I'd try out a single server and add sharding complexity only
as a last resort. For scaling, you can use pgbouncer, spread the
tables and indexes across different tablespaces, and other tricks. If
money is tight, you might look into using something like EC2.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200907141552
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkpc4ioACgkQvJuQZxSWSsga/gCfQUp+AHnX1myAO6hYpnmIMk+8
5ZIAoKAOsJepWnavWwVMkdb2h4eOfYt5
=En0J
-----END PGP SIGNATURE-----

#9Scott Marlowe
scott.marlowe@gmail.com
In reply to: V S P (#4)
Re: [Q] single image Table across multiple PG servers

On Tue, Jul 14, 2009 at 12:38 PM, V S P<toreason@fastmail.fm> wrote:

Hi,
thank you for the links

I read through the presentation

and they did not solve the issue for me -- which presenting a e table from
multiple
shards as one single table (at least for reads) for ODBC clients.

I also do not think that skypetools do that

they have implemented essentially an API on top of their shards (separate db
servers)
that does the table querying.  That means that I have to write a separate
API for every time
a user decides on a new query.

Just build a view that encloses a plproxy function.

#10Kevin Kempter
kevink@consistentstate.com
In reply to: V S P (#1)
Re: [Q] single image Table across multiple PG servers

On Tuesday 14 July 2009 14:16:33 you wrote:

This is great,
thank you

I have to say, if this is a free-of-charge
add on to postgres that works and utilizes the hardware on each server
to perform
the join,

-- why would anybody need to use mapreduce/hadoop/etc?
for database-like selects ?

It's new - the grid project just moved out of beta i believe. However I know
of at least one commercial company using it in production with no issues

Show quoted text

On Tue, 14 Jul 2009 09:45 -0600, "Kevin Kempter"

<kevink@consistentstate.com> wrote:

But do you know if supports viewes that combines tables
from different servers (this way
I can prefix a table on each server with server_id and then
just combine them in the view in a single Image table)

no need for this, the system presents the clients (via the controller) a
single table - even though its actually partitioned across nodes

It's quite similar to the Informix XPS product if that helps

--
Vlad P
author of C++ ORM http://github.com/vladp/CppOrm/tree/master

#11Michael Gould
mgould@intermodalsoftwaresolutions.net
In reply to: Kevin Kempter (#10)
Custom Class variables

I have created the following in my postgres.conf file

custom_variable_classes = 'iss'

In a SQL session I've tried

Set iss.one = '1'
set iss.two = '2'

Select * from iss;

How do I access the values from the custom class in sql code?

Best Regards

Michael Gould
--
Michael Gould, Managing Partner
Intermodal Software Solutions, LLC
904.226.0978
904.592.5250 fax

#12Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Gould (#11)
Re: Custom Class variables

Michael Gould wrote:

I have created the following in my postgres.conf file

custom_variable_classes = 'iss'

In a SQL session I've tried

Set iss.one = '1'
set iss.two = '2'

Select * from iss;

How do I access the values from the custom class in sql code?

show iss.one;
select current_setting('iss.one');

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support