Database Scalability

Started by Saurav Sarkarover 4 years ago8 messagesgeneral
Jump to latest
#1Saurav Sarkar
saurav.sarkar1@gmail.com

Hi All,

We are using Amazon RDS PostgreSQL.
We have some multi-tenant solutions which are separating the tenant data in
Postgresql mainly in the following manner.

1. Using different schemas
2. Using different tables for different tenants.

I was just wondering what would be the best way to scale in case no. of
schemas or no. of tables limit increase and probably cross the limits of
PostgreSQL.

I understand that no. of schemas are unlimited in DB as per

/messages/by-id/24478.1109618520@sss.pgh.pa.us

and some limits on tables are mentioned here

https://www.postgresql.org/docs/12/limits.html

In case of table separated tenancy some tables might have more data than
other tables. In the case of the table itself growing in size I can look
for partitioning the data.

In the case of limits being reached in terms of tables and schemas (if any
) I understand that database sharding or horizontal scaling over multiple
database instances could be one solution. It will also help in having
more database connections/resources for an ever increasing number of
tenants . Obviously with the headache of defining a balanced sharding
strategy and maintaining it.

Thanks and Best Regards,

Saurav

#2Ben
bench@silentmedia.com
In reply to: Saurav Sarkar (#1)
Re: Database Scalability

Saurav Sarkar wrote on 11/29/21 10:13 PM:

Hi All,

We have some multi-tenant solutions which are separating the tenant
data in Postgresql mainly in the following manner.

1. Using different schemas
2. Using different tables for different tenants.

Without more details it's impossible to give you a detailed answer, so,
in general.... if you are breaking out your client data by schema, you
will likely be fine. We have used this method with great success to
scale our customer workload - each customer gets their own schema, which
we can then rebalance between databases as those clients drive more load
over time. For those tables that we want partitioned, we simply
partition them in every schema. That's arguably inefficient but we find
the schema consistency to be an overall win.

(For context, when I say we have used this method with great success, we
have over 13k customers, almost a PB of data, peak around 1.5M iops, and
it's all painless to operate.)

#3Saurav Sarkar
saurav.sarkar1@gmail.com
In reply to: Ben (#2)
Re: Database Scalability

Hi Ben,

Thanks a lot for your reply.

So are all the schemas on one DB or are distributed/sharded across multiple
DBs ?

Best Regards,
Saurav

On Tue, Nov 30, 2021 at 11:43 PM Ben Chobot <bench@silentmedia.com> wrote:

Show quoted text

Saurav Sarkar wrote on 11/29/21 10:13 PM:

Hi All,

We have some multi-tenant solutions which are separating the tenant data
in Postgresql mainly in the following manner.

1. Using different schemas
2. Using different tables for different tenants.

Without more details it's impossible to give you a detailed answer, so, in
general.... if you are breaking out your client data by schema, you will
likely be fine. We have used this method with great success to scale our
customer workload - each customer gets their own schema, which we can then
rebalance between databases as those clients drive more load over time. For
those tables that we want partitioned, we simply partition them in every
schema. That's arguably inefficient but we find the schema consistency to
be an overall win.

(For context, when I say we have used this method with great success, we
have over 13k customers, almost a PB of data, peak around 1.5M iops, and
it's all painless to operate.)

#4Mladen Gogala
gogala.mladen@gmail.com
In reply to: Saurav Sarkar (#3)
Re: Database Scalability

On 11/30/21 22:08, Saurav Sarkar wrote:

Hi Ben,

Thanks a lot for your reply.

So are all the schemas on one DB or are distributed/sharded across
multiple DBs ?

Best Regards,
Saurav

To my knowledge PostgreSQL doesn't support sharding, which is well and
good because sharding is mostly useless, at least in my opinion.

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

#5Michael Stephenson
domehead100@gmail.com
In reply to: Mladen Gogala (#4)
Re: Database Scalability

Store a connection string for each tenant or the metadata to build one on demand. Then each tenant is in its own schema on whatever database instance the connection string points at. Then it doesn’t really matter how you spread your tenants across one database or many; just do whatever works best and adjust as needed.

~ Mike

#6Ben
bench@silentmedia.com
In reply to: Saurav Sarkar (#3)
Re: Database Scalability

Saurav Sarkar wrote on 11/30/21 7:08 PM:

So are all the schemas on one DB or are distributed/sharded across
multiple DBs ?

In our use case, every db entirely homes one or more schemas. Some dbs
host many schemas for small customers, some dbs host a handful of
schemas for medium customers, and some dbs are dedicated to a single
large customer. If the customer is very large, we get a bigger db than
normal for them. So theoretically we still have a problem of vertical
scale but in practice it's rarely an issue and when it is, we can
usually split that single customer into multiple, at the application layer.

#7Ben
bench@silentmedia.com
In reply to: Mladen Gogala (#4)
Re: Database Scalability

Mladen Gogala wrote on 11/30/21 7:52 PM:

To my knowledge PostgreSQL doesn't support sharding, which is well and
good because sharding is mostly useless, at least in my opinion.

OK I'll bite.

Not only does PostgreSQL natively support table partitioning (which is
absolutely a form of sharding), there multiple well-regarded extensions
that can help with sharding, all of which are orthogonal to how you can
configure your application to use Postgres in the first place. So to say
Postgres doesn't support sharding is.... misleading, at best.

Also, the general concept of sharding to move your scaling challenges
from vertical ones to horizontal ones has multiple self-evident
advantages. If your work history has all happened to fit on a single
server, then bully for you, but not everybody has it so easy.

#8SQL Padawan
sql_padawan@protonmail.com
In reply to: Ben (#7)
Re: Database Scalability

To my knowledge PostgreSQL doesn't support sharding, which is well and

good because sharding is mostly useless, at least in my opinion.

Not only does PostgreSQL natively support table partitioning (which is

absolutely a form of sharding), there multiple well-regarded extensions

that can help with sharding, all of which are orthogonal to how you can

configure your application to use Postgres in the first place. So to say

Postgres doesn't support sharding is.... misleading, at best.

Also, the general concept of sharding to move your scaling challenges

from vertical ones to horizontal ones has multiple self-evident

advantages. If your work history has all happened to fit on a single

server, then bully for you, but not everybody has it so easy.

It supports partitioning out of the box - not sharding where different tables reside on different machines!

CitusData and TimescaleDB provide sharding as extensions - both of which appear useful for TimeSeries data. There was PostgresXL which was a general sharding (multi-machine) solution that appears to have died.

SQLP!