which is better- storing data as array or json?

Started by zach cruiseabout 11 years ago5 messagesgeneral
Jump to latest
#1zach cruise
zachc1980@gmail.com

for indexing, accessing, filtering and searching?

as simple array-
first name | last name | nicknames
tom | jerry | {cat}, {mouse}

as multi-dimensional array-
first name | last name | nicknames
tom | jerry | {cat, kat}, {mouse, mice}

as simple json-
first name | last name | nicknames
tom | jerry | {"public": "cat", "private": "mouse"}

as multi-nested json-
first name | last name | nicknames
tom | jerry | {"public": {"first": "cat", "second": "kat"},
"private": {"first": "mouse", "second": "mice"}}

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: zach cruise (#1)
Re: which is better- storing data as array or json?

zach cruise wrote

for indexing, accessing, filtering and searching?

as simple array-
first name | last name | nicknames
tom | jerry | {cat}, {mouse}

as multi-dimensional array-
first name | last name | nicknames
tom | jerry | {cat, kat}, {mouse, mice}

as simple json-
first name | last name | nicknames
tom | jerry | {"public": "cat", "private": "mouse"}

as multi-nested json-
first name | last name | nicknames
tom | jerry | {"public": {"first": "cat", "second": "kat"},
"private": {"first": "mouse", "second": "mice"}}

The choice of proper model depends on how you intend to make use of it.

That said, I'd go with "none of the above" by default.

My first reaction in this scenario would be to create a nicknames table:

[nick_person_id, person_id, nick_name, nick_scope, nick_scope_order]

You could maybe normalize further by having a nickname table with integer
keys that then end up as FKs on this many-to-many relation.

An array is too complicated given the fact you need to track attributes on
the nicknames. You could possible do an array over a composite type but I'm
not sure how indexing and searching would fare in that setup.

Why are you even considering storing the information in JSON? The answer to
that question would make it more obvious whether that solution is viable but
do you really want any application that makes use of this data to have to
speak JSON to do so when the time-tested relational model can likely give
you everything you need - and probably more. Even if you had to serialize
the data to and from JSON I would say that storing the data in that format
to avoid the serializing is an instance of pre-mature optimization.

David J.

--
View this message in context: http://postgresql.nabble.com/which-is-better-storing-data-as-array-or-json-tp5838358p5838362.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3zach cruise
zachc1980@gmail.com
In reply to: David G. Johnston (#2)
Re: which is better- storing data as array or json?

i can't keep creating tables or adding columns every time i need to
add a nickname- this happens a lot.

so i want to put everything in an array or json.

remember rows can have different number of nicknames.

On 2/17/15, David G Johnston <david.g.johnston@gmail.com> wrote:

zach cruise wrote

for indexing, accessing, filtering and searching?

as simple array-
first name | last name | nicknames
tom | jerry | {cat}, {mouse}

as multi-dimensional array-
first name | last name | nicknames
tom | jerry | {cat, kat}, {mouse, mice}

as simple json-
first name | last name | nicknames
tom | jerry | {"public": "cat", "private": "mouse"}

as multi-nested json-
first name | last name | nicknames
tom | jerry | {"public": {"first": "cat", "second": "kat"},
"private": {"first": "mouse", "second": "mice"}}

The choice of proper model depends on how you intend to make use of it.

That said, I'd go with "none of the above" by default.

My first reaction in this scenario would be to create a nicknames table:

[nick_person_id, person_id, nick_name, nick_scope, nick_scope_order]

You could maybe normalize further by having a nickname table with integer
keys that then end up as FKs on this many-to-many relation.

An array is too complicated given the fact you need to track attributes on
the nicknames. You could possible do an array over a composite type but
I'm
not sure how indexing and searching would fare in that setup.

Why are you even considering storing the information in JSON? The answer
to
that question would make it more obvious whether that solution is viable
but
do you really want any application that makes use of this data to have to
speak JSON to do so when the time-tested relational model can likely give
you everything you need - and probably more. Even if you had to serialize
the data to and from JSON I would say that storing the data in that format
to avoid the serializing is an instance of pre-mature optimization.

David J.

--
View this message in context:
http://postgresql.nabble.com/which-is-better-storing-data-as-array-or-json-tp5838358p5838362.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4John R Pierce
pierce@hogranch.com
In reply to: zach cruise (#3)
Re: which is better- storing data as array or json?

On 2/17/2015 8:00 PM, zach cruise wrote:

i can't keep creating tables or adding columns every time i need to
add a nickname- this happens a lot.

so i want to put everything in an array or json.

remember rows can have different number of nicknames.

david was suggesting a join table.

one table with names (id, first name, last name, other attributes...)
with one row per person

and another table with (id, names_id, nickname, nickname attribute) with
as many rows as there are nicks and users...

you'd join these to get all the nicks for a given name, or to get all
the names for a given nickname, etc.

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: zach cruise (#3)
Re: which is better- storing data as array or json?

On Tue, Feb 17, 2015 at 9:00 PM, zach cruise <zachc1980@gmail.com> wrote:

i can't keep creating tables

​Where did you get this idea?​

or adding columns

​Of course not...​

every time i need to

add a nickname- this happens a lot.

​OK​...

so i want to put everything in an array or json.

​Those are not the only two solutions...​

remember rows can have different number of nicknames.

​Which is why you setup a "one-to-many (optional)" relationship between two
tables; the first table's row (i.e., person)​ can have many related rows
(i.e., nicknames) on the second table.

​Each nickname a person has is represented by a single row on the
"person-nickname" table with whatever attributes you wish to keep track
of. New nickname means you add a new row - just like you would add a new
array item to your JSON model.

If you need a truly dynamic representation (i.e, you do not know what
nickname attributes you want to keep track of - like visibility and order
as already shown - or want to allow users to add their own) you'd have to
go with JSON (or add an hstore column to the nickname table); the array
will not get you want you need​

​because multi-dimensional arrays are not a solution.

The solution proposed solves the "variable number of nicknames per person"
need that you describe. If there are other constraints you are dealing
with you need to list them.​​


​David J.​