kind of a bag of attributes in a DB . . .
Say, you get lots of data and their corresponding metadata, which in
some cases may be undefined or undeclared (left as an empty string).
Think of youtube json files or the result of the "file" command.
I need to be able to "instantly" search that metadata and I think DBs
are best for such jobs and get some metrics out of it.
I know this is not exactly a kosher way to deal with data which can't
be represented in a nice tabular form, but I don't find the idea that
half way off either.
What is the pattern, anti-pattern or whatever relating to such design?
Do you know of such implementations with such data?
lbrtchx
On 9/7/19 5:45 AM, Albretch Mueller wrote:
Say, you get lots of data and their corresponding metadata, which in
some cases may be undefined or undeclared (left as an empty string).
Think of youtube json files or the result of the "file" command.I need to be able to "instantly" search that metadata and I think DBs
are best for such jobs and get some metrics out of it.
Is the metadata uniform or are you dealing with a variety of different data?
I know this is not exactly a kosher way to deal with data which can't
be represented in a nice tabular form, but I don't find the idea that
half way off either.What is the pattern, anti-pattern or whatever relating to such design?
Do you know of such implementations with such data?
lbrtchx
--
Adrian Klaver
adrian.klaver@aklaver.com
On Sat, Sep 7, 2019 at 5:17 PM Albretch Mueller <lbrtchx@gmail.com> wrote:
Say, you get lots of data and their corresponding metadata, which in
some cases may be undefined or undeclared (left as an empty string).
Think of youtube json files or the result of the "file" command.I need to be able to "instantly" search that metadata and I think DBs
are best for such jobs and get some metrics out of it.I know this is not exactly a kosher way to deal with data which can't
be represented in a nice tabular form, but I don't find the idea that
half way off either.What is the pattern, anti-pattern or whatever relating to such design?
Do you know of such implementations with such data?
We do the debug logs of JSONB with some indexing. It works in some
limited cases but you need to have a good sense of index possibilities and
how the indexes actually work.
lbrtchx
--
Best Wishes,
Chris Travers
Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor
lock-in.
http://www.efficito.com/learn_more
On 9/7/19, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
Is the metadata uniform or are you dealing with a variety of different
data?
You can expect for all files to have a filename and size, but their
kinds (the metadata describing them) can be really colorful and wild
when it comes to formatting.
lbrtchx
On 9/10/19 9:59 AM, Albretch Mueller wrote:
On 9/7/19, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
Is the metadata uniform or are you dealing with a variety of different
data?You can expect for all files to have a filename and size, but their
kinds (the metadata describing them) can be really colorful and wild
when it comes to formatting.
If there is no rhyme or reason to the metadata I am not sure how you
could come up with an efficient search strategy. Seems it would be a
brute search over everything.
lbrtchx
--
Adrian Klaver
adrian.klaver@aklaver.com
On 9/10/19, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
If there is no rhyme or reason to the metadata I am not sure how you
could come up with an efficient search strategy. Seems it would be a
brute search over everything.
Not exactly. Say some things have colours but now weight. You could
still Group them as being "weighty" and then tell about how heavy they
are, with the colorful ones you could specify the colours and then see
if there is some correlation between weights and colours ...
lbrtchx
On 9/11/19 9:46 AM, Albretch Mueller wrote:
On 9/10/19, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
If there is no rhyme or reason to the metadata I am not sure how you
could come up with an efficient search strategy. Seems it would be a
brute search over everything.Not exactly. Say some things have colours but now weight. You could
still Group them as being "weighty" and then tell about how heavy they
are, with the colorful ones you could specify the colours and then see
if there is some correlation between weights and colours ...
It would help to see some sample data, otherwise any answer would be
pure speculation.
lbrtchx
--
Adrian Klaver
adrian.klaver@aklaver.com
just download a bunch of json info files from youtube data Feeds
Actually, does postgresql has a json Driver of import feature?
the metadata contained in json files would require more than one
small databases, but such an import feature should be trivial
C
On 9/14/19 2:06 AM, Albretch Mueller wrote:
just download a bunch of json info files from youtube data Feeds
Actually, does postgresql has a json Driver of import feature?
Not sure what you mean by above?
Postgres has json(b) data types that you can import JSON into:
https://www.postgresql.org/docs/11/datatype-json.html
the metadata contained in json files would require more than one
small databases, but such an import feature should be trivial
Again, not sure I understand why small databases are required?
C
--
Adrian Klaver
adrian.klaver@aklaver.com
On 9/14/19 2:06 AM, Albretch Mueller wrote:
just download a bunch of json info files from youtube data Feeds
Actually, does postgresql has a json Driver of import feature?
I'm working without a net(coffee) and so I forgot to mention that for
Python there is:
http://initd.org/psycopg/docs/extras.html?highlight=json
Not sure if this is what you are looking for or not.
the metadata contained in json files would require more than one
small databases, but such an import feature should be trivialC
--
Adrian Klaver
adrian.klaver@aklaver.com
On Sat, Sep 14, 2019 at 5:11 PM Albretch Mueller <lbrtchx@gmail.com> wrote:
just download a bunch of json info files from youtube data Feeds
Actually, does postgresql has a json Driver of import feature?
Sort of.... There are a bunch of features around JSON and JSONB data
types which could be useful.
the metadata contained in json files would require more than one
small databases, but such an import feature should be trivial
It is not at all trivial for a bunch of reasons inherent to the JSON
specification. How to handle duplicate keys, for example.
However writing an import for JSON objects into a particular database is
indeed trivial.
C
--
Best Wishes,
Chris Travers
Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor
lock-in.
http://www.efficito.com/learn_more