using hstore to store documents

Started by Ritaabout 9 years ago11 messagesgeneral
Jump to latest
#1Rita
rmorgan466@gmail.com

Is it possible to use hstore to store xml document?

I would like to have

"status" => "<xml>..."::xml

-- 
--- Get your facts first, then you can distort them as you please.--
#2Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Rita (#1)
Re: using hstore to store documents

On 01/26/2017 09:08 AM, Rita wrote:

Is it possible to use hstore to store xml document?

I would like to have

"status" => "<xml>..."::xml

Keys/values in hstore are simply text strings, so you can store anything
you want formatted as a string.

Not sure if that's what you're looking for though, you'll probably need
to explain what you're trying to achieve to get better answers.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Rita
rmorgan466@gmail.com
In reply to: Tomas Vondra (#2)
Re: using hstore to store documents

of course, sorry for being vague.

I have an external process generating a XML file (every 30 secs) which is
about 10MB. I would like to store the file as XML type for me to query
using xpath. I plan to query it every few seconds by few hundred clients.
so, it maybe easier for me create a separate table of my xpath results and
have clients query that table (xpath can be expensive).

So, I was wondering if hstore will help me for fast access.

Please let me know if anything is unclear.

On Thu, Jan 26, 2017 at 3:58 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com>
wrote:

On 01/26/2017 09:08 AM, Rita wrote:

Is it possible to use hstore to store xml document?

I would like to have

"status" => "<xml>..."::xml

Keys/values in hstore are simply text strings, so you can store anything
you want formatted as a string.

Not sure if that's what you're looking for though, you'll probably need to
explain what you're trying to achieve to get better answers.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
--- Get your facts first, then you can distort them as you please.--
#4Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rita (#3)
Re: using hstore to store documents

On 01/26/2017 04:37 AM, Rita wrote:

of course, sorry for being vague.

I have an external process generating a XML file (every 30 secs) which
is about 10MB. I would like to store the file as XML type for me to

Does this mean you are using this:

https://www.postgresql.org/docs/9.6/static/datatype-xml.html

to store the actual XML?

query using xpath. I plan to query it every few seconds by few hundred
clients. so, it maybe easier for me create a separate table of my xpath
results and have clients query that table (xpath can be expensive).

And you want to use hstore to store the search results?

So, I was wondering if hstore will help me for fast access.

Please let me know if anything is unclear.

On Thu, Jan 26, 2017 at 3:58 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:

On 01/26/2017 09:08 AM, Rita wrote:

Is it possible to use hstore to store xml document?

I would like to have

"status" => "<xml>..."::xml

Keys/values in hstore are simply text strings, so you can store
anything you want formatted as a string.

Not sure if that's what you're looking for though, you'll probably
need to explain what you're trying to achieve to get better answers.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
--- Get your facts first, then you can distort them as you please.--

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rita (#3)
Re: using hstore to store documents

On 01/26/2017 04:37 AM, Rita wrote:

of course, sorry for being vague.

I have an external process generating a XML file (every 30 secs) which
is about 10MB. I would like to store the file as XML type for me to
query using xpath. I plan to query it every few seconds by few hundred
clients. so, it maybe easier for me create a separate table of my xpath
results and have clients query that table (xpath can be expensive).

So, I was wondering if hstore will help me for fast access.

Please let me know if anything is unclear.

Realized it might help if you provide the schema of the tables you are
using or plan to use.

On Thu, Jan 26, 2017 at 3:58 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:

On 01/26/2017 09:08 AM, Rita wrote:

Is it possible to use hstore to store xml document?

I would like to have

"status" => "<xml>..."::xml

Keys/values in hstore are simply text strings, so you can store
anything you want formatted as a string.

Not sure if that's what you're looking for though, you'll probably
need to explain what you're trying to achieve to get better answers.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
--- Get your facts first, then you can distort them as you please.--

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Rita (#3)
Re: using hstore to store documents

On Thu, Jan 26, 2017 at 5:37 AM, Rita <rmorgan466@gmail.com> wrote:

of course, sorry for being vague.

I have an external process generating a XML file (every 30 secs) which is
about 10MB. I would like to store the file as XML type for me to query
using xpath. I plan to query it every few seconds by few hundred clients.
so, it maybe easier for me create a separate table of my xpath results and
have clients query that table (xpath can be expensive).

​If the XML being generated has a fixed structure/schema I personally would
treat the XML as a serialization format and de-serialize and store it in a
database as one or more relationally linked tables.​ If you have to deal
with the possibility of dynamic structure I would still try to put the
fixed items into individual columns and then and then any dynamic items
could be stuffed into an hstore typed table.

My answer to your stated question is: what happened when you tried doing
that? Documentation and a bit of experimentation goes a long ways in
learning.

David J.

#7Rita
rmorgan466@gmail.com
In reply to: David G. Johnston (#6)
Re: using hstore to store documents

sorry for the late reply.

My table schema is very simple

DROP TABLE xmltest;

create table xmltest(

id serial, -- dont really need the serial

data xml
NOT null

);

INSERT INTO xmltest (data, id) VALUES ('

<attendee>

<bio>

<name>John Doe</name>

<birthYear>1986</birthYear>

</bio>

<languages>

<lang level="5">php</lang>

<lang level="4">python</lang>

<lang level="2">java</lang>

</languages>

</attendee>', 1);

I really don't need the serial but every 30 seconds or so I plan to
overwrite the data portion. so, I suppose I wanted a simple key/value thats
where my orignal question stemmed from.

After xmltest has been populated, I can run xpath and unest to get my data
into a row but I would like to store that result in another table, I am
guessing I should look into triggers for something like that? Eventually, I
plan to have 5-6 downstream tables which will have xmltest as my head. The
application will be accessing the downstream tables and rarely be touching
xmltest (head table).

On Thu, Jan 26, 2017 at 10:38 AM, David G. Johnston <
david.g.johnston@gmail.com> wrote:

On Thu, Jan 26, 2017 at 5:37 AM, Rita <rmorgan466@gmail.com> wrote:

of course, sorry for being vague.

I have an external process generating a XML file (every 30 secs) which is
about 10MB. I would like to store the file as XML type for me to query
using xpath. I plan to query it every few seconds by few hundred clients.
so, it maybe easier for me create a separate table of my xpath results and
have clients query that table (xpath can be expensive).

​If the XML being generated has a fixed structure/schema I personally
would treat the XML as a serialization format and de-serialize and store it
in a database as one or more relationally linked tables.​ If you have to
deal with the possibility of dynamic structure I would still try to put the
fixed items into individual columns and then and then any dynamic items
could be stuffed into an hstore typed table.

My answer to your stated question is: what happened when you tried doing
that? Documentation and a bit of experimentation goes a long ways in
learning.

David J.

-- 
--- Get your facts first, then you can distort them as you please.--
#8Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rita (#7)
Re: using hstore to store documents

On 01/28/2017 05:57 PM, Rita wrote:

sorry for the late reply.

My table schema is very simple

DROP TABLE xmltest;

create table xmltest(

id serial, -- dont really need the serial

Maybe not a serial id, but a Primary Key of some sort would help with
what you say you want to do below, I think.

data xml NOT null

);

INSERT INTO xmltest (data, id) VALUES ('

<attendee>

<bio>

<name>John Doe</name>

<birthYear>1986</birthYear>

</bio>

<languages>

<lang level="5">php</lang>

<lang level="4">python</lang>

<lang level="2">java</lang>

</languages>

</attendee>', 1);

I really don't need the serial but every 30 seconds or so I plan to
overwrite the data portion. so, I suppose I wanted a simple key/value
thats where my orignal question stemmed from.

Why do you want to overwrite the data if you plan to refer to it below?

After xmltest has been populated, I can run xpath and unest to get my
data into a row but I would like to store that result in another table,
I am guessing I should look into triggers for something like that?
Eventually, I plan to have 5-6 downstream tables which will have xmltest

Why 5-6 tables?

Are they each holding some subset of data?

A schematic representation of what you are thinking of doing would help
with developing an answer to your question.

as my head. The application will be accessing the downstream tables and
rarely be touching xmltest (head table).

On Thu, Jan 26, 2017 at 10:38 AM, David G. Johnston
<david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>> wrote:

On Thu, Jan 26, 2017 at 5:37 AM, Rita <rmorgan466@gmail.com
<mailto:rmorgan466@gmail.com>>wrote:

of course, sorry for being vague.

I have an external process generating a XML file (every 30 secs)
which is about 10MB. I would like to store the file as XML type
for me to query using xpath. I plan to query it every few
seconds by few hundred clients. so, it maybe easier for me
create a separate table of my xpath results and have clients
query that table (xpath can be expensive).

​If the XML being generated has a fixed structure/schema I
personally would treat the XML as a serialization format and
de-serialize and store it in a database as one or more relationally
linked tables.​ If you have to deal with the possibility of dynamic
structure I would still try to put the fixed items into individual
columns and then and then any dynamic items could be stuffed into an
hstore typed table.

My answer to your stated question is: what happened when you tried
doing that? Documentation and a bit of experimentation goes a long
ways in learning.

David J.

--
--- Get your facts first, then you can distort them as you please.--

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Rita
rmorgan466@gmail.com
In reply to: Adrian Klaver (#8)
Re: using hstore to store documents

I want to over write the data because I don't need to keep its history.
Yes, the 5-6 table will be the subset of the data. The subset of the data
will all come from xpath. I don't want to keep doing an xpath query when a
client requests the data.

On Sun, Jan 29, 2017 at 5:42 PM, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 01/28/2017 05:57 PM, Rita wrote:

sorry for the late reply.

My table schema is very simple

DROP TABLE xmltest;

create table xmltest(

id serial, -- dont really need the serial

Maybe not a serial id, but a Primary Key of some sort would help with what
you say you want to do below, I think.

data xml NOT null

);

INSERT INTO xmltest (data, id) VALUES ('

<attendee>

<bio>

<name>John Doe</name>

<birthYear>1986</birthYear>

</bio>

<languages>

<lang level="5">php</lang>

<lang level="4">python</lang>

<lang level="2">java</lang>

</languages>

</attendee>', 1);

I really don't need the serial but every 30 seconds or so I plan to
overwrite the data portion. so, I suppose I wanted a simple key/value
thats where my orignal question stemmed from.

Why do you want to overwrite the data if you plan to refer to it below?

After xmltest has been populated, I can run xpath and unest to get my
data into a row but I would like to store that result in another table,
I am guessing I should look into triggers for something like that?
Eventually, I plan to have 5-6 downstream tables which will have xmltest

Why 5-6 tables?

Are they each holding some subset of data?

A schematic representation of what you are thinking of doing would help
with developing an answer to your question.

as my head. The application will be accessing the downstream tables and

rarely be touching xmltest (head table).

On Thu, Jan 26, 2017 at 10:38 AM, David G. Johnston
<david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>> wrote:

On Thu, Jan 26, 2017 at 5:37 AM, Rita <rmorgan466@gmail.com
<mailto:rmorgan466@gmail.com>>wrote:

of course, sorry for being vague.

I have an external process generating a XML file (every 30 secs)
which is about 10MB. I would like to store the file as XML type
for me to query using xpath. I plan to query it every few
seconds by few hundred clients. so, it maybe easier for me
create a separate table of my xpath results and have clients
query that table (xpath can be expensive).

​If the XML being generated has a fixed structure/schema I
personally would treat the XML as a serialization format and
de-serialize and store it in a database as one or more relationally
linked tables.​ If you have to deal with the possibility of dynamic
structure I would still try to put the fixed items into individual
columns and then and then any dynamic items could be stuffed into an
hstore typed table.

My answer to your stated question is: what happened when you tried
doing that? Documentation and a bit of experimentation goes a long
ways in learning.

David J.

--
--- Get your facts first, then you can distort them as you please.--

--
Adrian Klaver
adrian.klaver@aklaver.com

-- 
--- Get your facts first, then you can distort them as you please.--
#10David G. Johnston
david.g.johnston@gmail.com
In reply to: Rita (#7)

On Saturday, January 28, 2017, Rita <rmorgan466@gmail.com> wrote:

After xmltest has been populated, I can run xpath and unest to get my data
into a row but I would like to store that result in another table, I am
guessing I should look into triggers for something like that?

I suspect that using xpath in the database is not the right tool for doing
what you describe. Whatever middleware layer receives the XML should be
considered for the logic of deserialization and storage to the database in
normalized form. If you do want something like that in the database I'd
probably write a volatile function the receives xml and does whatever it
needs to do. I cannot imagine the indirection of a trigger would be
gainful here.

In particular, at scale you'd probably be better off with using a streaming
parser instead of a DOM one.

David J.

#11Rita
rmorgan466@gmail.com
In reply to: David G. Johnston (#10)
Re: using hstore to store documents

I think Xpath in postgresql is fast enough. I am dumping the raw xml file
into a table and then generating data from it. I want all the data to be
consistent thats why I am using a database.

I am planning to use triggers to generate content of
table1_master...table2_master, etc...

master -> table 1_master
-> table 2_master
-> table 3_master -> table 1_table3

On Sun, Jan 29, 2017 at 9:45 PM, David G. Johnston <
david.g.johnston@gmail.com> wrote:

On Saturday, January 28, 2017, Rita <rmorgan466@gmail.com> wrote:

After xmltest has been populated, I can run xpath and unest to get my
data into a row but I would like to store that result in another table, I
am guessing I should look into triggers for something like that?

I suspect that using xpath in the database is not the right tool for doing
what you describe. Whatever middleware layer receives the XML should be
considered for the logic of deserialization and storage to the database in
normalized form. If you do want something like that in the database I'd
probably write a volatile function the receives xml and does whatever it
needs to do. I cannot imagine the indirection of a trigger would be
gainful here.

In particular, at scale you'd probably be better off with using
a streaming parser instead of a DOM one.

David J.

-- 
--- Get your facts first, then you can distort them as you please.--