design for multiple time series
Hi,
I'm working on the design of a database for time series data collected
by a variety of meteorological sensors. Many sensors share the same
sampling scheme, but not all. I initially thought it would be a good
idea to have a table identifying each parameter (variable) that the
sensors report on:
CREATE TABLE parameters (
parameter_id serial PRIMARY KEY,
parameter_name character_varying(200) NOT NULL,
...
)
and then store the data in a table referencing it:
CREATE TABLE series (
record_id serial PRIMARY KEY,
parameter_id integer REFERENCES parameters,
reading ????
...
)
but of course, the data type for the parameters may vary, so it's
impossible to assign a data type to the "reading" column. The number of
variables measured by the sensors is quite large and may grow or
decrease over time, and grouping them into subjects (tables) is not
clear, so it's not simple to just assign them to different columns.
I've been trying to search for solutions in various sources, but am
having trouble finding relevant material. I'd appreciate any advice.
Cheers,
--
Seb
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Fri, Dec 13, 2013 at 12:15 AM, Seb <spluque@gmail.com> wrote:
Hi,
I'm working on the design of a database for time series data collected
by a variety of meteorological sensors. Many sensors share the same
sampling scheme, but not all. I initially thought it would be a good
idea to have a table identifying each parameter (variable) that the
sensors report on:CREATE TABLE parameters (
parameter_id serial PRIMARY KEY,
parameter_name character_varying(200) NOT NULL,
...
)and then store the data in a table referencing it:
CREATE TABLE series (
record_id serial PRIMARY KEY,
parameter_id integer REFERENCES parameters,
reading ????
...
)but of course, the data type for the parameters may vary, so it's
impossible to assign a data type to the "reading" column. The number of
variables measured by the sensors is quite large and may grow or
decrease over time, and grouping them into subjects (tables) is not
clear, so it's not simple to just assign them to different columns.I've been trying to search for solutions in various sources, but am
having trouble finding relevant material. I'd appreciate any advice.Cheers,
--
Seb--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
If you are not keen on using PostgreSQL, you could have a look at
http://opentsdb.net/
That was one project we found interesting when we were faced with a similar
problem a couple of years ago. In the end, many other factors made us opt
for Cassandra. We started with PostgreSQL. But our requirements included,
among others, ability to add new devices/parameters quickly. So the
persistence layer was mostly a data sink and we planned to move
cleansed/aggregated data to PostgreSQL for analysis. Most of the master
data was also in PostgreSQL - devicies, parameters, units.
O
n Thu, 2013-12-12 at 12:45 -0600, Seb wrote:I
'm working on the design of a database for time series data collected
--
Et in Arcadia, ego.
Floripa -- city of Land Rovers and alligators swimming in creeks.
sampling scheme, but not all. I initially thought it would be a good
idea to have a table identifying each parameter (variable) that the
sensors report on:CREATE TABLE parameters (
parameter_id serial PRIMARY KEY,
parameter_name character_varying(200) NOT NULL,
...
)and then store the data in a table referencing it:
CREATE TABLE series (
record_id serial PRIMARY KEY,
parameter_id integer REFERENCES parameters,
reading ????
...
)but of course, the data type for the parameters may vary, so it's
impossible to assign a data type to the "reading" column. The number of
variables measured by the sensors is quite large and may grow or
decrease over time, and grouping them into subjects (tables) is not
clear, so it's not simple to just assign them to different columns.I've been trying to search for solutions in various sources, but am
having trouble finding relevant material. I'd appreciate any advice.Cheers,
--
Seb
Hello Seb,
I am not a meteorologist and don't know "how" your sensors function, so
please bear with me. I am assuming each sensor reading consists of an
identifier, a timestamp and a numeric value.
As a first cut:-
1) a table to hold type of sensor and its reading metric. E.g, degrees
celsius, kph, litres/minute, etc. Maybe also hold min and max ranges for
validation purposes. E.g. wind direction would have a range of zero to
359.
2) a table to hold geographic location of sensor with a FK pointing to
its type. You could hold its latitude and longitude. Its "identifier"
matches the identifier returned by a reading.
3) a table to hold the readings with a FK pointing to its geographical
location with the actual reading held in a NUMBER(7,3) column, say?
4) a view over these tables using straightforward cartesian joins.
Use the view for analysis.
E.g., if you were going to build an airfield you know the location so
you can graph your wind rose by obtaining wind direction and velocity
each day for 'n' days.
The only fly in the ointment with this is a rain gauge. If you don't
empty it each day the actual rainfall is the difference between
readings.
HTH.
Cheers,
Robert
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 12/13/2013 4:46 AM, rob stone wrote:
The only fly in the ointment with this is a rain gauge. If you don't
empty it each day the actual rainfall is the difference between
readings.
(somewhat off topic)
The electronic rain gauges I've seen have all been tip-bucket. they
measure each 0.01" (or equiv metric unit) at a time, and the rain total
is counter based, the weather station software automatically rolls over
daily, weekly, monthly totals, also tracks 'last 24 hours' which is a
rolling total.
the one I have has a 'teeter totter' rocker, each side of it is a
'bucket', when 0.01" of precip. comes down the funnel into the currently
high side of the teeter, it flips over to the other side, dumping that
0.01" and a magnet on the side of the teeter registers a pulse on a
nearby coil. now the other side fills, and it tips back
we've had 1.40" total at my house since the rainy season started in
August 1. we should be getting that much or more weekly this time of
year. this will be the 3rd year of draught on the central coast ;(((
best design for multiple time series *I've* seen is RRDTOOL, which is an
updated version of the classic MRTG. this is /not/ SQL.
--
john r pierce 37N 122W
somewhere on the middle of the left coast
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 12/13/2013 4:46 AM, rob stone wrote:
The only fly in the ointment with this is a rain gauge. If you don't
empty it each day the actual rainfall is the difference between
readings.
(somewhat off topic)
The electronic rain gauges I've seen have all been tip-bucket. they
measure each 0.01" (or equiv metric unit) at a time, and the rain total
is counter based, the weather station software automatically rolls over
daily, weekly, monthly totals, also tracks 'last 24 hours' which is a
rolling total.
the one I have has a 'teeter totter' rocker, each side of it is a
'bucket', when 0.01" of precip. comes down the funnel into the currently
high side of the teeter, it flips over to the other side, dumping that
0.01" and a magnet on the side of the teeter registers a pulse on a
nearby coil. now the other side fills, and it tips back
we've had 1.40" total at my house since the rainy season started in
August 1. we should be getting that much or more weekly this time of
year. this will be the 3rd year of draught on the central coast ;(((
best design for multiple time series *I've* seen is RRDTOOL, which is an
updated version of the classic MRTG. this is /not/ SQL.
Hello,
(out of the topic also)
NetCDF , SCIDB
Regards
On Friday, December 13, 2013 7:55 PM, John R Pierce <pierce@hogranch.com> wrote:
On 12/13/2013 4:46 AM, rob stone wrote:
The only fly in the ointment with this is a rain gauge. If you don't
empty it each day the actual rainfall is the difference between
readings.
(somewhat off topic)
The electronic rain gauges I've seen have all been tip-bucket. they
measure each 0.01" (or equiv metric unit) at a time, and the rain total
is counter based, the weather station software automatically rolls over
daily, weekly, monthly totals, also tracks 'last 24 hours' which is a
rolling total.
the one I have has a 'teeter totter' rocker, each side of it is a
'bucket', when 0.01" of precip. comes down the funnel into the currently
high side of the teeter, it flips over to the other side, dumping that
0.01" and a magnet on the side of the teeter registers a pulse on a
nearby coil. now the other side fills, and it tips back
we've had 1.40" total at my house since the rainy season started in
August 1. we should be getting that much or more weekly this time of
year. this will be the 3rd year of draught on the central coast ;(((
best design for multiple time series *I've* seen is RRDTOOL, which is an
updated version of the classic MRTG. this is /not/ SQL.
--
john r pierce 37N 122W
somewhere on the middle of the left coast
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general