Normalization and regexp

Started by MTover 22 years ago3 messagesgeneral
Jump to latest
#1MT
m_tessier@sympatico.ca

Hi,

Since Canada Post hasn't programmed their automated web tools for calculating transportation costs to work with anything outside of Windows Internet Explorer, I'm obliged to write a web based program optimized for all those *other* browsers, myself. Part of this task requires that I set up tables in postgresql that match postal codes to transportation cost.

Canada Post provides a booklet for calculating transportation cost based on package weight and the first three characters of the postal code sequence. For instance, if I want to send a package to an address that includes G8F 1X1 as the postal code, I take the first 3 characters G8F and look them up in table 1.

Table 1

PostalCode Tarrif number
---------------------------
G4V 14
G8E-G8G 14
G4R-G4S 13

Since G8F falls in the G8E-G8G range, I now know that the tarrif number is 14. Taking the number 14, I go to table 2

Table 2

For tarrif Code 14
Weight(kg) Price
----------------------
1.0 5.37
1.5 5.61
2.0 5.82

If the weight of my package is 1kg, the price is 5.37 to send the package to the address bearing G8F as the first 3 characters of the postal code.

To render this in the database, I have done the following:

_____________________________
p_code |
=============================
pcode_id | tarrif |
-----------------------------
G4V | 14 |
-----------------------------
G8E | 14 |
-----------------------------
G8F | 14 |
-----------------------------
G8G | 14 |
-----------------------------
G4R | 13 |
-----------------------------
G4S | 13 |
-----------------------------

__________________________________
price_weight |
==================================
tarrif | weight(kg)| price |
----------------------------------
14 | 1.0 | 5.37 |
----------------------------------
14 | 1.5 | 5.61 |
----------------------------------
14 | 2.0 | 5.82 |
----------------------------------
13 | 1.0 | 5.20 |
----------------------------------
13 | 1.5 | 5.32 |
----------------------------------
13 | 2.0 | 5.42 |

Therefore my sql statement would look something like this:

SELECT price FROM price_weight
WHERE p_code.tarrif = price_weight.tarrif
AND pcode = 'G8F'
AND weight = '1.0';

I think this will work, but before I proceed, I'd like to ask 2 questions:

1.
Is it possible to further normalize the data in the p_code and price_weight tables above?

2.
Is it possible to abbreviate the number of records in the p_code table using regular expressions. For instance, to avoid repetition, I thought I'd use regular expressions, so that instead of entering the postal code into separate rows as such:

G4V 14
G8E 14
G8F 14
G8G 14

I could do something like this:

(G4V | G8[E-G]) 14

Somehow I don't think this is possible, but I'm looking for any way to minimize the number of postal codes that I have to enter, since there's a lot of them.

Anyway, I realize these questions may have more to do with database design than postgresql per se. If there's a better place to ask them, please point me in the right direction.

Thanks,

Mark

#2John Sidney-Woollett
johnsw@wardbrook.com
In reply to: MT (#1)
Re: Normalization and regexp

In order to reduce the number of rows in the p_code table, you could
modify it so that it had the following columns:

create table p_code (
pCode_Start varchar(3),
pCode_End varchar(3),
tariff integer,
primary key(pCode_Start, pCode_End)
);

If you need to insert a single code, use the following:

insert into pCode('G4V','G4V',14);

and a range as follows:

insert into pCode('G8E', 'G8G',14);

Your select statement is now:

SELECT price FROM price_weight pw, p_code pc
WHERE pc.tarrif = pw.tarrif
AND pCode_Start <= 'G8F'
AND pCode_End >= 'G8F'
AND weight = '1.0';

I think that should work OK...

John Sidney-Woollett

MT said:

Show quoted text

Hi,

Since Canada Post hasn't programmed their automated web tools for
calculating transportation costs to work with anything outside of Windows
Internet Explorer, I'm obliged to write a web based program optimized for
all those *other* browsers, myself. Part of this task requires that I set
up tables in postgresql that match postal codes to transportation cost.

Canada Post provides a booklet for calculating transportation cost based
on package weight and the first three characters of the postal code
sequence. For instance, if I want to send a package to an address that
includes G8F 1X1 as the postal code, I take the first 3 characters G8F and
look them up in table 1.

Table 1

PostalCode Tarrif number
---------------------------
G4V 14
G8E-G8G 14
G4R-G4S 13

Since G8F falls in the G8E-G8G range, I now know that the tarrif number is
14. Taking the number 14, I go to table 2

Table 2

For tarrif Code 14
Weight(kg) Price
----------------------
1.0 5.37
1.5 5.61
2.0 5.82

If the weight of my package is 1kg, the price is 5.37 to send the package
to the address bearing G8F as the first 3 characters of the postal code.

To render this in the database, I have done the following:

_____________________________
p_code |
=============================
pcode_id | tarrif |
-----------------------------
G4V | 14 |
-----------------------------
G8E | 14 |
-----------------------------
G8F | 14 |
-----------------------------
G8G | 14 |
-----------------------------
G4R | 13 |
-----------------------------
G4S | 13 |
-----------------------------

__________________________________
price_weight |
==================================
tarrif | weight(kg)| price |
----------------------------------
14 | 1.0 | 5.37 |
----------------------------------
14 | 1.5 | 5.61 |
----------------------------------
14 | 2.0 | 5.82 |
----------------------------------
13 | 1.0 | 5.20 |
----------------------------------
13 | 1.5 | 5.32 |
----------------------------------
13 | 2.0 | 5.42 |

Therefore my sql statement would look something like this:

SELECT price FROM price_weight
WHERE p_code.tarrif = price_weight.tarrif
AND pcode = 'G8F'
AND weight = '1.0';

I think this will work, but before I proceed, I'd like to ask 2 questions:

1.
Is it possible to further normalize the data in the p_code and
price_weight tables above?

2.
Is it possible to abbreviate the number of records in the p_code table
using regular expressions. For instance, to avoid repetition, I thought
I'd use regular expressions, so that instead of entering the postal code
into separate rows as such:

G4V 14
G8E 14
G8F 14
G8G 14

I could do something like this:

(G4V | G8[E-G]) 14

Somehow I don't think this is possible, but I'm looking for any way to
minimize the number of postal codes that I have to enter, since there's a
lot of them.

Anyway, I realize these questions may have more to do with database design
than postgresql per se. If there's a better place to ask them, please
point me in the right direction.

Thanks,

Mark

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

#3ezra epstein
ee_newsgroup_post@prajnait.com
In reply to: MT (#1)
Re: Normalization and regexp

There's no reason you couldn't store a regex string into a column. The
trick has only to do with proper escaping for the various steps, namely the
loading method (regular SQL has some escaping requirements and so you need
double back-slashes and escaped quotes). A work-around for this is to use
other input methods like COPY. Then there's the escaping required for the
regex tool of choice: either client-side or server(db)-side.

If you're doing the regex processing client-side this might be easier.
Depends on usage.

In short, the principal of what you propose seems completely sound.
However, there might be tricks to handling character escaping correctly.

== Ezra Epstein

"MT" <m_tessier@sympatico.ca> wrote in message
news:20031222181239.476cfa47.m_tessier@sympatico.ca...

Hi,

Since Canada Post hasn't programmed their automated web tools for

calculating transportation costs to work with anything outside of Windows
Internet Explorer, I'm obliged to write a web based program optimized for
all those *other* browsers, myself. Part of this task requires that I set up
tables in postgresql that match postal codes to transportation cost.

Canada Post provides a booklet for calculating transportation cost based

on package weight and the first three characters of the postal code
sequence. For instance, if I want to send a package to an address that
includes G8F 1X1 as the postal code, I take the first 3 characters G8F and
look them up in table 1.

Table 1

PostalCode Tarrif number
---------------------------
G4V 14
G8E-G8G 14
G4R-G4S 13

Since G8F falls in the G8E-G8G range, I now know that the tarrif number is

14. Taking the number 14, I go to table 2

Table 2

For tarrif Code 14
Weight(kg) Price
----------------------
1.0 5.37
1.5 5.61
2.0 5.82

If the weight of my package is 1kg, the price is 5.37 to send the package

to the address bearing G8F as the first 3 characters of the postal code.

To render this in the database, I have done the following:

_____________________________
p_code |
=============================
pcode_id | tarrif |
-----------------------------
G4V | 14 |
-----------------------------
G8E | 14 |
-----------------------------
G8F | 14 |
-----------------------------
G8G | 14 |
-----------------------------
G4R | 13 |
-----------------------------
G4S | 13 |
-----------------------------

__________________________________
price_weight |
==================================
tarrif | weight(kg)| price |
----------------------------------
14 | 1.0 | 5.37 |
----------------------------------
14 | 1.5 | 5.61 |
----------------------------------
14 | 2.0 | 5.82 |
----------------------------------
13 | 1.0 | 5.20 |
----------------------------------
13 | 1.5 | 5.32 |
----------------------------------
13 | 2.0 | 5.42 |

Therefore my sql statement would look something like this:

SELECT price FROM price_weight
WHERE p_code.tarrif = price_weight.tarrif
AND pcode = 'G8F'
AND weight = '1.0';

I think this will work, but before I proceed, I'd like to ask 2 questions:

1.
Is it possible to further normalize the data in the p_code and

price_weight tables above?

2.
Is it possible to abbreviate the number of records in the p_code table

using regular expressions. For instance, to avoid repetition, I thought I'd
use regular expressions, so that instead of entering the postal code into
separate rows as such:

G4V 14
G8E 14
G8F 14
G8G 14

I could do something like this:

(G4V | G8[E-G]) 14

Somehow I don't think this is possible, but I'm looking for any way to

minimize the number of postal codes that I have to enter, since there's a
lot of them.

Anyway, I realize these questions may have more to do with database design

than postgresql per se. If there's a better place to ask them, please point
me in the right direction.

Show quoted text

Thanks,

Mark

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings