Normalization and regexp

Started by MTover 22 years ago3 messagesgeneral

m_tessier@sympatico.ca

over 22 years ago

Hi,

Since Canada Post hasn't programmed their automated web tools for calculating transportation costs to work with anything outside of Windows Internet Explorer, I'm obliged to write a web based program optimized for all those *other* browsers, myself. Part of this task requires that I set up tables in postgresql that match postal codes to transportation cost.

Canada Post provides a booklet for calculating transportation cost based on package weight and the first three characters of the postal code sequence. For instance, if I want to send a package to an address that includes G8F 1X1 as the postal code, I take the first 3 characters G8F and look them up in table 1.

Table 1

PostalCode Tarrif number
---------------------------
G4V 14
G8E-G8G 14
G4R-G4S 13

Since G8F falls in the G8E-G8G range, I now know that the tarrif number is 14. Taking the number 14, I go to table 2

Table 2

For tarrif Code 14
Weight(kg) Price
----------------------
1.0 5.37
1.5 5.61
2.0 5.82

If the weight of my package is 1kg, the price is 5.37 to send the package to the address bearing G8F as the first 3 characters of the postal code.

To render this in the database, I have done the following:

_____________________________
p_code |
=============================
pcode_id | tarrif |
-----------------------------
G4V | 14 |
-----------------------------
G8E | 14 |
-----------------------------
G8F | 14 |
-----------------------------
G8G | 14 |
-----------------------------
G4R | 13 |
-----------------------------
G4S | 13 |
-----------------------------

__________________________________
price_weight |
==================================
tarrif | weight(kg)| price |
----------------------------------
14 | 1.0 | 5.37 |
----------------------------------
14 | 1.5 | 5.61 |
----------------------------------
14 | 2.0 | 5.82 |
----------------------------------
13 | 1.0 | 5.20 |
----------------------------------
13 | 1.5 | 5.32 |
----------------------------------
13 | 2.0 | 5.42 |

Therefore my sql statement would look something like this:

SELECT price FROM price_weight
WHERE p_code.tarrif = price_weight.tarrif
AND pcode = 'G8F'
AND weight = '1.0';

I think this will work, but before I proceed, I'd like to ask 2 questions:

1.
Is it possible to further normalize the data in the p_code and price_weight tables above?

2.
Is it possible to abbreviate the number of records in the p_code table using regular expressions. For instance, to avoid repetition, I thought I'd use regular expressions, so that instead of entering the postal code into separate rows as such:

G4V 14
G8E 14
G8F 14
G8G 14

I could do something like this:

(G4V | G8[E-G]) 14

Somehow I don't think this is possible, but I'm looking for any way to minimize the number of postal codes that I have to enter, since there's a lot of them.

Anyway, I realize these questions may have more to do with database design than postgresql per se. If there's a better place to ask them, please point me in the right direction.

Thanks,

Mark

John Sidney-Woollett

johnsw@wardbrook.com

over 22 years ago

In reply to: MT (#1)

Re: Normalization and regexp

In order to reduce the number of rows in the p_code table, you could
modify it so that it had the following columns:

create table p_code (
pCode_Start varchar(3),
pCode_End varchar(3),
tariff integer,
primary key(pCode_Start, pCode_End)
);

If you need to insert a single code, use the following:

insert into pCode('G4V','G4V',14);

and a range as follows:

insert into pCode('G8E', 'G8G',14);

Your select statement is now:

SELECT price FROM price_weight pw, p_code pc
WHERE pc.tarrif = pw.tarrif
AND pCode_Start <= 'G8F'
AND pCode_End >= 'G8F'
AND weight = '1.0';

I think that should work OK...

John Sidney-Woollett

MT said:

Show quoted text

Hi,

Since Canada Post hasn't programmed their automated web tools for
calculating transportation costs to work with anything outside of Windows
Internet Explorer, I'm obliged to write a web based program optimized for
all those *other* browsers, myself. Part of this task requires that I set
up tables in postgresql that match postal codes to transportation cost.

Canada Post provides a booklet for calculating transportation cost based
on package weight and the first three characters of the postal code
sequence. For instance, if I want to send a package to an address that
includes G8F 1X1 as the postal code, I take the first 3 characters G8F and
look them up in table 1.

Table 1

PostalCode Tarrif number
---------------------------
G4V 14
G8E-G8G 14
G4R-G4S 13

Since G8F falls in the G8E-G8G range, I now know that the tarrif number is
14. Taking the number 14, I go to table 2

Table 2

For tarrif Code 14
Weight(kg) Price
----------------------
1.0 5.37
1.5 5.61
2.0 5.82

If the weight of my package is 1kg, the price is 5.37 to send the package
to the address bearing G8F as the first 3 characters of the postal code.

To render this in the database, I have done the following:

_____________________________
p_code |
=============================
pcode_id | tarrif |
-----------------------------
G4V | 14 |
-----------------------------
G8E | 14 |
-----------------------------
G8F | 14 |
-----------------------------
G8G | 14 |
-----------------------------
G4R | 13 |
-----------------------------
G4S | 13 |
-----------------------------

__________________________________
price_weight |
==================================
tarrif | weight(kg)| price |
----------------------------------
14 | 1.0 | 5.37 |
----------------------------------
14 | 1.5 | 5.61 |
----------------------------------
14 | 2.0 | 5.82 |
----------------------------------
13 | 1.0 | 5.20 |
----------------------------------
13 | 1.5 | 5.32 |
----------------------------------
13 | 2.0 | 5.42 |

Therefore my sql statement would look something like this:

SELECT price FROM price_weight
WHERE p_code.tarrif = price_weight.tarrif
AND pcode = 'G8F'
AND weight = '1.0';

I think this will work, but before I proceed, I'd like to ask 2 questions:

1.
Is it possible to further normalize the data in the p_code and
price_weight tables above?

2.
Is it possible to abbreviate the number of records in the p_code table
using regular expressions. For instance, to avoid repetition, I thought
I'd use regular expressions, so that instead of entering the postal code
into separate rows as such:

G4V 14
G8E 14
G8F 14
G8G 14

I could do something like this:

(G4V | G8[E-G]) 14

Somehow I don't think this is possible, but I'm looking for any way to
minimize the number of postal codes that I have to enter, since there's a
lot of them.

Anyway, I realize these questions may have more to do with database design
than postgresql per se. If there's a better place to ask them, please
point me in the right direction.

Thanks,

Mark

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

ezra epstein

ee_newsgroup_post@prajnait.com

over 22 years ago

In reply to: MT (#1)

Re: Normalization and regexp

There's no reason you couldn't store a regex string into a column. The
trick has only to do with proper escaping for the various steps, namely the
loading method (regular SQL has some escaping requirements and so you need
double back-slashes and escaped quotes). A work-around for this is to use
other input methods like COPY. Then there's the escaping required for the
regex tool of choice: either client-side or server(db)-side.

If you're doing the regex processing client-side this might be easier.
Depends on usage.

In short, the principal of what you propose seems completely sound.
However, there might be tricks to handling character escaping correctly.

== Ezra Epstein

"MT" <m_tessier@sympatico.ca> wrote in message
news:20031222181239.476cfa47.m_tessier@sympatico.ca...

Hi,

Since Canada Post hasn't programmed their automated web tools for

calculating transportation costs to work with anything outside of Windows
Internet Explorer, I'm obliged to write a web based program optimized for
all those *other* browsers, myself. Part of this task requires that I set up
tables in postgresql that match postal codes to transportation cost.

Canada Post provides a booklet for calculating transportation cost based

on package weight and the first three characters of the postal code
sequence. For instance, if I want to send a package to an address that
includes G8F 1X1 as the postal code, I take the first 3 characters G8F and
look them up in table 1.

Table 1

PostalCode Tarrif number
---------------------------
G4V 14
G8E-G8G 14
G4R-G4S 13

Since G8F falls in the G8E-G8G range, I now know that the tarrif number is

14. Taking the number 14, I go to table 2

Table 2

For tarrif Code 14
Weight(kg) Price
----------------------
1.0 5.37
1.5 5.61
2.0 5.82

If the weight of my package is 1kg, the price is 5.37 to send the package

to the address bearing G8F as the first 3 characters of the postal code.

To render this in the database, I have done the following:

_____________________________
p_code |
=============================
pcode_id | tarrif |
-----------------------------
G4V | 14 |
-----------------------------
G8E | 14 |
-----------------------------
G8F | 14 |
-----------------------------
G8G | 14 |
-----------------------------
G4R | 13 |
-----------------------------
G4S | 13 |
-----------------------------

__________________________________
price_weight |
==================================
tarrif | weight(kg)| price |
----------------------------------
14 | 1.0 | 5.37 |
----------------------------------
14 | 1.5 | 5.61 |
----------------------------------
14 | 2.0 | 5.82 |
----------------------------------
13 | 1.0 | 5.20 |
----------------------------------
13 | 1.5 | 5.32 |
----------------------------------
13 | 2.0 | 5.42 |

Therefore my sql statement would look something like this:

SELECT price FROM price_weight
WHERE p_code.tarrif = price_weight.tarrif
AND pcode = 'G8F'
AND weight = '1.0';

I think this will work, but before I proceed, I'd like to ask 2 questions:

1.
Is it possible to further normalize the data in the p_code and

price_weight tables above?

2.
Is it possible to abbreviate the number of records in the p_code table

using regular expressions. For instance, to avoid repetition, I thought I'd
use regular expressions, so that instead of entering the postal code into
separate rows as such:

G4V 14
G8E 14
G8F 14
G8G 14

I could do something like this:

(G4V | G8[E-G]) 14

Somehow I don't think this is possible, but I'm looking for any way to

minimize the number of postal codes that I have to enter, since there's a
lot of them.

Anyway, I realize these questions may have more to do with database design

than postgresql per se. If there's a better place to ask them, please point
me in the right direction.

Show quoted text

Thanks,

Mark

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings