How to split up phone numbers?
Hi,
is there a way to split up phone numbers?
I know that's a tricky topic and it depends on the national phone number
format.
I'm especially interested in a solution for Germany, Swizerland and Austria.
I've got everything in a phone number column that makes hardly sense like:
+49432156780
0049 4321 5678 0
04321/5678-0
and so on...
Those 3 samples are actually the same number in different notations.
Aim would be to get a normalized number split up in 4 seperate columns
nr_nation
nr_city
nr_main
nr_individual
so I end up with
49 4321 5678 0 for central
49 4321 5678 42 for Mr. Smith
Is this doable?
It would be a start to at least split off nr_nation and nr_city.
On 02/20/2012 08:49 AM, Andreas wrote:
Hi,
is there a way to split up phone numbers?
I know that's a tricky topic and it depends on the national phone number format.
I'm especially interested in a solution for Germany, Swizerland and Austria.I've got everything in a phone number column that makes hardly sense like:
+49432156780
0049 4321 5678 0
04321/5678-0
and so on...
Those 3 samples are actually the same number in different notations.Aim would be to get a normalized number split up in 4 seperate columns
nr_nation
nr_city
nr_main
nr_individualso I end up with
49 4321 5678 0 for central
49 4321 5678 42 for Mr. SmithIs this doable?
It would be a start to at least split off nr_nation and nr_city.
I would do it in multiple passes. Trim everything out (spaces, slashes, etc) to try and get a constant number, then use the length to determin the different types of numbers, then use substring to pull out the parts.
-- for shorter phone numbers
select '49' as nr_nation,
substring(phone from 1 for 4) as nr_city,
etc...
where length( regexp_replace(phone, '[^\d]', '', 'g') ) = 10
and nr_nation is null;
-- for longer phone numbers
select substring(phone from 1 for 2) as nr_nation,
substring(phone from 3 for 4) as nr_city,
etc...
where length( regexp_replace(phone, '[^\d]', '', 'g') ) = 12
and nr_nation is null;
-- etc
-Andy
Dear Andreas,
this will basically require some external knowledge about the numbers logic.
The Format is not so a problem in the first place.
Numbers that start with national prefix will either start with "00" or
"+" followed by the 2 or 3 digit international prefix.
City part is of variable length (2 to 5 digits for Germany) (will
require a list of city prefix numbers for correct identification).
"main" part also is of variable length, thus separation of main and
individual part only is possible with explicit knowledge about a
specific number.
(e.g.: 12345678 could be 123456-78 or 12345-678 or 1234-5678, other
splittings are unlike but not completely impossible,
only restriction is the maximum length of internationally reachable
number length - 12 digits including internat. prefix if I remember
correctly, but could also be 14, sorry)
So, while it usually is easy to produce a valid dial string from any
such number given,
splitting such number into its logical parts requires additional
information to deliver reasonable results.
Rainer
Show quoted text
On 20.02.2012 15:49, Andreas wrote:
Hi,
is there a way to split up phone numbers?
I know that's a tricky topic and it depends on the national phone
number format.
I'm especially interested in a solution for Germany, Swizerland and
Austria.I've got everything in a phone number column that makes hardly sense
like:
+49432156780
0049 4321 5678 0
04321/5678-0
and so on...
Those 3 samples are actually the same number in different notations.Aim would be to get a normalized number split up in 4 seperate columns
nr_nation
nr_city
nr_main
nr_individualso I end up with
49 4321 5678 0 for central
49 4321 5678 42 for Mr. SmithIs this doable?
It would be a start to at least split off nr_nation and nr_city.
Maybe some processing using external libraries? For example
http://code.google.com/p/libphonenumber/ and you may try it there
http://libphonenumber.appspot.com/
Vojta
Dne 20.2.2012 15:49, Andreas napsal(a):
Show quoted text
Hi,
is there a way to split up phone numbers?
I know that's a tricky topic and it depends on the national phone
number format.
I'm especially interested in a solution for Germany, Swizerland and
Austria.I've got everything in a phone number column that makes hardly sense
like:
+49432156780
0049 4321 5678 0
04321/5678-0
and so on...
Those 3 samples are actually the same number in different notations.Aim would be to get a normalized number split up in 4 seperate columns
nr_nation
nr_city
nr_main
nr_individualso I end up with
49 4321 5678 0 for central
49 4321 5678 42 for Mr. SmithIs this doable?
It would be a start to at least split off nr_nation and nr_city.
On 02/20/2012 10:12 AM, Rainer Pruy wrote:
...only restriction is the maximum length of internationally reachable
number length - 12 digits including internat. prefix if I remember
correctly, but could also be 14, sorry)...
Per ITU-T E.164 the *recommended* maximum is 15.
6.1 International ITU-T E.164-number length
ITU-T recommends that the maximum number of digits for the international
geographic, global
services, Network and groups of countries applications should be 15
(excluding the international
prefix). Administrations are invited to do their utmost to limit the
digits to be dialled to the degree
possible consistent with the service needs.
Cheers,
Steve
On 02/20/2012 06:49 AM, Andreas wrote:
Hi,
is there a way to split up phone numbers?
Yes.
I know that's a tricky topic and it depends on the national phone
number format.
I'm especially interested in a solution for Germany, Swizerland and
Austria....
It really depends on *why* you want to split them. To determine
line-type (land/mobile/pager/special...)? For toll calculations? For
do-not-call compliance? For geolocation? Time-zone determination?
Determine the carrier or local switch ID?
In the US, the obvious first choice is NPA-NXX-#### (NPA/NXX commonly
referred to as area-code/prefix). But you may also want the 7th digit
which is the thousands-block (When a block of 10k numbers is split into
blocks for use by different carriers).
At least we have a standard length and format. Many countries I've
checked in South America have numbers of varying length but checking
certain digits in the middle can sometimes tell you the carrier or
land/mobile.
ITU E.164 recommendations are a place to start:
http://www.itu.int/rec/T-REC-E.164/en
But I'm afraid you ultimately need to study the formats for each country
as it relates to your needs.
Cheers,
Steve