Pre-proposal: unicode normalized text

Started by Jeff Davisover 2 years ago77 messages
#1Jeff Davis
pgsql@j-davis.com

One of the frustrations with using the "C" locale (or any deterministic
locale) is that the following returns false:

SELECT 'á' = 'á'; -- false

because those are the unicode sequences U&'\0061\0301' and U&'\00E1',
respectively, so memcmp() returns non-zero. But it's really the same
character with just a different representation, and if you normalize
them they are equal:

SELECT normalize('á') = normalize('á'); -- true

The idea is to have a new data type, say "UTEXT", that normalizes the
input so that it can have an improved notion of equality while still
using memcmp().

Unicode guarantees that "the results of normalizing a string on one
version will always be the same as normalizing it on any other version,
as long as the string contains only assigned characters according to
both versions"[1]https://unicode.org/reports/tr15/. It also guarantees that it "will not reallocate,
remove, or reassign" characters[2]https://www.unicode.org/policies/stability_policy.html. That means that we can normalize in
a forward-compatible way as long as we don't allow the use of
unassigned code points.

I looked at the standard to see what it had to say, and is discusses
normalization, but a standard UCS string with an unassigned code point
is not an error. Without a data type to enforce the constraint that
there are no unassigned code points, we can't guarantee forward
compatibility. Some other systems support NVARCHAR, but I didn't see
any guarantee of normalization or blocking unassigned code points
there, either.

UTEXT benefits:
* slightly better natural language semantics than TEXT with
deterministic collation
* still deterministic=true
* fast memcmp()-based comparisons
* no breaking semantic changes as unicode evolves

TEXT allows unassigned code points, and generally returns the same byte
sequences that were orgiinally entered; therefore UTEXT is not a
replacement for TEXT.

UTEXT could be built-in or it could be an extension or in contrib. If
an extension, we'd probably want to at least expose a function that can
detect unassigned code points, so that it's easy to be consistent with
the auto-generated unicode tables. I also notice that there already is
an unassigned code points table in saslprep.c, but it seems to be
frozen as of Unicode 3.2, and I'm not sure why.

Questions:

* Would this be useful enough to justify a new data type? Would it be
confusing about when to choose one versus the other?
* Would cross-type comparisons between TEXT and UTEXT become a major
problem that would reduce the utility?
* Should "some_utext_value = some_text_value" coerce the LHS to TEXT
or the RHS to UTEXT?
* Other comments or am I missing something?

Regards,
Jeff Davis

[1]: https://unicode.org/reports/tr15/
[2]: https://www.unicode.org/policies/stability_policy.html

#2Peter Eisentraut
peter@eisentraut.org
In reply to: Jeff Davis (#1)
Re: Pre-proposal: unicode normalized text

On 13.09.23 00:47, Jeff Davis wrote:

The idea is to have a new data type, say "UTEXT", that normalizes the
input so that it can have an improved notion of equality while still
using memcmp().

I think a new type like this would obviously be suboptimal because it's
nonstandard and most people wouldn't use it.

I think a better direction here would be to work toward making
nondeterministic collations usable on the global/database level and then
encouraging users to use those.

It's also not clear which way the performance tradeoffs would fall.

Nondeterministic collations are obviously going to be slower, but by how
much? People have accepted moving from C locale to "real" locales
because they needed those semantics. Would it be any worse moving from
real locales to "even realer" locales?

On the other hand, a utext type would either require a large set of its
own functions and operators, or you would have to inject text-to-utext
casts in places, which would also introduce overhead.

#3Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#2)
Re: Pre-proposal: unicode normalized text

On Mon, Oct 2, 2023 at 3:42 PM Peter Eisentraut <peter@eisentraut.org> wrote:

I think a better direction here would be to work toward making
nondeterministic collations usable on the global/database level and then
encouraging users to use those.

It seems to me that this overlooks one of the major points of Jeff's
proposal, which is that we don't reject text input that contains
unassigned code points. That decision turns out to be really painful.
Here, Jeff mentions normalization, but I think it's a major issue with
collation support. If new code points are added, users can put them
into the database before they are known to the collation library, and
then when they become known to the collation library the sort order
changes and indexes break. Would we endorse a proposal to make
pg_catalog.text with encoding UTF-8 reject code points that aren't yet
known to the collation library? To do so would be tighten things up
considerably from where they stand today, and the way things stand
today is already rigid enough to cause problems for some users. But if
we're not willing to do that then I find it easy to understand why
Jeff wants an alternative type that does.

Now, there is still the question of whether such a data type would
properly belong in core or even contrib rather than being an
out-of-core project. It's not obvious to me that such a data type
would get enough traction that we'd want it to be part of PostgreSQL
itself. But at the same time I can certainly understand why Jeff finds
the status quo problematic.

--
Robert Haas
EDB: http://www.enterprisedb.com

#4Nico Williams
nico@cryptonector.com
In reply to: Jeff Davis (#1)
Re: Pre-proposal: unicode normalized text

On Tue, Sep 12, 2023 at 03:47:10PM -0700, Jeff Davis wrote:

One of the frustrations with using the "C" locale (or any deterministic
locale) is that the following returns false:

SELECT 'á' = 'á'; -- false

because those are the unicode sequences U&'\0061\0301' and U&'\00E1',
respectively, so memcmp() returns non-zero. But it's really the same
character with just a different representation, and if you normalize
them they are equal:

SELECT normalize('á') = normalize('á'); -- true

I think you misunderstand Unicode normalization and equivalence. There
is no standard Unicode `normalize()` that would cause the above equality
predicate to be true. If you normalize to NFD (normal form decomposed)
then a _prefix_ of those two strings will be equal, but that's clearly
not what you're looking for.

PostgreSQL already has Unicode normalization support, though it would be
nice to also have form-insensitive indexing and equality predicates.

There are two ways to write 'á' in Unicode: one is pre-composed (one
codepoint) and the other is decomposed (two codepoints in this specific
case), and it would be nice to be able to preserve input form when
storing strings but then still be able to index and match them
form-insensitively (in the case of 'á' both equivalent representations
should be considered equal, and for UNIQUE indexes they should be
considered the same).

You could also have functions that perform lossy normalization in the
sort of way that soundex does, such as first normalizing to NFD then
dropping all combining codepoints which then could allow 'á' to be eq to
'a'. But this would not be a Unicode normalization function.

Nico
--

#5Jeff Davis
pgsql@j-davis.com
In reply to: Nico Williams (#4)
Re: Pre-proposal: unicode normalized text

On Mon, 2023-10-02 at 15:27 -0500, Nico Williams wrote:

I think you misunderstand Unicode normalization and equivalence. 
There
is no standard Unicode `normalize()` that would cause the above
equality
predicate to be true.  If you normalize to NFD (normal form
decomposed)
then a _prefix_ of those two strings will be equal, but that's
clearly
not what you're looking for.

From [1]https://unicode.org/reports/tr15/:

"Unicode Normalization Forms are formally defined normalizations of
Unicode strings which make it possible to determine whether any two
Unicode strings are equivalent to each other. Depending on the
particular Unicode Normalization Form, that equivalence can either be a
canonical equivalence or a compatibility equivalence... A binary
comparison of the transformed strings will then determine equivalence."

NFC and NFD are based on Canonical Equivalence.

"Canonical equivalence is a fundamental equivalency between characters
or sequences of characters which represent the same abstract character,
and which when correctly displayed should always have the same visual
appearance and behavior."

Can you explain why NFC (the default form of normalization used by the
postgres normalize() function), followed by memcmp(), is not the right
thing to use to determine Canonical Equivalence?

Or are you saying that Canonical Equivalence is not a useful thing to
test?

What do you mean about the "prefix"?

In Postgres today:

SELECT normalize(U&'\0061\0301', nfc)::bytea; -- \xc3a1
SELECT normalize(U&'\00E1', nfc)::bytea; -- \xc3a1

SELECT normalize(U&'\0061\0301', nfd)::bytea; -- \x61cc81
SELECT normalize(U&'\00E1', nfd)::bytea; -- \x61cc81

which looks useful to me, but I assume you are saying that it doesn't
generalize well to other cases?

[1]: https://unicode.org/reports/tr15/

There are two ways to write 'á' in Unicode: one is pre-composed (one
codepoint) and the other is decomposed (two codepoints in this
specific
case), and it would be nice to be able to preserve input form when
storing strings but then still be able to index and match them
form-insensitively (in the case of 'á' both equivalent
representations
should be considered equal, and for UNIQUE indexes they should be
considered the same).

Sometimes preserving input differences is a good thing, other times
it's not, depending on the context. Almost any data type has some
aspects of the input that might not be preserved -- leading zeros in a
number, or whitespace in jsonb, etc.

If text is stored as normalized with NFC, it could be frustrating if
the retrieved string has a different binary representation than the
source data. But it could also be frustrating to look at two strings
made up of ordinary characters that look identical and for the database
to consider them unequal.

Regards,
Jeff Davis

#6Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#3)
Re: Pre-proposal: unicode normalized text

On Mon, 2023-10-02 at 16:06 -0400, Robert Haas wrote:

It seems to me that this overlooks one of the major points of Jeff's
proposal, which is that we don't reject text input that contains
unassigned code points. That decision turns out to be really painful.

Yeah, because we lose forward-compatibility of some useful operations.

Here, Jeff mentions normalization, but I think it's a major issue
with
collation support. If new code points are added, users can put them
into the database before they are known to the collation library, and
then when they become known to the collation library the sort order
changes and indexes break.

The collation version number may reflect the change in understanding
about assigned code points that may affect collation -- though I'd like
to understand whether this is guaranteed or not.

Regardless, given that (a) we don't have a good story for migrating to
new collation versions; and (b) it would be painful to rebuild indexes
even if we did; then you are right that it's a problem.

Would we endorse a proposal to make
pg_catalog.text with encoding UTF-8 reject code points that aren't
yet
known to the collation library? To do so would be tighten things up
considerably from where they stand today, and the way things stand
today is already rigid enough to cause problems for some users.

What problems exist today due to the rigidity of text?

I assume you mean because we reject invalid byte sequences? Yeah, I'm
sure that causes a problem for some (especially migrations), but it's
difficult for me to imagine a database working well with no rules at
all for the the basic data types.

Now, there is still the question of whether such a data type would
properly belong in core or even contrib rather than being an
out-of-core project. It's not obvious to me that such a data type
would get enough traction that we'd want it to be part of PostgreSQL
itself.

At minimum I think we need to have some internal functions to check for
unassigned code points. That belongs in core, because we generate the
unicode tables from a specific version.

I also think we should expose some SQL functions to check for
unassigned code points. That sounds useful, especially since we already
expose normalization functions.

One could easily imagine a domain with CHECK(NOT
contains_unassigned(a)). Or an extension with a data type that uses the
internal functions.

Whether we ever get to a core data type -- and more importantly,
whether anyone uses it -- I'm not sure.

But at the same time I can certainly understand why Jeff finds
the status quo problematic.

Yeah, I am looking for a better compromise between:

* everything is memcmp() and 'á' sometimes doesn't equal 'á'
(depending on code point sequence)
* everything is constantly changing, indexes break, and text
comparisons are slow

A stable idea of unicode normalization based on using only assigned
code points is very tempting.

Regards,
Jeff Davis

#7Nico Williams
nico@cryptonector.com
In reply to: Jeff Davis (#5)
Re: Pre-proposal: unicode normalized text

On Tue, Oct 03, 2023 at 12:15:10PM -0700, Jeff Davis wrote:

On Mon, 2023-10-02 at 15:27 -0500, Nico Williams wrote:

I think you misunderstand Unicode normalization and equivalence. 
There is no standard Unicode `normalize()` that would cause the
above equality predicate to be true.  If you normalize to NFD
(normal form decomposed) then a _prefix_ of those two strings will
be equal, but that's clearly not what you're looking for.

Ugh, My client is not displying 'a' correctly, thus I misunderstood your
post.

From [1]:

Here's what you wrote in your post:

| [...] But it's really the same
| character with just a different representation, and if you normalize
| them they are equal:
|
| SELECT normalize('á') = normalize('á'); -- true

but my client is not displying 'a' correctly! (It displays like 'a' but
it should display like 'á'.)

Bah. So I'd (mis)interpreted you as saying that normalize('a') should
equal normalize('á'). Please disregard that part of my reply.

There are two ways to write 'á' in Unicode: one is pre-composed (one
codepoint) and the other is decomposed (two codepoints in this
specific case), and it would be nice to be able to preserve input
form when storing strings but then still be able to index and match
them form-insensitively (in the case of 'á' both equivalent
representations should be considered equal, and for UNIQUE indexes
they should be considered the same).

Sometimes preserving input differences is a good thing, other times
it's not, depending on the context. Almost any data type has some
aspects of the input that might not be preserved -- leading zeros in a
number, or whitespace in jsonb, etc.

Almost every Latin input mode out there produces precomposed characters
and so they effectively produce NFC. I'm not sure if the same is true
for, e.g., Hangul (Korean) and various other scripts.

But there are things out there that produce NFD. Famously Apple's HFS+
uses NFD (or something very close to NFD). So if you cut-n-paste things
that got normalized to NFD and paste them into contexts where
normalization isn't done, then you might start wanting to alter those
contexts to either normalize or be form-preserving/form-insensitive.
Sometimes you don't get to normalize, so you have to pick form-
preserving/form-insensitive behavior.

If text is stored as normalized with NFC, it could be frustrating if
the retrieved string has a different binary representation than the
source data. But it could also be frustrating to look at two strings
made up of ordinary characters that look identical and for the database
to consider them unequal.

Exactly. If you have such a case you might like the option to make your
database form-preserving and form-insensitive. That means that indices
need to normalize strings, but tables need to store unnormalized
strings.

ZFS (filesystems are a bit like databases) does just that!

Nico
--

#8Jeff Davis
pgsql@j-davis.com
In reply to: Nico Williams (#7)
Re: Pre-proposal: unicode normalized text

On Tue, 2023-10-03 at 15:15 -0500, Nico Williams wrote:

Ugh, My client is not displying 'a' correctly

Ugh. Is that an argument in favor of normalization or against?

I've also noticed that some fonts render the same character a bit
differently depending on the constituent code points. For instance, if
the accent is its own code point, it seems to be more prominent than if
a single code point represents both the base character and the accent.
That seems to be a violation, but I can understand why that might be
useful.

Almost every Latin input mode out there produces precomposed
characters
and so they effectively produce NFC.

The problem is not the normal case, the problem will be things like
obscure input methods, some kind of software that's being too clever,
or some kind of malicious user trying to confuse the database.

That means that indices
need to normalize strings, but tables need to store unnormalized
strings.

That's an interesting idea. Would the equality operator normalize
first, or are you saying that the index would need to recheck the
results?

Regards,
Jeff Davis

#9Jeff Davis
pgsql@j-davis.com
In reply to: Peter Eisentraut (#2)
Re: Pre-proposal: unicode normalized text

On Mon, 2023-10-02 at 10:47 +0200, Peter Eisentraut wrote:

I think a better direction here would be to work toward making
nondeterministic collations usable on the global/database level and
then
encouraging users to use those.

It's also not clear which way the performance tradeoffs would fall.

Nondeterministic collations are obviously going to be slower, but by
how
much?  People have accepted moving from C locale to "real" locales
because they needed those semantics.  Would it be any worse moving
from
real locales to "even realer" locales?

If you normalize first, then you can get some semantic improvements
without giving up on the stability and performance of memcmp(). That
seems like a win with zero costs in terms of stability or performance
(except perhaps some extra text->utext casts).

Going to a "real" locale gives more semantic benefits but at a very
high cost: depending on a collation provider library, dealing with
collation changes, and performance costs. While supporting the use of
nondeterministic collations at the database level may be a good idea,
it's not helping to reach the compromise that I'm trying to reach in
this thread.

Regards,
Jeff Davis

#10Nico Williams
nico@cryptonector.com
In reply to: Jeff Davis (#8)
Re: Pre-proposal: unicode normalized text

On Tue, Oct 03, 2023 at 03:34:44PM -0700, Jeff Davis wrote:

On Tue, 2023-10-03 at 15:15 -0500, Nico Williams wrote:

Ugh, My client is not displying 'a' correctly

Ugh. Is that an argument in favor of normalization or against?

Heheh, well, it's an argument in favor of more software getting this
right (darn it).

It's also an argument for building a time machine so HFS+ can just
always have used NFC. But the existence of UTF-16 is proof that time
machines don't exist (or that only bad actors have them).

I've also noticed that some fonts render the same character a bit
differently depending on the constituent code points. For instance, if
the accent is its own code point, it seems to be more prominent than if
a single code point represents both the base character and the accent.
That seems to be a violation, but I can understand why that might be
useful.

Yes, that happens. Did you know that the ASCII character set was
designed with overstrike in mind for typing of accented Latin
characters? Unicode combining sequences are kinda like that, but more
complex.

Yes, the idea really was that you could write a<BS>' (or '<BS>a) to get �.
That's how people did it with typewriters anyways.

Almost every Latin input mode out there produces precomposed
characters and so they effectively produce NFC.

The problem is not the normal case, the problem will be things like
obscure input methods, some kind of software that's being too clever,
or some kind of malicious user trying to confuse the database.

_HFS+ enters the chat_

That means that indices
need to normalize strings, but tables need to store unnormalized
strings.

That's an interesting idea. Would the equality operator normalize
first, or are you saying that the index would need to recheck the
results?

You can optimize this to avoid having to normalize first. Most strings
are not equal, and they tend to differ early. And most strings will
likely be ASCII-mostly or in the same form anyways. So you can just
walk a cursor down each string looking at two bytes, and if they are
both ASCII then you move each cursor forward by one byte, and if then
are not both ASCII then you take a slow path where you normalize one
grapheme cluster at each cursor (if necessary) and compare that. (ZFS
does this.)

You can also assume ASCII-mostly, load as many bits of each string
(padding as needed) as will fit in SIMD registers, compare and check
that they're all ASCII, and if not then jump to the slow path.

You can also normalize one grapheme cluster at a time when hashing
(e.g., for hash indices), thus avoiding a large allocation if the string
is large.

Nico
--

#11Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#6)
Re: Pre-proposal: unicode normalized text

On Tue, Oct 3, 2023 at 3:54 PM Jeff Davis <pgsql@j-davis.com> wrote:

I assume you mean because we reject invalid byte sequences? Yeah, I'm
sure that causes a problem for some (especially migrations), but it's
difficult for me to imagine a database working well with no rules at
all for the the basic data types.

There's a very popular commercial database where, or so I have been
led to believe, any byte sequence at all is accepted when you try to
put values into the database. The rumors I've heard -- I have not
played with it myself -- are that when you try to do anything, byte
sequences that are not valid in the configured encoding are treated as
single-byte characters or something of that sort. So like if you had
UTF-8 as the encoding and the first byte of the string is something
that can only appear as a continuation byte in UTF-8, I think that
byte is just treated as a separate character. I don't quite know how
you make all of the operations work that way, but it seems like
they've come up with a somewhat-consistent set of principles that are
applied across the board. Very different from the PG philosophy, of
course. And I'm not saying it's better. But it does eliminate the
problem of being unable to load data into the database, because in
such a model there's no such thing as invalidly-encoded data. Instead,
an encoding like UTF-8 is effectively extended so that every byte
sequence represents *something*. Whether that something is what you
wanted is another story.

At any rate, if we were to go in the direction of rejecting code
points that aren't yet assigned, or aren't yet known to the collation
library, that's another way for data loading to fail. Which feels like
very defensible behavior, but not what everyone wants, or is used to.

At minimum I think we need to have some internal functions to check for
unassigned code points. That belongs in core, because we generate the
unicode tables from a specific version.

That's a good idea.

I also think we should expose some SQL functions to check for
unassigned code points. That sounds useful, especially since we already
expose normalization functions.

That's a good idea, too.

One could easily imagine a domain with CHECK(NOT
contains_unassigned(a)). Or an extension with a data type that uses the
internal functions.

Yeah.

Whether we ever get to a core data type -- and more importantly,
whether anyone uses it -- I'm not sure.

Same here.

Yeah, I am looking for a better compromise between:

* everything is memcmp() and 'á' sometimes doesn't equal 'á'
(depending on code point sequence)
* everything is constantly changing, indexes break, and text
comparisons are slow

A stable idea of unicode normalization based on using only assigned
code points is very tempting.

The fact that there are multiple types of normalization and multiple
notions of equality doesn't make this easier.

--
Robert Haas
EDB: http://www.enterprisedb.com

#12Nico Williams
nico@cryptonector.com
In reply to: Jeff Davis (#1)
Re: Pre-proposal: unicode normalized text

On Tue, Sep 12, 2023 at 03:47:10PM -0700, Jeff Davis wrote:

The idea is to have a new data type, say "UTEXT", that normalizes the
input so that it can have an improved notion of equality while still
using memcmp().

A UTEXT type would be helpful for specifying that the text must be
Unicode (in which transform?) even if the character data encoding for
the database is not UTF-8.

Maybe UTF8 might be a better name for the new type, since it would
denote the transform (and would allow for UTF16 and UTF32 some day,
though it's doubtful those would ever happen).

But it's one thing to specify Unicode (and transform) in the type and
another to specify an NF to normalize to on insert or on lookup.

How about new column constraint keywords, such as NORMALIZE (meaning
normalize on insert) and NORMALIZED (meaning reject non-canonical form
text), with an optional parenthetical by which to specify a non-default
form? (These would apply to TEXT as well when the default encoding for
the DB is UTF-8.)

One could then ALTER TABLE to add this to existing tables.

This would also make it easier to add a form-preserving/form-insensitive
mode later if it turns out to be useful or necessary, maybe making it
the default for Unicode text in new tables.

Questions:

* Would this be useful enough to justify a new data type? Would it be
confusing about when to choose one versus the other?

Yes. See above. I think I'd rather have it be called UTF8, and the
normalization properties of it to be specified as column constraints.

* Would cross-type comparisons between TEXT and UTEXT become a major
problem that would reduce the utility?

Maybe when the database's encoding is UTF_8 then UTEXT (or UTF8) can be an alias
of TEXT.

* Should "some_utext_value = some_text_value" coerce the LHS to TEXT
or the RHS to UTEXT?

Ooh, this is nice! If the TEXT is _not_ UTF-8 then it could be
converted to UTF-8. So I think which is RHS and which is LHS doesn't
matter -- it's which is UTF-8, and if both are then the only thing left
to do is normalize, and for that I'd take the LHS' form if the LHS is
UTF-8, else the RHS'.

Nico
--

#13Robert Haas
robertmhaas@gmail.com
In reply to: Nico Williams (#12)
Re: Pre-proposal: unicode normalized text

On Wed, Oct 4, 2023 at 1:27 PM Nico Williams <nico@cryptonector.com> wrote:

A UTEXT type would be helpful for specifying that the text must be
Unicode (in which transform?) even if the character data encoding for
the database is not UTF-8.

That's actually pretty thorny ... because right now client_encoding
specifies the encoding to be used for all data sent to the client. So
would we convert the data from UTF8 to the selected client encoding?
Or what?

--
Robert Haas
EDB: http://www.enterprisedb.com

#14Chapman Flack
chap@anastigmatix.net
In reply to: Robert Haas (#13)
Re: Pre-proposal: unicode normalized text

On 2023-10-04 13:47, Robert Haas wrote:

On Wed, Oct 4, 2023 at 1:27 PM Nico Williams <nico@cryptonector.com>
wrote:

A UTEXT type would be helpful for specifying that the text must be
Unicode (in which transform?) even if the character data encoding for
the database is not UTF-8.

That's actually pretty thorny ... because right now client_encoding
specifies the encoding to be used for all data sent to the client. So
would we convert the data from UTF8 to the selected client encoding?

The SQL standard would have me able to:

CREATE TABLE foo (
a CHARACTER VARYING CHARACTER SET UTF8,
b CHARACTER VARYING CHARACTER SET LATIN1
)

and so on, and write character literals like

_UTF8'Hello, world!' and _LATIN1'Hello, world!'

and have those columns and data types independently contain what
they can contain, without constraints imposed by one overall
database encoding.

Obviously, we're far from being able to do that. But should it
become desirable to get closer, would it be worthwhile to also
try to follow how the standard would have it look?

Clearly, part of the job would involve making the wire protocol
able to transmit binary values and identify their encodings.

Regards,
-Chap

#15Robert Haas
robertmhaas@gmail.com
In reply to: Chapman Flack (#14)
Re: Pre-proposal: unicode normalized text

On Wed, Oct 4, 2023 at 2:02 PM Chapman Flack <chap@anastigmatix.net> wrote:

Clearly, part of the job would involve making the wire protocol
able to transmit binary values and identify their encodings.

Right. Which unfortunately is moving the goal posts into the
stratosphere compared to any other work mentioned so far. I agree it
would be great. But not if you want concrete progress any time soon.

--
Robert Haas
EDB: http://www.enterprisedb.com

#16Isaac Morland
isaac.morland@gmail.com
In reply to: Chapman Flack (#14)
Re: Pre-proposal: unicode normalized text

On Wed, 4 Oct 2023 at 14:05, Chapman Flack <chap@anastigmatix.net> wrote:

On 2023-10-04 13:47, Robert Haas wrote:

The SQL standard would have me able to:

CREATE TABLE foo (
a CHARACTER VARYING CHARACTER SET UTF8,
b CHARACTER VARYING CHARACTER SET LATIN1
)

and so on, and write character literals like

_UTF8'Hello, world!' and _LATIN1'Hello, world!'

and have those columns and data types independently contain what
they can contain, without constraints imposed by one overall
database encoding.

Obviously, we're far from being able to do that. But should it
become desirable to get closer, would it be worthwhile to also
try to follow how the standard would have it look?

Clearly, part of the job would involve making the wire protocol
able to transmit binary values and identify their encodings.

I would go in the other direction (note: I’m ignoring all backward
compatibility considerations related to the current design of Postgres).

Always store only UTF-8 in the database, and send only UTF-8 on the wire
protocol. If we still want to have a concept of "client encoding", have the
client libpq take care of translating the bytes between the bytes used by
the caller and the bytes sent on the wire.

Note that you could still define columns as you say, but the character set
specification would effectively act simply as a CHECK constraint on the
characters allowed, essentially CHECK (column_name ~ '^[...all characters
in encoding...]$*'). We don't allow different on-disk representations of
dates or other data types; except when we really need to, and then we have
multiple data types (e.g. int vs. float) rather than different ways of
storing the same datatype.

What about characters not in UTF-8? If a character is important enough for
us to worry about in Postgres, it’s important enough to get a U+ number
from the Unicode Consortium, which automatically puts it in UTF-8. In the
modern context, "plain text" mean "UTF-8 encoded text", as far as I'm
concerned.

#17Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#11)
Re: Pre-proposal: unicode normalized text

On Wed, 2023-10-04 at 13:16 -0400, Robert Haas wrote:

any byte sequence at all is accepted when you try to
put values into the database.

We support SQL_ASCII, which allows something similar.

At any rate, if we were to go in the direction of rejecting code
points that aren't yet assigned, or aren't yet known to the collation
library, that's another way for data loading to fail.

A failure during data loading is either a feature or a bug, depending
on whether you are the one loading the data or the one trying to make
sense of it later ;-)

Which feels like
very defensible behavior, but not what everyone wants, or is used to.

Yeah, there are many reasons someone might want to accept unassigned
code points. An obvious one is if their application is on a newer
version of unicode where the codepoint *is* assigned.

The fact that there are multiple types of normalization and multiple
notions of equality doesn't make this easier.

NFC is really the only one that makes sense.

NFD is semantically the same as NFC, but expanded into a larger
representation. NFKC/NFKD are based on a more relaxed notion of
equality -- kind of like non-deterministic collations. These other
forms might make sense in certain cases, but not general use.

I believe that having a kind of text data type where it's stored in NFC
and compared with memcmp() would be a good place for many users to be -
- probably most users. It's got all the performance and stability
benefits of memcmp(), with slightly richer semantics. It's less likely
that someone malicious can confuse the database by using different
representations of the same character.

The problem is that it's not universally better for everyone: there are
certainly users who would prefer that the codepoints they send to the
database are preserved exactly, and also users who would like to be
able to use unassigned code points.

Regards,
Jeff Davis

#18Jeff Davis
pgsql@j-davis.com
In reply to: Chapman Flack (#14)
Re: Pre-proposal: unicode normalized text

On Wed, 2023-10-04 at 14:02 -0400, Chapman Flack wrote:

The SQL standard would have me able to:

CREATE TABLE foo (
   a CHARACTER VARYING CHARACTER SET UTF8,
   b CHARACTER VARYING CHARACTER SET LATIN1
)

and so on, and write character literals like

_UTF8'Hello, world!' and _LATIN1'Hello, world!'

Is there a use case for that? UTF-8 is able to encode any unicode code
point, it's relatively compact, and it's backwards-compatible with 7-
bit ASCII. If you have a variety of text data in your system (and in
many cases even if not), then UTF-8 seems like the right solution.

Text data encoded 17 different ways requires a lot of bookkeeping in
the type system, and it also requires injecting a bunch of fallible
transcoding operators around just to compare strings.

Regards,
Jeff Davis

#19Nico Williams
nico@cryptonector.com
In reply to: Jeff Davis (#18)
Re: Pre-proposal: unicode normalized text

On Wed, Oct 04, 2023 at 01:38:15PM -0700, Jeff Davis wrote:

On Wed, 2023-10-04 at 14:02 -0400, Chapman Flack wrote:

The SQL standard would have me able to:

[...]
_UTF8'Hello, world!' and _LATIN1'Hello, world!'

Is there a use case for that? UTF-8 is able to encode any unicode code
point, it's relatively compact, and it's backwards-compatible with 7-
bit ASCII. If you have a variety of text data in your system (and in
many cases even if not), then UTF-8 seems like the right solution.

Text data encoded 17 different ways requires a lot of bookkeeping in
the type system, and it also requires injecting a bunch of fallible
transcoding operators around just to compare strings.

Better that than TEXT blobs w/ the encoding given by the `CREATE
DATABASE` or `initdb` default!

It'd be a lot _less_ fragile to have all text tagged with an encoding
(indirectly, via its type which then denotes the encoding).

That would be a lot of work, but starting with just a UTF-8 text type
would be an improvement.

Nico
--

#20Chapman Flack
chap@anastigmatix.net
In reply to: Jeff Davis (#18)
Re: Pre-proposal: unicode normalized text

On 2023-10-04 16:38, Jeff Davis wrote:

On Wed, 2023-10-04 at 14:02 -0400, Chapman Flack wrote:

The SQL standard would have me able to:

CREATE TABLE foo (
   a CHARACTER VARYING CHARACTER SET UTF8,
   b CHARACTER VARYING CHARACTER SET LATIN1
)

and so on

Is there a use case for that? UTF-8 is able to encode any unicode code
point, it's relatively compact, and it's backwards-compatible with 7-
bit ASCII. If you have a variety of text data in your system (and in
many cases even if not), then UTF-8 seems like the right solution.

Well, for what reason does anybody run PG now with the encoding set
to anything besides UTF-8? I don't really have my finger on that pulse.
Could it be that it bloats common strings in their local script, and
with enough of those to store, it could matter to use the local
encoding that stores them more economically?

Also, while any Unicode transfer format can encode any Unicode code
point, I'm unsure whether it's yet the case that {any Unicode code
point} is a superset of every character repertoire associated with
every non-Unicode encoding.

The cheap glaring counterexample is SQL_ASCII. Half those code points
are *nobody knows what Unicode character* (or even *whether*). I'm not
insisting that's a good thing, but it is a thing.

It might be a very tidy future to say all text is Unicode and all
server encodings are UTF-8, but I'm not sure it wouldn't still
be a good step on the way to be able to store some things in
their own encodings. We have JSON and XML now, two data types
that are *formally defined* to accept any Unicode content, and
we hedge and mumble and say (well, as long as it goes in the
server encoding) and that makes me sad. Things like that should
be easy to handle even without declaring UTF-8 as a server-wide
encoding ... they already are their own distinct data types, and
could conceivably know their own encodings.

But there again, it's possible that going with unconditional
UTF-8 for JSON or XML documents could, in some regions, bloat them.

Regards,
-Chap

#21Jeff Davis
pgsql@j-davis.com
In reply to: Isaac Morland (#16)
Re: Pre-proposal: unicode normalized text

On Wed, 2023-10-04 at 14:14 -0400, Isaac Morland wrote:

Always store only UTF-8 in the database

What problem does that solve? I don't see our encoding support as a big
source of problems, given that database-wide UTF-8 already works fine.
In fact, some postgres features only work with UTF-8.

I agree that we shouldn't add a bunch of bookkeeping and type system
support for per-column encodings without a clear use case, because that
would have a cost. But right now it's just a database-wide thing.

I don't see encodings as a major area to solve problems or innovate. At
the end of the day, encodings have little semantic significance, and
therefore limited upside and limited downside. Collations and
normalization get more interesting, but those are happening at a higher
layer than the encoding.

What about characters not in UTF-8?

Honestly I'm not clear on this topic. Are the "private use" areas in
unicode enough to cover use cases for characters not recognized by
unicode? Which encodings in postgres can represent characters that
can't be automatically transcoded (without failure) to unicode?

Obviously if we have some kind of unicode-based type, it would only
work with encodings that are a subset of unicode.

Regards,
Jeff Davis

#22Nico Williams
nico@cryptonector.com
In reply to: Chapman Flack (#20)
Re: Pre-proposal: unicode normalized text

On Wed, Oct 04, 2023 at 05:32:50PM -0400, Chapman Flack wrote:

Well, for what reason does anybody run PG now with the encoding set
to anything besides UTF-8? I don't really have my finger on that pulse.

Because they still have databases that didn't use UTF-8 10 or 20 years
ago that they haven't migrated to UTF-8?

It's harder to think of why one might _want_ to store text in any
encoding other than UTF-8 for _new_ databases.

Though too there's no reason that it should be impossible other than
lack of developer interest: as long as text is tagged with its encoding,
it should be possible to store text in any number of encodings.

Could it be that it bloats common strings in their local script, and
with enough of those to store, it could matter to use the local
encoding that stores them more economically?

UTF-8 bloat is not likely worth the trouble. UTF-8 is only clearly
bloaty when compared to encodings with 1-byte code units, like
ISO-8859-*. For CJK UTF-8 is not much more bloaty than native
non-Unicode encodings like SHIFT_JIS.

UTF-8 is not much bloatier than UTF-16 in general either.

Bloat is not really a good reason to avoid Unicode or any specific TF.

Also, while any Unicode transfer format can encode any Unicode code
point, I'm unsure whether it's yet the case that {any Unicode code
point} is a superset of every character repertoire associated with
every non-Unicode encoding.

It's not always been the case that Unicode is a strict superset of all
currently-in-use human scripts. Making Unicode a strict superset of all
currently-in-use human scripts seems to be the Unicode Consortium's aim.

I think you're asking why not just use UTF-8 for everything, all the
time. It's a fair question. I don't have a reason to answer in the
negative (maybe someone else does). But that doesn't mean that one
couldn't want to store text in many encodings (e.g., for historical
reasons).

Nico
--

#23Jeff Davis
pgsql@j-davis.com
In reply to: Nico Williams (#19)
Re: Pre-proposal: unicode normalized text

On Wed, 2023-10-04 at 16:15 -0500, Nico Williams wrote:

Better that than TEXT blobs w/ the encoding given by the `CREATE
DATABASE` or `initdb` default!

From an engineering perspective, yes, per-column encodings would be
more flexible. But I still don't understand who exactly would use that,
and why.

It would take an awful lot of effort to implement and make the code
more complex, so we'd really need to see some serious demand for that.

Regards,
Jeff Davis

#24Nico Williams
nico@cryptonector.com
In reply to: Jeff Davis (#23)
Re: Pre-proposal: unicode normalized text

On Wed, Oct 04, 2023 at 04:01:26PM -0700, Jeff Davis wrote:

On Wed, 2023-10-04 at 16:15 -0500, Nico Williams wrote:

Better that than TEXT blobs w/ the encoding given by the `CREATE
DATABASE` or `initdb` default!

From an engineering perspective, yes, per-column encodings would be
more flexible. But I still don't understand who exactly would use that,
and why.

Say you have a bunch of text files in different encodings for reasons
(historical). And now say you want to store them in a database so you
can index them and search them. Sure, you could use a filesystem, but
you want an RDBMS. Well, the answer to this is "convert all those files
to UTF-8".

It would take an awful lot of effort to implement and make the code
more complex, so we'd really need to see some serious demand for that.

Yes, it's better to just use UTF-8.

The DB could implement conversions to/from other codesets and encodings
for clients that insist on it. Why would clients insist anyways?
Better to do the conversions at the clients.

In the middle its best to just have Unicode, and specifically UTF-8,
then push all conversions to the edges of the system.

Nico
--

#25Isaac Morland
isaac.morland@gmail.com
In reply to: Jeff Davis (#21)
Re: Pre-proposal: unicode normalized text

On Wed, 4 Oct 2023 at 17:37, Jeff Davis <pgsql@j-davis.com> wrote:

On Wed, 2023-10-04 at 14:14 -0400, Isaac Morland wrote:

Always store only UTF-8 in the database

What problem does that solve? I don't see our encoding support as a big
source of problems, given that database-wide UTF-8 already works fine.
In fact, some postgres features only work with UTF-8.

My idea is in the context of a suggestion that we support specifying the
encoding per column. I don't mean to suggest eliminating the ability to set
a server-wide encoding, although I doubt there is any use case for using
anything other than UTF-8 except for an old database that hasn’t been
converted yet.

I see no reason to write different strings using different encodings in the
data files, depending on what column they belong to. The various text types
are all abstract data types which store sequences of characters (not
bytes); if one wants bytes, then one has to encode them. Of course, if one
wants UTF-8 bytes, then the encoding is, under the covers, the identity
function, but conceptually it is still taking the characters stored in the
database and converting them to bytes according to a specific encoding.

By contrast, although I don’t see it as a top-priority use case, I can
imagine somebody wanting to restrict the characters stored in a particular
column to characters that can be encoded in a particular encoding. That is
what "CHARACTER SET LATIN1" and so on should mean.

What about characters not in UTF-8?

Honestly I'm not clear on this topic. Are the "private use" areas in
unicode enough to cover use cases for characters not recognized by
unicode? Which encodings in postgres can represent characters that
can't be automatically transcoded (without failure) to unicode?

Here I’m just anticipating a hypothetical objection, “what about characters
that can’t be represented in UTF-8?” to my suggestion to always use UTF-8
and I’m saying we shouldn’t care about them. I believe the answers to your
questions in this paragraph are “yes”, and “none”.

#26Robert Haas
robertmhaas@gmail.com
In reply to: Isaac Morland (#25)
Re: Pre-proposal: unicode normalized text

On Wed, Oct 4, 2023 at 9:02 PM Isaac Morland <isaac.morland@gmail.com> wrote:

What about characters not in UTF-8?

Honestly I'm not clear on this topic. Are the "private use" areas in
unicode enough to cover use cases for characters not recognized by
unicode? Which encodings in postgres can represent characters that
can't be automatically transcoded (without failure) to unicode?

Here I’m just anticipating a hypothetical objection, “what about characters that can’t be represented in UTF-8?” to my suggestion to always use UTF-8 and I’m saying we shouldn’t care about them. I believe the answers to your questions in this paragraph are “yes”, and “none”.

Years ago, I remember SJIS being cited as an example of an encoding
that had characters which weren't part of Unicode. I don't know
whether this is still a live issue.

But I do think that sometimes users are reluctant to perform encoding
conversions on the data that they have. Sometimes they're not
completely certain what encoding their data is in, and sometimes
they're worried that the encoding conversion might fail or produce
wrong answers. In theory, if your existing data is validly encoded and
you know what encoding it's in and it's easily mapped onto UTF-8,
there's no problem. You can just transcode it and be done. But a lot
of times the reality is a lot messier than that.

Which gives me some sympathy with the idea of wanting multiple
character sets within a database. Such a feature exists in some other
database systems and is, presumably, useful to some people. On the
other hand, to do that in PostgreSQL, we'd need to propagate the
character set/encoding information into all of the places that
currently get the typmod and collation, and that is not a small number
of places. It's a lot of infrastructure for the project to carry
around for a feature that's probably only going to continue to become
less relevant.

I suppose you never know, though. Maybe the Unicode consortium will
explode in a tornado of fiery rage and there will be dueling standards
making war over the proper way of representing an emoji of a dog
eating broccoli for decades to come. In that case, our hypothetical
multi-character-set feature might seem prescient.

--
Robert Haas
EDB: http://www.enterprisedb.com

#27Isaac Morland
isaac.morland@gmail.com
In reply to: Robert Haas (#26)
Re: Pre-proposal: unicode normalized text

On Thu, 5 Oct 2023 at 07:32, Robert Haas <robertmhaas@gmail.com> wrote:

But I do think that sometimes users are reluctant to perform encoding
conversions on the data that they have. Sometimes they're not
completely certain what encoding their data is in, and sometimes
they're worried that the encoding conversion might fail or produce
wrong answers. In theory, if your existing data is validly encoded and
you know what encoding it's in and it's easily mapped onto UTF-8,
there's no problem. You can just transcode it and be done. But a lot
of times the reality is a lot messier than that.

In the case you describe, the users don’t have text at all; they have
bytes, and a vague belief about what encoding the bytes might be in and
therefore what characters they are intended to represent. The correct way
to store that in the database is using bytea. Text types should be for when
you know what characters you want to store. In this scenario, the
implementation detail of what encoding the database uses internally to
write the data on the disk doesn't matter, any more than it matters to a
casual user how a table is stored on disk.

Similarly, I don't believe we have a "YMD" data type which stores year,
month, and day, without being specific as to whether it's Gregorian or
Julian; if you have that situation, make a 3-tuple type or do something
else. "Date" is for when you actually know what day you want to record.

#28Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#26)
Re: Pre-proposal: unicode normalized text

On Thu, 2023-10-05 at 07:31 -0400, Robert Haas wrote:

It's a lot of infrastructure for the project to carry
around for a feature that's probably only going to continue to become
less relevant.

Agreed, at least until we understand the set of users per-column
encoding is important to. I acknowledge that the presence of per-column
encoding in the standard is some kind of signal there, but not enough
by itself to justify something so invasive.

I suppose you never know, though.

On balance I think it's better to keep the code clean enough that we
can adapt to whatever unanticipated things happen in the future; rather
than to make the code very complicated trying to anticipate everything,
and then being completely unable to adapt it when something
unanticipated happens anyway.

Regards,
Jeff Davis

#29Nico Williams
nico@cryptonector.com
In reply to: Robert Haas (#26)
Re: Pre-proposal: unicode normalized text

On Thu, Oct 05, 2023 at 07:31:54AM -0400, Robert Haas wrote:

[...] On the other hand, to do that in PostgreSQL, we'd need to
propagate the character set/encoding information into all of the
places that currently get the typmod and collation, and that is not a
small number of places. It's a lot of infrastructure for the project
to carry around for a feature that's probably only going to continue
to become less relevant.

Text+encoding can be just like bytea with a one- or two-byte prefix
indicating what codeset+encoding it's in. That'd be how to encode
such text values on the wire, though on disk the column's type should
indicate the codeset+encoding, so no need to add a prefix to the value.

Complexity would creep in around when and whether to perform automatic
conversions. The easy answer would be "never, on the server side", but
on the client side it might be useful to convert to/from the locale's
codeset+encoding when displaying to the user or accepting user input.

If there's no automatic server-side codeset/encoding conversions then
the server-side cost of supporting non-UTF-8 text should not be too high
dev-wise -- it's just (famous last words) a generic text type
parameterized by codeset+ encoding type. There would not even be a hard
need for functions for conversions, though there would be demand for
them.

But I agree that if there's no need, there's no need. UTF-8 is great,
and if only all PG users would just switch then there's not much more to
do.

Nico
--

#30Jeff Davis
pgsql@j-davis.com
In reply to: Isaac Morland (#27)
Re: Pre-proposal: unicode normalized text

On Thu, 2023-10-05 at 09:10 -0400, Isaac Morland wrote:

In the case you describe, the users don’t have text at all; they have
bytes, and a vague belief about what encoding the bytes might be in
and therefore what characters they are intended to represent. The
correct way to store that in the database is using bytea.

I wouldn't be so absolute. It's text data to the user, and is
presumably working fine for them now, and if they switched to bytea
today then 'foo' would show up as '\x666f6f' in psql.

The point is that this is a somewhat messy problem because there's so
much software out there that treats byte strings and textual data
interchangably. Rust goes the extra mile to organize all of this, and
it ends up with:

* String -- always UTF-8, never NUL-terminated
* CString -- NUL-terminated byte sequence with no internal NULs
* OsString[3] -- needed to make a Path[4], which is needed to open a
file[5]
* Vec<u8> -- any byte sequence

and I suppose we could work towards offering better support for these
different types, the casts between them, and delivering them in a form
the client can understand. But I wouldn't describe it as a solved
problem with one "correct" solution.

One takeaway from this discussion is that it would be useful to provide
more flexibility in how values are represented to the client in a more
general way. In addition to encoding, representational issues have come
up with binary formats, bytea, extra_float_digits, etc.

The collection of books by CJ Date & Hugh Darwen, et al. (sorry I don't
remember exactly which books), made the theoretical case for explicitly
distinguishing values from representations at the lanugage level. We're
starting to see that representational issues can't be satisfied with a
few special cases and hacks -- it's worth thinking about a general
solution to that problem. There was also a lot of relevant discussion
about how to think about overlapping domains (e.g. ASCII is valid in
any of these text domains).

Text types should be for when you know what characters you want to
store. In this scenario, the implementation detail of what encoding
the database uses internally to write the data on the disk doesn't
matter, any more than it matters to a casual user how a table is
stored on disk.

Perhaps the user and application do know, and there's some kind of
subtlety that we're missing, or some historical artefact that we're not
accounting for, and that somehow makes UTF-8 unsuitable. Surely there
are applications that treat certain byte sequences in non-standard
ways, and perhaps not all of those byte sequences can be reproduced by
transcoding from UTF-8 to the client_encoding. In any case, I would
want to understand in detail why a user thinks UTF8 is not good enough
before I make too strong of a statement here.

Even the terminal font that I use renders some "identical" unicode
characters slightly differently depending on the code points from which
they are composed. I believe that's an intentional convenience to make
it more apparent why the "diff" command (or other byte-based tool) is
showing a difference between two textually identical strings, but it's
also a violation of unicode. (This is another reason why normalization
might not be for everyone, but I believe it's still good in typical
cases.)

Regards,
Jeff Davis

#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nico Williams (#29)
Re: Pre-proposal: unicode normalized text

Nico Williams <nico@cryptonector.com> writes:

Text+encoding can be just like bytea with a one- or two-byte prefix
indicating what codeset+encoding it's in. That'd be how to encode
such text values on the wire, though on disk the column's type should
indicate the codeset+encoding, so no need to add a prefix to the value.

The precedent of BOMs (byte order marks) suggests strongly that
such a solution would be horrible to use.

regards, tom lane

#32Nico Williams
nico@cryptonector.com
In reply to: Tom Lane (#31)
Re: Pre-proposal: unicode normalized text

On Thu, Oct 05, 2023 at 03:49:37PM -0400, Tom Lane wrote:

Nico Williams <nico@cryptonector.com> writes:

Text+encoding can be just like bytea with a one- or two-byte prefix
indicating what codeset+encoding it's in. That'd be how to encode
such text values on the wire, though on disk the column's type should
indicate the codeset+encoding, so no need to add a prefix to the value.

The precedent of BOMs (byte order marks) suggests strongly that
such a solution would be horrible to use.

This is just how you encode the type of the string. You have any number
of options. The point is that already PG can encode binary data, so if
how to encode text of disparate encodings on the wire, building on top
of the encoding of bytea is an option.

#33Peter Eisentraut
peter@eisentraut.org
In reply to: Jeff Davis (#6)
Re: Pre-proposal: unicode normalized text

On 03.10.23 21:54, Jeff Davis wrote:

Here, Jeff mentions normalization, but I think it's a major issue
with
collation support. If new code points are added, users can put them
into the database before they are known to the collation library, and
then when they become known to the collation library the sort order
changes and indexes break.

The collation version number may reflect the change in understanding
about assigned code points that may affect collation -- though I'd like
to understand whether this is guaranteed or not.

This is correct. The collation version number produced by ICU contains
the UCA version, which is effectively the Unicode version (14.0, 15.0,
etc.). Since new code point assignments can only come from new Unicode
versions, a new assigned code point will always result in a different
collation version.

For example, with ICU 70 / CLDR 40 / Unicode 14:

select collversion from pg_collation where collname = 'unicode';
= 153.112

With ICU 72 / CLDR 42 / Unicode 15:
= 153.120

At minimum I think we need to have some internal functions to check for
unassigned code points. That belongs in core, because we generate the
unicode tables from a specific version.

If you want to be rigid about it, you also need to consider whether the
Unicode version used by the ICU library in use matches the one used by
the in-core tables.

#34Peter Eisentraut
peter@eisentraut.org
In reply to: Jeff Davis (#28)
Re: Pre-proposal: unicode normalized text

On 05.10.23 19:30, Jeff Davis wrote:

Agreed, at least until we understand the set of users per-column
encoding is important to. I acknowledge that the presence of per-column
encoding in the standard is some kind of signal there, but not enough
by itself to justify something so invasive.

The per-column encoding support in SQL is clearly a legacy feature from
before Unicode. If one were to write something like SQL today, one
would most likely just specify, "everything is Unicode".

#35Jeff Davis
pgsql@j-davis.com
In reply to: Peter Eisentraut (#33)
Re: Pre-proposal: unicode normalized text

On Fri, 2023-10-06 at 09:58 +0200, Peter Eisentraut wrote:

If you want to be rigid about it, you also need to consider whether
the
Unicode version used by the ICU library in use matches the one used
by
the in-core tables.

What problem are you concerned about here? I thought about it and I
didn't see an obvious issue.

If the ICU unicode version is ahead of the Postgres unicode version,
and no unassigned code points are used according to the Postgres
version, then there's no problem.

And in the other direction, there might be some code points that are
assigned according to the postgres unicode version but unassigned
according to the ICU version. But that would be tracked by the
collation version as you pointed out earlier, so upgrading ICU would be
like any other ICU upgrade (with the same risks). Right?

Regards,
Jeff Davis

#36Robert Haas
robertmhaas@gmail.com
In reply to: Nico Williams (#29)
Re: Pre-proposal: unicode normalized text

On Thu, Oct 5, 2023 at 3:15 PM Nico Williams <nico@cryptonector.com> wrote:

Text+encoding can be just like bytea with a one- or two-byte prefix
indicating what codeset+encoding it's in. That'd be how to encode
such text values on the wire, though on disk the column's type should
indicate the codeset+encoding, so no need to add a prefix to the value.

Well, that would be making the encoding a per-value property, rather
than a per-column property like collation as I proposed. I can't see
that working out very nicely, because encodings are
collation-specific. It wouldn't make any sense if the column collation
were en_US.UTF8 or ko_KR.eucKR or en_CA.ISO8859-1 (just to pick a few
values that are legal on my machine) while data stored in the column
was from a whole bunch of different encodings, at most one of which
could be the one to which the column's collation applied. That would
end up meaning, for example, that such a column was very hard to sort.

For that and other reasons, I suspect that the utility of storing data
from a variety of different encodings in the same database column is
quite limited. What I think people really want is a whole column in
some encoding that isn't the normal one for that database. That's not
to say we should add such a feature, but if we do, I think it should
be that, not a different encoding for every individual value.

--
Robert Haas
EDB: http://www.enterprisedb.com

#37Nico Williams
nico@cryptonector.com
In reply to: Robert Haas (#36)
Re: Pre-proposal: unicode normalized text

On Fri, Oct 06, 2023 at 01:33:06PM -0400, Robert Haas wrote:

On Thu, Oct 5, 2023 at 3:15 PM Nico Williams <nico@cryptonector.com> wrote:

Text+encoding can be just like bytea with a one- or two-byte prefix
indicating what codeset+encoding it's in. That'd be how to encode
such text values on the wire, though on disk the column's type should
indicate the codeset+encoding, so no need to add a prefix to the value.

Well, that would be making the encoding a per-value property, rather
than a per-column property like collation as I proposed. I can't see

On-disk it would be just a property of the type, not part of the value.

Nico
--

#38Jeff Davis
pgsql@j-davis.com
In reply to: Nico Williams (#32)
Re: Pre-proposal: unicode normalized text

On Thu, 2023-10-05 at 14:52 -0500, Nico Williams wrote:

This is just how you encode the type of the string.  You have any
number
of options.  The point is that already PG can encode binary data, so
if
how to encode text of disparate encodings on the wire, building on
top
of the encoding of bytea is an option.

There's another significant discussion going on here:

/messages/by-id/CA+TgmoZ8r8xb_73WzKHGb00cV3tpHV_U0RHuzzMFKvLepdu2Jw@mail.gmail.com

about how to handle binary formats better, so it's not clear to me that
it's a great precedent to expand upon. At least not yet.

I think it would be interesting to think more generally about these
representational issues in a way that accounds for binary formats,
extra_float_digits, client_encoding, etc. But I see that as more of an
issue with how the client expects to receive the data -- nobody has a
presented a reason in this thread that we need per-column encodings on
the server.

Regards,
Jeff Davis

#39Robert Haas
robertmhaas@gmail.com
In reply to: Nico Williams (#37)
Re: Pre-proposal: unicode normalized text

On Fri, Oct 6, 2023 at 1:38 PM Nico Williams <nico@cryptonector.com> wrote:

On Fri, Oct 06, 2023 at 01:33:06PM -0400, Robert Haas wrote:

On Thu, Oct 5, 2023 at 3:15 PM Nico Williams <nico@cryptonector.com> wrote:

Text+encoding can be just like bytea with a one- or two-byte prefix
indicating what codeset+encoding it's in. That'd be how to encode
such text values on the wire, though on disk the column's type should
indicate the codeset+encoding, so no need to add a prefix to the value.

Well, that would be making the encoding a per-value property, rather
than a per-column property like collation as I proposed. I can't see

On-disk it would be just a property of the type, not part of the value.

I mean, that's not how it works.

--
Robert Haas
EDB: http://www.enterprisedb.com

#40Nico Williams
nico@cryptonector.com
In reply to: Robert Haas (#39)
Re: Pre-proposal: unicode normalized text

On Fri, Oct 06, 2023 at 02:17:32PM -0400, Robert Haas wrote:

On Fri, Oct 6, 2023 at 1:38 PM Nico Williams <nico@cryptonector.com> wrote:

On Fri, Oct 06, 2023 at 01:33:06PM -0400, Robert Haas wrote:

On Thu, Oct 5, 2023 at 3:15 PM Nico Williams <nico@cryptonector.com> wrote:

Text+encoding can be just like bytea with a one- or two-byte prefix
indicating what codeset+encoding it's in. That'd be how to encode
such text values on the wire, though on disk the column's type should
indicate the codeset+encoding, so no need to add a prefix to the value.

Well, that would be making the encoding a per-value property, rather
than a per-column property like collation as I proposed. I can't see

On-disk it would be just a property of the type, not part of the value.

I mean, that's not how it works.

Sure, because TEXT in PG doesn't have codeset+encoding as part of it --
it's whatever the database's encoding is. Collation can and should be a
porperty of a column, since for Unicode it wouldn't be reasonable to
make that part of the type. But codeset+encoding should really be a
property of the type if PG were to support more than one. IMO.

#41Robert Haas
robertmhaas@gmail.com
In reply to: Nico Williams (#40)
Re: Pre-proposal: unicode normalized text

On Fri, Oct 6, 2023 at 2:25 PM Nico Williams <nico@cryptonector.com> wrote:

Well, that would be making the encoding a per-value property, rather
than a per-column property like collation as I proposed. I can't see

On-disk it would be just a property of the type, not part of the value.

I mean, that's not how it works.

Sure, because TEXT in PG doesn't have codeset+encoding as part of it --
it's whatever the database's encoding is. Collation can and should be a
porperty of a column, since for Unicode it wouldn't be reasonable to
make that part of the type. But codeset+encoding should really be a
property of the type if PG were to support more than one. IMO.

No, what I mean is, you can't just be like "oh, the varlena will be
different in memory than on disk" as if that were no big deal.

I agree that, as an alternative to encoding being a column property,
it could instead be completely a type property, meaning that if you
want to store, say, LATIN1 text in your UTF-8 database, you first
create a latint1text data type and then use it, rather than, as in the
model I proposed, creating a text column and then applying a setting
like ENCODING latin1 to it. I think that there might be some problems
with that model, but it could also have some benefits. If someone were
going to make a run at implementing this, they might want to consider
both designs and evaluate the tradeoffs.

But, even if we were all convinced that this kind of feature was good
to add, I think it would almost certainly be wrong to invent new
varlena features along the way.

--
Robert Haas
EDB: http://www.enterprisedb.com

#42Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#36)
Re: Pre-proposal: unicode normalized text

On Fri, 2023-10-06 at 13:33 -0400, Robert Haas wrote:

What I think people really want is a whole column in
some encoding that isn't the normal one for that database.

Do people really want that? I'd be curious to know why.

A lot of modern projects are simply declaring UTF-8 to be the "one true
way". I am not suggesting that we do that, but it seems odd to go in
the opposite direction and have greater flexibility for many encodings.

Regards,
Jeff Davis

#43Isaac Morland
isaac.morland@gmail.com
In reply to: Jeff Davis (#42)
Re: Pre-proposal: unicode normalized text

On Fri, 6 Oct 2023 at 15:07, Jeff Davis <pgsql@j-davis.com> wrote:

On Fri, 2023-10-06 at 13:33 -0400, Robert Haas wrote:

What I think people really want is a whole column in
some encoding that isn't the normal one for that database.

Do people really want that? I'd be curious to know why.

A lot of modern projects are simply declaring UTF-8 to be the "one true
way". I am not suggesting that we do that, but it seems odd to go in
the opposite direction and have greater flexibility for many encodings.

And even if they want it, we can give it to them when we send/accept the
data from the client; just because they want to store ISO-8859-1 doesn't
mean the actual bytes on the disk need to be that. And by "client" maybe I
mean the client end of the network connection, and maybe I mean the program
that is calling in to libpq.

If they try to submit data that cannot possibly be encoded in the stated
encoding because the bytes they submit don't correspond to any string in
that encoding, then that is unambiguously an error, just as trying to put
February 30 in a date column is an error.

Is there a single other data type where anybody is even discussing letting
the client tell us how to write the data on disk?

#44Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Jeff Davis (#42)
Re: Pre-proposal: unicode normalized text

On Fri, 6 Oct 2023, 21:08 Jeff Davis, <pgsql@j-davis.com> wrote:

On Fri, 2023-10-06 at 13:33 -0400, Robert Haas wrote:

What I think people really want is a whole column in
some encoding that isn't the normal one for that database.

Do people really want that? I'd be curious to know why.

One reason someone would like this is because a database cluster may have
been initialized with something like --no-locale (thus getting defaulted to
LC_COLLATE=C, which is desired behaviour and gets fast strcmp operations
for indexing, and LC_CTYPE=SQL_ASCII, which is not exactly expected but can
be sufficient for some workloads), but now that the data has grown they
want to use utf8.EN_US collations in some of their new and modern table's
fields?
Or, a user wants to maintain literal translation tables, where different
encodings would need to be used for different languages to cover the full
script when Unicode might not cover the full character set yet.
Additionally, I'd imagine specialized encodings like Shift_JIS could be
more space efficient than UTF-8 for e.g. japanese text, which might be
useful for someone who wants to be a bit more frugal with storage when they
know text is guaranteed to be in some encoding's native language:
compression can do the same work, but also adds significant overhead.

I've certainly experienced situations where I forgot to explicitly include
the encoding in initdb --no-locale and then only much later noticed that my
big data load is useless due to an inability to create UTF-8 collated
indexes.
I often use --no-locale to make string indexing fast (locales/collation are
not often important to my workload) and to block any environment variables
from being carried over into the installation. An ability to set or update
the encoding of columns would help reduce the pain: I would no longer have
to re-initialize the database or cluster from 0.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

#45Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#11)
1 attachment(s)
Re: Pre-proposal: unicode normalized text

On Wed, 2023-10-04 at 13:16 -0400, Robert Haas wrote:

At minimum I think we need to have some internal functions to check
for
unassigned code points. That belongs in core, because we generate
the
unicode tables from a specific version.

That's a good idea.

Patch attached.

I added a new perl script to parse UnicodeData.txt and generate a
lookup table (of ranges, which can be binary-searched).

The C entry point does the same thing as u_charType(), and I also
matched the enum numeric values for convenience. I didn't use
u_charType() because I don't think this kind of unicode functionality
should depend on ICU, and I think it should match other Postgres
Unicode functionality.

Strictly speaking, I only needed to know whether it's unassigned or
not, not the general category. But it seemed easy enough to return the
general category, and it will be easier to create other potentially-
useful functions on top of this.

The tests do require ICU though, because I compare with the results of
u_charType().

Regards,
Jeff Davis

Attachments:

v1-0001-Internal-functions-for-determining-Unicode-genera.patchtext/x-patch; charset=UTF-8; name=v1-0001-Internal-functions-for-determining-Unicode-genera.patchDownload
From 174b4099045ba573e7e4c6895c4f59d9c095e10c Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Thu, 5 Oct 2023 17:01:03 -0700
Subject: [PATCH v1] Internal functions for determining Unicode general
 category.

Add perl script to generate general category lookup tables from
UnicodeData.txt, and create entry points to look up the general
category for a given code point.

Discussion: https://postgr.es/m/CA+TgmoYzYR-yhU6k1XFCADeyj=Oyz2PkVsa3iKv+keM8wp-F_A@mail.gmail.com
---
 src/common/Makefile                           |    1 +
 src/common/meson.build                        |    1 +
 src/common/unicode/Makefile                   |   16 +-
 src/common/unicode/category_test.c            |  114 +
 .../generate-unicode_category_table.pl        |  202 +
 src/common/unicode/meson.build                |   31 +
 src/common/unicode/norm_test.c                |    2 +-
 src/common/unicode_category.c                 |  197 +
 src/include/common/unicode_category.h         |   57 +
 src/include/common/unicode_category_table.h   | 4039 +++++++++++++++++
 10 files changed, 4657 insertions(+), 3 deletions(-)
 create mode 100644 src/common/unicode/category_test.c
 create mode 100644 src/common/unicode/generate-unicode_category_table.pl
 create mode 100644 src/common/unicode_category.c
 create mode 100644 src/include/common/unicode_category.h
 create mode 100644 src/include/common/unicode_category_table.h

diff --git a/src/common/Makefile b/src/common/Makefile
index cc5c54dcee..fbce73f407 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -78,6 +78,7 @@ OBJS_COMMON = \
 	scram-common.o \
 	string.o \
 	stringinfo.o \
+	unicode_category.o \
 	unicode_norm.o \
 	username.o \
 	wait_error.o \
diff --git a/src/common/meson.build b/src/common/meson.build
index 3b97497d1a..8106d8ad6c 100644
--- a/src/common/meson.build
+++ b/src/common/meson.build
@@ -30,6 +30,7 @@ common_sources = files(
   'scram-common.c',
   'string.c',
   'stringinfo.c',
+  'unicode_category.c',
   'unicode_norm.c',
   'username.c',
   'wait_error.c',
diff --git a/src/common/unicode/Makefile b/src/common/unicode/Makefile
index 382da476cf..22c606f654 100644
--- a/src/common/unicode/Makefile
+++ b/src/common/unicode/Makefile
@@ -15,11 +15,15 @@ include $(top_builddir)/src/Makefile.global
 override CPPFLAGS := -DFRONTEND -I. $(CPPFLAGS)
 LIBS += $(PTHREAD_LIBS)
 
+LDFLAGS_INTERNAL += $(ICU_LIBS)
+CPPFLAGS += $(ICU_CFLAGS)
+
 # By default, do nothing.
 all:
 
-update-unicode: unicode_norm_table.h unicode_nonspacing_table.h unicode_east_asian_fw_table.h unicode_normprops_table.h unicode_norm_hashfunc.h
+update-unicode: unicode_category_table.h unicode_norm_table.h unicode_nonspacing_table.h unicode_east_asian_fw_table.h unicode_normprops_table.h unicode_norm_hashfunc.h
 	mv $^ $(top_srcdir)/src/include/common/
+	$(MAKE) category-check
 	$(MAKE) normalization-check
 
 # These files are part of the Unicode Character Database. Download
@@ -28,6 +32,9 @@ update-unicode: unicode_norm_table.h unicode_nonspacing_table.h unicode_east_asi
 UnicodeData.txt EastAsianWidth.txt DerivedNormalizationProps.txt CompositionExclusions.txt NormalizationTest.txt: $(top_builddir)/src/Makefile.global
 	$(DOWNLOAD) https://www.unicode.org/Public/$(UNICODE_VERSION)/ucd/$(@F)
 
+unicode_category_table.h: generate-unicode_category_table.pl UnicodeData.txt
+	$(PERL) $<
+
 # Generation of conversion tables used for string normalization with
 # UTF-8 strings.
 unicode_norm_hashfunc.h: unicode_norm_table.h
@@ -45,9 +52,14 @@ unicode_normprops_table.h: generate-unicode_normprops_table.pl DerivedNormalizat
 	$(PERL) $^ >$@
 
 # Test suite
+category-check: category_test
+	./category_test $(UNICODE_VERSION)
+
 normalization-check: norm_test
 	./norm_test
 
+category_test: category_test.o ../unicode_category.o | submake-common
+
 norm_test: norm_test.o ../unicode_norm.o | submake-common
 
 norm_test.o: norm_test_table.h
@@ -64,7 +76,7 @@ norm_test_table.h: generate-norm_test_table.pl NormalizationTest.txt
 
 
 clean:
-	rm -f $(OBJS) norm_test norm_test.o
+	rm -f $(OBJS) category_test category_test.o norm_test norm_test.o
 
 distclean: clean
 	rm -f UnicodeData.txt EastAsianWidth.txt CompositionExclusions.txt NormalizationTest.txt norm_test_table.h unicode_norm_table.h
diff --git a/src/common/unicode/category_test.c b/src/common/unicode/category_test.c
new file mode 100644
index 0000000000..adf0d2848a
--- /dev/null
+++ b/src/common/unicode/category_test.c
@@ -0,0 +1,114 @@
+/*-------------------------------------------------------------------------
+ * category_test.c
+ *		Program to test Unicode general category functions.
+ *
+ * Portions Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/common/unicode/category_test.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#ifdef USE_ICU
+#include <unicode/uchar.h>
+#endif
+#include "common/unicode_category.h"
+
+/*
+ * Parse X.Y[.Z] into integer composed from X and Y.
+ */
+static int
+parse_unicode_version(const char *version)
+{
+	int n, major, minor;
+
+	n = sscanf(version, "%d.%d", &major, &minor);
+
+	Assert(n == 2);
+	Assert(major < 0xff && minor < 0xff);
+
+	return (major << 8) | minor;
+}
+
+/*
+ * Exhaustively test that the Unicode category for each codepoint matches that
+ * returned by ICU.
+ */
+int
+main(int argc, char **argv)
+{
+#ifdef USE_ICU
+	int		pg_unicode_version;
+	int		icu_unicode_version;
+	int		pg_skipped_codepoints  = 0;
+	int		icu_skipped_codepoints = 0;
+
+	/* argument is the version of Unicode data imported by Postgres */
+	if (argc != 2)
+	{
+		printf("Expected one argument, got %d.\n", argc - 1);
+		exit(1);
+	}
+
+	pg_unicode_version = parse_unicode_version(argv[1]);
+	icu_unicode_version = parse_unicode_version(U_UNICODE_VERSION);
+
+	printf("Postgres Unicode Version:\t%d.%d\n", pg_unicode_version >> 8,
+		   pg_unicode_version & 0xff);
+	printf("ICU Unicode Version:\t\t%d.%d\n", icu_unicode_version >> 8,
+		   icu_unicode_version & 0xff);
+
+	for (UChar32 code = 0; code <= 0x10ffff; code++)
+	{
+		uint8_t pg_category = unicode_category(code);
+		uint8_t icu_category = u_charType(code);
+		if (pg_category != icu_category)
+		{
+			/*
+			 * A version mismatch means that some assigned codepoints in the
+			 * newer version may be unassigned in the older version. That's
+			 * OK, though the test will not cover those codepoints marked
+			 * unassigned in the older version (that is, it will no longer be
+			 * an exhaustive test).
+			 */
+			if (pg_category == PG_U_UNASSIGNED &&
+				pg_unicode_version < icu_unicode_version)
+				pg_skipped_codepoints++;
+			else if (icu_category == PG_U_UNASSIGNED &&
+					 icu_unicode_version < pg_unicode_version)
+				icu_skipped_codepoints++;
+			else
+			{
+				printf("FAILURE for codepoint %06x\n", code);
+				printf("Postgres category:	%02d %s %s\n", pg_category,
+					   unicode_category_short(pg_category),
+					   unicode_category_string(pg_category));
+				printf("ICU category:		%02d %s %s\n", icu_category,
+					   unicode_category_short(icu_category),
+					   unicode_category_string(icu_category));
+				printf("\n");
+				exit(1);
+			}
+		}
+	}
+
+	if (pg_skipped_codepoints > 0)
+		printf("Skipped %d codepoints unassigned in Postgres due to Unicode version mismatch.\n",
+			   pg_skipped_codepoints);
+	if (icu_skipped_codepoints > 0)
+		printf("Skipped %d codepoints unassigned in ICU due to Unicode version mismatch.\n",
+			   icu_skipped_codepoints);
+
+	printf("category_test: All tests successful!\n");
+	exit(0);
+#else
+	printf("ICU support required for test; skipping.\n");
+	exit(0);
+#endif
+}
diff --git a/src/common/unicode/generate-unicode_category_table.pl b/src/common/unicode/generate-unicode_category_table.pl
new file mode 100644
index 0000000000..bec34d591d
--- /dev/null
+++ b/src/common/unicode/generate-unicode_category_table.pl
@@ -0,0 +1,202 @@
+#!/usr/bin/perl
+#
+# Generate a code point category table and its lookup utilities, using
+# Unicode data files as input.
+#
+# Input: UnicodeData.txt
+# Output: unicode_category_table.h
+#
+# Copyright (c) 2000-2023, PostgreSQL Global Development Group
+
+use strict;
+use warnings;
+use Getopt::Long;
+
+use FindBin;
+use lib "$FindBin::RealBin/../../tools/";
+
+my $CATEGORY_UNASSIGNED = 'Cn';
+
+my $output_path = '.';
+
+GetOptions('outdir:s' => \$output_path);
+
+my $output_table_file = "$output_path/unicode_category_table.h";
+
+my $FH;
+
+# Read entries from UnicodeData.txt into a list of codepoint ranges
+# and their general category.
+my @category_ranges = ();
+my $range_start = undef;
+my $range_end = undef;
+my $range_category = undef;
+
+# If between a "<..., First>" entry and a "<..., Last>" entry, the gap in
+# codepoints represents a range, and $gap_category is equal to the
+# category for both (which must match). Otherwise, the gap represents
+# unassigned code points.
+my $gap_category = undef;
+
+open($FH, '<', "$output_path/UnicodeData.txt")
+  or die "Could not open $output_path/UnicodeData.txt: $!.";
+while (my $line = <$FH>)
+{
+	my @elts = split(';', $line);
+	my $code = hex($elts[0]);
+	my $name = $elts[1];
+	my $category = $elts[2];
+
+	die "codepoint out of range" if $code > 0x10FFFF;
+	die "unassigned codepoint in UnicodeData.txt" if $category eq $CATEGORY_UNASSIGNED;
+
+	if (!defined($range_start)) {
+		my $code_str = sprintf "0x%06x", $code;
+		die if defined($range_end) || defined($range_category) || defined($gap_category);
+		die "unexpected first entry <..., Last>" if ($name =~ /Last>/);
+		die "expected 0x000000 for first entry, got $code_str" if $code != 0x000000;
+
+		# initialize
+		$range_start = $code;
+		$range_end = $code;
+		$range_category = $category;
+		if ($name =~ /<.*, First>$/) {
+			$gap_category = $category;
+		} else {
+			$gap_category = $CATEGORY_UNASSIGNED;
+		}
+		next;
+	}
+
+	# Gap in codepoints detected. If it's a different category than
+	# the current range, emit the current range and initialize a new
+	# range representing the gap.
+	if ($range_end + 1 != $code && $range_category ne $gap_category) {
+		push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+		$range_start = $range_end + 1;
+		$range_end = $code - 1;
+		$range_category = $gap_category;
+	}
+
+	# different category; new range
+	if ($range_category ne $category) {
+		push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+		$range_start = $code;
+		$range_end = $code;
+		$range_category = $category;
+	}
+
+	if ($name =~ /<.*, First>$/) {
+		die "<..., First> entry unexpectedly follows <..., Last> entry"
+		  if $gap_category ne $CATEGORY_UNASSIGNED;
+		$gap_category = $category;
+	}
+	elsif ($name =~ /<.*, Last>$/) {
+		die "<..., First> and <..., Last> entries have mismatching general category"
+		  if $gap_category ne $category;
+		$gap_category = $CATEGORY_UNASSIGNED;
+	}
+	else {
+		die "unexpected entry found between <..., First> and <..., Last>"
+		  if $gap_category ne $CATEGORY_UNASSIGNED;
+	}
+
+	$range_end = $code;
+}
+close $FH;
+
+die "<..., First> entry with no corresponding <..., Last> entry"
+  if $gap_category ne $CATEGORY_UNASSIGNED;
+
+# emit final range
+push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+
+# emit range for any unassigned code points after last entry
+if ($range_end < 0x10FFFF) {
+	$range_start = $range_end + 1;
+	$range_end = 0x10FFFF;
+	$range_category = $CATEGORY_UNASSIGNED;
+	push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+}
+
+my $num_ranges = scalar @category_ranges;
+
+# See: https://www.unicode.org/reports/tr44/#General_Category_Values
+my $categories = {
+	Cn => 'PG_U_UNASSIGNED',
+	Lu => 'PG_U_UPPERCASE_LETTER',
+	Ll => 'PG_U_LOWERCASE_LETTER',
+	Lt => 'PG_U_TITLECASE_LETTER',
+	Lm => 'PG_U_MODIFIER_LETTER',
+	Lo => 'PG_U_OTHER_LETTER',
+	Mn => 'PG_U_NON_SPACING_MARK',
+	Me => 'PG_U_ENCLOSING_MARK',
+	Mc => 'PG_U_COMBINING_SPACING_MARK',
+	Nd => 'PG_U_DECIMAL_DIGIT_NUMBER',
+	Nl => 'PG_U_LETTER_NUMBER',
+	No => 'PG_U_OTHER_NUMBER',
+	Zs => 'PG_U_SPACE_SEPARATOR',
+	Zl => 'PG_U_LINE_SEPARATOR',
+	Zp => 'PG_U_PARAGRAPH_SEPARATOR',
+	Cc => 'PG_U_CONTROL_CHAR',
+	Cf => 'PG_U_FORMAT_CHAR',
+	Co => 'PG_U_PRIVATE_USE_CHAR',
+	Cs => 'PG_U_SURROGATE',
+	Pd => 'PG_U_DASH_PUNCTUATION',
+	Ps => 'PG_U_START_PUNCTUATION',
+	Pe => 'PG_U_END_PUNCTUATION',
+	Pc => 'PG_U_CONNECTOR_PUNCTUATION',
+	Po => 'PG_U_OTHER_PUNCTUATION',
+	Sm => 'PG_U_MATH_SYMBOL',
+	Sc => 'PG_U_CURRENCY_SYMBOL',
+	Sk => 'PG_U_MODIFIER_SYMBOL',
+	So => 'PG_U_OTHER_SYMBOL',
+	Pi => 'PG_U_INITIAL_PUNCTUATION',
+	Pf => 'PG_U_FINAL_PUNCTUATION'
+};
+
+# Start writing out the output files
+open my $OT, '>', $output_table_file
+  or die "Could not open output file $output_table_file: $!\n";
+
+print $OT <<HEADER;
+/*-------------------------------------------------------------------------
+ *
+ * unicode_category_table.h
+ *	  Category table for Unicode character classification.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/unicode_category_table.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * File auto-generated by src/common/unicode/generate-unicode_category_table.pl,
+ * do not edit. There is deliberately not an #ifndef PG_UNICODE_CATEGORY_TABLE_H
+ * here.
+ */
+typedef struct
+{
+	uint32		first;	/* Unicode codepoint */
+	uint32		last;		/* Unicode codepoint */
+	uint8		category;		/* combining class of character */
+} pg_category_range;
+
+/* table of Unicode codepoint ranges and their categories */
+static const pg_category_range unicode_categories[$num_ranges] =
+{
+HEADER
+
+my $firsttime = 1;
+foreach my $range (@category_ranges) {
+	printf $OT ",\n" unless $firsttime;
+	$firsttime = 0;
+
+	my $category = $categories->{$range->{category}};
+	die "category missing: $range->{category}" unless $category;
+	printf $OT "\t{0x%06x, 0x%06x, %s}", $range->{start}, $range->{end}, $category;
+}
+print $OT "\n};\n\n";
diff --git a/src/common/unicode/meson.build b/src/common/unicode/meson.build
index 357ca2f9fb..e37e9d07be 100644
--- a/src/common/unicode/meson.build
+++ b/src/common/unicode/meson.build
@@ -24,6 +24,16 @@ endforeach
 
 update_unicode_targets = []
 
+update_unicode_targets += \
+  custom_target('unicode_category_table.h',
+    input: [unicode_data['UnicodeData.txt']],
+    output: ['unicode_category_table.h'],
+    command: [
+      perl, files('generate-unicode_category_table.pl'),
+      '--outdir', '@OUTDIR@', '@INPUT@'],
+    build_by_default: false,
+  )
+
 update_unicode_targets += \
   custom_target('unicode_norm_table.h',
     input: [unicode_data['UnicodeData.txt'], unicode_data['CompositionExclusions.txt']],
@@ -73,6 +83,17 @@ norm_test_table = custom_target('norm_test_table.h',
 
 inc = include_directories('.')
 
+category_test = executable('category_test',
+  ['category_test.c'],
+  dependencies: [frontend_port_code, icu],
+  include_directories: inc,
+  link_with: [common_static, pgport_static],
+  build_by_default: false,
+  kwargs: default_bin_args + {
+    'install': false,
+  }
+)
+
 norm_test = executable('norm_test',
   ['norm_test.c', norm_test_table],
   dependencies: [frontend_port_code],
@@ -86,6 +107,16 @@ norm_test = executable('norm_test',
 
 update_unicode_dep = []
 
+if not meson.is_cross_build()
+  update_unicode_dep += custom_target('category_test.run',
+    output: 'category_test.run',
+    input: update_unicode_targets,
+    command: [category_test, UNICODE_VERSION],
+    build_by_default: false,
+    build_always_stale: true,
+  )
+endif
+
 if not meson.is_cross_build()
   update_unicode_dep += custom_target('norm_test.run',
     output: 'norm_test.run',
diff --git a/src/common/unicode/norm_test.c b/src/common/unicode/norm_test.c
index 809a6dee54..b6097b912a 100644
--- a/src/common/unicode/norm_test.c
+++ b/src/common/unicode/norm_test.c
@@ -81,6 +81,6 @@ main(int argc, char **argv)
 		}
 	}
 
-	printf("All tests successful!\n");
+	printf("norm_test: All tests successful!\n");
 	exit(0);
 }
diff --git a/src/common/unicode_category.c b/src/common/unicode_category.c
new file mode 100644
index 0000000000..b8b8ee8a4e
--- /dev/null
+++ b/src/common/unicode_category.c
@@ -0,0 +1,197 @@
+/*-------------------------------------------------------------------------
+ * unicode_category.c
+ *		Determine general category of Unicode characters.
+ *
+ * Portions Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/common/unicode_category.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef FRONTEND
+#include "postgres.h"
+#else
+#include "postgres_fe.h"
+#endif
+
+#include "common/unicode_category.h"
+#include "common/unicode_category_table.h"
+
+/*
+ * Unicode general category for the given codepoint.
+ */
+pg_unicode_category
+unicode_category(pg_wchar ucs)
+{
+	int	min = 0;
+	int	mid;
+	int max = (sizeof(unicode_categories) / sizeof(pg_category_range)) - 1;
+
+	Assert (ucs >= unicode_categories[0].first &&
+			ucs <= unicode_categories[max].last);
+
+	while (max >= min)
+	{
+		mid = (min + max) / 2;
+		if (ucs > unicode_categories[mid].last)
+			min = mid + 1;
+		else if (ucs < unicode_categories[mid].first)
+			max = mid - 1;
+		else
+			return unicode_categories[mid].category;
+	}
+
+	Assert(false);
+	return (pg_unicode_category) -1;
+}
+
+/*
+ * Description of Unicode general category.
+ *
+ * See: https://www.unicode.org/reports/tr44/#General_Category_Values
+ */
+const char *
+unicode_category_string(pg_unicode_category category)
+{
+	switch (category)
+	{
+		case PG_U_UNASSIGNED:
+			return "Unassigned";
+		case PG_U_UPPERCASE_LETTER:
+			return "Uppercase_Letter";
+		case PG_U_LOWERCASE_LETTER:
+			return "Lowercase_Letter";
+		case PG_U_TITLECASE_LETTER:
+			return "Titlecase_Letter";
+		case PG_U_MODIFIER_LETTER:
+			return "Modifier_Letter";
+		case PG_U_OTHER_LETTER:
+			return "Other_Letter";
+		case PG_U_NON_SPACING_MARK:
+			return "Nonspacing_Mark";
+		case PG_U_ENCLOSING_MARK:
+			return "Enclosing_Mark";
+		case PG_U_COMBINING_SPACING_MARK:
+			return "Spacing_Mark";
+		case PG_U_DECIMAL_DIGIT_NUMBER:
+			return "Decimal_Number";
+		case PG_U_LETTER_NUMBER:
+			return "Letter_Number";
+		case PG_U_OTHER_NUMBER:
+			return "Other_Number";
+		case PG_U_SPACE_SEPARATOR:
+			return "Space_Separator";
+		case PG_U_LINE_SEPARATOR:
+			return "Line_Separator";
+		case PG_U_PARAGRAPH_SEPARATOR:
+			return "Paragraph_Separator";
+		case PG_U_CONTROL_CHAR:
+			return "Control";
+		case PG_U_FORMAT_CHAR:
+			return "Format";
+		case PG_U_PRIVATE_USE_CHAR:
+			return "Private_Use";
+		case PG_U_SURROGATE:
+			return "Surrogate";
+		case PG_U_DASH_PUNCTUATION:
+			return "Dash_Punctuation";
+		case PG_U_START_PUNCTUATION:
+			return "Open_Punctuation";
+		case PG_U_END_PUNCTUATION:
+			return "Close_Punctuation";
+		case PG_U_CONNECTOR_PUNCTUATION:
+			return "Connector_Punctuation";
+		case PG_U_OTHER_PUNCTUATION:
+			return "Other_Punctuation";
+		case PG_U_MATH_SYMBOL:
+			return "Math_Symbol";
+		case PG_U_CURRENCY_SYMBOL:
+			return "Currency_Symbol";
+		case PG_U_MODIFIER_SYMBOL:
+			return "Modifier_Symbol";
+		case PG_U_OTHER_SYMBOL:
+			return "Other_Symbol";
+		case PG_U_INITIAL_PUNCTUATION:
+			return "Initial_Punctuation";
+		case PG_U_FINAL_PUNCTUATION:
+			return "Final_Punctuation";
+		default:
+			return "Unrecognized";
+	}
+}
+
+/*
+ * Short code for Unicode general category.
+ *
+ * See: https://www.unicode.org/reports/tr44/#General_Category_Values
+ */
+const char *
+unicode_category_short(pg_unicode_category category)
+{
+	switch (category)
+	{
+		case PG_U_UNASSIGNED:
+			return "Cn";
+		case PG_U_UPPERCASE_LETTER:
+			return "Lu";
+		case PG_U_LOWERCASE_LETTER:
+			return "Ll";
+		case PG_U_TITLECASE_LETTER:
+			return "Lt";
+		case PG_U_MODIFIER_LETTER:
+			return "Lm";
+		case PG_U_OTHER_LETTER:
+			return "Lo";
+		case PG_U_NON_SPACING_MARK:
+			return "Mn";
+		case PG_U_ENCLOSING_MARK:
+			return "Me";
+		case PG_U_COMBINING_SPACING_MARK:
+			return "Mc";
+		case PG_U_DECIMAL_DIGIT_NUMBER:
+			return "Nd";
+		case PG_U_LETTER_NUMBER:
+			return "Nl";
+		case PG_U_OTHER_NUMBER:
+			return "No";
+		case PG_U_SPACE_SEPARATOR:
+			return "Zs";
+		case PG_U_LINE_SEPARATOR:
+			return "Zl";
+		case PG_U_PARAGRAPH_SEPARATOR:
+			return "Zp";
+		case PG_U_CONTROL_CHAR:
+			return "Cc";
+		case PG_U_FORMAT_CHAR:
+			return "Cf";
+		case PG_U_PRIVATE_USE_CHAR:
+			return "Co";
+		case PG_U_SURROGATE:
+			return "Cs";
+		case PG_U_DASH_PUNCTUATION:
+			return "Pd";
+		case PG_U_START_PUNCTUATION:
+			return "Ps";
+		case PG_U_END_PUNCTUATION:
+			return "Pe";
+		case PG_U_CONNECTOR_PUNCTUATION:
+			return "Pc";
+		case PG_U_OTHER_PUNCTUATION:
+			return "Po";
+		case PG_U_MATH_SYMBOL:
+			return "Sm";
+		case PG_U_CURRENCY_SYMBOL:
+			return "Sc";
+		case PG_U_MODIFIER_SYMBOL:
+			return "Sk";
+		case PG_U_OTHER_SYMBOL:
+			return "So";
+		case PG_U_INITIAL_PUNCTUATION:
+			return "Pi";
+		case PG_U_FINAL_PUNCTUATION:
+			return "Pf";
+		default:
+			return "??";
+	}
+}
diff --git a/src/include/common/unicode_category.h b/src/include/common/unicode_category.h
new file mode 100644
index 0000000000..e4301be726
--- /dev/null
+++ b/src/include/common/unicode_category.h
@@ -0,0 +1,57 @@
+/*-------------------------------------------------------------------------
+ *
+ * unicode_category.h
+ *	  Routines for determining the category of Unicode characters.
+ *
+ * These definitions can be used by both frontend and backend code.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * src/include/common/unicode_category.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef UNICODE_CATEGORY_H
+#define UNICODE_CATEGORY_H
+
+#include "mb/pg_wchar.h"
+
+/* matches corresponding numeric values of UCharCategory, defined by ICU */
+typedef enum pg_unicode_category {
+	PG_U_UNASSIGNED = 0,
+	PG_U_UPPERCASE_LETTER = 1,
+	PG_U_LOWERCASE_LETTER = 2,
+	PG_U_TITLECASE_LETTER = 3,
+	PG_U_MODIFIER_LETTER = 4,
+	PG_U_OTHER_LETTER = 5,
+	PG_U_NON_SPACING_MARK = 6,
+	PG_U_ENCLOSING_MARK = 7,
+	PG_U_COMBINING_SPACING_MARK = 8,
+	PG_U_DECIMAL_DIGIT_NUMBER = 9,
+	PG_U_LETTER_NUMBER = 10,
+	PG_U_OTHER_NUMBER = 11,
+	PG_U_SPACE_SEPARATOR = 12,
+	PG_U_LINE_SEPARATOR = 13,
+	PG_U_PARAGRAPH_SEPARATOR = 14,
+	PG_U_CONTROL_CHAR = 15,
+	PG_U_FORMAT_CHAR = 16,
+	PG_U_PRIVATE_USE_CHAR = 17,
+	PG_U_SURROGATE = 18,
+	PG_U_DASH_PUNCTUATION = 19,
+	PG_U_START_PUNCTUATION = 20,
+	PG_U_END_PUNCTUATION = 21,
+	PG_U_CONNECTOR_PUNCTUATION = 22,
+	PG_U_OTHER_PUNCTUATION = 23,
+	PG_U_MATH_SYMBOL = 24,
+	PG_U_CURRENCY_SYMBOL = 25,
+	PG_U_MODIFIER_SYMBOL = 26,
+	PG_U_OTHER_SYMBOL = 27,
+	PG_U_INITIAL_PUNCTUATION = 28,
+	PG_U_FINAL_PUNCTUATION = 29
+} pg_unicode_category;
+
+extern pg_unicode_category unicode_category(pg_wchar ucs);
+const char *unicode_category_string(pg_unicode_category category);
+const char *unicode_category_short(pg_unicode_category category);
+
+#endif							/* UNICODE_CATEGORY_H */
diff --git a/src/include/common/unicode_category_table.h b/src/include/common/unicode_category_table.h
new file mode 100644
index 0000000000..3125cbdbf5
--- /dev/null
+++ b/src/include/common/unicode_category_table.h
@@ -0,0 +1,4039 @@
+/*-------------------------------------------------------------------------
+ *
+ * unicode_category_table.h
+ *	  Category table for Unicode character classification.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/unicode_category_table.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * File auto-generated by src/common/unicode/generate-unicode_category_table.pl,
+ * do not edit. There is deliberately not an #ifndef PG_UNICODE_CATEGORY_TABLE_H
+ * here.
+ */
+typedef struct
+{
+	uint32		first;	/* Unicode codepoint */
+	uint32		last;		/* Unicode codepoint */
+	uint8		category;		/* combining class of character */
+} pg_category_range;
+
+/* table of Unicode codepoint ranges and their categories */
+static const pg_category_range unicode_categories[4009] =
+{
+	{0x000000, 0x00001f, PG_U_CONTROL_CHAR},
+	{0x000020, 0x000020, PG_U_SPACE_SEPARATOR},
+	{0x000021, 0x000023, PG_U_OTHER_PUNCTUATION},
+	{0x000024, 0x000024, PG_U_CURRENCY_SYMBOL},
+	{0x000025, 0x000027, PG_U_OTHER_PUNCTUATION},
+	{0x000028, 0x000028, PG_U_START_PUNCTUATION},
+	{0x000029, 0x000029, PG_U_END_PUNCTUATION},
+	{0x00002a, 0x00002a, PG_U_OTHER_PUNCTUATION},
+	{0x00002b, 0x00002b, PG_U_MATH_SYMBOL},
+	{0x00002c, 0x00002c, PG_U_OTHER_PUNCTUATION},
+	{0x00002d, 0x00002d, PG_U_DASH_PUNCTUATION},
+	{0x00002e, 0x00002f, PG_U_OTHER_PUNCTUATION},
+	{0x000030, 0x000039, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00003a, 0x00003b, PG_U_OTHER_PUNCTUATION},
+	{0x00003c, 0x00003e, PG_U_MATH_SYMBOL},
+	{0x00003f, 0x000040, PG_U_OTHER_PUNCTUATION},
+	{0x000041, 0x00005a, PG_U_UPPERCASE_LETTER},
+	{0x00005b, 0x00005b, PG_U_START_PUNCTUATION},
+	{0x00005c, 0x00005c, PG_U_OTHER_PUNCTUATION},
+	{0x00005d, 0x00005d, PG_U_END_PUNCTUATION},
+	{0x00005e, 0x00005e, PG_U_MODIFIER_SYMBOL},
+	{0x00005f, 0x00005f, PG_U_CONNECTOR_PUNCTUATION},
+	{0x000060, 0x000060, PG_U_MODIFIER_SYMBOL},
+	{0x000061, 0x00007a, PG_U_LOWERCASE_LETTER},
+	{0x00007b, 0x00007b, PG_U_START_PUNCTUATION},
+	{0x00007c, 0x00007c, PG_U_MATH_SYMBOL},
+	{0x00007d, 0x00007d, PG_U_END_PUNCTUATION},
+	{0x00007e, 0x00007e, PG_U_MATH_SYMBOL},
+	{0x00007f, 0x00009f, PG_U_CONTROL_CHAR},
+	{0x0000a0, 0x0000a0, PG_U_SPACE_SEPARATOR},
+	{0x0000a1, 0x0000a1, PG_U_OTHER_PUNCTUATION},
+	{0x0000a2, 0x0000a5, PG_U_CURRENCY_SYMBOL},
+	{0x0000a6, 0x0000a6, PG_U_OTHER_SYMBOL},
+	{0x0000a7, 0x0000a7, PG_U_OTHER_PUNCTUATION},
+	{0x0000a8, 0x0000a8, PG_U_MODIFIER_SYMBOL},
+	{0x0000a9, 0x0000a9, PG_U_OTHER_SYMBOL},
+	{0x0000aa, 0x0000aa, PG_U_OTHER_LETTER},
+	{0x0000ab, 0x0000ab, PG_U_INITIAL_PUNCTUATION},
+	{0x0000ac, 0x0000ac, PG_U_MATH_SYMBOL},
+	{0x0000ad, 0x0000ad, PG_U_FORMAT_CHAR},
+	{0x0000ae, 0x0000ae, PG_U_OTHER_SYMBOL},
+	{0x0000af, 0x0000af, PG_U_MODIFIER_SYMBOL},
+	{0x0000b0, 0x0000b0, PG_U_OTHER_SYMBOL},
+	{0x0000b1, 0x0000b1, PG_U_MATH_SYMBOL},
+	{0x0000b2, 0x0000b3, PG_U_OTHER_NUMBER},
+	{0x0000b4, 0x0000b4, PG_U_MODIFIER_SYMBOL},
+	{0x0000b5, 0x0000b5, PG_U_LOWERCASE_LETTER},
+	{0x0000b6, 0x0000b7, PG_U_OTHER_PUNCTUATION},
+	{0x0000b8, 0x0000b8, PG_U_MODIFIER_SYMBOL},
+	{0x0000b9, 0x0000b9, PG_U_OTHER_NUMBER},
+	{0x0000ba, 0x0000ba, PG_U_OTHER_LETTER},
+	{0x0000bb, 0x0000bb, PG_U_FINAL_PUNCTUATION},
+	{0x0000bc, 0x0000be, PG_U_OTHER_NUMBER},
+	{0x0000bf, 0x0000bf, PG_U_OTHER_PUNCTUATION},
+	{0x0000c0, 0x0000d6, PG_U_UPPERCASE_LETTER},
+	{0x0000d7, 0x0000d7, PG_U_MATH_SYMBOL},
+	{0x0000d8, 0x0000de, PG_U_UPPERCASE_LETTER},
+	{0x0000df, 0x0000f6, PG_U_LOWERCASE_LETTER},
+	{0x0000f7, 0x0000f7, PG_U_MATH_SYMBOL},
+	{0x0000f8, 0x0000ff, PG_U_LOWERCASE_LETTER},
+	{0x000100, 0x000100, PG_U_UPPERCASE_LETTER},
+	{0x000101, 0x000101, PG_U_LOWERCASE_LETTER},
+	{0x000102, 0x000102, PG_U_UPPERCASE_LETTER},
+	{0x000103, 0x000103, PG_U_LOWERCASE_LETTER},
+	{0x000104, 0x000104, PG_U_UPPERCASE_LETTER},
+	{0x000105, 0x000105, PG_U_LOWERCASE_LETTER},
+	{0x000106, 0x000106, PG_U_UPPERCASE_LETTER},
+	{0x000107, 0x000107, PG_U_LOWERCASE_LETTER},
+	{0x000108, 0x000108, PG_U_UPPERCASE_LETTER},
+	{0x000109, 0x000109, PG_U_LOWERCASE_LETTER},
+	{0x00010a, 0x00010a, PG_U_UPPERCASE_LETTER},
+	{0x00010b, 0x00010b, PG_U_LOWERCASE_LETTER},
+	{0x00010c, 0x00010c, PG_U_UPPERCASE_LETTER},
+	{0x00010d, 0x00010d, PG_U_LOWERCASE_LETTER},
+	{0x00010e, 0x00010e, PG_U_UPPERCASE_LETTER},
+	{0x00010f, 0x00010f, PG_U_LOWERCASE_LETTER},
+	{0x000110, 0x000110, PG_U_UPPERCASE_LETTER},
+	{0x000111, 0x000111, PG_U_LOWERCASE_LETTER},
+	{0x000112, 0x000112, PG_U_UPPERCASE_LETTER},
+	{0x000113, 0x000113, PG_U_LOWERCASE_LETTER},
+	{0x000114, 0x000114, PG_U_UPPERCASE_LETTER},
+	{0x000115, 0x000115, PG_U_LOWERCASE_LETTER},
+	{0x000116, 0x000116, PG_U_UPPERCASE_LETTER},
+	{0x000117, 0x000117, PG_U_LOWERCASE_LETTER},
+	{0x000118, 0x000118, PG_U_UPPERCASE_LETTER},
+	{0x000119, 0x000119, PG_U_LOWERCASE_LETTER},
+	{0x00011a, 0x00011a, PG_U_UPPERCASE_LETTER},
+	{0x00011b, 0x00011b, PG_U_LOWERCASE_LETTER},
+	{0x00011c, 0x00011c, PG_U_UPPERCASE_LETTER},
+	{0x00011d, 0x00011d, PG_U_LOWERCASE_LETTER},
+	{0x00011e, 0x00011e, PG_U_UPPERCASE_LETTER},
+	{0x00011f, 0x00011f, PG_U_LOWERCASE_LETTER},
+	{0x000120, 0x000120, PG_U_UPPERCASE_LETTER},
+	{0x000121, 0x000121, PG_U_LOWERCASE_LETTER},
+	{0x000122, 0x000122, PG_U_UPPERCASE_LETTER},
+	{0x000123, 0x000123, PG_U_LOWERCASE_LETTER},
+	{0x000124, 0x000124, PG_U_UPPERCASE_LETTER},
+	{0x000125, 0x000125, PG_U_LOWERCASE_LETTER},
+	{0x000126, 0x000126, PG_U_UPPERCASE_LETTER},
+	{0x000127, 0x000127, PG_U_LOWERCASE_LETTER},
+	{0x000128, 0x000128, PG_U_UPPERCASE_LETTER},
+	{0x000129, 0x000129, PG_U_LOWERCASE_LETTER},
+	{0x00012a, 0x00012a, PG_U_UPPERCASE_LETTER},
+	{0x00012b, 0x00012b, PG_U_LOWERCASE_LETTER},
+	{0x00012c, 0x00012c, PG_U_UPPERCASE_LETTER},
+	{0x00012d, 0x00012d, PG_U_LOWERCASE_LETTER},
+	{0x00012e, 0x00012e, PG_U_UPPERCASE_LETTER},
+	{0x00012f, 0x00012f, PG_U_LOWERCASE_LETTER},
+	{0x000130, 0x000130, PG_U_UPPERCASE_LETTER},
+	{0x000131, 0x000131, PG_U_LOWERCASE_LETTER},
+	{0x000132, 0x000132, PG_U_UPPERCASE_LETTER},
+	{0x000133, 0x000133, PG_U_LOWERCASE_LETTER},
+	{0x000134, 0x000134, PG_U_UPPERCASE_LETTER},
+	{0x000135, 0x000135, PG_U_LOWERCASE_LETTER},
+	{0x000136, 0x000136, PG_U_UPPERCASE_LETTER},
+	{0x000137, 0x000138, PG_U_LOWERCASE_LETTER},
+	{0x000139, 0x000139, PG_U_UPPERCASE_LETTER},
+	{0x00013a, 0x00013a, PG_U_LOWERCASE_LETTER},
+	{0x00013b, 0x00013b, PG_U_UPPERCASE_LETTER},
+	{0x00013c, 0x00013c, PG_U_LOWERCASE_LETTER},
+	{0x00013d, 0x00013d, PG_U_UPPERCASE_LETTER},
+	{0x00013e, 0x00013e, PG_U_LOWERCASE_LETTER},
+	{0x00013f, 0x00013f, PG_U_UPPERCASE_LETTER},
+	{0x000140, 0x000140, PG_U_LOWERCASE_LETTER},
+	{0x000141, 0x000141, PG_U_UPPERCASE_LETTER},
+	{0x000142, 0x000142, PG_U_LOWERCASE_LETTER},
+	{0x000143, 0x000143, PG_U_UPPERCASE_LETTER},
+	{0x000144, 0x000144, PG_U_LOWERCASE_LETTER},
+	{0x000145, 0x000145, PG_U_UPPERCASE_LETTER},
+	{0x000146, 0x000146, PG_U_LOWERCASE_LETTER},
+	{0x000147, 0x000147, PG_U_UPPERCASE_LETTER},
+	{0x000148, 0x000149, PG_U_LOWERCASE_LETTER},
+	{0x00014a, 0x00014a, PG_U_UPPERCASE_LETTER},
+	{0x00014b, 0x00014b, PG_U_LOWERCASE_LETTER},
+	{0x00014c, 0x00014c, PG_U_UPPERCASE_LETTER},
+	{0x00014d, 0x00014d, PG_U_LOWERCASE_LETTER},
+	{0x00014e, 0x00014e, PG_U_UPPERCASE_LETTER},
+	{0x00014f, 0x00014f, PG_U_LOWERCASE_LETTER},
+	{0x000150, 0x000150, PG_U_UPPERCASE_LETTER},
+	{0x000151, 0x000151, PG_U_LOWERCASE_LETTER},
+	{0x000152, 0x000152, PG_U_UPPERCASE_LETTER},
+	{0x000153, 0x000153, PG_U_LOWERCASE_LETTER},
+	{0x000154, 0x000154, PG_U_UPPERCASE_LETTER},
+	{0x000155, 0x000155, PG_U_LOWERCASE_LETTER},
+	{0x000156, 0x000156, PG_U_UPPERCASE_LETTER},
+	{0x000157, 0x000157, PG_U_LOWERCASE_LETTER},
+	{0x000158, 0x000158, PG_U_UPPERCASE_LETTER},
+	{0x000159, 0x000159, PG_U_LOWERCASE_LETTER},
+	{0x00015a, 0x00015a, PG_U_UPPERCASE_LETTER},
+	{0x00015b, 0x00015b, PG_U_LOWERCASE_LETTER},
+	{0x00015c, 0x00015c, PG_U_UPPERCASE_LETTER},
+	{0x00015d, 0x00015d, PG_U_LOWERCASE_LETTER},
+	{0x00015e, 0x00015e, PG_U_UPPERCASE_LETTER},
+	{0x00015f, 0x00015f, PG_U_LOWERCASE_LETTER},
+	{0x000160, 0x000160, PG_U_UPPERCASE_LETTER},
+	{0x000161, 0x000161, PG_U_LOWERCASE_LETTER},
+	{0x000162, 0x000162, PG_U_UPPERCASE_LETTER},
+	{0x000163, 0x000163, PG_U_LOWERCASE_LETTER},
+	{0x000164, 0x000164, PG_U_UPPERCASE_LETTER},
+	{0x000165, 0x000165, PG_U_LOWERCASE_LETTER},
+	{0x000166, 0x000166, PG_U_UPPERCASE_LETTER},
+	{0x000167, 0x000167, PG_U_LOWERCASE_LETTER},
+	{0x000168, 0x000168, PG_U_UPPERCASE_LETTER},
+	{0x000169, 0x000169, PG_U_LOWERCASE_LETTER},
+	{0x00016a, 0x00016a, PG_U_UPPERCASE_LETTER},
+	{0x00016b, 0x00016b, PG_U_LOWERCASE_LETTER},
+	{0x00016c, 0x00016c, PG_U_UPPERCASE_LETTER},
+	{0x00016d, 0x00016d, PG_U_LOWERCASE_LETTER},
+	{0x00016e, 0x00016e, PG_U_UPPERCASE_LETTER},
+	{0x00016f, 0x00016f, PG_U_LOWERCASE_LETTER},
+	{0x000170, 0x000170, PG_U_UPPERCASE_LETTER},
+	{0x000171, 0x000171, PG_U_LOWERCASE_LETTER},
+	{0x000172, 0x000172, PG_U_UPPERCASE_LETTER},
+	{0x000173, 0x000173, PG_U_LOWERCASE_LETTER},
+	{0x000174, 0x000174, PG_U_UPPERCASE_LETTER},
+	{0x000175, 0x000175, PG_U_LOWERCASE_LETTER},
+	{0x000176, 0x000176, PG_U_UPPERCASE_LETTER},
+	{0x000177, 0x000177, PG_U_LOWERCASE_LETTER},
+	{0x000178, 0x000179, PG_U_UPPERCASE_LETTER},
+	{0x00017a, 0x00017a, PG_U_LOWERCASE_LETTER},
+	{0x00017b, 0x00017b, PG_U_UPPERCASE_LETTER},
+	{0x00017c, 0x00017c, PG_U_LOWERCASE_LETTER},
+	{0x00017d, 0x00017d, PG_U_UPPERCASE_LETTER},
+	{0x00017e, 0x000180, PG_U_LOWERCASE_LETTER},
+	{0x000181, 0x000182, PG_U_UPPERCASE_LETTER},
+	{0x000183, 0x000183, PG_U_LOWERCASE_LETTER},
+	{0x000184, 0x000184, PG_U_UPPERCASE_LETTER},
+	{0x000185, 0x000185, PG_U_LOWERCASE_LETTER},
+	{0x000186, 0x000187, PG_U_UPPERCASE_LETTER},
+	{0x000188, 0x000188, PG_U_LOWERCASE_LETTER},
+	{0x000189, 0x00018b, PG_U_UPPERCASE_LETTER},
+	{0x00018c, 0x00018d, PG_U_LOWERCASE_LETTER},
+	{0x00018e, 0x000191, PG_U_UPPERCASE_LETTER},
+	{0x000192, 0x000192, PG_U_LOWERCASE_LETTER},
+	{0x000193, 0x000194, PG_U_UPPERCASE_LETTER},
+	{0x000195, 0x000195, PG_U_LOWERCASE_LETTER},
+	{0x000196, 0x000198, PG_U_UPPERCASE_LETTER},
+	{0x000199, 0x00019b, PG_U_LOWERCASE_LETTER},
+	{0x00019c, 0x00019d, PG_U_UPPERCASE_LETTER},
+	{0x00019e, 0x00019e, PG_U_LOWERCASE_LETTER},
+	{0x00019f, 0x0001a0, PG_U_UPPERCASE_LETTER},
+	{0x0001a1, 0x0001a1, PG_U_LOWERCASE_LETTER},
+	{0x0001a2, 0x0001a2, PG_U_UPPERCASE_LETTER},
+	{0x0001a3, 0x0001a3, PG_U_LOWERCASE_LETTER},
+	{0x0001a4, 0x0001a4, PG_U_UPPERCASE_LETTER},
+	{0x0001a5, 0x0001a5, PG_U_LOWERCASE_LETTER},
+	{0x0001a6, 0x0001a7, PG_U_UPPERCASE_LETTER},
+	{0x0001a8, 0x0001a8, PG_U_LOWERCASE_LETTER},
+	{0x0001a9, 0x0001a9, PG_U_UPPERCASE_LETTER},
+	{0x0001aa, 0x0001ab, PG_U_LOWERCASE_LETTER},
+	{0x0001ac, 0x0001ac, PG_U_UPPERCASE_LETTER},
+	{0x0001ad, 0x0001ad, PG_U_LOWERCASE_LETTER},
+	{0x0001ae, 0x0001af, PG_U_UPPERCASE_LETTER},
+	{0x0001b0, 0x0001b0, PG_U_LOWERCASE_LETTER},
+	{0x0001b1, 0x0001b3, PG_U_UPPERCASE_LETTER},
+	{0x0001b4, 0x0001b4, PG_U_LOWERCASE_LETTER},
+	{0x0001b5, 0x0001b5, PG_U_UPPERCASE_LETTER},
+	{0x0001b6, 0x0001b6, PG_U_LOWERCASE_LETTER},
+	{0x0001b7, 0x0001b8, PG_U_UPPERCASE_LETTER},
+	{0x0001b9, 0x0001ba, PG_U_LOWERCASE_LETTER},
+	{0x0001bb, 0x0001bb, PG_U_OTHER_LETTER},
+	{0x0001bc, 0x0001bc, PG_U_UPPERCASE_LETTER},
+	{0x0001bd, 0x0001bf, PG_U_LOWERCASE_LETTER},
+	{0x0001c0, 0x0001c3, PG_U_OTHER_LETTER},
+	{0x0001c4, 0x0001c4, PG_U_UPPERCASE_LETTER},
+	{0x0001c5, 0x0001c5, PG_U_TITLECASE_LETTER},
+	{0x0001c6, 0x0001c6, PG_U_LOWERCASE_LETTER},
+	{0x0001c7, 0x0001c7, PG_U_UPPERCASE_LETTER},
+	{0x0001c8, 0x0001c8, PG_U_TITLECASE_LETTER},
+	{0x0001c9, 0x0001c9, PG_U_LOWERCASE_LETTER},
+	{0x0001ca, 0x0001ca, PG_U_UPPERCASE_LETTER},
+	{0x0001cb, 0x0001cb, PG_U_TITLECASE_LETTER},
+	{0x0001cc, 0x0001cc, PG_U_LOWERCASE_LETTER},
+	{0x0001cd, 0x0001cd, PG_U_UPPERCASE_LETTER},
+	{0x0001ce, 0x0001ce, PG_U_LOWERCASE_LETTER},
+	{0x0001cf, 0x0001cf, PG_U_UPPERCASE_LETTER},
+	{0x0001d0, 0x0001d0, PG_U_LOWERCASE_LETTER},
+	{0x0001d1, 0x0001d1, PG_U_UPPERCASE_LETTER},
+	{0x0001d2, 0x0001d2, PG_U_LOWERCASE_LETTER},
+	{0x0001d3, 0x0001d3, PG_U_UPPERCASE_LETTER},
+	{0x0001d4, 0x0001d4, PG_U_LOWERCASE_LETTER},
+	{0x0001d5, 0x0001d5, PG_U_UPPERCASE_LETTER},
+	{0x0001d6, 0x0001d6, PG_U_LOWERCASE_LETTER},
+	{0x0001d7, 0x0001d7, PG_U_UPPERCASE_LETTER},
+	{0x0001d8, 0x0001d8, PG_U_LOWERCASE_LETTER},
+	{0x0001d9, 0x0001d9, PG_U_UPPERCASE_LETTER},
+	{0x0001da, 0x0001da, PG_U_LOWERCASE_LETTER},
+	{0x0001db, 0x0001db, PG_U_UPPERCASE_LETTER},
+	{0x0001dc, 0x0001dd, PG_U_LOWERCASE_LETTER},
+	{0x0001de, 0x0001de, PG_U_UPPERCASE_LETTER},
+	{0x0001df, 0x0001df, PG_U_LOWERCASE_LETTER},
+	{0x0001e0, 0x0001e0, PG_U_UPPERCASE_LETTER},
+	{0x0001e1, 0x0001e1, PG_U_LOWERCASE_LETTER},
+	{0x0001e2, 0x0001e2, PG_U_UPPERCASE_LETTER},
+	{0x0001e3, 0x0001e3, PG_U_LOWERCASE_LETTER},
+	{0x0001e4, 0x0001e4, PG_U_UPPERCASE_LETTER},
+	{0x0001e5, 0x0001e5, PG_U_LOWERCASE_LETTER},
+	{0x0001e6, 0x0001e6, PG_U_UPPERCASE_LETTER},
+	{0x0001e7, 0x0001e7, PG_U_LOWERCASE_LETTER},
+	{0x0001e8, 0x0001e8, PG_U_UPPERCASE_LETTER},
+	{0x0001e9, 0x0001e9, PG_U_LOWERCASE_LETTER},
+	{0x0001ea, 0x0001ea, PG_U_UPPERCASE_LETTER},
+	{0x0001eb, 0x0001eb, PG_U_LOWERCASE_LETTER},
+	{0x0001ec, 0x0001ec, PG_U_UPPERCASE_LETTER},
+	{0x0001ed, 0x0001ed, PG_U_LOWERCASE_LETTER},
+	{0x0001ee, 0x0001ee, PG_U_UPPERCASE_LETTER},
+	{0x0001ef, 0x0001f0, PG_U_LOWERCASE_LETTER},
+	{0x0001f1, 0x0001f1, PG_U_UPPERCASE_LETTER},
+	{0x0001f2, 0x0001f2, PG_U_TITLECASE_LETTER},
+	{0x0001f3, 0x0001f3, PG_U_LOWERCASE_LETTER},
+	{0x0001f4, 0x0001f4, PG_U_UPPERCASE_LETTER},
+	{0x0001f5, 0x0001f5, PG_U_LOWERCASE_LETTER},
+	{0x0001f6, 0x0001f8, PG_U_UPPERCASE_LETTER},
+	{0x0001f9, 0x0001f9, PG_U_LOWERCASE_LETTER},
+	{0x0001fa, 0x0001fa, PG_U_UPPERCASE_LETTER},
+	{0x0001fb, 0x0001fb, PG_U_LOWERCASE_LETTER},
+	{0x0001fc, 0x0001fc, PG_U_UPPERCASE_LETTER},
+	{0x0001fd, 0x0001fd, PG_U_LOWERCASE_LETTER},
+	{0x0001fe, 0x0001fe, PG_U_UPPERCASE_LETTER},
+	{0x0001ff, 0x0001ff, PG_U_LOWERCASE_LETTER},
+	{0x000200, 0x000200, PG_U_UPPERCASE_LETTER},
+	{0x000201, 0x000201, PG_U_LOWERCASE_LETTER},
+	{0x000202, 0x000202, PG_U_UPPERCASE_LETTER},
+	{0x000203, 0x000203, PG_U_LOWERCASE_LETTER},
+	{0x000204, 0x000204, PG_U_UPPERCASE_LETTER},
+	{0x000205, 0x000205, PG_U_LOWERCASE_LETTER},
+	{0x000206, 0x000206, PG_U_UPPERCASE_LETTER},
+	{0x000207, 0x000207, PG_U_LOWERCASE_LETTER},
+	{0x000208, 0x000208, PG_U_UPPERCASE_LETTER},
+	{0x000209, 0x000209, PG_U_LOWERCASE_LETTER},
+	{0x00020a, 0x00020a, PG_U_UPPERCASE_LETTER},
+	{0x00020b, 0x00020b, PG_U_LOWERCASE_LETTER},
+	{0x00020c, 0x00020c, PG_U_UPPERCASE_LETTER},
+	{0x00020d, 0x00020d, PG_U_LOWERCASE_LETTER},
+	{0x00020e, 0x00020e, PG_U_UPPERCASE_LETTER},
+	{0x00020f, 0x00020f, PG_U_LOWERCASE_LETTER},
+	{0x000210, 0x000210, PG_U_UPPERCASE_LETTER},
+	{0x000211, 0x000211, PG_U_LOWERCASE_LETTER},
+	{0x000212, 0x000212, PG_U_UPPERCASE_LETTER},
+	{0x000213, 0x000213, PG_U_LOWERCASE_LETTER},
+	{0x000214, 0x000214, PG_U_UPPERCASE_LETTER},
+	{0x000215, 0x000215, PG_U_LOWERCASE_LETTER},
+	{0x000216, 0x000216, PG_U_UPPERCASE_LETTER},
+	{0x000217, 0x000217, PG_U_LOWERCASE_LETTER},
+	{0x000218, 0x000218, PG_U_UPPERCASE_LETTER},
+	{0x000219, 0x000219, PG_U_LOWERCASE_LETTER},
+	{0x00021a, 0x00021a, PG_U_UPPERCASE_LETTER},
+	{0x00021b, 0x00021b, PG_U_LOWERCASE_LETTER},
+	{0x00021c, 0x00021c, PG_U_UPPERCASE_LETTER},
+	{0x00021d, 0x00021d, PG_U_LOWERCASE_LETTER},
+	{0x00021e, 0x00021e, PG_U_UPPERCASE_LETTER},
+	{0x00021f, 0x00021f, PG_U_LOWERCASE_LETTER},
+	{0x000220, 0x000220, PG_U_UPPERCASE_LETTER},
+	{0x000221, 0x000221, PG_U_LOWERCASE_LETTER},
+	{0x000222, 0x000222, PG_U_UPPERCASE_LETTER},
+	{0x000223, 0x000223, PG_U_LOWERCASE_LETTER},
+	{0x000224, 0x000224, PG_U_UPPERCASE_LETTER},
+	{0x000225, 0x000225, PG_U_LOWERCASE_LETTER},
+	{0x000226, 0x000226, PG_U_UPPERCASE_LETTER},
+	{0x000227, 0x000227, PG_U_LOWERCASE_LETTER},
+	{0x000228, 0x000228, PG_U_UPPERCASE_LETTER},
+	{0x000229, 0x000229, PG_U_LOWERCASE_LETTER},
+	{0x00022a, 0x00022a, PG_U_UPPERCASE_LETTER},
+	{0x00022b, 0x00022b, PG_U_LOWERCASE_LETTER},
+	{0x00022c, 0x00022c, PG_U_UPPERCASE_LETTER},
+	{0x00022d, 0x00022d, PG_U_LOWERCASE_LETTER},
+	{0x00022e, 0x00022e, PG_U_UPPERCASE_LETTER},
+	{0x00022f, 0x00022f, PG_U_LOWERCASE_LETTER},
+	{0x000230, 0x000230, PG_U_UPPERCASE_LETTER},
+	{0x000231, 0x000231, PG_U_LOWERCASE_LETTER},
+	{0x000232, 0x000232, PG_U_UPPERCASE_LETTER},
+	{0x000233, 0x000239, PG_U_LOWERCASE_LETTER},
+	{0x00023a, 0x00023b, PG_U_UPPERCASE_LETTER},
+	{0x00023c, 0x00023c, PG_U_LOWERCASE_LETTER},
+	{0x00023d, 0x00023e, PG_U_UPPERCASE_LETTER},
+	{0x00023f, 0x000240, PG_U_LOWERCASE_LETTER},
+	{0x000241, 0x000241, PG_U_UPPERCASE_LETTER},
+	{0x000242, 0x000242, PG_U_LOWERCASE_LETTER},
+	{0x000243, 0x000246, PG_U_UPPERCASE_LETTER},
+	{0x000247, 0x000247, PG_U_LOWERCASE_LETTER},
+	{0x000248, 0x000248, PG_U_UPPERCASE_LETTER},
+	{0x000249, 0x000249, PG_U_LOWERCASE_LETTER},
+	{0x00024a, 0x00024a, PG_U_UPPERCASE_LETTER},
+	{0x00024b, 0x00024b, PG_U_LOWERCASE_LETTER},
+	{0x00024c, 0x00024c, PG_U_UPPERCASE_LETTER},
+	{0x00024d, 0x00024d, PG_U_LOWERCASE_LETTER},
+	{0x00024e, 0x00024e, PG_U_UPPERCASE_LETTER},
+	{0x00024f, 0x000293, PG_U_LOWERCASE_LETTER},
+	{0x000294, 0x000294, PG_U_OTHER_LETTER},
+	{0x000295, 0x0002af, PG_U_LOWERCASE_LETTER},
+	{0x0002b0, 0x0002c1, PG_U_MODIFIER_LETTER},
+	{0x0002c2, 0x0002c5, PG_U_MODIFIER_SYMBOL},
+	{0x0002c6, 0x0002d1, PG_U_MODIFIER_LETTER},
+	{0x0002d2, 0x0002df, PG_U_MODIFIER_SYMBOL},
+	{0x0002e0, 0x0002e4, PG_U_MODIFIER_LETTER},
+	{0x0002e5, 0x0002eb, PG_U_MODIFIER_SYMBOL},
+	{0x0002ec, 0x0002ec, PG_U_MODIFIER_LETTER},
+	{0x0002ed, 0x0002ed, PG_U_MODIFIER_SYMBOL},
+	{0x0002ee, 0x0002ee, PG_U_MODIFIER_LETTER},
+	{0x0002ef, 0x0002ff, PG_U_MODIFIER_SYMBOL},
+	{0x000300, 0x00036f, PG_U_NON_SPACING_MARK},
+	{0x000370, 0x000370, PG_U_UPPERCASE_LETTER},
+	{0x000371, 0x000371, PG_U_LOWERCASE_LETTER},
+	{0x000372, 0x000372, PG_U_UPPERCASE_LETTER},
+	{0x000373, 0x000373, PG_U_LOWERCASE_LETTER},
+	{0x000374, 0x000374, PG_U_MODIFIER_LETTER},
+	{0x000375, 0x000375, PG_U_MODIFIER_SYMBOL},
+	{0x000376, 0x000376, PG_U_UPPERCASE_LETTER},
+	{0x000377, 0x000377, PG_U_LOWERCASE_LETTER},
+	{0x000378, 0x000379, PG_U_UNASSIGNED},
+	{0x00037a, 0x00037a, PG_U_MODIFIER_LETTER},
+	{0x00037b, 0x00037d, PG_U_LOWERCASE_LETTER},
+	{0x00037e, 0x00037e, PG_U_OTHER_PUNCTUATION},
+	{0x00037f, 0x00037f, PG_U_UPPERCASE_LETTER},
+	{0x000380, 0x000383, PG_U_UNASSIGNED},
+	{0x000384, 0x000385, PG_U_MODIFIER_SYMBOL},
+	{0x000386, 0x000386, PG_U_UPPERCASE_LETTER},
+	{0x000387, 0x000387, PG_U_OTHER_PUNCTUATION},
+	{0x000388, 0x00038a, PG_U_UPPERCASE_LETTER},
+	{0x00038b, 0x00038b, PG_U_UNASSIGNED},
+	{0x00038c, 0x00038c, PG_U_UPPERCASE_LETTER},
+	{0x00038d, 0x00038d, PG_U_UNASSIGNED},
+	{0x00038e, 0x00038f, PG_U_UPPERCASE_LETTER},
+	{0x000390, 0x000390, PG_U_LOWERCASE_LETTER},
+	{0x000391, 0x0003a1, PG_U_UPPERCASE_LETTER},
+	{0x0003a2, 0x0003a2, PG_U_UNASSIGNED},
+	{0x0003a3, 0x0003ab, PG_U_UPPERCASE_LETTER},
+	{0x0003ac, 0x0003ce, PG_U_LOWERCASE_LETTER},
+	{0x0003cf, 0x0003cf, PG_U_UPPERCASE_LETTER},
+	{0x0003d0, 0x0003d1, PG_U_LOWERCASE_LETTER},
+	{0x0003d2, 0x0003d4, PG_U_UPPERCASE_LETTER},
+	{0x0003d5, 0x0003d7, PG_U_LOWERCASE_LETTER},
+	{0x0003d8, 0x0003d8, PG_U_UPPERCASE_LETTER},
+	{0x0003d9, 0x0003d9, PG_U_LOWERCASE_LETTER},
+	{0x0003da, 0x0003da, PG_U_UPPERCASE_LETTER},
+	{0x0003db, 0x0003db, PG_U_LOWERCASE_LETTER},
+	{0x0003dc, 0x0003dc, PG_U_UPPERCASE_LETTER},
+	{0x0003dd, 0x0003dd, PG_U_LOWERCASE_LETTER},
+	{0x0003de, 0x0003de, PG_U_UPPERCASE_LETTER},
+	{0x0003df, 0x0003df, PG_U_LOWERCASE_LETTER},
+	{0x0003e0, 0x0003e0, PG_U_UPPERCASE_LETTER},
+	{0x0003e1, 0x0003e1, PG_U_LOWERCASE_LETTER},
+	{0x0003e2, 0x0003e2, PG_U_UPPERCASE_LETTER},
+	{0x0003e3, 0x0003e3, PG_U_LOWERCASE_LETTER},
+	{0x0003e4, 0x0003e4, PG_U_UPPERCASE_LETTER},
+	{0x0003e5, 0x0003e5, PG_U_LOWERCASE_LETTER},
+	{0x0003e6, 0x0003e6, PG_U_UPPERCASE_LETTER},
+	{0x0003e7, 0x0003e7, PG_U_LOWERCASE_LETTER},
+	{0x0003e8, 0x0003e8, PG_U_UPPERCASE_LETTER},
+	{0x0003e9, 0x0003e9, PG_U_LOWERCASE_LETTER},
+	{0x0003ea, 0x0003ea, PG_U_UPPERCASE_LETTER},
+	{0x0003eb, 0x0003eb, PG_U_LOWERCASE_LETTER},
+	{0x0003ec, 0x0003ec, PG_U_UPPERCASE_LETTER},
+	{0x0003ed, 0x0003ed, PG_U_LOWERCASE_LETTER},
+	{0x0003ee, 0x0003ee, PG_U_UPPERCASE_LETTER},
+	{0x0003ef, 0x0003f3, PG_U_LOWERCASE_LETTER},
+	{0x0003f4, 0x0003f4, PG_U_UPPERCASE_LETTER},
+	{0x0003f5, 0x0003f5, PG_U_LOWERCASE_LETTER},
+	{0x0003f6, 0x0003f6, PG_U_MATH_SYMBOL},
+	{0x0003f7, 0x0003f7, PG_U_UPPERCASE_LETTER},
+	{0x0003f8, 0x0003f8, PG_U_LOWERCASE_LETTER},
+	{0x0003f9, 0x0003fa, PG_U_UPPERCASE_LETTER},
+	{0x0003fb, 0x0003fc, PG_U_LOWERCASE_LETTER},
+	{0x0003fd, 0x00042f, PG_U_UPPERCASE_LETTER},
+	{0x000430, 0x00045f, PG_U_LOWERCASE_LETTER},
+	{0x000460, 0x000460, PG_U_UPPERCASE_LETTER},
+	{0x000461, 0x000461, PG_U_LOWERCASE_LETTER},
+	{0x000462, 0x000462, PG_U_UPPERCASE_LETTER},
+	{0x000463, 0x000463, PG_U_LOWERCASE_LETTER},
+	{0x000464, 0x000464, PG_U_UPPERCASE_LETTER},
+	{0x000465, 0x000465, PG_U_LOWERCASE_LETTER},
+	{0x000466, 0x000466, PG_U_UPPERCASE_LETTER},
+	{0x000467, 0x000467, PG_U_LOWERCASE_LETTER},
+	{0x000468, 0x000468, PG_U_UPPERCASE_LETTER},
+	{0x000469, 0x000469, PG_U_LOWERCASE_LETTER},
+	{0x00046a, 0x00046a, PG_U_UPPERCASE_LETTER},
+	{0x00046b, 0x00046b, PG_U_LOWERCASE_LETTER},
+	{0x00046c, 0x00046c, PG_U_UPPERCASE_LETTER},
+	{0x00046d, 0x00046d, PG_U_LOWERCASE_LETTER},
+	{0x00046e, 0x00046e, PG_U_UPPERCASE_LETTER},
+	{0x00046f, 0x00046f, PG_U_LOWERCASE_LETTER},
+	{0x000470, 0x000470, PG_U_UPPERCASE_LETTER},
+	{0x000471, 0x000471, PG_U_LOWERCASE_LETTER},
+	{0x000472, 0x000472, PG_U_UPPERCASE_LETTER},
+	{0x000473, 0x000473, PG_U_LOWERCASE_LETTER},
+	{0x000474, 0x000474, PG_U_UPPERCASE_LETTER},
+	{0x000475, 0x000475, PG_U_LOWERCASE_LETTER},
+	{0x000476, 0x000476, PG_U_UPPERCASE_LETTER},
+	{0x000477, 0x000477, PG_U_LOWERCASE_LETTER},
+	{0x000478, 0x000478, PG_U_UPPERCASE_LETTER},
+	{0x000479, 0x000479, PG_U_LOWERCASE_LETTER},
+	{0x00047a, 0x00047a, PG_U_UPPERCASE_LETTER},
+	{0x00047b, 0x00047b, PG_U_LOWERCASE_LETTER},
+	{0x00047c, 0x00047c, PG_U_UPPERCASE_LETTER},
+	{0x00047d, 0x00047d, PG_U_LOWERCASE_LETTER},
+	{0x00047e, 0x00047e, PG_U_UPPERCASE_LETTER},
+	{0x00047f, 0x00047f, PG_U_LOWERCASE_LETTER},
+	{0x000480, 0x000480, PG_U_UPPERCASE_LETTER},
+	{0x000481, 0x000481, PG_U_LOWERCASE_LETTER},
+	{0x000482, 0x000482, PG_U_OTHER_SYMBOL},
+	{0x000483, 0x000487, PG_U_NON_SPACING_MARK},
+	{0x000488, 0x000489, PG_U_ENCLOSING_MARK},
+	{0x00048a, 0x00048a, PG_U_UPPERCASE_LETTER},
+	{0x00048b, 0x00048b, PG_U_LOWERCASE_LETTER},
+	{0x00048c, 0x00048c, PG_U_UPPERCASE_LETTER},
+	{0x00048d, 0x00048d, PG_U_LOWERCASE_LETTER},
+	{0x00048e, 0x00048e, PG_U_UPPERCASE_LETTER},
+	{0x00048f, 0x00048f, PG_U_LOWERCASE_LETTER},
+	{0x000490, 0x000490, PG_U_UPPERCASE_LETTER},
+	{0x000491, 0x000491, PG_U_LOWERCASE_LETTER},
+	{0x000492, 0x000492, PG_U_UPPERCASE_LETTER},
+	{0x000493, 0x000493, PG_U_LOWERCASE_LETTER},
+	{0x000494, 0x000494, PG_U_UPPERCASE_LETTER},
+	{0x000495, 0x000495, PG_U_LOWERCASE_LETTER},
+	{0x000496, 0x000496, PG_U_UPPERCASE_LETTER},
+	{0x000497, 0x000497, PG_U_LOWERCASE_LETTER},
+	{0x000498, 0x000498, PG_U_UPPERCASE_LETTER},
+	{0x000499, 0x000499, PG_U_LOWERCASE_LETTER},
+	{0x00049a, 0x00049a, PG_U_UPPERCASE_LETTER},
+	{0x00049b, 0x00049b, PG_U_LOWERCASE_LETTER},
+	{0x00049c, 0x00049c, PG_U_UPPERCASE_LETTER},
+	{0x00049d, 0x00049d, PG_U_LOWERCASE_LETTER},
+	{0x00049e, 0x00049e, PG_U_UPPERCASE_LETTER},
+	{0x00049f, 0x00049f, PG_U_LOWERCASE_LETTER},
+	{0x0004a0, 0x0004a0, PG_U_UPPERCASE_LETTER},
+	{0x0004a1, 0x0004a1, PG_U_LOWERCASE_LETTER},
+	{0x0004a2, 0x0004a2, PG_U_UPPERCASE_LETTER},
+	{0x0004a3, 0x0004a3, PG_U_LOWERCASE_LETTER},
+	{0x0004a4, 0x0004a4, PG_U_UPPERCASE_LETTER},
+	{0x0004a5, 0x0004a5, PG_U_LOWERCASE_LETTER},
+	{0x0004a6, 0x0004a6, PG_U_UPPERCASE_LETTER},
+	{0x0004a7, 0x0004a7, PG_U_LOWERCASE_LETTER},
+	{0x0004a8, 0x0004a8, PG_U_UPPERCASE_LETTER},
+	{0x0004a9, 0x0004a9, PG_U_LOWERCASE_LETTER},
+	{0x0004aa, 0x0004aa, PG_U_UPPERCASE_LETTER},
+	{0x0004ab, 0x0004ab, PG_U_LOWERCASE_LETTER},
+	{0x0004ac, 0x0004ac, PG_U_UPPERCASE_LETTER},
+	{0x0004ad, 0x0004ad, PG_U_LOWERCASE_LETTER},
+	{0x0004ae, 0x0004ae, PG_U_UPPERCASE_LETTER},
+	{0x0004af, 0x0004af, PG_U_LOWERCASE_LETTER},
+	{0x0004b0, 0x0004b0, PG_U_UPPERCASE_LETTER},
+	{0x0004b1, 0x0004b1, PG_U_LOWERCASE_LETTER},
+	{0x0004b2, 0x0004b2, PG_U_UPPERCASE_LETTER},
+	{0x0004b3, 0x0004b3, PG_U_LOWERCASE_LETTER},
+	{0x0004b4, 0x0004b4, PG_U_UPPERCASE_LETTER},
+	{0x0004b5, 0x0004b5, PG_U_LOWERCASE_LETTER},
+	{0x0004b6, 0x0004b6, PG_U_UPPERCASE_LETTER},
+	{0x0004b7, 0x0004b7, PG_U_LOWERCASE_LETTER},
+	{0x0004b8, 0x0004b8, PG_U_UPPERCASE_LETTER},
+	{0x0004b9, 0x0004b9, PG_U_LOWERCASE_LETTER},
+	{0x0004ba, 0x0004ba, PG_U_UPPERCASE_LETTER},
+	{0x0004bb, 0x0004bb, PG_U_LOWERCASE_LETTER},
+	{0x0004bc, 0x0004bc, PG_U_UPPERCASE_LETTER},
+	{0x0004bd, 0x0004bd, PG_U_LOWERCASE_LETTER},
+	{0x0004be, 0x0004be, PG_U_UPPERCASE_LETTER},
+	{0x0004bf, 0x0004bf, PG_U_LOWERCASE_LETTER},
+	{0x0004c0, 0x0004c1, PG_U_UPPERCASE_LETTER},
+	{0x0004c2, 0x0004c2, PG_U_LOWERCASE_LETTER},
+	{0x0004c3, 0x0004c3, PG_U_UPPERCASE_LETTER},
+	{0x0004c4, 0x0004c4, PG_U_LOWERCASE_LETTER},
+	{0x0004c5, 0x0004c5, PG_U_UPPERCASE_LETTER},
+	{0x0004c6, 0x0004c6, PG_U_LOWERCASE_LETTER},
+	{0x0004c7, 0x0004c7, PG_U_UPPERCASE_LETTER},
+	{0x0004c8, 0x0004c8, PG_U_LOWERCASE_LETTER},
+	{0x0004c9, 0x0004c9, PG_U_UPPERCASE_LETTER},
+	{0x0004ca, 0x0004ca, PG_U_LOWERCASE_LETTER},
+	{0x0004cb, 0x0004cb, PG_U_UPPERCASE_LETTER},
+	{0x0004cc, 0x0004cc, PG_U_LOWERCASE_LETTER},
+	{0x0004cd, 0x0004cd, PG_U_UPPERCASE_LETTER},
+	{0x0004ce, 0x0004cf, PG_U_LOWERCASE_LETTER},
+	{0x0004d0, 0x0004d0, PG_U_UPPERCASE_LETTER},
+	{0x0004d1, 0x0004d1, PG_U_LOWERCASE_LETTER},
+	{0x0004d2, 0x0004d2, PG_U_UPPERCASE_LETTER},
+	{0x0004d3, 0x0004d3, PG_U_LOWERCASE_LETTER},
+	{0x0004d4, 0x0004d4, PG_U_UPPERCASE_LETTER},
+	{0x0004d5, 0x0004d5, PG_U_LOWERCASE_LETTER},
+	{0x0004d6, 0x0004d6, PG_U_UPPERCASE_LETTER},
+	{0x0004d7, 0x0004d7, PG_U_LOWERCASE_LETTER},
+	{0x0004d8, 0x0004d8, PG_U_UPPERCASE_LETTER},
+	{0x0004d9, 0x0004d9, PG_U_LOWERCASE_LETTER},
+	{0x0004da, 0x0004da, PG_U_UPPERCASE_LETTER},
+	{0x0004db, 0x0004db, PG_U_LOWERCASE_LETTER},
+	{0x0004dc, 0x0004dc, PG_U_UPPERCASE_LETTER},
+	{0x0004dd, 0x0004dd, PG_U_LOWERCASE_LETTER},
+	{0x0004de, 0x0004de, PG_U_UPPERCASE_LETTER},
+	{0x0004df, 0x0004df, PG_U_LOWERCASE_LETTER},
+	{0x0004e0, 0x0004e0, PG_U_UPPERCASE_LETTER},
+	{0x0004e1, 0x0004e1, PG_U_LOWERCASE_LETTER},
+	{0x0004e2, 0x0004e2, PG_U_UPPERCASE_LETTER},
+	{0x0004e3, 0x0004e3, PG_U_LOWERCASE_LETTER},
+	{0x0004e4, 0x0004e4, PG_U_UPPERCASE_LETTER},
+	{0x0004e5, 0x0004e5, PG_U_LOWERCASE_LETTER},
+	{0x0004e6, 0x0004e6, PG_U_UPPERCASE_LETTER},
+	{0x0004e7, 0x0004e7, PG_U_LOWERCASE_LETTER},
+	{0x0004e8, 0x0004e8, PG_U_UPPERCASE_LETTER},
+	{0x0004e9, 0x0004e9, PG_U_LOWERCASE_LETTER},
+	{0x0004ea, 0x0004ea, PG_U_UPPERCASE_LETTER},
+	{0x0004eb, 0x0004eb, PG_U_LOWERCASE_LETTER},
+	{0x0004ec, 0x0004ec, PG_U_UPPERCASE_LETTER},
+	{0x0004ed, 0x0004ed, PG_U_LOWERCASE_LETTER},
+	{0x0004ee, 0x0004ee, PG_U_UPPERCASE_LETTER},
+	{0x0004ef, 0x0004ef, PG_U_LOWERCASE_LETTER},
+	{0x0004f0, 0x0004f0, PG_U_UPPERCASE_LETTER},
+	{0x0004f1, 0x0004f1, PG_U_LOWERCASE_LETTER},
+	{0x0004f2, 0x0004f2, PG_U_UPPERCASE_LETTER},
+	{0x0004f3, 0x0004f3, PG_U_LOWERCASE_LETTER},
+	{0x0004f4, 0x0004f4, PG_U_UPPERCASE_LETTER},
+	{0x0004f5, 0x0004f5, PG_U_LOWERCASE_LETTER},
+	{0x0004f6, 0x0004f6, PG_U_UPPERCASE_LETTER},
+	{0x0004f7, 0x0004f7, PG_U_LOWERCASE_LETTER},
+	{0x0004f8, 0x0004f8, PG_U_UPPERCASE_LETTER},
+	{0x0004f9, 0x0004f9, PG_U_LOWERCASE_LETTER},
+	{0x0004fa, 0x0004fa, PG_U_UPPERCASE_LETTER},
+	{0x0004fb, 0x0004fb, PG_U_LOWERCASE_LETTER},
+	{0x0004fc, 0x0004fc, PG_U_UPPERCASE_LETTER},
+	{0x0004fd, 0x0004fd, PG_U_LOWERCASE_LETTER},
+	{0x0004fe, 0x0004fe, PG_U_UPPERCASE_LETTER},
+	{0x0004ff, 0x0004ff, PG_U_LOWERCASE_LETTER},
+	{0x000500, 0x000500, PG_U_UPPERCASE_LETTER},
+	{0x000501, 0x000501, PG_U_LOWERCASE_LETTER},
+	{0x000502, 0x000502, PG_U_UPPERCASE_LETTER},
+	{0x000503, 0x000503, PG_U_LOWERCASE_LETTER},
+	{0x000504, 0x000504, PG_U_UPPERCASE_LETTER},
+	{0x000505, 0x000505, PG_U_LOWERCASE_LETTER},
+	{0x000506, 0x000506, PG_U_UPPERCASE_LETTER},
+	{0x000507, 0x000507, PG_U_LOWERCASE_LETTER},
+	{0x000508, 0x000508, PG_U_UPPERCASE_LETTER},
+	{0x000509, 0x000509, PG_U_LOWERCASE_LETTER},
+	{0x00050a, 0x00050a, PG_U_UPPERCASE_LETTER},
+	{0x00050b, 0x00050b, PG_U_LOWERCASE_LETTER},
+	{0x00050c, 0x00050c, PG_U_UPPERCASE_LETTER},
+	{0x00050d, 0x00050d, PG_U_LOWERCASE_LETTER},
+	{0x00050e, 0x00050e, PG_U_UPPERCASE_LETTER},
+	{0x00050f, 0x00050f, PG_U_LOWERCASE_LETTER},
+	{0x000510, 0x000510, PG_U_UPPERCASE_LETTER},
+	{0x000511, 0x000511, PG_U_LOWERCASE_LETTER},
+	{0x000512, 0x000512, PG_U_UPPERCASE_LETTER},
+	{0x000513, 0x000513, PG_U_LOWERCASE_LETTER},
+	{0x000514, 0x000514, PG_U_UPPERCASE_LETTER},
+	{0x000515, 0x000515, PG_U_LOWERCASE_LETTER},
+	{0x000516, 0x000516, PG_U_UPPERCASE_LETTER},
+	{0x000517, 0x000517, PG_U_LOWERCASE_LETTER},
+	{0x000518, 0x000518, PG_U_UPPERCASE_LETTER},
+	{0x000519, 0x000519, PG_U_LOWERCASE_LETTER},
+	{0x00051a, 0x00051a, PG_U_UPPERCASE_LETTER},
+	{0x00051b, 0x00051b, PG_U_LOWERCASE_LETTER},
+	{0x00051c, 0x00051c, PG_U_UPPERCASE_LETTER},
+	{0x00051d, 0x00051d, PG_U_LOWERCASE_LETTER},
+	{0x00051e, 0x00051e, PG_U_UPPERCASE_LETTER},
+	{0x00051f, 0x00051f, PG_U_LOWERCASE_LETTER},
+	{0x000520, 0x000520, PG_U_UPPERCASE_LETTER},
+	{0x000521, 0x000521, PG_U_LOWERCASE_LETTER},
+	{0x000522, 0x000522, PG_U_UPPERCASE_LETTER},
+	{0x000523, 0x000523, PG_U_LOWERCASE_LETTER},
+	{0x000524, 0x000524, PG_U_UPPERCASE_LETTER},
+	{0x000525, 0x000525, PG_U_LOWERCASE_LETTER},
+	{0x000526, 0x000526, PG_U_UPPERCASE_LETTER},
+	{0x000527, 0x000527, PG_U_LOWERCASE_LETTER},
+	{0x000528, 0x000528, PG_U_UPPERCASE_LETTER},
+	{0x000529, 0x000529, PG_U_LOWERCASE_LETTER},
+	{0x00052a, 0x00052a, PG_U_UPPERCASE_LETTER},
+	{0x00052b, 0x00052b, PG_U_LOWERCASE_LETTER},
+	{0x00052c, 0x00052c, PG_U_UPPERCASE_LETTER},
+	{0x00052d, 0x00052d, PG_U_LOWERCASE_LETTER},
+	{0x00052e, 0x00052e, PG_U_UPPERCASE_LETTER},
+	{0x00052f, 0x00052f, PG_U_LOWERCASE_LETTER},
+	{0x000530, 0x000530, PG_U_UNASSIGNED},
+	{0x000531, 0x000556, PG_U_UPPERCASE_LETTER},
+	{0x000557, 0x000558, PG_U_UNASSIGNED},
+	{0x000559, 0x000559, PG_U_MODIFIER_LETTER},
+	{0x00055a, 0x00055f, PG_U_OTHER_PUNCTUATION},
+	{0x000560, 0x000588, PG_U_LOWERCASE_LETTER},
+	{0x000589, 0x000589, PG_U_OTHER_PUNCTUATION},
+	{0x00058a, 0x00058a, PG_U_DASH_PUNCTUATION},
+	{0x00058b, 0x00058c, PG_U_UNASSIGNED},
+	{0x00058d, 0x00058e, PG_U_OTHER_SYMBOL},
+	{0x00058f, 0x00058f, PG_U_CURRENCY_SYMBOL},
+	{0x000590, 0x000590, PG_U_UNASSIGNED},
+	{0x000591, 0x0005bd, PG_U_NON_SPACING_MARK},
+	{0x0005be, 0x0005be, PG_U_DASH_PUNCTUATION},
+	{0x0005bf, 0x0005bf, PG_U_NON_SPACING_MARK},
+	{0x0005c0, 0x0005c0, PG_U_OTHER_PUNCTUATION},
+	{0x0005c1, 0x0005c2, PG_U_NON_SPACING_MARK},
+	{0x0005c3, 0x0005c3, PG_U_OTHER_PUNCTUATION},
+	{0x0005c4, 0x0005c5, PG_U_NON_SPACING_MARK},
+	{0x0005c6, 0x0005c6, PG_U_OTHER_PUNCTUATION},
+	{0x0005c7, 0x0005c7, PG_U_NON_SPACING_MARK},
+	{0x0005c8, 0x0005cf, PG_U_UNASSIGNED},
+	{0x0005d0, 0x0005ea, PG_U_OTHER_LETTER},
+	{0x0005eb, 0x0005ee, PG_U_UNASSIGNED},
+	{0x0005ef, 0x0005f2, PG_U_OTHER_LETTER},
+	{0x0005f3, 0x0005f4, PG_U_OTHER_PUNCTUATION},
+	{0x0005f5, 0x0005ff, PG_U_UNASSIGNED},
+	{0x000600, 0x000605, PG_U_FORMAT_CHAR},
+	{0x000606, 0x000608, PG_U_MATH_SYMBOL},
+	{0x000609, 0x00060a, PG_U_OTHER_PUNCTUATION},
+	{0x00060b, 0x00060b, PG_U_CURRENCY_SYMBOL},
+	{0x00060c, 0x00060d, PG_U_OTHER_PUNCTUATION},
+	{0x00060e, 0x00060f, PG_U_OTHER_SYMBOL},
+	{0x000610, 0x00061a, PG_U_NON_SPACING_MARK},
+	{0x00061b, 0x00061b, PG_U_OTHER_PUNCTUATION},
+	{0x00061c, 0x00061c, PG_U_FORMAT_CHAR},
+	{0x00061d, 0x00061f, PG_U_OTHER_PUNCTUATION},
+	{0x000620, 0x00063f, PG_U_OTHER_LETTER},
+	{0x000640, 0x000640, PG_U_MODIFIER_LETTER},
+	{0x000641, 0x00064a, PG_U_OTHER_LETTER},
+	{0x00064b, 0x00065f, PG_U_NON_SPACING_MARK},
+	{0x000660, 0x000669, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00066a, 0x00066d, PG_U_OTHER_PUNCTUATION},
+	{0x00066e, 0x00066f, PG_U_OTHER_LETTER},
+	{0x000670, 0x000670, PG_U_NON_SPACING_MARK},
+	{0x000671, 0x0006d3, PG_U_OTHER_LETTER},
+	{0x0006d4, 0x0006d4, PG_U_OTHER_PUNCTUATION},
+	{0x0006d5, 0x0006d5, PG_U_OTHER_LETTER},
+	{0x0006d6, 0x0006dc, PG_U_NON_SPACING_MARK},
+	{0x0006dd, 0x0006dd, PG_U_FORMAT_CHAR},
+	{0x0006de, 0x0006de, PG_U_OTHER_SYMBOL},
+	{0x0006df, 0x0006e4, PG_U_NON_SPACING_MARK},
+	{0x0006e5, 0x0006e6, PG_U_MODIFIER_LETTER},
+	{0x0006e7, 0x0006e8, PG_U_NON_SPACING_MARK},
+	{0x0006e9, 0x0006e9, PG_U_OTHER_SYMBOL},
+	{0x0006ea, 0x0006ed, PG_U_NON_SPACING_MARK},
+	{0x0006ee, 0x0006ef, PG_U_OTHER_LETTER},
+	{0x0006f0, 0x0006f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0006fa, 0x0006fc, PG_U_OTHER_LETTER},
+	{0x0006fd, 0x0006fe, PG_U_OTHER_SYMBOL},
+	{0x0006ff, 0x0006ff, PG_U_OTHER_LETTER},
+	{0x000700, 0x00070d, PG_U_OTHER_PUNCTUATION},
+	{0x00070e, 0x00070e, PG_U_UNASSIGNED},
+	{0x00070f, 0x00070f, PG_U_FORMAT_CHAR},
+	{0x000710, 0x000710, PG_U_OTHER_LETTER},
+	{0x000711, 0x000711, PG_U_NON_SPACING_MARK},
+	{0x000712, 0x00072f, PG_U_OTHER_LETTER},
+	{0x000730, 0x00074a, PG_U_NON_SPACING_MARK},
+	{0x00074b, 0x00074c, PG_U_UNASSIGNED},
+	{0x00074d, 0x0007a5, PG_U_OTHER_LETTER},
+	{0x0007a6, 0x0007b0, PG_U_NON_SPACING_MARK},
+	{0x0007b1, 0x0007b1, PG_U_OTHER_LETTER},
+	{0x0007b2, 0x0007bf, PG_U_UNASSIGNED},
+	{0x0007c0, 0x0007c9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0007ca, 0x0007ea, PG_U_OTHER_LETTER},
+	{0x0007eb, 0x0007f3, PG_U_NON_SPACING_MARK},
+	{0x0007f4, 0x0007f5, PG_U_MODIFIER_LETTER},
+	{0x0007f6, 0x0007f6, PG_U_OTHER_SYMBOL},
+	{0x0007f7, 0x0007f9, PG_U_OTHER_PUNCTUATION},
+	{0x0007fa, 0x0007fa, PG_U_MODIFIER_LETTER},
+	{0x0007fb, 0x0007fc, PG_U_UNASSIGNED},
+	{0x0007fd, 0x0007fd, PG_U_NON_SPACING_MARK},
+	{0x0007fe, 0x0007ff, PG_U_CURRENCY_SYMBOL},
+	{0x000800, 0x000815, PG_U_OTHER_LETTER},
+	{0x000816, 0x000819, PG_U_NON_SPACING_MARK},
+	{0x00081a, 0x00081a, PG_U_MODIFIER_LETTER},
+	{0x00081b, 0x000823, PG_U_NON_SPACING_MARK},
+	{0x000824, 0x000824, PG_U_MODIFIER_LETTER},
+	{0x000825, 0x000827, PG_U_NON_SPACING_MARK},
+	{0x000828, 0x000828, PG_U_MODIFIER_LETTER},
+	{0x000829, 0x00082d, PG_U_NON_SPACING_MARK},
+	{0x00082e, 0x00082f, PG_U_UNASSIGNED},
+	{0x000830, 0x00083e, PG_U_OTHER_PUNCTUATION},
+	{0x00083f, 0x00083f, PG_U_UNASSIGNED},
+	{0x000840, 0x000858, PG_U_OTHER_LETTER},
+	{0x000859, 0x00085b, PG_U_NON_SPACING_MARK},
+	{0x00085c, 0x00085d, PG_U_UNASSIGNED},
+	{0x00085e, 0x00085e, PG_U_OTHER_PUNCTUATION},
+	{0x00085f, 0x00085f, PG_U_UNASSIGNED},
+	{0x000860, 0x00086a, PG_U_OTHER_LETTER},
+	{0x00086b, 0x00086f, PG_U_UNASSIGNED},
+	{0x000870, 0x000887, PG_U_OTHER_LETTER},
+	{0x000888, 0x000888, PG_U_MODIFIER_SYMBOL},
+	{0x000889, 0x00088e, PG_U_OTHER_LETTER},
+	{0x00088f, 0x00088f, PG_U_UNASSIGNED},
+	{0x000890, 0x000891, PG_U_FORMAT_CHAR},
+	{0x000892, 0x000897, PG_U_UNASSIGNED},
+	{0x000898, 0x00089f, PG_U_NON_SPACING_MARK},
+	{0x0008a0, 0x0008c8, PG_U_OTHER_LETTER},
+	{0x0008c9, 0x0008c9, PG_U_MODIFIER_LETTER},
+	{0x0008ca, 0x0008e1, PG_U_NON_SPACING_MARK},
+	{0x0008e2, 0x0008e2, PG_U_FORMAT_CHAR},
+	{0x0008e3, 0x000902, PG_U_NON_SPACING_MARK},
+	{0x000903, 0x000903, PG_U_COMBINING_SPACING_MARK},
+	{0x000904, 0x000939, PG_U_OTHER_LETTER},
+	{0x00093a, 0x00093a, PG_U_NON_SPACING_MARK},
+	{0x00093b, 0x00093b, PG_U_COMBINING_SPACING_MARK},
+	{0x00093c, 0x00093c, PG_U_NON_SPACING_MARK},
+	{0x00093d, 0x00093d, PG_U_OTHER_LETTER},
+	{0x00093e, 0x000940, PG_U_COMBINING_SPACING_MARK},
+	{0x000941, 0x000948, PG_U_NON_SPACING_MARK},
+	{0x000949, 0x00094c, PG_U_COMBINING_SPACING_MARK},
+	{0x00094d, 0x00094d, PG_U_NON_SPACING_MARK},
+	{0x00094e, 0x00094f, PG_U_COMBINING_SPACING_MARK},
+	{0x000950, 0x000950, PG_U_OTHER_LETTER},
+	{0x000951, 0x000957, PG_U_NON_SPACING_MARK},
+	{0x000958, 0x000961, PG_U_OTHER_LETTER},
+	{0x000962, 0x000963, PG_U_NON_SPACING_MARK},
+	{0x000964, 0x000965, PG_U_OTHER_PUNCTUATION},
+	{0x000966, 0x00096f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000970, 0x000970, PG_U_OTHER_PUNCTUATION},
+	{0x000971, 0x000971, PG_U_MODIFIER_LETTER},
+	{0x000972, 0x000980, PG_U_OTHER_LETTER},
+	{0x000981, 0x000981, PG_U_NON_SPACING_MARK},
+	{0x000982, 0x000983, PG_U_COMBINING_SPACING_MARK},
+	{0x000984, 0x000984, PG_U_UNASSIGNED},
+	{0x000985, 0x00098c, PG_U_OTHER_LETTER},
+	{0x00098d, 0x00098e, PG_U_UNASSIGNED},
+	{0x00098f, 0x000990, PG_U_OTHER_LETTER},
+	{0x000991, 0x000992, PG_U_UNASSIGNED},
+	{0x000993, 0x0009a8, PG_U_OTHER_LETTER},
+	{0x0009a9, 0x0009a9, PG_U_UNASSIGNED},
+	{0x0009aa, 0x0009b0, PG_U_OTHER_LETTER},
+	{0x0009b1, 0x0009b1, PG_U_UNASSIGNED},
+	{0x0009b2, 0x0009b2, PG_U_OTHER_LETTER},
+	{0x0009b3, 0x0009b5, PG_U_UNASSIGNED},
+	{0x0009b6, 0x0009b9, PG_U_OTHER_LETTER},
+	{0x0009ba, 0x0009bb, PG_U_UNASSIGNED},
+	{0x0009bc, 0x0009bc, PG_U_NON_SPACING_MARK},
+	{0x0009bd, 0x0009bd, PG_U_OTHER_LETTER},
+	{0x0009be, 0x0009c0, PG_U_COMBINING_SPACING_MARK},
+	{0x0009c1, 0x0009c4, PG_U_NON_SPACING_MARK},
+	{0x0009c5, 0x0009c6, PG_U_UNASSIGNED},
+	{0x0009c7, 0x0009c8, PG_U_COMBINING_SPACING_MARK},
+	{0x0009c9, 0x0009ca, PG_U_UNASSIGNED},
+	{0x0009cb, 0x0009cc, PG_U_COMBINING_SPACING_MARK},
+	{0x0009cd, 0x0009cd, PG_U_NON_SPACING_MARK},
+	{0x0009ce, 0x0009ce, PG_U_OTHER_LETTER},
+	{0x0009cf, 0x0009d6, PG_U_UNASSIGNED},
+	{0x0009d7, 0x0009d7, PG_U_COMBINING_SPACING_MARK},
+	{0x0009d8, 0x0009db, PG_U_UNASSIGNED},
+	{0x0009dc, 0x0009dd, PG_U_OTHER_LETTER},
+	{0x0009de, 0x0009de, PG_U_UNASSIGNED},
+	{0x0009df, 0x0009e1, PG_U_OTHER_LETTER},
+	{0x0009e2, 0x0009e3, PG_U_NON_SPACING_MARK},
+	{0x0009e4, 0x0009e5, PG_U_UNASSIGNED},
+	{0x0009e6, 0x0009ef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0009f0, 0x0009f1, PG_U_OTHER_LETTER},
+	{0x0009f2, 0x0009f3, PG_U_CURRENCY_SYMBOL},
+	{0x0009f4, 0x0009f9, PG_U_OTHER_NUMBER},
+	{0x0009fa, 0x0009fa, PG_U_OTHER_SYMBOL},
+	{0x0009fb, 0x0009fb, PG_U_CURRENCY_SYMBOL},
+	{0x0009fc, 0x0009fc, PG_U_OTHER_LETTER},
+	{0x0009fd, 0x0009fd, PG_U_OTHER_PUNCTUATION},
+	{0x0009fe, 0x0009fe, PG_U_NON_SPACING_MARK},
+	{0x0009ff, 0x000a00, PG_U_UNASSIGNED},
+	{0x000a01, 0x000a02, PG_U_NON_SPACING_MARK},
+	{0x000a03, 0x000a03, PG_U_COMBINING_SPACING_MARK},
+	{0x000a04, 0x000a04, PG_U_UNASSIGNED},
+	{0x000a05, 0x000a0a, PG_U_OTHER_LETTER},
+	{0x000a0b, 0x000a0e, PG_U_UNASSIGNED},
+	{0x000a0f, 0x000a10, PG_U_OTHER_LETTER},
+	{0x000a11, 0x000a12, PG_U_UNASSIGNED},
+	{0x000a13, 0x000a28, PG_U_OTHER_LETTER},
+	{0x000a29, 0x000a29, PG_U_UNASSIGNED},
+	{0x000a2a, 0x000a30, PG_U_OTHER_LETTER},
+	{0x000a31, 0x000a31, PG_U_UNASSIGNED},
+	{0x000a32, 0x000a33, PG_U_OTHER_LETTER},
+	{0x000a34, 0x000a34, PG_U_UNASSIGNED},
+	{0x000a35, 0x000a36, PG_U_OTHER_LETTER},
+	{0x000a37, 0x000a37, PG_U_UNASSIGNED},
+	{0x000a38, 0x000a39, PG_U_OTHER_LETTER},
+	{0x000a3a, 0x000a3b, PG_U_UNASSIGNED},
+	{0x000a3c, 0x000a3c, PG_U_NON_SPACING_MARK},
+	{0x000a3d, 0x000a3d, PG_U_UNASSIGNED},
+	{0x000a3e, 0x000a40, PG_U_COMBINING_SPACING_MARK},
+	{0x000a41, 0x000a42, PG_U_NON_SPACING_MARK},
+	{0x000a43, 0x000a46, PG_U_UNASSIGNED},
+	{0x000a47, 0x000a48, PG_U_NON_SPACING_MARK},
+	{0x000a49, 0x000a4a, PG_U_UNASSIGNED},
+	{0x000a4b, 0x000a4d, PG_U_NON_SPACING_MARK},
+	{0x000a4e, 0x000a50, PG_U_UNASSIGNED},
+	{0x000a51, 0x000a51, PG_U_NON_SPACING_MARK},
+	{0x000a52, 0x000a58, PG_U_UNASSIGNED},
+	{0x000a59, 0x000a5c, PG_U_OTHER_LETTER},
+	{0x000a5d, 0x000a5d, PG_U_UNASSIGNED},
+	{0x000a5e, 0x000a5e, PG_U_OTHER_LETTER},
+	{0x000a5f, 0x000a65, PG_U_UNASSIGNED},
+	{0x000a66, 0x000a6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000a70, 0x000a71, PG_U_NON_SPACING_MARK},
+	{0x000a72, 0x000a74, PG_U_OTHER_LETTER},
+	{0x000a75, 0x000a75, PG_U_NON_SPACING_MARK},
+	{0x000a76, 0x000a76, PG_U_OTHER_PUNCTUATION},
+	{0x000a77, 0x000a80, PG_U_UNASSIGNED},
+	{0x000a81, 0x000a82, PG_U_NON_SPACING_MARK},
+	{0x000a83, 0x000a83, PG_U_COMBINING_SPACING_MARK},
+	{0x000a84, 0x000a84, PG_U_UNASSIGNED},
+	{0x000a85, 0x000a8d, PG_U_OTHER_LETTER},
+	{0x000a8e, 0x000a8e, PG_U_UNASSIGNED},
+	{0x000a8f, 0x000a91, PG_U_OTHER_LETTER},
+	{0x000a92, 0x000a92, PG_U_UNASSIGNED},
+	{0x000a93, 0x000aa8, PG_U_OTHER_LETTER},
+	{0x000aa9, 0x000aa9, PG_U_UNASSIGNED},
+	{0x000aaa, 0x000ab0, PG_U_OTHER_LETTER},
+	{0x000ab1, 0x000ab1, PG_U_UNASSIGNED},
+	{0x000ab2, 0x000ab3, PG_U_OTHER_LETTER},
+	{0x000ab4, 0x000ab4, PG_U_UNASSIGNED},
+	{0x000ab5, 0x000ab9, PG_U_OTHER_LETTER},
+	{0x000aba, 0x000abb, PG_U_UNASSIGNED},
+	{0x000abc, 0x000abc, PG_U_NON_SPACING_MARK},
+	{0x000abd, 0x000abd, PG_U_OTHER_LETTER},
+	{0x000abe, 0x000ac0, PG_U_COMBINING_SPACING_MARK},
+	{0x000ac1, 0x000ac5, PG_U_NON_SPACING_MARK},
+	{0x000ac6, 0x000ac6, PG_U_UNASSIGNED},
+	{0x000ac7, 0x000ac8, PG_U_NON_SPACING_MARK},
+	{0x000ac9, 0x000ac9, PG_U_COMBINING_SPACING_MARK},
+	{0x000aca, 0x000aca, PG_U_UNASSIGNED},
+	{0x000acb, 0x000acc, PG_U_COMBINING_SPACING_MARK},
+	{0x000acd, 0x000acd, PG_U_NON_SPACING_MARK},
+	{0x000ace, 0x000acf, PG_U_UNASSIGNED},
+	{0x000ad0, 0x000ad0, PG_U_OTHER_LETTER},
+	{0x000ad1, 0x000adf, PG_U_UNASSIGNED},
+	{0x000ae0, 0x000ae1, PG_U_OTHER_LETTER},
+	{0x000ae2, 0x000ae3, PG_U_NON_SPACING_MARK},
+	{0x000ae4, 0x000ae5, PG_U_UNASSIGNED},
+	{0x000ae6, 0x000aef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000af0, 0x000af0, PG_U_OTHER_PUNCTUATION},
+	{0x000af1, 0x000af1, PG_U_CURRENCY_SYMBOL},
+	{0x000af2, 0x000af8, PG_U_UNASSIGNED},
+	{0x000af9, 0x000af9, PG_U_OTHER_LETTER},
+	{0x000afa, 0x000aff, PG_U_NON_SPACING_MARK},
+	{0x000b00, 0x000b00, PG_U_UNASSIGNED},
+	{0x000b01, 0x000b01, PG_U_NON_SPACING_MARK},
+	{0x000b02, 0x000b03, PG_U_COMBINING_SPACING_MARK},
+	{0x000b04, 0x000b04, PG_U_UNASSIGNED},
+	{0x000b05, 0x000b0c, PG_U_OTHER_LETTER},
+	{0x000b0d, 0x000b0e, PG_U_UNASSIGNED},
+	{0x000b0f, 0x000b10, PG_U_OTHER_LETTER},
+	{0x000b11, 0x000b12, PG_U_UNASSIGNED},
+	{0x000b13, 0x000b28, PG_U_OTHER_LETTER},
+	{0x000b29, 0x000b29, PG_U_UNASSIGNED},
+	{0x000b2a, 0x000b30, PG_U_OTHER_LETTER},
+	{0x000b31, 0x000b31, PG_U_UNASSIGNED},
+	{0x000b32, 0x000b33, PG_U_OTHER_LETTER},
+	{0x000b34, 0x000b34, PG_U_UNASSIGNED},
+	{0x000b35, 0x000b39, PG_U_OTHER_LETTER},
+	{0x000b3a, 0x000b3b, PG_U_UNASSIGNED},
+	{0x000b3c, 0x000b3c, PG_U_NON_SPACING_MARK},
+	{0x000b3d, 0x000b3d, PG_U_OTHER_LETTER},
+	{0x000b3e, 0x000b3e, PG_U_COMBINING_SPACING_MARK},
+	{0x000b3f, 0x000b3f, PG_U_NON_SPACING_MARK},
+	{0x000b40, 0x000b40, PG_U_COMBINING_SPACING_MARK},
+	{0x000b41, 0x000b44, PG_U_NON_SPACING_MARK},
+	{0x000b45, 0x000b46, PG_U_UNASSIGNED},
+	{0x000b47, 0x000b48, PG_U_COMBINING_SPACING_MARK},
+	{0x000b49, 0x000b4a, PG_U_UNASSIGNED},
+	{0x000b4b, 0x000b4c, PG_U_COMBINING_SPACING_MARK},
+	{0x000b4d, 0x000b4d, PG_U_NON_SPACING_MARK},
+	{0x000b4e, 0x000b54, PG_U_UNASSIGNED},
+	{0x000b55, 0x000b56, PG_U_NON_SPACING_MARK},
+	{0x000b57, 0x000b57, PG_U_COMBINING_SPACING_MARK},
+	{0x000b58, 0x000b5b, PG_U_UNASSIGNED},
+	{0x000b5c, 0x000b5d, PG_U_OTHER_LETTER},
+	{0x000b5e, 0x000b5e, PG_U_UNASSIGNED},
+	{0x000b5f, 0x000b61, PG_U_OTHER_LETTER},
+	{0x000b62, 0x000b63, PG_U_NON_SPACING_MARK},
+	{0x000b64, 0x000b65, PG_U_UNASSIGNED},
+	{0x000b66, 0x000b6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000b70, 0x000b70, PG_U_OTHER_SYMBOL},
+	{0x000b71, 0x000b71, PG_U_OTHER_LETTER},
+	{0x000b72, 0x000b77, PG_U_OTHER_NUMBER},
+	{0x000b78, 0x000b81, PG_U_UNASSIGNED},
+	{0x000b82, 0x000b82, PG_U_NON_SPACING_MARK},
+	{0x000b83, 0x000b83, PG_U_OTHER_LETTER},
+	{0x000b84, 0x000b84, PG_U_UNASSIGNED},
+	{0x000b85, 0x000b8a, PG_U_OTHER_LETTER},
+	{0x000b8b, 0x000b8d, PG_U_UNASSIGNED},
+	{0x000b8e, 0x000b90, PG_U_OTHER_LETTER},
+	{0x000b91, 0x000b91, PG_U_UNASSIGNED},
+	{0x000b92, 0x000b95, PG_U_OTHER_LETTER},
+	{0x000b96, 0x000b98, PG_U_UNASSIGNED},
+	{0x000b99, 0x000b9a, PG_U_OTHER_LETTER},
+	{0x000b9b, 0x000b9b, PG_U_UNASSIGNED},
+	{0x000b9c, 0x000b9c, PG_U_OTHER_LETTER},
+	{0x000b9d, 0x000b9d, PG_U_UNASSIGNED},
+	{0x000b9e, 0x000b9f, PG_U_OTHER_LETTER},
+	{0x000ba0, 0x000ba2, PG_U_UNASSIGNED},
+	{0x000ba3, 0x000ba4, PG_U_OTHER_LETTER},
+	{0x000ba5, 0x000ba7, PG_U_UNASSIGNED},
+	{0x000ba8, 0x000baa, PG_U_OTHER_LETTER},
+	{0x000bab, 0x000bad, PG_U_UNASSIGNED},
+	{0x000bae, 0x000bb9, PG_U_OTHER_LETTER},
+	{0x000bba, 0x000bbd, PG_U_UNASSIGNED},
+	{0x000bbe, 0x000bbf, PG_U_COMBINING_SPACING_MARK},
+	{0x000bc0, 0x000bc0, PG_U_NON_SPACING_MARK},
+	{0x000bc1, 0x000bc2, PG_U_COMBINING_SPACING_MARK},
+	{0x000bc3, 0x000bc5, PG_U_UNASSIGNED},
+	{0x000bc6, 0x000bc8, PG_U_COMBINING_SPACING_MARK},
+	{0x000bc9, 0x000bc9, PG_U_UNASSIGNED},
+	{0x000bca, 0x000bcc, PG_U_COMBINING_SPACING_MARK},
+	{0x000bcd, 0x000bcd, PG_U_NON_SPACING_MARK},
+	{0x000bce, 0x000bcf, PG_U_UNASSIGNED},
+	{0x000bd0, 0x000bd0, PG_U_OTHER_LETTER},
+	{0x000bd1, 0x000bd6, PG_U_UNASSIGNED},
+	{0x000bd7, 0x000bd7, PG_U_COMBINING_SPACING_MARK},
+	{0x000bd8, 0x000be5, PG_U_UNASSIGNED},
+	{0x000be6, 0x000bef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000bf0, 0x000bf2, PG_U_OTHER_NUMBER},
+	{0x000bf3, 0x000bf8, PG_U_OTHER_SYMBOL},
+	{0x000bf9, 0x000bf9, PG_U_CURRENCY_SYMBOL},
+	{0x000bfa, 0x000bfa, PG_U_OTHER_SYMBOL},
+	{0x000bfb, 0x000bff, PG_U_UNASSIGNED},
+	{0x000c00, 0x000c00, PG_U_NON_SPACING_MARK},
+	{0x000c01, 0x000c03, PG_U_COMBINING_SPACING_MARK},
+	{0x000c04, 0x000c04, PG_U_NON_SPACING_MARK},
+	{0x000c05, 0x000c0c, PG_U_OTHER_LETTER},
+	{0x000c0d, 0x000c0d, PG_U_UNASSIGNED},
+	{0x000c0e, 0x000c10, PG_U_OTHER_LETTER},
+	{0x000c11, 0x000c11, PG_U_UNASSIGNED},
+	{0x000c12, 0x000c28, PG_U_OTHER_LETTER},
+	{0x000c29, 0x000c29, PG_U_UNASSIGNED},
+	{0x000c2a, 0x000c39, PG_U_OTHER_LETTER},
+	{0x000c3a, 0x000c3b, PG_U_UNASSIGNED},
+	{0x000c3c, 0x000c3c, PG_U_NON_SPACING_MARK},
+	{0x000c3d, 0x000c3d, PG_U_OTHER_LETTER},
+	{0x000c3e, 0x000c40, PG_U_NON_SPACING_MARK},
+	{0x000c41, 0x000c44, PG_U_COMBINING_SPACING_MARK},
+	{0x000c45, 0x000c45, PG_U_UNASSIGNED},
+	{0x000c46, 0x000c48, PG_U_NON_SPACING_MARK},
+	{0x000c49, 0x000c49, PG_U_UNASSIGNED},
+	{0x000c4a, 0x000c4d, PG_U_NON_SPACING_MARK},
+	{0x000c4e, 0x000c54, PG_U_UNASSIGNED},
+	{0x000c55, 0x000c56, PG_U_NON_SPACING_MARK},
+	{0x000c57, 0x000c57, PG_U_UNASSIGNED},
+	{0x000c58, 0x000c5a, PG_U_OTHER_LETTER},
+	{0x000c5b, 0x000c5c, PG_U_UNASSIGNED},
+	{0x000c5d, 0x000c5d, PG_U_OTHER_LETTER},
+	{0x000c5e, 0x000c5f, PG_U_UNASSIGNED},
+	{0x000c60, 0x000c61, PG_U_OTHER_LETTER},
+	{0x000c62, 0x000c63, PG_U_NON_SPACING_MARK},
+	{0x000c64, 0x000c65, PG_U_UNASSIGNED},
+	{0x000c66, 0x000c6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000c70, 0x000c76, PG_U_UNASSIGNED},
+	{0x000c77, 0x000c77, PG_U_OTHER_PUNCTUATION},
+	{0x000c78, 0x000c7e, PG_U_OTHER_NUMBER},
+	{0x000c7f, 0x000c7f, PG_U_OTHER_SYMBOL},
+	{0x000c80, 0x000c80, PG_U_OTHER_LETTER},
+	{0x000c81, 0x000c81, PG_U_NON_SPACING_MARK},
+	{0x000c82, 0x000c83, PG_U_COMBINING_SPACING_MARK},
+	{0x000c84, 0x000c84, PG_U_OTHER_PUNCTUATION},
+	{0x000c85, 0x000c8c, PG_U_OTHER_LETTER},
+	{0x000c8d, 0x000c8d, PG_U_UNASSIGNED},
+	{0x000c8e, 0x000c90, PG_U_OTHER_LETTER},
+	{0x000c91, 0x000c91, PG_U_UNASSIGNED},
+	{0x000c92, 0x000ca8, PG_U_OTHER_LETTER},
+	{0x000ca9, 0x000ca9, PG_U_UNASSIGNED},
+	{0x000caa, 0x000cb3, PG_U_OTHER_LETTER},
+	{0x000cb4, 0x000cb4, PG_U_UNASSIGNED},
+	{0x000cb5, 0x000cb9, PG_U_OTHER_LETTER},
+	{0x000cba, 0x000cbb, PG_U_UNASSIGNED},
+	{0x000cbc, 0x000cbc, PG_U_NON_SPACING_MARK},
+	{0x000cbd, 0x000cbd, PG_U_OTHER_LETTER},
+	{0x000cbe, 0x000cbe, PG_U_COMBINING_SPACING_MARK},
+	{0x000cbf, 0x000cbf, PG_U_NON_SPACING_MARK},
+	{0x000cc0, 0x000cc4, PG_U_COMBINING_SPACING_MARK},
+	{0x000cc5, 0x000cc5, PG_U_UNASSIGNED},
+	{0x000cc6, 0x000cc6, PG_U_NON_SPACING_MARK},
+	{0x000cc7, 0x000cc8, PG_U_COMBINING_SPACING_MARK},
+	{0x000cc9, 0x000cc9, PG_U_UNASSIGNED},
+	{0x000cca, 0x000ccb, PG_U_COMBINING_SPACING_MARK},
+	{0x000ccc, 0x000ccd, PG_U_NON_SPACING_MARK},
+	{0x000cce, 0x000cd4, PG_U_UNASSIGNED},
+	{0x000cd5, 0x000cd6, PG_U_COMBINING_SPACING_MARK},
+	{0x000cd7, 0x000cdc, PG_U_UNASSIGNED},
+	{0x000cdd, 0x000cde, PG_U_OTHER_LETTER},
+	{0x000cdf, 0x000cdf, PG_U_UNASSIGNED},
+	{0x000ce0, 0x000ce1, PG_U_OTHER_LETTER},
+	{0x000ce2, 0x000ce3, PG_U_NON_SPACING_MARK},
+	{0x000ce4, 0x000ce5, PG_U_UNASSIGNED},
+	{0x000ce6, 0x000cef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000cf0, 0x000cf0, PG_U_UNASSIGNED},
+	{0x000cf1, 0x000cf2, PG_U_OTHER_LETTER},
+	{0x000cf3, 0x000cf3, PG_U_COMBINING_SPACING_MARK},
+	{0x000cf4, 0x000cff, PG_U_UNASSIGNED},
+	{0x000d00, 0x000d01, PG_U_NON_SPACING_MARK},
+	{0x000d02, 0x000d03, PG_U_COMBINING_SPACING_MARK},
+	{0x000d04, 0x000d0c, PG_U_OTHER_LETTER},
+	{0x000d0d, 0x000d0d, PG_U_UNASSIGNED},
+	{0x000d0e, 0x000d10, PG_U_OTHER_LETTER},
+	{0x000d11, 0x000d11, PG_U_UNASSIGNED},
+	{0x000d12, 0x000d3a, PG_U_OTHER_LETTER},
+	{0x000d3b, 0x000d3c, PG_U_NON_SPACING_MARK},
+	{0x000d3d, 0x000d3d, PG_U_OTHER_LETTER},
+	{0x000d3e, 0x000d40, PG_U_COMBINING_SPACING_MARK},
+	{0x000d41, 0x000d44, PG_U_NON_SPACING_MARK},
+	{0x000d45, 0x000d45, PG_U_UNASSIGNED},
+	{0x000d46, 0x000d48, PG_U_COMBINING_SPACING_MARK},
+	{0x000d49, 0x000d49, PG_U_UNASSIGNED},
+	{0x000d4a, 0x000d4c, PG_U_COMBINING_SPACING_MARK},
+	{0x000d4d, 0x000d4d, PG_U_NON_SPACING_MARK},
+	{0x000d4e, 0x000d4e, PG_U_OTHER_LETTER},
+	{0x000d4f, 0x000d4f, PG_U_OTHER_SYMBOL},
+	{0x000d50, 0x000d53, PG_U_UNASSIGNED},
+	{0x000d54, 0x000d56, PG_U_OTHER_LETTER},
+	{0x000d57, 0x000d57, PG_U_COMBINING_SPACING_MARK},
+	{0x000d58, 0x000d5e, PG_U_OTHER_NUMBER},
+	{0x000d5f, 0x000d61, PG_U_OTHER_LETTER},
+	{0x000d62, 0x000d63, PG_U_NON_SPACING_MARK},
+	{0x000d64, 0x000d65, PG_U_UNASSIGNED},
+	{0x000d66, 0x000d6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000d70, 0x000d78, PG_U_OTHER_NUMBER},
+	{0x000d79, 0x000d79, PG_U_OTHER_SYMBOL},
+	{0x000d7a, 0x000d7f, PG_U_OTHER_LETTER},
+	{0x000d80, 0x000d80, PG_U_UNASSIGNED},
+	{0x000d81, 0x000d81, PG_U_NON_SPACING_MARK},
+	{0x000d82, 0x000d83, PG_U_COMBINING_SPACING_MARK},
+	{0x000d84, 0x000d84, PG_U_UNASSIGNED},
+	{0x000d85, 0x000d96, PG_U_OTHER_LETTER},
+	{0x000d97, 0x000d99, PG_U_UNASSIGNED},
+	{0x000d9a, 0x000db1, PG_U_OTHER_LETTER},
+	{0x000db2, 0x000db2, PG_U_UNASSIGNED},
+	{0x000db3, 0x000dbb, PG_U_OTHER_LETTER},
+	{0x000dbc, 0x000dbc, PG_U_UNASSIGNED},
+	{0x000dbd, 0x000dbd, PG_U_OTHER_LETTER},
+	{0x000dbe, 0x000dbf, PG_U_UNASSIGNED},
+	{0x000dc0, 0x000dc6, PG_U_OTHER_LETTER},
+	{0x000dc7, 0x000dc9, PG_U_UNASSIGNED},
+	{0x000dca, 0x000dca, PG_U_NON_SPACING_MARK},
+	{0x000dcb, 0x000dce, PG_U_UNASSIGNED},
+	{0x000dcf, 0x000dd1, PG_U_COMBINING_SPACING_MARK},
+	{0x000dd2, 0x000dd4, PG_U_NON_SPACING_MARK},
+	{0x000dd5, 0x000dd5, PG_U_UNASSIGNED},
+	{0x000dd6, 0x000dd6, PG_U_NON_SPACING_MARK},
+	{0x000dd7, 0x000dd7, PG_U_UNASSIGNED},
+	{0x000dd8, 0x000ddf, PG_U_COMBINING_SPACING_MARK},
+	{0x000de0, 0x000de5, PG_U_UNASSIGNED},
+	{0x000de6, 0x000def, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000df0, 0x000df1, PG_U_UNASSIGNED},
+	{0x000df2, 0x000df3, PG_U_COMBINING_SPACING_MARK},
+	{0x000df4, 0x000df4, PG_U_OTHER_PUNCTUATION},
+	{0x000df5, 0x000e00, PG_U_UNASSIGNED},
+	{0x000e01, 0x000e30, PG_U_OTHER_LETTER},
+	{0x000e31, 0x000e31, PG_U_NON_SPACING_MARK},
+	{0x000e32, 0x000e33, PG_U_OTHER_LETTER},
+	{0x000e34, 0x000e3a, PG_U_NON_SPACING_MARK},
+	{0x000e3b, 0x000e3e, PG_U_UNASSIGNED},
+	{0x000e3f, 0x000e3f, PG_U_CURRENCY_SYMBOL},
+	{0x000e40, 0x000e45, PG_U_OTHER_LETTER},
+	{0x000e46, 0x000e46, PG_U_MODIFIER_LETTER},
+	{0x000e47, 0x000e4e, PG_U_NON_SPACING_MARK},
+	{0x000e4f, 0x000e4f, PG_U_OTHER_PUNCTUATION},
+	{0x000e50, 0x000e59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000e5a, 0x000e5b, PG_U_OTHER_PUNCTUATION},
+	{0x000e5c, 0x000e80, PG_U_UNASSIGNED},
+	{0x000e81, 0x000e82, PG_U_OTHER_LETTER},
+	{0x000e83, 0x000e83, PG_U_UNASSIGNED},
+	{0x000e84, 0x000e84, PG_U_OTHER_LETTER},
+	{0x000e85, 0x000e85, PG_U_UNASSIGNED},
+	{0x000e86, 0x000e8a, PG_U_OTHER_LETTER},
+	{0x000e8b, 0x000e8b, PG_U_UNASSIGNED},
+	{0x000e8c, 0x000ea3, PG_U_OTHER_LETTER},
+	{0x000ea4, 0x000ea4, PG_U_UNASSIGNED},
+	{0x000ea5, 0x000ea5, PG_U_OTHER_LETTER},
+	{0x000ea6, 0x000ea6, PG_U_UNASSIGNED},
+	{0x000ea7, 0x000eb0, PG_U_OTHER_LETTER},
+	{0x000eb1, 0x000eb1, PG_U_NON_SPACING_MARK},
+	{0x000eb2, 0x000eb3, PG_U_OTHER_LETTER},
+	{0x000eb4, 0x000ebc, PG_U_NON_SPACING_MARK},
+	{0x000ebd, 0x000ebd, PG_U_OTHER_LETTER},
+	{0x000ebe, 0x000ebf, PG_U_UNASSIGNED},
+	{0x000ec0, 0x000ec4, PG_U_OTHER_LETTER},
+	{0x000ec5, 0x000ec5, PG_U_UNASSIGNED},
+	{0x000ec6, 0x000ec6, PG_U_MODIFIER_LETTER},
+	{0x000ec7, 0x000ec7, PG_U_UNASSIGNED},
+	{0x000ec8, 0x000ece, PG_U_NON_SPACING_MARK},
+	{0x000ecf, 0x000ecf, PG_U_UNASSIGNED},
+	{0x000ed0, 0x000ed9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000eda, 0x000edb, PG_U_UNASSIGNED},
+	{0x000edc, 0x000edf, PG_U_OTHER_LETTER},
+	{0x000ee0, 0x000eff, PG_U_UNASSIGNED},
+	{0x000f00, 0x000f00, PG_U_OTHER_LETTER},
+	{0x000f01, 0x000f03, PG_U_OTHER_SYMBOL},
+	{0x000f04, 0x000f12, PG_U_OTHER_PUNCTUATION},
+	{0x000f13, 0x000f13, PG_U_OTHER_SYMBOL},
+	{0x000f14, 0x000f14, PG_U_OTHER_PUNCTUATION},
+	{0x000f15, 0x000f17, PG_U_OTHER_SYMBOL},
+	{0x000f18, 0x000f19, PG_U_NON_SPACING_MARK},
+	{0x000f1a, 0x000f1f, PG_U_OTHER_SYMBOL},
+	{0x000f20, 0x000f29, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000f2a, 0x000f33, PG_U_OTHER_NUMBER},
+	{0x000f34, 0x000f34, PG_U_OTHER_SYMBOL},
+	{0x000f35, 0x000f35, PG_U_NON_SPACING_MARK},
+	{0x000f36, 0x000f36, PG_U_OTHER_SYMBOL},
+	{0x000f37, 0x000f37, PG_U_NON_SPACING_MARK},
+	{0x000f38, 0x000f38, PG_U_OTHER_SYMBOL},
+	{0x000f39, 0x000f39, PG_U_NON_SPACING_MARK},
+	{0x000f3a, 0x000f3a, PG_U_START_PUNCTUATION},
+	{0x000f3b, 0x000f3b, PG_U_END_PUNCTUATION},
+	{0x000f3c, 0x000f3c, PG_U_START_PUNCTUATION},
+	{0x000f3d, 0x000f3d, PG_U_END_PUNCTUATION},
+	{0x000f3e, 0x000f3f, PG_U_COMBINING_SPACING_MARK},
+	{0x000f40, 0x000f47, PG_U_OTHER_LETTER},
+	{0x000f48, 0x000f48, PG_U_UNASSIGNED},
+	{0x000f49, 0x000f6c, PG_U_OTHER_LETTER},
+	{0x000f6d, 0x000f70, PG_U_UNASSIGNED},
+	{0x000f71, 0x000f7e, PG_U_NON_SPACING_MARK},
+	{0x000f7f, 0x000f7f, PG_U_COMBINING_SPACING_MARK},
+	{0x000f80, 0x000f84, PG_U_NON_SPACING_MARK},
+	{0x000f85, 0x000f85, PG_U_OTHER_PUNCTUATION},
+	{0x000f86, 0x000f87, PG_U_NON_SPACING_MARK},
+	{0x000f88, 0x000f8c, PG_U_OTHER_LETTER},
+	{0x000f8d, 0x000f97, PG_U_NON_SPACING_MARK},
+	{0x000f98, 0x000f98, PG_U_UNASSIGNED},
+	{0x000f99, 0x000fbc, PG_U_NON_SPACING_MARK},
+	{0x000fbd, 0x000fbd, PG_U_UNASSIGNED},
+	{0x000fbe, 0x000fc5, PG_U_OTHER_SYMBOL},
+	{0x000fc6, 0x000fc6, PG_U_NON_SPACING_MARK},
+	{0x000fc7, 0x000fcc, PG_U_OTHER_SYMBOL},
+	{0x000fcd, 0x000fcd, PG_U_UNASSIGNED},
+	{0x000fce, 0x000fcf, PG_U_OTHER_SYMBOL},
+	{0x000fd0, 0x000fd4, PG_U_OTHER_PUNCTUATION},
+	{0x000fd5, 0x000fd8, PG_U_OTHER_SYMBOL},
+	{0x000fd9, 0x000fda, PG_U_OTHER_PUNCTUATION},
+	{0x000fdb, 0x000fff, PG_U_UNASSIGNED},
+	{0x001000, 0x00102a, PG_U_OTHER_LETTER},
+	{0x00102b, 0x00102c, PG_U_COMBINING_SPACING_MARK},
+	{0x00102d, 0x001030, PG_U_NON_SPACING_MARK},
+	{0x001031, 0x001031, PG_U_COMBINING_SPACING_MARK},
+	{0x001032, 0x001037, PG_U_NON_SPACING_MARK},
+	{0x001038, 0x001038, PG_U_COMBINING_SPACING_MARK},
+	{0x001039, 0x00103a, PG_U_NON_SPACING_MARK},
+	{0x00103b, 0x00103c, PG_U_COMBINING_SPACING_MARK},
+	{0x00103d, 0x00103e, PG_U_NON_SPACING_MARK},
+	{0x00103f, 0x00103f, PG_U_OTHER_LETTER},
+	{0x001040, 0x001049, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00104a, 0x00104f, PG_U_OTHER_PUNCTUATION},
+	{0x001050, 0x001055, PG_U_OTHER_LETTER},
+	{0x001056, 0x001057, PG_U_COMBINING_SPACING_MARK},
+	{0x001058, 0x001059, PG_U_NON_SPACING_MARK},
+	{0x00105a, 0x00105d, PG_U_OTHER_LETTER},
+	{0x00105e, 0x001060, PG_U_NON_SPACING_MARK},
+	{0x001061, 0x001061, PG_U_OTHER_LETTER},
+	{0x001062, 0x001064, PG_U_COMBINING_SPACING_MARK},
+	{0x001065, 0x001066, PG_U_OTHER_LETTER},
+	{0x001067, 0x00106d, PG_U_COMBINING_SPACING_MARK},
+	{0x00106e, 0x001070, PG_U_OTHER_LETTER},
+	{0x001071, 0x001074, PG_U_NON_SPACING_MARK},
+	{0x001075, 0x001081, PG_U_OTHER_LETTER},
+	{0x001082, 0x001082, PG_U_NON_SPACING_MARK},
+	{0x001083, 0x001084, PG_U_COMBINING_SPACING_MARK},
+	{0x001085, 0x001086, PG_U_NON_SPACING_MARK},
+	{0x001087, 0x00108c, PG_U_COMBINING_SPACING_MARK},
+	{0x00108d, 0x00108d, PG_U_NON_SPACING_MARK},
+	{0x00108e, 0x00108e, PG_U_OTHER_LETTER},
+	{0x00108f, 0x00108f, PG_U_COMBINING_SPACING_MARK},
+	{0x001090, 0x001099, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00109a, 0x00109c, PG_U_COMBINING_SPACING_MARK},
+	{0x00109d, 0x00109d, PG_U_NON_SPACING_MARK},
+	{0x00109e, 0x00109f, PG_U_OTHER_SYMBOL},
+	{0x0010a0, 0x0010c5, PG_U_UPPERCASE_LETTER},
+	{0x0010c6, 0x0010c6, PG_U_UNASSIGNED},
+	{0x0010c7, 0x0010c7, PG_U_UPPERCASE_LETTER},
+	{0x0010c8, 0x0010cc, PG_U_UNASSIGNED},
+	{0x0010cd, 0x0010cd, PG_U_UPPERCASE_LETTER},
+	{0x0010ce, 0x0010cf, PG_U_UNASSIGNED},
+	{0x0010d0, 0x0010fa, PG_U_LOWERCASE_LETTER},
+	{0x0010fb, 0x0010fb, PG_U_OTHER_PUNCTUATION},
+	{0x0010fc, 0x0010fc, PG_U_MODIFIER_LETTER},
+	{0x0010fd, 0x0010ff, PG_U_LOWERCASE_LETTER},
+	{0x001100, 0x001248, PG_U_OTHER_LETTER},
+	{0x001249, 0x001249, PG_U_UNASSIGNED},
+	{0x00124a, 0x00124d, PG_U_OTHER_LETTER},
+	{0x00124e, 0x00124f, PG_U_UNASSIGNED},
+	{0x001250, 0x001256, PG_U_OTHER_LETTER},
+	{0x001257, 0x001257, PG_U_UNASSIGNED},
+	{0x001258, 0x001258, PG_U_OTHER_LETTER},
+	{0x001259, 0x001259, PG_U_UNASSIGNED},
+	{0x00125a, 0x00125d, PG_U_OTHER_LETTER},
+	{0x00125e, 0x00125f, PG_U_UNASSIGNED},
+	{0x001260, 0x001288, PG_U_OTHER_LETTER},
+	{0x001289, 0x001289, PG_U_UNASSIGNED},
+	{0x00128a, 0x00128d, PG_U_OTHER_LETTER},
+	{0x00128e, 0x00128f, PG_U_UNASSIGNED},
+	{0x001290, 0x0012b0, PG_U_OTHER_LETTER},
+	{0x0012b1, 0x0012b1, PG_U_UNASSIGNED},
+	{0x0012b2, 0x0012b5, PG_U_OTHER_LETTER},
+	{0x0012b6, 0x0012b7, PG_U_UNASSIGNED},
+	{0x0012b8, 0x0012be, PG_U_OTHER_LETTER},
+	{0x0012bf, 0x0012bf, PG_U_UNASSIGNED},
+	{0x0012c0, 0x0012c0, PG_U_OTHER_LETTER},
+	{0x0012c1, 0x0012c1, PG_U_UNASSIGNED},
+	{0x0012c2, 0x0012c5, PG_U_OTHER_LETTER},
+	{0x0012c6, 0x0012c7, PG_U_UNASSIGNED},
+	{0x0012c8, 0x0012d6, PG_U_OTHER_LETTER},
+	{0x0012d7, 0x0012d7, PG_U_UNASSIGNED},
+	{0x0012d8, 0x001310, PG_U_OTHER_LETTER},
+	{0x001311, 0x001311, PG_U_UNASSIGNED},
+	{0x001312, 0x001315, PG_U_OTHER_LETTER},
+	{0x001316, 0x001317, PG_U_UNASSIGNED},
+	{0x001318, 0x00135a, PG_U_OTHER_LETTER},
+	{0x00135b, 0x00135c, PG_U_UNASSIGNED},
+	{0x00135d, 0x00135f, PG_U_NON_SPACING_MARK},
+	{0x001360, 0x001368, PG_U_OTHER_PUNCTUATION},
+	{0x001369, 0x00137c, PG_U_OTHER_NUMBER},
+	{0x00137d, 0x00137f, PG_U_UNASSIGNED},
+	{0x001380, 0x00138f, PG_U_OTHER_LETTER},
+	{0x001390, 0x001399, PG_U_OTHER_SYMBOL},
+	{0x00139a, 0x00139f, PG_U_UNASSIGNED},
+	{0x0013a0, 0x0013f5, PG_U_UPPERCASE_LETTER},
+	{0x0013f6, 0x0013f7, PG_U_UNASSIGNED},
+	{0x0013f8, 0x0013fd, PG_U_LOWERCASE_LETTER},
+	{0x0013fe, 0x0013ff, PG_U_UNASSIGNED},
+	{0x001400, 0x001400, PG_U_DASH_PUNCTUATION},
+	{0x001401, 0x00166c, PG_U_OTHER_LETTER},
+	{0x00166d, 0x00166d, PG_U_OTHER_SYMBOL},
+	{0x00166e, 0x00166e, PG_U_OTHER_PUNCTUATION},
+	{0x00166f, 0x00167f, PG_U_OTHER_LETTER},
+	{0x001680, 0x001680, PG_U_SPACE_SEPARATOR},
+	{0x001681, 0x00169a, PG_U_OTHER_LETTER},
+	{0x00169b, 0x00169b, PG_U_START_PUNCTUATION},
+	{0x00169c, 0x00169c, PG_U_END_PUNCTUATION},
+	{0x00169d, 0x00169f, PG_U_UNASSIGNED},
+	{0x0016a0, 0x0016ea, PG_U_OTHER_LETTER},
+	{0x0016eb, 0x0016ed, PG_U_OTHER_PUNCTUATION},
+	{0x0016ee, 0x0016f0, PG_U_LETTER_NUMBER},
+	{0x0016f1, 0x0016f8, PG_U_OTHER_LETTER},
+	{0x0016f9, 0x0016ff, PG_U_UNASSIGNED},
+	{0x001700, 0x001711, PG_U_OTHER_LETTER},
+	{0x001712, 0x001714, PG_U_NON_SPACING_MARK},
+	{0x001715, 0x001715, PG_U_COMBINING_SPACING_MARK},
+	{0x001716, 0x00171e, PG_U_UNASSIGNED},
+	{0x00171f, 0x001731, PG_U_OTHER_LETTER},
+	{0x001732, 0x001733, PG_U_NON_SPACING_MARK},
+	{0x001734, 0x001734, PG_U_COMBINING_SPACING_MARK},
+	{0x001735, 0x001736, PG_U_OTHER_PUNCTUATION},
+	{0x001737, 0x00173f, PG_U_UNASSIGNED},
+	{0x001740, 0x001751, PG_U_OTHER_LETTER},
+	{0x001752, 0x001753, PG_U_NON_SPACING_MARK},
+	{0x001754, 0x00175f, PG_U_UNASSIGNED},
+	{0x001760, 0x00176c, PG_U_OTHER_LETTER},
+	{0x00176d, 0x00176d, PG_U_UNASSIGNED},
+	{0x00176e, 0x001770, PG_U_OTHER_LETTER},
+	{0x001771, 0x001771, PG_U_UNASSIGNED},
+	{0x001772, 0x001773, PG_U_NON_SPACING_MARK},
+	{0x001774, 0x00177f, PG_U_UNASSIGNED},
+	{0x001780, 0x0017b3, PG_U_OTHER_LETTER},
+	{0x0017b4, 0x0017b5, PG_U_NON_SPACING_MARK},
+	{0x0017b6, 0x0017b6, PG_U_COMBINING_SPACING_MARK},
+	{0x0017b7, 0x0017bd, PG_U_NON_SPACING_MARK},
+	{0x0017be, 0x0017c5, PG_U_COMBINING_SPACING_MARK},
+	{0x0017c6, 0x0017c6, PG_U_NON_SPACING_MARK},
+	{0x0017c7, 0x0017c8, PG_U_COMBINING_SPACING_MARK},
+	{0x0017c9, 0x0017d3, PG_U_NON_SPACING_MARK},
+	{0x0017d4, 0x0017d6, PG_U_OTHER_PUNCTUATION},
+	{0x0017d7, 0x0017d7, PG_U_MODIFIER_LETTER},
+	{0x0017d8, 0x0017da, PG_U_OTHER_PUNCTUATION},
+	{0x0017db, 0x0017db, PG_U_CURRENCY_SYMBOL},
+	{0x0017dc, 0x0017dc, PG_U_OTHER_LETTER},
+	{0x0017dd, 0x0017dd, PG_U_NON_SPACING_MARK},
+	{0x0017de, 0x0017df, PG_U_UNASSIGNED},
+	{0x0017e0, 0x0017e9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0017ea, 0x0017ef, PG_U_UNASSIGNED},
+	{0x0017f0, 0x0017f9, PG_U_OTHER_NUMBER},
+	{0x0017fa, 0x0017ff, PG_U_UNASSIGNED},
+	{0x001800, 0x001805, PG_U_OTHER_PUNCTUATION},
+	{0x001806, 0x001806, PG_U_DASH_PUNCTUATION},
+	{0x001807, 0x00180a, PG_U_OTHER_PUNCTUATION},
+	{0x00180b, 0x00180d, PG_U_NON_SPACING_MARK},
+	{0x00180e, 0x00180e, PG_U_FORMAT_CHAR},
+	{0x00180f, 0x00180f, PG_U_NON_SPACING_MARK},
+	{0x001810, 0x001819, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00181a, 0x00181f, PG_U_UNASSIGNED},
+	{0x001820, 0x001842, PG_U_OTHER_LETTER},
+	{0x001843, 0x001843, PG_U_MODIFIER_LETTER},
+	{0x001844, 0x001878, PG_U_OTHER_LETTER},
+	{0x001879, 0x00187f, PG_U_UNASSIGNED},
+	{0x001880, 0x001884, PG_U_OTHER_LETTER},
+	{0x001885, 0x001886, PG_U_NON_SPACING_MARK},
+	{0x001887, 0x0018a8, PG_U_OTHER_LETTER},
+	{0x0018a9, 0x0018a9, PG_U_NON_SPACING_MARK},
+	{0x0018aa, 0x0018aa, PG_U_OTHER_LETTER},
+	{0x0018ab, 0x0018af, PG_U_UNASSIGNED},
+	{0x0018b0, 0x0018f5, PG_U_OTHER_LETTER},
+	{0x0018f6, 0x0018ff, PG_U_UNASSIGNED},
+	{0x001900, 0x00191e, PG_U_OTHER_LETTER},
+	{0x00191f, 0x00191f, PG_U_UNASSIGNED},
+	{0x001920, 0x001922, PG_U_NON_SPACING_MARK},
+	{0x001923, 0x001926, PG_U_COMBINING_SPACING_MARK},
+	{0x001927, 0x001928, PG_U_NON_SPACING_MARK},
+	{0x001929, 0x00192b, PG_U_COMBINING_SPACING_MARK},
+	{0x00192c, 0x00192f, PG_U_UNASSIGNED},
+	{0x001930, 0x001931, PG_U_COMBINING_SPACING_MARK},
+	{0x001932, 0x001932, PG_U_NON_SPACING_MARK},
+	{0x001933, 0x001938, PG_U_COMBINING_SPACING_MARK},
+	{0x001939, 0x00193b, PG_U_NON_SPACING_MARK},
+	{0x00193c, 0x00193f, PG_U_UNASSIGNED},
+	{0x001940, 0x001940, PG_U_OTHER_SYMBOL},
+	{0x001941, 0x001943, PG_U_UNASSIGNED},
+	{0x001944, 0x001945, PG_U_OTHER_PUNCTUATION},
+	{0x001946, 0x00194f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001950, 0x00196d, PG_U_OTHER_LETTER},
+	{0x00196e, 0x00196f, PG_U_UNASSIGNED},
+	{0x001970, 0x001974, PG_U_OTHER_LETTER},
+	{0x001975, 0x00197f, PG_U_UNASSIGNED},
+	{0x001980, 0x0019ab, PG_U_OTHER_LETTER},
+	{0x0019ac, 0x0019af, PG_U_UNASSIGNED},
+	{0x0019b0, 0x0019c9, PG_U_OTHER_LETTER},
+	{0x0019ca, 0x0019cf, PG_U_UNASSIGNED},
+	{0x0019d0, 0x0019d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0019da, 0x0019da, PG_U_OTHER_NUMBER},
+	{0x0019db, 0x0019dd, PG_U_UNASSIGNED},
+	{0x0019de, 0x0019ff, PG_U_OTHER_SYMBOL},
+	{0x001a00, 0x001a16, PG_U_OTHER_LETTER},
+	{0x001a17, 0x001a18, PG_U_NON_SPACING_MARK},
+	{0x001a19, 0x001a1a, PG_U_COMBINING_SPACING_MARK},
+	{0x001a1b, 0x001a1b, PG_U_NON_SPACING_MARK},
+	{0x001a1c, 0x001a1d, PG_U_UNASSIGNED},
+	{0x001a1e, 0x001a1f, PG_U_OTHER_PUNCTUATION},
+	{0x001a20, 0x001a54, PG_U_OTHER_LETTER},
+	{0x001a55, 0x001a55, PG_U_COMBINING_SPACING_MARK},
+	{0x001a56, 0x001a56, PG_U_NON_SPACING_MARK},
+	{0x001a57, 0x001a57, PG_U_COMBINING_SPACING_MARK},
+	{0x001a58, 0x001a5e, PG_U_NON_SPACING_MARK},
+	{0x001a5f, 0x001a5f, PG_U_UNASSIGNED},
+	{0x001a60, 0x001a60, PG_U_NON_SPACING_MARK},
+	{0x001a61, 0x001a61, PG_U_COMBINING_SPACING_MARK},
+	{0x001a62, 0x001a62, PG_U_NON_SPACING_MARK},
+	{0x001a63, 0x001a64, PG_U_COMBINING_SPACING_MARK},
+	{0x001a65, 0x001a6c, PG_U_NON_SPACING_MARK},
+	{0x001a6d, 0x001a72, PG_U_COMBINING_SPACING_MARK},
+	{0x001a73, 0x001a7c, PG_U_NON_SPACING_MARK},
+	{0x001a7d, 0x001a7e, PG_U_UNASSIGNED},
+	{0x001a7f, 0x001a7f, PG_U_NON_SPACING_MARK},
+	{0x001a80, 0x001a89, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001a8a, 0x001a8f, PG_U_UNASSIGNED},
+	{0x001a90, 0x001a99, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001a9a, 0x001a9f, PG_U_UNASSIGNED},
+	{0x001aa0, 0x001aa6, PG_U_OTHER_PUNCTUATION},
+	{0x001aa7, 0x001aa7, PG_U_MODIFIER_LETTER},
+	{0x001aa8, 0x001aad, PG_U_OTHER_PUNCTUATION},
+	{0x001aae, 0x001aaf, PG_U_UNASSIGNED},
+	{0x001ab0, 0x001abd, PG_U_NON_SPACING_MARK},
+	{0x001abe, 0x001abe, PG_U_ENCLOSING_MARK},
+	{0x001abf, 0x001ace, PG_U_NON_SPACING_MARK},
+	{0x001acf, 0x001aff, PG_U_UNASSIGNED},
+	{0x001b00, 0x001b03, PG_U_NON_SPACING_MARK},
+	{0x001b04, 0x001b04, PG_U_COMBINING_SPACING_MARK},
+	{0x001b05, 0x001b33, PG_U_OTHER_LETTER},
+	{0x001b34, 0x001b34, PG_U_NON_SPACING_MARK},
+	{0x001b35, 0x001b35, PG_U_COMBINING_SPACING_MARK},
+	{0x001b36, 0x001b3a, PG_U_NON_SPACING_MARK},
+	{0x001b3b, 0x001b3b, PG_U_COMBINING_SPACING_MARK},
+	{0x001b3c, 0x001b3c, PG_U_NON_SPACING_MARK},
+	{0x001b3d, 0x001b41, PG_U_COMBINING_SPACING_MARK},
+	{0x001b42, 0x001b42, PG_U_NON_SPACING_MARK},
+	{0x001b43, 0x001b44, PG_U_COMBINING_SPACING_MARK},
+	{0x001b45, 0x001b4c, PG_U_OTHER_LETTER},
+	{0x001b4d, 0x001b4f, PG_U_UNASSIGNED},
+	{0x001b50, 0x001b59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001b5a, 0x001b60, PG_U_OTHER_PUNCTUATION},
+	{0x001b61, 0x001b6a, PG_U_OTHER_SYMBOL},
+	{0x001b6b, 0x001b73, PG_U_NON_SPACING_MARK},
+	{0x001b74, 0x001b7c, PG_U_OTHER_SYMBOL},
+	{0x001b7d, 0x001b7e, PG_U_OTHER_PUNCTUATION},
+	{0x001b7f, 0x001b7f, PG_U_UNASSIGNED},
+	{0x001b80, 0x001b81, PG_U_NON_SPACING_MARK},
+	{0x001b82, 0x001b82, PG_U_COMBINING_SPACING_MARK},
+	{0x001b83, 0x001ba0, PG_U_OTHER_LETTER},
+	{0x001ba1, 0x001ba1, PG_U_COMBINING_SPACING_MARK},
+	{0x001ba2, 0x001ba5, PG_U_NON_SPACING_MARK},
+	{0x001ba6, 0x001ba7, PG_U_COMBINING_SPACING_MARK},
+	{0x001ba8, 0x001ba9, PG_U_NON_SPACING_MARK},
+	{0x001baa, 0x001baa, PG_U_COMBINING_SPACING_MARK},
+	{0x001bab, 0x001bad, PG_U_NON_SPACING_MARK},
+	{0x001bae, 0x001baf, PG_U_OTHER_LETTER},
+	{0x001bb0, 0x001bb9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001bba, 0x001be5, PG_U_OTHER_LETTER},
+	{0x001be6, 0x001be6, PG_U_NON_SPACING_MARK},
+	{0x001be7, 0x001be7, PG_U_COMBINING_SPACING_MARK},
+	{0x001be8, 0x001be9, PG_U_NON_SPACING_MARK},
+	{0x001bea, 0x001bec, PG_U_COMBINING_SPACING_MARK},
+	{0x001bed, 0x001bed, PG_U_NON_SPACING_MARK},
+	{0x001bee, 0x001bee, PG_U_COMBINING_SPACING_MARK},
+	{0x001bef, 0x001bf1, PG_U_NON_SPACING_MARK},
+	{0x001bf2, 0x001bf3, PG_U_COMBINING_SPACING_MARK},
+	{0x001bf4, 0x001bfb, PG_U_UNASSIGNED},
+	{0x001bfc, 0x001bff, PG_U_OTHER_PUNCTUATION},
+	{0x001c00, 0x001c23, PG_U_OTHER_LETTER},
+	{0x001c24, 0x001c2b, PG_U_COMBINING_SPACING_MARK},
+	{0x001c2c, 0x001c33, PG_U_NON_SPACING_MARK},
+	{0x001c34, 0x001c35, PG_U_COMBINING_SPACING_MARK},
+	{0x001c36, 0x001c37, PG_U_NON_SPACING_MARK},
+	{0x001c38, 0x001c3a, PG_U_UNASSIGNED},
+	{0x001c3b, 0x001c3f, PG_U_OTHER_PUNCTUATION},
+	{0x001c40, 0x001c49, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001c4a, 0x001c4c, PG_U_UNASSIGNED},
+	{0x001c4d, 0x001c4f, PG_U_OTHER_LETTER},
+	{0x001c50, 0x001c59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001c5a, 0x001c77, PG_U_OTHER_LETTER},
+	{0x001c78, 0x001c7d, PG_U_MODIFIER_LETTER},
+	{0x001c7e, 0x001c7f, PG_U_OTHER_PUNCTUATION},
+	{0x001c80, 0x001c88, PG_U_LOWERCASE_LETTER},
+	{0x001c89, 0x001c8f, PG_U_UNASSIGNED},
+	{0x001c90, 0x001cba, PG_U_UPPERCASE_LETTER},
+	{0x001cbb, 0x001cbc, PG_U_UNASSIGNED},
+	{0x001cbd, 0x001cbf, PG_U_UPPERCASE_LETTER},
+	{0x001cc0, 0x001cc7, PG_U_OTHER_PUNCTUATION},
+	{0x001cc8, 0x001ccf, PG_U_UNASSIGNED},
+	{0x001cd0, 0x001cd2, PG_U_NON_SPACING_MARK},
+	{0x001cd3, 0x001cd3, PG_U_OTHER_PUNCTUATION},
+	{0x001cd4, 0x001ce0, PG_U_NON_SPACING_MARK},
+	{0x001ce1, 0x001ce1, PG_U_COMBINING_SPACING_MARK},
+	{0x001ce2, 0x001ce8, PG_U_NON_SPACING_MARK},
+	{0x001ce9, 0x001cec, PG_U_OTHER_LETTER},
+	{0x001ced, 0x001ced, PG_U_NON_SPACING_MARK},
+	{0x001cee, 0x001cf3, PG_U_OTHER_LETTER},
+	{0x001cf4, 0x001cf4, PG_U_NON_SPACING_MARK},
+	{0x001cf5, 0x001cf6, PG_U_OTHER_LETTER},
+	{0x001cf7, 0x001cf7, PG_U_COMBINING_SPACING_MARK},
+	{0x001cf8, 0x001cf9, PG_U_NON_SPACING_MARK},
+	{0x001cfa, 0x001cfa, PG_U_OTHER_LETTER},
+	{0x001cfb, 0x001cff, PG_U_UNASSIGNED},
+	{0x001d00, 0x001d2b, PG_U_LOWERCASE_LETTER},
+	{0x001d2c, 0x001d6a, PG_U_MODIFIER_LETTER},
+	{0x001d6b, 0x001d77, PG_U_LOWERCASE_LETTER},
+	{0x001d78, 0x001d78, PG_U_MODIFIER_LETTER},
+	{0x001d79, 0x001d9a, PG_U_LOWERCASE_LETTER},
+	{0x001d9b, 0x001dbf, PG_U_MODIFIER_LETTER},
+	{0x001dc0, 0x001dff, PG_U_NON_SPACING_MARK},
+	{0x001e00, 0x001e00, PG_U_UPPERCASE_LETTER},
+	{0x001e01, 0x001e01, PG_U_LOWERCASE_LETTER},
+	{0x001e02, 0x001e02, PG_U_UPPERCASE_LETTER},
+	{0x001e03, 0x001e03, PG_U_LOWERCASE_LETTER},
+	{0x001e04, 0x001e04, PG_U_UPPERCASE_LETTER},
+	{0x001e05, 0x001e05, PG_U_LOWERCASE_LETTER},
+	{0x001e06, 0x001e06, PG_U_UPPERCASE_LETTER},
+	{0x001e07, 0x001e07, PG_U_LOWERCASE_LETTER},
+	{0x001e08, 0x001e08, PG_U_UPPERCASE_LETTER},
+	{0x001e09, 0x001e09, PG_U_LOWERCASE_LETTER},
+	{0x001e0a, 0x001e0a, PG_U_UPPERCASE_LETTER},
+	{0x001e0b, 0x001e0b, PG_U_LOWERCASE_LETTER},
+	{0x001e0c, 0x001e0c, PG_U_UPPERCASE_LETTER},
+	{0x001e0d, 0x001e0d, PG_U_LOWERCASE_LETTER},
+	{0x001e0e, 0x001e0e, PG_U_UPPERCASE_LETTER},
+	{0x001e0f, 0x001e0f, PG_U_LOWERCASE_LETTER},
+	{0x001e10, 0x001e10, PG_U_UPPERCASE_LETTER},
+	{0x001e11, 0x001e11, PG_U_LOWERCASE_LETTER},
+	{0x001e12, 0x001e12, PG_U_UPPERCASE_LETTER},
+	{0x001e13, 0x001e13, PG_U_LOWERCASE_LETTER},
+	{0x001e14, 0x001e14, PG_U_UPPERCASE_LETTER},
+	{0x001e15, 0x001e15, PG_U_LOWERCASE_LETTER},
+	{0x001e16, 0x001e16, PG_U_UPPERCASE_LETTER},
+	{0x001e17, 0x001e17, PG_U_LOWERCASE_LETTER},
+	{0x001e18, 0x001e18, PG_U_UPPERCASE_LETTER},
+	{0x001e19, 0x001e19, PG_U_LOWERCASE_LETTER},
+	{0x001e1a, 0x001e1a, PG_U_UPPERCASE_LETTER},
+	{0x001e1b, 0x001e1b, PG_U_LOWERCASE_LETTER},
+	{0x001e1c, 0x001e1c, PG_U_UPPERCASE_LETTER},
+	{0x001e1d, 0x001e1d, PG_U_LOWERCASE_LETTER},
+	{0x001e1e, 0x001e1e, PG_U_UPPERCASE_LETTER},
+	{0x001e1f, 0x001e1f, PG_U_LOWERCASE_LETTER},
+	{0x001e20, 0x001e20, PG_U_UPPERCASE_LETTER},
+	{0x001e21, 0x001e21, PG_U_LOWERCASE_LETTER},
+	{0x001e22, 0x001e22, PG_U_UPPERCASE_LETTER},
+	{0x001e23, 0x001e23, PG_U_LOWERCASE_LETTER},
+	{0x001e24, 0x001e24, PG_U_UPPERCASE_LETTER},
+	{0x001e25, 0x001e25, PG_U_LOWERCASE_LETTER},
+	{0x001e26, 0x001e26, PG_U_UPPERCASE_LETTER},
+	{0x001e27, 0x001e27, PG_U_LOWERCASE_LETTER},
+	{0x001e28, 0x001e28, PG_U_UPPERCASE_LETTER},
+	{0x001e29, 0x001e29, PG_U_LOWERCASE_LETTER},
+	{0x001e2a, 0x001e2a, PG_U_UPPERCASE_LETTER},
+	{0x001e2b, 0x001e2b, PG_U_LOWERCASE_LETTER},
+	{0x001e2c, 0x001e2c, PG_U_UPPERCASE_LETTER},
+	{0x001e2d, 0x001e2d, PG_U_LOWERCASE_LETTER},
+	{0x001e2e, 0x001e2e, PG_U_UPPERCASE_LETTER},
+	{0x001e2f, 0x001e2f, PG_U_LOWERCASE_LETTER},
+	{0x001e30, 0x001e30, PG_U_UPPERCASE_LETTER},
+	{0x001e31, 0x001e31, PG_U_LOWERCASE_LETTER},
+	{0x001e32, 0x001e32, PG_U_UPPERCASE_LETTER},
+	{0x001e33, 0x001e33, PG_U_LOWERCASE_LETTER},
+	{0x001e34, 0x001e34, PG_U_UPPERCASE_LETTER},
+	{0x001e35, 0x001e35, PG_U_LOWERCASE_LETTER},
+	{0x001e36, 0x001e36, PG_U_UPPERCASE_LETTER},
+	{0x001e37, 0x001e37, PG_U_LOWERCASE_LETTER},
+	{0x001e38, 0x001e38, PG_U_UPPERCASE_LETTER},
+	{0x001e39, 0x001e39, PG_U_LOWERCASE_LETTER},
+	{0x001e3a, 0x001e3a, PG_U_UPPERCASE_LETTER},
+	{0x001e3b, 0x001e3b, PG_U_LOWERCASE_LETTER},
+	{0x001e3c, 0x001e3c, PG_U_UPPERCASE_LETTER},
+	{0x001e3d, 0x001e3d, PG_U_LOWERCASE_LETTER},
+	{0x001e3e, 0x001e3e, PG_U_UPPERCASE_LETTER},
+	{0x001e3f, 0x001e3f, PG_U_LOWERCASE_LETTER},
+	{0x001e40, 0x001e40, PG_U_UPPERCASE_LETTER},
+	{0x001e41, 0x001e41, PG_U_LOWERCASE_LETTER},
+	{0x001e42, 0x001e42, PG_U_UPPERCASE_LETTER},
+	{0x001e43, 0x001e43, PG_U_LOWERCASE_LETTER},
+	{0x001e44, 0x001e44, PG_U_UPPERCASE_LETTER},
+	{0x001e45, 0x001e45, PG_U_LOWERCASE_LETTER},
+	{0x001e46, 0x001e46, PG_U_UPPERCASE_LETTER},
+	{0x001e47, 0x001e47, PG_U_LOWERCASE_LETTER},
+	{0x001e48, 0x001e48, PG_U_UPPERCASE_LETTER},
+	{0x001e49, 0x001e49, PG_U_LOWERCASE_LETTER},
+	{0x001e4a, 0x001e4a, PG_U_UPPERCASE_LETTER},
+	{0x001e4b, 0x001e4b, PG_U_LOWERCASE_LETTER},
+	{0x001e4c, 0x001e4c, PG_U_UPPERCASE_LETTER},
+	{0x001e4d, 0x001e4d, PG_U_LOWERCASE_LETTER},
+	{0x001e4e, 0x001e4e, PG_U_UPPERCASE_LETTER},
+	{0x001e4f, 0x001e4f, PG_U_LOWERCASE_LETTER},
+	{0x001e50, 0x001e50, PG_U_UPPERCASE_LETTER},
+	{0x001e51, 0x001e51, PG_U_LOWERCASE_LETTER},
+	{0x001e52, 0x001e52, PG_U_UPPERCASE_LETTER},
+	{0x001e53, 0x001e53, PG_U_LOWERCASE_LETTER},
+	{0x001e54, 0x001e54, PG_U_UPPERCASE_LETTER},
+	{0x001e55, 0x001e55, PG_U_LOWERCASE_LETTER},
+	{0x001e56, 0x001e56, PG_U_UPPERCASE_LETTER},
+	{0x001e57, 0x001e57, PG_U_LOWERCASE_LETTER},
+	{0x001e58, 0x001e58, PG_U_UPPERCASE_LETTER},
+	{0x001e59, 0x001e59, PG_U_LOWERCASE_LETTER},
+	{0x001e5a, 0x001e5a, PG_U_UPPERCASE_LETTER},
+	{0x001e5b, 0x001e5b, PG_U_LOWERCASE_LETTER},
+	{0x001e5c, 0x001e5c, PG_U_UPPERCASE_LETTER},
+	{0x001e5d, 0x001e5d, PG_U_LOWERCASE_LETTER},
+	{0x001e5e, 0x001e5e, PG_U_UPPERCASE_LETTER},
+	{0x001e5f, 0x001e5f, PG_U_LOWERCASE_LETTER},
+	{0x001e60, 0x001e60, PG_U_UPPERCASE_LETTER},
+	{0x001e61, 0x001e61, PG_U_LOWERCASE_LETTER},
+	{0x001e62, 0x001e62, PG_U_UPPERCASE_LETTER},
+	{0x001e63, 0x001e63, PG_U_LOWERCASE_LETTER},
+	{0x001e64, 0x001e64, PG_U_UPPERCASE_LETTER},
+	{0x001e65, 0x001e65, PG_U_LOWERCASE_LETTER},
+	{0x001e66, 0x001e66, PG_U_UPPERCASE_LETTER},
+	{0x001e67, 0x001e67, PG_U_LOWERCASE_LETTER},
+	{0x001e68, 0x001e68, PG_U_UPPERCASE_LETTER},
+	{0x001e69, 0x001e69, PG_U_LOWERCASE_LETTER},
+	{0x001e6a, 0x001e6a, PG_U_UPPERCASE_LETTER},
+	{0x001e6b, 0x001e6b, PG_U_LOWERCASE_LETTER},
+	{0x001e6c, 0x001e6c, PG_U_UPPERCASE_LETTER},
+	{0x001e6d, 0x001e6d, PG_U_LOWERCASE_LETTER},
+	{0x001e6e, 0x001e6e, PG_U_UPPERCASE_LETTER},
+	{0x001e6f, 0x001e6f, PG_U_LOWERCASE_LETTER},
+	{0x001e70, 0x001e70, PG_U_UPPERCASE_LETTER},
+	{0x001e71, 0x001e71, PG_U_LOWERCASE_LETTER},
+	{0x001e72, 0x001e72, PG_U_UPPERCASE_LETTER},
+	{0x001e73, 0x001e73, PG_U_LOWERCASE_LETTER},
+	{0x001e74, 0x001e74, PG_U_UPPERCASE_LETTER},
+	{0x001e75, 0x001e75, PG_U_LOWERCASE_LETTER},
+	{0x001e76, 0x001e76, PG_U_UPPERCASE_LETTER},
+	{0x001e77, 0x001e77, PG_U_LOWERCASE_LETTER},
+	{0x001e78, 0x001e78, PG_U_UPPERCASE_LETTER},
+	{0x001e79, 0x001e79, PG_U_LOWERCASE_LETTER},
+	{0x001e7a, 0x001e7a, PG_U_UPPERCASE_LETTER},
+	{0x001e7b, 0x001e7b, PG_U_LOWERCASE_LETTER},
+	{0x001e7c, 0x001e7c, PG_U_UPPERCASE_LETTER},
+	{0x001e7d, 0x001e7d, PG_U_LOWERCASE_LETTER},
+	{0x001e7e, 0x001e7e, PG_U_UPPERCASE_LETTER},
+	{0x001e7f, 0x001e7f, PG_U_LOWERCASE_LETTER},
+	{0x001e80, 0x001e80, PG_U_UPPERCASE_LETTER},
+	{0x001e81, 0x001e81, PG_U_LOWERCASE_LETTER},
+	{0x001e82, 0x001e82, PG_U_UPPERCASE_LETTER},
+	{0x001e83, 0x001e83, PG_U_LOWERCASE_LETTER},
+	{0x001e84, 0x001e84, PG_U_UPPERCASE_LETTER},
+	{0x001e85, 0x001e85, PG_U_LOWERCASE_LETTER},
+	{0x001e86, 0x001e86, PG_U_UPPERCASE_LETTER},
+	{0x001e87, 0x001e87, PG_U_LOWERCASE_LETTER},
+	{0x001e88, 0x001e88, PG_U_UPPERCASE_LETTER},
+	{0x001e89, 0x001e89, PG_U_LOWERCASE_LETTER},
+	{0x001e8a, 0x001e8a, PG_U_UPPERCASE_LETTER},
+	{0x001e8b, 0x001e8b, PG_U_LOWERCASE_LETTER},
+	{0x001e8c, 0x001e8c, PG_U_UPPERCASE_LETTER},
+	{0x001e8d, 0x001e8d, PG_U_LOWERCASE_LETTER},
+	{0x001e8e, 0x001e8e, PG_U_UPPERCASE_LETTER},
+	{0x001e8f, 0x001e8f, PG_U_LOWERCASE_LETTER},
+	{0x001e90, 0x001e90, PG_U_UPPERCASE_LETTER},
+	{0x001e91, 0x001e91, PG_U_LOWERCASE_LETTER},
+	{0x001e92, 0x001e92, PG_U_UPPERCASE_LETTER},
+	{0x001e93, 0x001e93, PG_U_LOWERCASE_LETTER},
+	{0x001e94, 0x001e94, PG_U_UPPERCASE_LETTER},
+	{0x001e95, 0x001e9d, PG_U_LOWERCASE_LETTER},
+	{0x001e9e, 0x001e9e, PG_U_UPPERCASE_LETTER},
+	{0x001e9f, 0x001e9f, PG_U_LOWERCASE_LETTER},
+	{0x001ea0, 0x001ea0, PG_U_UPPERCASE_LETTER},
+	{0x001ea1, 0x001ea1, PG_U_LOWERCASE_LETTER},
+	{0x001ea2, 0x001ea2, PG_U_UPPERCASE_LETTER},
+	{0x001ea3, 0x001ea3, PG_U_LOWERCASE_LETTER},
+	{0x001ea4, 0x001ea4, PG_U_UPPERCASE_LETTER},
+	{0x001ea5, 0x001ea5, PG_U_LOWERCASE_LETTER},
+	{0x001ea6, 0x001ea6, PG_U_UPPERCASE_LETTER},
+	{0x001ea7, 0x001ea7, PG_U_LOWERCASE_LETTER},
+	{0x001ea8, 0x001ea8, PG_U_UPPERCASE_LETTER},
+	{0x001ea9, 0x001ea9, PG_U_LOWERCASE_LETTER},
+	{0x001eaa, 0x001eaa, PG_U_UPPERCASE_LETTER},
+	{0x001eab, 0x001eab, PG_U_LOWERCASE_LETTER},
+	{0x001eac, 0x001eac, PG_U_UPPERCASE_LETTER},
+	{0x001ead, 0x001ead, PG_U_LOWERCASE_LETTER},
+	{0x001eae, 0x001eae, PG_U_UPPERCASE_LETTER},
+	{0x001eaf, 0x001eaf, PG_U_LOWERCASE_LETTER},
+	{0x001eb0, 0x001eb0, PG_U_UPPERCASE_LETTER},
+	{0x001eb1, 0x001eb1, PG_U_LOWERCASE_LETTER},
+	{0x001eb2, 0x001eb2, PG_U_UPPERCASE_LETTER},
+	{0x001eb3, 0x001eb3, PG_U_LOWERCASE_LETTER},
+	{0x001eb4, 0x001eb4, PG_U_UPPERCASE_LETTER},
+	{0x001eb5, 0x001eb5, PG_U_LOWERCASE_LETTER},
+	{0x001eb6, 0x001eb6, PG_U_UPPERCASE_LETTER},
+	{0x001eb7, 0x001eb7, PG_U_LOWERCASE_LETTER},
+	{0x001eb8, 0x001eb8, PG_U_UPPERCASE_LETTER},
+	{0x001eb9, 0x001eb9, PG_U_LOWERCASE_LETTER},
+	{0x001eba, 0x001eba, PG_U_UPPERCASE_LETTER},
+	{0x001ebb, 0x001ebb, PG_U_LOWERCASE_LETTER},
+	{0x001ebc, 0x001ebc, PG_U_UPPERCASE_LETTER},
+	{0x001ebd, 0x001ebd, PG_U_LOWERCASE_LETTER},
+	{0x001ebe, 0x001ebe, PG_U_UPPERCASE_LETTER},
+	{0x001ebf, 0x001ebf, PG_U_LOWERCASE_LETTER},
+	{0x001ec0, 0x001ec0, PG_U_UPPERCASE_LETTER},
+	{0x001ec1, 0x001ec1, PG_U_LOWERCASE_LETTER},
+	{0x001ec2, 0x001ec2, PG_U_UPPERCASE_LETTER},
+	{0x001ec3, 0x001ec3, PG_U_LOWERCASE_LETTER},
+	{0x001ec4, 0x001ec4, PG_U_UPPERCASE_LETTER},
+	{0x001ec5, 0x001ec5, PG_U_LOWERCASE_LETTER},
+	{0x001ec6, 0x001ec6, PG_U_UPPERCASE_LETTER},
+	{0x001ec7, 0x001ec7, PG_U_LOWERCASE_LETTER},
+	{0x001ec8, 0x001ec8, PG_U_UPPERCASE_LETTER},
+	{0x001ec9, 0x001ec9, PG_U_LOWERCASE_LETTER},
+	{0x001eca, 0x001eca, PG_U_UPPERCASE_LETTER},
+	{0x001ecb, 0x001ecb, PG_U_LOWERCASE_LETTER},
+	{0x001ecc, 0x001ecc, PG_U_UPPERCASE_LETTER},
+	{0x001ecd, 0x001ecd, PG_U_LOWERCASE_LETTER},
+	{0x001ece, 0x001ece, PG_U_UPPERCASE_LETTER},
+	{0x001ecf, 0x001ecf, PG_U_LOWERCASE_LETTER},
+	{0x001ed0, 0x001ed0, PG_U_UPPERCASE_LETTER},
+	{0x001ed1, 0x001ed1, PG_U_LOWERCASE_LETTER},
+	{0x001ed2, 0x001ed2, PG_U_UPPERCASE_LETTER},
+	{0x001ed3, 0x001ed3, PG_U_LOWERCASE_LETTER},
+	{0x001ed4, 0x001ed4, PG_U_UPPERCASE_LETTER},
+	{0x001ed5, 0x001ed5, PG_U_LOWERCASE_LETTER},
+	{0x001ed6, 0x001ed6, PG_U_UPPERCASE_LETTER},
+	{0x001ed7, 0x001ed7, PG_U_LOWERCASE_LETTER},
+	{0x001ed8, 0x001ed8, PG_U_UPPERCASE_LETTER},
+	{0x001ed9, 0x001ed9, PG_U_LOWERCASE_LETTER},
+	{0x001eda, 0x001eda, PG_U_UPPERCASE_LETTER},
+	{0x001edb, 0x001edb, PG_U_LOWERCASE_LETTER},
+	{0x001edc, 0x001edc, PG_U_UPPERCASE_LETTER},
+	{0x001edd, 0x001edd, PG_U_LOWERCASE_LETTER},
+	{0x001ede, 0x001ede, PG_U_UPPERCASE_LETTER},
+	{0x001edf, 0x001edf, PG_U_LOWERCASE_LETTER},
+	{0x001ee0, 0x001ee0, PG_U_UPPERCASE_LETTER},
+	{0x001ee1, 0x001ee1, PG_U_LOWERCASE_LETTER},
+	{0x001ee2, 0x001ee2, PG_U_UPPERCASE_LETTER},
+	{0x001ee3, 0x001ee3, PG_U_LOWERCASE_LETTER},
+	{0x001ee4, 0x001ee4, PG_U_UPPERCASE_LETTER},
+	{0x001ee5, 0x001ee5, PG_U_LOWERCASE_LETTER},
+	{0x001ee6, 0x001ee6, PG_U_UPPERCASE_LETTER},
+	{0x001ee7, 0x001ee7, PG_U_LOWERCASE_LETTER},
+	{0x001ee8, 0x001ee8, PG_U_UPPERCASE_LETTER},
+	{0x001ee9, 0x001ee9, PG_U_LOWERCASE_LETTER},
+	{0x001eea, 0x001eea, PG_U_UPPERCASE_LETTER},
+	{0x001eeb, 0x001eeb, PG_U_LOWERCASE_LETTER},
+	{0x001eec, 0x001eec, PG_U_UPPERCASE_LETTER},
+	{0x001eed, 0x001eed, PG_U_LOWERCASE_LETTER},
+	{0x001eee, 0x001eee, PG_U_UPPERCASE_LETTER},
+	{0x001eef, 0x001eef, PG_U_LOWERCASE_LETTER},
+	{0x001ef0, 0x001ef0, PG_U_UPPERCASE_LETTER},
+	{0x001ef1, 0x001ef1, PG_U_LOWERCASE_LETTER},
+	{0x001ef2, 0x001ef2, PG_U_UPPERCASE_LETTER},
+	{0x001ef3, 0x001ef3, PG_U_LOWERCASE_LETTER},
+	{0x001ef4, 0x001ef4, PG_U_UPPERCASE_LETTER},
+	{0x001ef5, 0x001ef5, PG_U_LOWERCASE_LETTER},
+	{0x001ef6, 0x001ef6, PG_U_UPPERCASE_LETTER},
+	{0x001ef7, 0x001ef7, PG_U_LOWERCASE_LETTER},
+	{0x001ef8, 0x001ef8, PG_U_UPPERCASE_LETTER},
+	{0x001ef9, 0x001ef9, PG_U_LOWERCASE_LETTER},
+	{0x001efa, 0x001efa, PG_U_UPPERCASE_LETTER},
+	{0x001efb, 0x001efb, PG_U_LOWERCASE_LETTER},
+	{0x001efc, 0x001efc, PG_U_UPPERCASE_LETTER},
+	{0x001efd, 0x001efd, PG_U_LOWERCASE_LETTER},
+	{0x001efe, 0x001efe, PG_U_UPPERCASE_LETTER},
+	{0x001eff, 0x001f07, PG_U_LOWERCASE_LETTER},
+	{0x001f08, 0x001f0f, PG_U_UPPERCASE_LETTER},
+	{0x001f10, 0x001f15, PG_U_LOWERCASE_LETTER},
+	{0x001f16, 0x001f17, PG_U_UNASSIGNED},
+	{0x001f18, 0x001f1d, PG_U_UPPERCASE_LETTER},
+	{0x001f1e, 0x001f1f, PG_U_UNASSIGNED},
+	{0x001f20, 0x001f27, PG_U_LOWERCASE_LETTER},
+	{0x001f28, 0x001f2f, PG_U_UPPERCASE_LETTER},
+	{0x001f30, 0x001f37, PG_U_LOWERCASE_LETTER},
+	{0x001f38, 0x001f3f, PG_U_UPPERCASE_LETTER},
+	{0x001f40, 0x001f45, PG_U_LOWERCASE_LETTER},
+	{0x001f46, 0x001f47, PG_U_UNASSIGNED},
+	{0x001f48, 0x001f4d, PG_U_UPPERCASE_LETTER},
+	{0x001f4e, 0x001f4f, PG_U_UNASSIGNED},
+	{0x001f50, 0x001f57, PG_U_LOWERCASE_LETTER},
+	{0x001f58, 0x001f58, PG_U_UNASSIGNED},
+	{0x001f59, 0x001f59, PG_U_UPPERCASE_LETTER},
+	{0x001f5a, 0x001f5a, PG_U_UNASSIGNED},
+	{0x001f5b, 0x001f5b, PG_U_UPPERCASE_LETTER},
+	{0x001f5c, 0x001f5c, PG_U_UNASSIGNED},
+	{0x001f5d, 0x001f5d, PG_U_UPPERCASE_LETTER},
+	{0x001f5e, 0x001f5e, PG_U_UNASSIGNED},
+	{0x001f5f, 0x001f5f, PG_U_UPPERCASE_LETTER},
+	{0x001f60, 0x001f67, PG_U_LOWERCASE_LETTER},
+	{0x001f68, 0x001f6f, PG_U_UPPERCASE_LETTER},
+	{0x001f70, 0x001f7d, PG_U_LOWERCASE_LETTER},
+	{0x001f7e, 0x001f7f, PG_U_UNASSIGNED},
+	{0x001f80, 0x001f87, PG_U_LOWERCASE_LETTER},
+	{0x001f88, 0x001f8f, PG_U_TITLECASE_LETTER},
+	{0x001f90, 0x001f97, PG_U_LOWERCASE_LETTER},
+	{0x001f98, 0x001f9f, PG_U_TITLECASE_LETTER},
+	{0x001fa0, 0x001fa7, PG_U_LOWERCASE_LETTER},
+	{0x001fa8, 0x001faf, PG_U_TITLECASE_LETTER},
+	{0x001fb0, 0x001fb4, PG_U_LOWERCASE_LETTER},
+	{0x001fb5, 0x001fb5, PG_U_UNASSIGNED},
+	{0x001fb6, 0x001fb7, PG_U_LOWERCASE_LETTER},
+	{0x001fb8, 0x001fbb, PG_U_UPPERCASE_LETTER},
+	{0x001fbc, 0x001fbc, PG_U_TITLECASE_LETTER},
+	{0x001fbd, 0x001fbd, PG_U_MODIFIER_SYMBOL},
+	{0x001fbe, 0x001fbe, PG_U_LOWERCASE_LETTER},
+	{0x001fbf, 0x001fc1, PG_U_MODIFIER_SYMBOL},
+	{0x001fc2, 0x001fc4, PG_U_LOWERCASE_LETTER},
+	{0x001fc5, 0x001fc5, PG_U_UNASSIGNED},
+	{0x001fc6, 0x001fc7, PG_U_LOWERCASE_LETTER},
+	{0x001fc8, 0x001fcb, PG_U_UPPERCASE_LETTER},
+	{0x001fcc, 0x001fcc, PG_U_TITLECASE_LETTER},
+	{0x001fcd, 0x001fcf, PG_U_MODIFIER_SYMBOL},
+	{0x001fd0, 0x001fd3, PG_U_LOWERCASE_LETTER},
+	{0x001fd4, 0x001fd5, PG_U_UNASSIGNED},
+	{0x001fd6, 0x001fd7, PG_U_LOWERCASE_LETTER},
+	{0x001fd8, 0x001fdb, PG_U_UPPERCASE_LETTER},
+	{0x001fdc, 0x001fdc, PG_U_UNASSIGNED},
+	{0x001fdd, 0x001fdf, PG_U_MODIFIER_SYMBOL},
+	{0x001fe0, 0x001fe7, PG_U_LOWERCASE_LETTER},
+	{0x001fe8, 0x001fec, PG_U_UPPERCASE_LETTER},
+	{0x001fed, 0x001fef, PG_U_MODIFIER_SYMBOL},
+	{0x001ff0, 0x001ff1, PG_U_UNASSIGNED},
+	{0x001ff2, 0x001ff4, PG_U_LOWERCASE_LETTER},
+	{0x001ff5, 0x001ff5, PG_U_UNASSIGNED},
+	{0x001ff6, 0x001ff7, PG_U_LOWERCASE_LETTER},
+	{0x001ff8, 0x001ffb, PG_U_UPPERCASE_LETTER},
+	{0x001ffc, 0x001ffc, PG_U_TITLECASE_LETTER},
+	{0x001ffd, 0x001ffe, PG_U_MODIFIER_SYMBOL},
+	{0x001fff, 0x001fff, PG_U_UNASSIGNED},
+	{0x002000, 0x00200a, PG_U_SPACE_SEPARATOR},
+	{0x00200b, 0x00200f, PG_U_FORMAT_CHAR},
+	{0x002010, 0x002015, PG_U_DASH_PUNCTUATION},
+	{0x002016, 0x002017, PG_U_OTHER_PUNCTUATION},
+	{0x002018, 0x002018, PG_U_INITIAL_PUNCTUATION},
+	{0x002019, 0x002019, PG_U_FINAL_PUNCTUATION},
+	{0x00201a, 0x00201a, PG_U_START_PUNCTUATION},
+	{0x00201b, 0x00201c, PG_U_INITIAL_PUNCTUATION},
+	{0x00201d, 0x00201d, PG_U_FINAL_PUNCTUATION},
+	{0x00201e, 0x00201e, PG_U_START_PUNCTUATION},
+	{0x00201f, 0x00201f, PG_U_INITIAL_PUNCTUATION},
+	{0x002020, 0x002027, PG_U_OTHER_PUNCTUATION},
+	{0x002028, 0x002028, PG_U_LINE_SEPARATOR},
+	{0x002029, 0x002029, PG_U_PARAGRAPH_SEPARATOR},
+	{0x00202a, 0x00202e, PG_U_FORMAT_CHAR},
+	{0x00202f, 0x00202f, PG_U_SPACE_SEPARATOR},
+	{0x002030, 0x002038, PG_U_OTHER_PUNCTUATION},
+	{0x002039, 0x002039, PG_U_INITIAL_PUNCTUATION},
+	{0x00203a, 0x00203a, PG_U_FINAL_PUNCTUATION},
+	{0x00203b, 0x00203e, PG_U_OTHER_PUNCTUATION},
+	{0x00203f, 0x002040, PG_U_CONNECTOR_PUNCTUATION},
+	{0x002041, 0x002043, PG_U_OTHER_PUNCTUATION},
+	{0x002044, 0x002044, PG_U_MATH_SYMBOL},
+	{0x002045, 0x002045, PG_U_START_PUNCTUATION},
+	{0x002046, 0x002046, PG_U_END_PUNCTUATION},
+	{0x002047, 0x002051, PG_U_OTHER_PUNCTUATION},
+	{0x002052, 0x002052, PG_U_MATH_SYMBOL},
+	{0x002053, 0x002053, PG_U_OTHER_PUNCTUATION},
+	{0x002054, 0x002054, PG_U_CONNECTOR_PUNCTUATION},
+	{0x002055, 0x00205e, PG_U_OTHER_PUNCTUATION},
+	{0x00205f, 0x00205f, PG_U_SPACE_SEPARATOR},
+	{0x002060, 0x002064, PG_U_FORMAT_CHAR},
+	{0x002065, 0x002065, PG_U_UNASSIGNED},
+	{0x002066, 0x00206f, PG_U_FORMAT_CHAR},
+	{0x002070, 0x002070, PG_U_OTHER_NUMBER},
+	{0x002071, 0x002071, PG_U_MODIFIER_LETTER},
+	{0x002072, 0x002073, PG_U_UNASSIGNED},
+	{0x002074, 0x002079, PG_U_OTHER_NUMBER},
+	{0x00207a, 0x00207c, PG_U_MATH_SYMBOL},
+	{0x00207d, 0x00207d, PG_U_START_PUNCTUATION},
+	{0x00207e, 0x00207e, PG_U_END_PUNCTUATION},
+	{0x00207f, 0x00207f, PG_U_MODIFIER_LETTER},
+	{0x002080, 0x002089, PG_U_OTHER_NUMBER},
+	{0x00208a, 0x00208c, PG_U_MATH_SYMBOL},
+	{0x00208d, 0x00208d, PG_U_START_PUNCTUATION},
+	{0x00208e, 0x00208e, PG_U_END_PUNCTUATION},
+	{0x00208f, 0x00208f, PG_U_UNASSIGNED},
+	{0x002090, 0x00209c, PG_U_MODIFIER_LETTER},
+	{0x00209d, 0x00209f, PG_U_UNASSIGNED},
+	{0x0020a0, 0x0020c0, PG_U_CURRENCY_SYMBOL},
+	{0x0020c1, 0x0020cf, PG_U_UNASSIGNED},
+	{0x0020d0, 0x0020dc, PG_U_NON_SPACING_MARK},
+	{0x0020dd, 0x0020e0, PG_U_ENCLOSING_MARK},
+	{0x0020e1, 0x0020e1, PG_U_NON_SPACING_MARK},
+	{0x0020e2, 0x0020e4, PG_U_ENCLOSING_MARK},
+	{0x0020e5, 0x0020f0, PG_U_NON_SPACING_MARK},
+	{0x0020f1, 0x0020ff, PG_U_UNASSIGNED},
+	{0x002100, 0x002101, PG_U_OTHER_SYMBOL},
+	{0x002102, 0x002102, PG_U_UPPERCASE_LETTER},
+	{0x002103, 0x002106, PG_U_OTHER_SYMBOL},
+	{0x002107, 0x002107, PG_U_UPPERCASE_LETTER},
+	{0x002108, 0x002109, PG_U_OTHER_SYMBOL},
+	{0x00210a, 0x00210a, PG_U_LOWERCASE_LETTER},
+	{0x00210b, 0x00210d, PG_U_UPPERCASE_LETTER},
+	{0x00210e, 0x00210f, PG_U_LOWERCASE_LETTER},
+	{0x002110, 0x002112, PG_U_UPPERCASE_LETTER},
+	{0x002113, 0x002113, PG_U_LOWERCASE_LETTER},
+	{0x002114, 0x002114, PG_U_OTHER_SYMBOL},
+	{0x002115, 0x002115, PG_U_UPPERCASE_LETTER},
+	{0x002116, 0x002117, PG_U_OTHER_SYMBOL},
+	{0x002118, 0x002118, PG_U_MATH_SYMBOL},
+	{0x002119, 0x00211d, PG_U_UPPERCASE_LETTER},
+	{0x00211e, 0x002123, PG_U_OTHER_SYMBOL},
+	{0x002124, 0x002124, PG_U_UPPERCASE_LETTER},
+	{0x002125, 0x002125, PG_U_OTHER_SYMBOL},
+	{0x002126, 0x002126, PG_U_UPPERCASE_LETTER},
+	{0x002127, 0x002127, PG_U_OTHER_SYMBOL},
+	{0x002128, 0x002128, PG_U_UPPERCASE_LETTER},
+	{0x002129, 0x002129, PG_U_OTHER_SYMBOL},
+	{0x00212a, 0x00212d, PG_U_UPPERCASE_LETTER},
+	{0x00212e, 0x00212e, PG_U_OTHER_SYMBOL},
+	{0x00212f, 0x00212f, PG_U_LOWERCASE_LETTER},
+	{0x002130, 0x002133, PG_U_UPPERCASE_LETTER},
+	{0x002134, 0x002134, PG_U_LOWERCASE_LETTER},
+	{0x002135, 0x002138, PG_U_OTHER_LETTER},
+	{0x002139, 0x002139, PG_U_LOWERCASE_LETTER},
+	{0x00213a, 0x00213b, PG_U_OTHER_SYMBOL},
+	{0x00213c, 0x00213d, PG_U_LOWERCASE_LETTER},
+	{0x00213e, 0x00213f, PG_U_UPPERCASE_LETTER},
+	{0x002140, 0x002144, PG_U_MATH_SYMBOL},
+	{0x002145, 0x002145, PG_U_UPPERCASE_LETTER},
+	{0x002146, 0x002149, PG_U_LOWERCASE_LETTER},
+	{0x00214a, 0x00214a, PG_U_OTHER_SYMBOL},
+	{0x00214b, 0x00214b, PG_U_MATH_SYMBOL},
+	{0x00214c, 0x00214d, PG_U_OTHER_SYMBOL},
+	{0x00214e, 0x00214e, PG_U_LOWERCASE_LETTER},
+	{0x00214f, 0x00214f, PG_U_OTHER_SYMBOL},
+	{0x002150, 0x00215f, PG_U_OTHER_NUMBER},
+	{0x002160, 0x002182, PG_U_LETTER_NUMBER},
+	{0x002183, 0x002183, PG_U_UPPERCASE_LETTER},
+	{0x002184, 0x002184, PG_U_LOWERCASE_LETTER},
+	{0x002185, 0x002188, PG_U_LETTER_NUMBER},
+	{0x002189, 0x002189, PG_U_OTHER_NUMBER},
+	{0x00218a, 0x00218b, PG_U_OTHER_SYMBOL},
+	{0x00218c, 0x00218f, PG_U_UNASSIGNED},
+	{0x002190, 0x002194, PG_U_MATH_SYMBOL},
+	{0x002195, 0x002199, PG_U_OTHER_SYMBOL},
+	{0x00219a, 0x00219b, PG_U_MATH_SYMBOL},
+	{0x00219c, 0x00219f, PG_U_OTHER_SYMBOL},
+	{0x0021a0, 0x0021a0, PG_U_MATH_SYMBOL},
+	{0x0021a1, 0x0021a2, PG_U_OTHER_SYMBOL},
+	{0x0021a3, 0x0021a3, PG_U_MATH_SYMBOL},
+	{0x0021a4, 0x0021a5, PG_U_OTHER_SYMBOL},
+	{0x0021a6, 0x0021a6, PG_U_MATH_SYMBOL},
+	{0x0021a7, 0x0021ad, PG_U_OTHER_SYMBOL},
+	{0x0021ae, 0x0021ae, PG_U_MATH_SYMBOL},
+	{0x0021af, 0x0021cd, PG_U_OTHER_SYMBOL},
+	{0x0021ce, 0x0021cf, PG_U_MATH_SYMBOL},
+	{0x0021d0, 0x0021d1, PG_U_OTHER_SYMBOL},
+	{0x0021d2, 0x0021d2, PG_U_MATH_SYMBOL},
+	{0x0021d3, 0x0021d3, PG_U_OTHER_SYMBOL},
+	{0x0021d4, 0x0021d4, PG_U_MATH_SYMBOL},
+	{0x0021d5, 0x0021f3, PG_U_OTHER_SYMBOL},
+	{0x0021f4, 0x0022ff, PG_U_MATH_SYMBOL},
+	{0x002300, 0x002307, PG_U_OTHER_SYMBOL},
+	{0x002308, 0x002308, PG_U_START_PUNCTUATION},
+	{0x002309, 0x002309, PG_U_END_PUNCTUATION},
+	{0x00230a, 0x00230a, PG_U_START_PUNCTUATION},
+	{0x00230b, 0x00230b, PG_U_END_PUNCTUATION},
+	{0x00230c, 0x00231f, PG_U_OTHER_SYMBOL},
+	{0x002320, 0x002321, PG_U_MATH_SYMBOL},
+	{0x002322, 0x002328, PG_U_OTHER_SYMBOL},
+	{0x002329, 0x002329, PG_U_START_PUNCTUATION},
+	{0x00232a, 0x00232a, PG_U_END_PUNCTUATION},
+	{0x00232b, 0x00237b, PG_U_OTHER_SYMBOL},
+	{0x00237c, 0x00237c, PG_U_MATH_SYMBOL},
+	{0x00237d, 0x00239a, PG_U_OTHER_SYMBOL},
+	{0x00239b, 0x0023b3, PG_U_MATH_SYMBOL},
+	{0x0023b4, 0x0023db, PG_U_OTHER_SYMBOL},
+	{0x0023dc, 0x0023e1, PG_U_MATH_SYMBOL},
+	{0x0023e2, 0x002426, PG_U_OTHER_SYMBOL},
+	{0x002427, 0x00243f, PG_U_UNASSIGNED},
+	{0x002440, 0x00244a, PG_U_OTHER_SYMBOL},
+	{0x00244b, 0x00245f, PG_U_UNASSIGNED},
+	{0x002460, 0x00249b, PG_U_OTHER_NUMBER},
+	{0x00249c, 0x0024e9, PG_U_OTHER_SYMBOL},
+	{0x0024ea, 0x0024ff, PG_U_OTHER_NUMBER},
+	{0x002500, 0x0025b6, PG_U_OTHER_SYMBOL},
+	{0x0025b7, 0x0025b7, PG_U_MATH_SYMBOL},
+	{0x0025b8, 0x0025c0, PG_U_OTHER_SYMBOL},
+	{0x0025c1, 0x0025c1, PG_U_MATH_SYMBOL},
+	{0x0025c2, 0x0025f7, PG_U_OTHER_SYMBOL},
+	{0x0025f8, 0x0025ff, PG_U_MATH_SYMBOL},
+	{0x002600, 0x00266e, PG_U_OTHER_SYMBOL},
+	{0x00266f, 0x00266f, PG_U_MATH_SYMBOL},
+	{0x002670, 0x002767, PG_U_OTHER_SYMBOL},
+	{0x002768, 0x002768, PG_U_START_PUNCTUATION},
+	{0x002769, 0x002769, PG_U_END_PUNCTUATION},
+	{0x00276a, 0x00276a, PG_U_START_PUNCTUATION},
+	{0x00276b, 0x00276b, PG_U_END_PUNCTUATION},
+	{0x00276c, 0x00276c, PG_U_START_PUNCTUATION},
+	{0x00276d, 0x00276d, PG_U_END_PUNCTUATION},
+	{0x00276e, 0x00276e, PG_U_START_PUNCTUATION},
+	{0x00276f, 0x00276f, PG_U_END_PUNCTUATION},
+	{0x002770, 0x002770, PG_U_START_PUNCTUATION},
+	{0x002771, 0x002771, PG_U_END_PUNCTUATION},
+	{0x002772, 0x002772, PG_U_START_PUNCTUATION},
+	{0x002773, 0x002773, PG_U_END_PUNCTUATION},
+	{0x002774, 0x002774, PG_U_START_PUNCTUATION},
+	{0x002775, 0x002775, PG_U_END_PUNCTUATION},
+	{0x002776, 0x002793, PG_U_OTHER_NUMBER},
+	{0x002794, 0x0027bf, PG_U_OTHER_SYMBOL},
+	{0x0027c0, 0x0027c4, PG_U_MATH_SYMBOL},
+	{0x0027c5, 0x0027c5, PG_U_START_PUNCTUATION},
+	{0x0027c6, 0x0027c6, PG_U_END_PUNCTUATION},
+	{0x0027c7, 0x0027e5, PG_U_MATH_SYMBOL},
+	{0x0027e6, 0x0027e6, PG_U_START_PUNCTUATION},
+	{0x0027e7, 0x0027e7, PG_U_END_PUNCTUATION},
+	{0x0027e8, 0x0027e8, PG_U_START_PUNCTUATION},
+	{0x0027e9, 0x0027e9, PG_U_END_PUNCTUATION},
+	{0x0027ea, 0x0027ea, PG_U_START_PUNCTUATION},
+	{0x0027eb, 0x0027eb, PG_U_END_PUNCTUATION},
+	{0x0027ec, 0x0027ec, PG_U_START_PUNCTUATION},
+	{0x0027ed, 0x0027ed, PG_U_END_PUNCTUATION},
+	{0x0027ee, 0x0027ee, PG_U_START_PUNCTUATION},
+	{0x0027ef, 0x0027ef, PG_U_END_PUNCTUATION},
+	{0x0027f0, 0x0027ff, PG_U_MATH_SYMBOL},
+	{0x002800, 0x0028ff, PG_U_OTHER_SYMBOL},
+	{0x002900, 0x002982, PG_U_MATH_SYMBOL},
+	{0x002983, 0x002983, PG_U_START_PUNCTUATION},
+	{0x002984, 0x002984, PG_U_END_PUNCTUATION},
+	{0x002985, 0x002985, PG_U_START_PUNCTUATION},
+	{0x002986, 0x002986, PG_U_END_PUNCTUATION},
+	{0x002987, 0x002987, PG_U_START_PUNCTUATION},
+	{0x002988, 0x002988, PG_U_END_PUNCTUATION},
+	{0x002989, 0x002989, PG_U_START_PUNCTUATION},
+	{0x00298a, 0x00298a, PG_U_END_PUNCTUATION},
+	{0x00298b, 0x00298b, PG_U_START_PUNCTUATION},
+	{0x00298c, 0x00298c, PG_U_END_PUNCTUATION},
+	{0x00298d, 0x00298d, PG_U_START_PUNCTUATION},
+	{0x00298e, 0x00298e, PG_U_END_PUNCTUATION},
+	{0x00298f, 0x00298f, PG_U_START_PUNCTUATION},
+	{0x002990, 0x002990, PG_U_END_PUNCTUATION},
+	{0x002991, 0x002991, PG_U_START_PUNCTUATION},
+	{0x002992, 0x002992, PG_U_END_PUNCTUATION},
+	{0x002993, 0x002993, PG_U_START_PUNCTUATION},
+	{0x002994, 0x002994, PG_U_END_PUNCTUATION},
+	{0x002995, 0x002995, PG_U_START_PUNCTUATION},
+	{0x002996, 0x002996, PG_U_END_PUNCTUATION},
+	{0x002997, 0x002997, PG_U_START_PUNCTUATION},
+	{0x002998, 0x002998, PG_U_END_PUNCTUATION},
+	{0x002999, 0x0029d7, PG_U_MATH_SYMBOL},
+	{0x0029d8, 0x0029d8, PG_U_START_PUNCTUATION},
+	{0x0029d9, 0x0029d9, PG_U_END_PUNCTUATION},
+	{0x0029da, 0x0029da, PG_U_START_PUNCTUATION},
+	{0x0029db, 0x0029db, PG_U_END_PUNCTUATION},
+	{0x0029dc, 0x0029fb, PG_U_MATH_SYMBOL},
+	{0x0029fc, 0x0029fc, PG_U_START_PUNCTUATION},
+	{0x0029fd, 0x0029fd, PG_U_END_PUNCTUATION},
+	{0x0029fe, 0x002aff, PG_U_MATH_SYMBOL},
+	{0x002b00, 0x002b2f, PG_U_OTHER_SYMBOL},
+	{0x002b30, 0x002b44, PG_U_MATH_SYMBOL},
+	{0x002b45, 0x002b46, PG_U_OTHER_SYMBOL},
+	{0x002b47, 0x002b4c, PG_U_MATH_SYMBOL},
+	{0x002b4d, 0x002b73, PG_U_OTHER_SYMBOL},
+	{0x002b74, 0x002b75, PG_U_UNASSIGNED},
+	{0x002b76, 0x002b95, PG_U_OTHER_SYMBOL},
+	{0x002b96, 0x002b96, PG_U_UNASSIGNED},
+	{0x002b97, 0x002bff, PG_U_OTHER_SYMBOL},
+	{0x002c00, 0x002c2f, PG_U_UPPERCASE_LETTER},
+	{0x002c30, 0x002c5f, PG_U_LOWERCASE_LETTER},
+	{0x002c60, 0x002c60, PG_U_UPPERCASE_LETTER},
+	{0x002c61, 0x002c61, PG_U_LOWERCASE_LETTER},
+	{0x002c62, 0x002c64, PG_U_UPPERCASE_LETTER},
+	{0x002c65, 0x002c66, PG_U_LOWERCASE_LETTER},
+	{0x002c67, 0x002c67, PG_U_UPPERCASE_LETTER},
+	{0x002c68, 0x002c68, PG_U_LOWERCASE_LETTER},
+	{0x002c69, 0x002c69, PG_U_UPPERCASE_LETTER},
+	{0x002c6a, 0x002c6a, PG_U_LOWERCASE_LETTER},
+	{0x002c6b, 0x002c6b, PG_U_UPPERCASE_LETTER},
+	{0x002c6c, 0x002c6c, PG_U_LOWERCASE_LETTER},
+	{0x002c6d, 0x002c70, PG_U_UPPERCASE_LETTER},
+	{0x002c71, 0x002c71, PG_U_LOWERCASE_LETTER},
+	{0x002c72, 0x002c72, PG_U_UPPERCASE_LETTER},
+	{0x002c73, 0x002c74, PG_U_LOWERCASE_LETTER},
+	{0x002c75, 0x002c75, PG_U_UPPERCASE_LETTER},
+	{0x002c76, 0x002c7b, PG_U_LOWERCASE_LETTER},
+	{0x002c7c, 0x002c7d, PG_U_MODIFIER_LETTER},
+	{0x002c7e, 0x002c80, PG_U_UPPERCASE_LETTER},
+	{0x002c81, 0x002c81, PG_U_LOWERCASE_LETTER},
+	{0x002c82, 0x002c82, PG_U_UPPERCASE_LETTER},
+	{0x002c83, 0x002c83, PG_U_LOWERCASE_LETTER},
+	{0x002c84, 0x002c84, PG_U_UPPERCASE_LETTER},
+	{0x002c85, 0x002c85, PG_U_LOWERCASE_LETTER},
+	{0x002c86, 0x002c86, PG_U_UPPERCASE_LETTER},
+	{0x002c87, 0x002c87, PG_U_LOWERCASE_LETTER},
+	{0x002c88, 0x002c88, PG_U_UPPERCASE_LETTER},
+	{0x002c89, 0x002c89, PG_U_LOWERCASE_LETTER},
+	{0x002c8a, 0x002c8a, PG_U_UPPERCASE_LETTER},
+	{0x002c8b, 0x002c8b, PG_U_LOWERCASE_LETTER},
+	{0x002c8c, 0x002c8c, PG_U_UPPERCASE_LETTER},
+	{0x002c8d, 0x002c8d, PG_U_LOWERCASE_LETTER},
+	{0x002c8e, 0x002c8e, PG_U_UPPERCASE_LETTER},
+	{0x002c8f, 0x002c8f, PG_U_LOWERCASE_LETTER},
+	{0x002c90, 0x002c90, PG_U_UPPERCASE_LETTER},
+	{0x002c91, 0x002c91, PG_U_LOWERCASE_LETTER},
+	{0x002c92, 0x002c92, PG_U_UPPERCASE_LETTER},
+	{0x002c93, 0x002c93, PG_U_LOWERCASE_LETTER},
+	{0x002c94, 0x002c94, PG_U_UPPERCASE_LETTER},
+	{0x002c95, 0x002c95, PG_U_LOWERCASE_LETTER},
+	{0x002c96, 0x002c96, PG_U_UPPERCASE_LETTER},
+	{0x002c97, 0x002c97, PG_U_LOWERCASE_LETTER},
+	{0x002c98, 0x002c98, PG_U_UPPERCASE_LETTER},
+	{0x002c99, 0x002c99, PG_U_LOWERCASE_LETTER},
+	{0x002c9a, 0x002c9a, PG_U_UPPERCASE_LETTER},
+	{0x002c9b, 0x002c9b, PG_U_LOWERCASE_LETTER},
+	{0x002c9c, 0x002c9c, PG_U_UPPERCASE_LETTER},
+	{0x002c9d, 0x002c9d, PG_U_LOWERCASE_LETTER},
+	{0x002c9e, 0x002c9e, PG_U_UPPERCASE_LETTER},
+	{0x002c9f, 0x002c9f, PG_U_LOWERCASE_LETTER},
+	{0x002ca0, 0x002ca0, PG_U_UPPERCASE_LETTER},
+	{0x002ca1, 0x002ca1, PG_U_LOWERCASE_LETTER},
+	{0x002ca2, 0x002ca2, PG_U_UPPERCASE_LETTER},
+	{0x002ca3, 0x002ca3, PG_U_LOWERCASE_LETTER},
+	{0x002ca4, 0x002ca4, PG_U_UPPERCASE_LETTER},
+	{0x002ca5, 0x002ca5, PG_U_LOWERCASE_LETTER},
+	{0x002ca6, 0x002ca6, PG_U_UPPERCASE_LETTER},
+	{0x002ca7, 0x002ca7, PG_U_LOWERCASE_LETTER},
+	{0x002ca8, 0x002ca8, PG_U_UPPERCASE_LETTER},
+	{0x002ca9, 0x002ca9, PG_U_LOWERCASE_LETTER},
+	{0x002caa, 0x002caa, PG_U_UPPERCASE_LETTER},
+	{0x002cab, 0x002cab, PG_U_LOWERCASE_LETTER},
+	{0x002cac, 0x002cac, PG_U_UPPERCASE_LETTER},
+	{0x002cad, 0x002cad, PG_U_LOWERCASE_LETTER},
+	{0x002cae, 0x002cae, PG_U_UPPERCASE_LETTER},
+	{0x002caf, 0x002caf, PG_U_LOWERCASE_LETTER},
+	{0x002cb0, 0x002cb0, PG_U_UPPERCASE_LETTER},
+	{0x002cb1, 0x002cb1, PG_U_LOWERCASE_LETTER},
+	{0x002cb2, 0x002cb2, PG_U_UPPERCASE_LETTER},
+	{0x002cb3, 0x002cb3, PG_U_LOWERCASE_LETTER},
+	{0x002cb4, 0x002cb4, PG_U_UPPERCASE_LETTER},
+	{0x002cb5, 0x002cb5, PG_U_LOWERCASE_LETTER},
+	{0x002cb6, 0x002cb6, PG_U_UPPERCASE_LETTER},
+	{0x002cb7, 0x002cb7, PG_U_LOWERCASE_LETTER},
+	{0x002cb8, 0x002cb8, PG_U_UPPERCASE_LETTER},
+	{0x002cb9, 0x002cb9, PG_U_LOWERCASE_LETTER},
+	{0x002cba, 0x002cba, PG_U_UPPERCASE_LETTER},
+	{0x002cbb, 0x002cbb, PG_U_LOWERCASE_LETTER},
+	{0x002cbc, 0x002cbc, PG_U_UPPERCASE_LETTER},
+	{0x002cbd, 0x002cbd, PG_U_LOWERCASE_LETTER},
+	{0x002cbe, 0x002cbe, PG_U_UPPERCASE_LETTER},
+	{0x002cbf, 0x002cbf, PG_U_LOWERCASE_LETTER},
+	{0x002cc0, 0x002cc0, PG_U_UPPERCASE_LETTER},
+	{0x002cc1, 0x002cc1, PG_U_LOWERCASE_LETTER},
+	{0x002cc2, 0x002cc2, PG_U_UPPERCASE_LETTER},
+	{0x002cc3, 0x002cc3, PG_U_LOWERCASE_LETTER},
+	{0x002cc4, 0x002cc4, PG_U_UPPERCASE_LETTER},
+	{0x002cc5, 0x002cc5, PG_U_LOWERCASE_LETTER},
+	{0x002cc6, 0x002cc6, PG_U_UPPERCASE_LETTER},
+	{0x002cc7, 0x002cc7, PG_U_LOWERCASE_LETTER},
+	{0x002cc8, 0x002cc8, PG_U_UPPERCASE_LETTER},
+	{0x002cc9, 0x002cc9, PG_U_LOWERCASE_LETTER},
+	{0x002cca, 0x002cca, PG_U_UPPERCASE_LETTER},
+	{0x002ccb, 0x002ccb, PG_U_LOWERCASE_LETTER},
+	{0x002ccc, 0x002ccc, PG_U_UPPERCASE_LETTER},
+	{0x002ccd, 0x002ccd, PG_U_LOWERCASE_LETTER},
+	{0x002cce, 0x002cce, PG_U_UPPERCASE_LETTER},
+	{0x002ccf, 0x002ccf, PG_U_LOWERCASE_LETTER},
+	{0x002cd0, 0x002cd0, PG_U_UPPERCASE_LETTER},
+	{0x002cd1, 0x002cd1, PG_U_LOWERCASE_LETTER},
+	{0x002cd2, 0x002cd2, PG_U_UPPERCASE_LETTER},
+	{0x002cd3, 0x002cd3, PG_U_LOWERCASE_LETTER},
+	{0x002cd4, 0x002cd4, PG_U_UPPERCASE_LETTER},
+	{0x002cd5, 0x002cd5, PG_U_LOWERCASE_LETTER},
+	{0x002cd6, 0x002cd6, PG_U_UPPERCASE_LETTER},
+	{0x002cd7, 0x002cd7, PG_U_LOWERCASE_LETTER},
+	{0x002cd8, 0x002cd8, PG_U_UPPERCASE_LETTER},
+	{0x002cd9, 0x002cd9, PG_U_LOWERCASE_LETTER},
+	{0x002cda, 0x002cda, PG_U_UPPERCASE_LETTER},
+	{0x002cdb, 0x002cdb, PG_U_LOWERCASE_LETTER},
+	{0x002cdc, 0x002cdc, PG_U_UPPERCASE_LETTER},
+	{0x002cdd, 0x002cdd, PG_U_LOWERCASE_LETTER},
+	{0x002cde, 0x002cde, PG_U_UPPERCASE_LETTER},
+	{0x002cdf, 0x002cdf, PG_U_LOWERCASE_LETTER},
+	{0x002ce0, 0x002ce0, PG_U_UPPERCASE_LETTER},
+	{0x002ce1, 0x002ce1, PG_U_LOWERCASE_LETTER},
+	{0x002ce2, 0x002ce2, PG_U_UPPERCASE_LETTER},
+	{0x002ce3, 0x002ce4, PG_U_LOWERCASE_LETTER},
+	{0x002ce5, 0x002cea, PG_U_OTHER_SYMBOL},
+	{0x002ceb, 0x002ceb, PG_U_UPPERCASE_LETTER},
+	{0x002cec, 0x002cec, PG_U_LOWERCASE_LETTER},
+	{0x002ced, 0x002ced, PG_U_UPPERCASE_LETTER},
+	{0x002cee, 0x002cee, PG_U_LOWERCASE_LETTER},
+	{0x002cef, 0x002cf1, PG_U_NON_SPACING_MARK},
+	{0x002cf2, 0x002cf2, PG_U_UPPERCASE_LETTER},
+	{0x002cf3, 0x002cf3, PG_U_LOWERCASE_LETTER},
+	{0x002cf4, 0x002cf8, PG_U_UNASSIGNED},
+	{0x002cf9, 0x002cfc, PG_U_OTHER_PUNCTUATION},
+	{0x002cfd, 0x002cfd, PG_U_OTHER_NUMBER},
+	{0x002cfe, 0x002cff, PG_U_OTHER_PUNCTUATION},
+	{0x002d00, 0x002d25, PG_U_LOWERCASE_LETTER},
+	{0x002d26, 0x002d26, PG_U_UNASSIGNED},
+	{0x002d27, 0x002d27, PG_U_LOWERCASE_LETTER},
+	{0x002d28, 0x002d2c, PG_U_UNASSIGNED},
+	{0x002d2d, 0x002d2d, PG_U_LOWERCASE_LETTER},
+	{0x002d2e, 0x002d2f, PG_U_UNASSIGNED},
+	{0x002d30, 0x002d67, PG_U_OTHER_LETTER},
+	{0x002d68, 0x002d6e, PG_U_UNASSIGNED},
+	{0x002d6f, 0x002d6f, PG_U_MODIFIER_LETTER},
+	{0x002d70, 0x002d70, PG_U_OTHER_PUNCTUATION},
+	{0x002d71, 0x002d7e, PG_U_UNASSIGNED},
+	{0x002d7f, 0x002d7f, PG_U_NON_SPACING_MARK},
+	{0x002d80, 0x002d96, PG_U_OTHER_LETTER},
+	{0x002d97, 0x002d9f, PG_U_UNASSIGNED},
+	{0x002da0, 0x002da6, PG_U_OTHER_LETTER},
+	{0x002da7, 0x002da7, PG_U_UNASSIGNED},
+	{0x002da8, 0x002dae, PG_U_OTHER_LETTER},
+	{0x002daf, 0x002daf, PG_U_UNASSIGNED},
+	{0x002db0, 0x002db6, PG_U_OTHER_LETTER},
+	{0x002db7, 0x002db7, PG_U_UNASSIGNED},
+	{0x002db8, 0x002dbe, PG_U_OTHER_LETTER},
+	{0x002dbf, 0x002dbf, PG_U_UNASSIGNED},
+	{0x002dc0, 0x002dc6, PG_U_OTHER_LETTER},
+	{0x002dc7, 0x002dc7, PG_U_UNASSIGNED},
+	{0x002dc8, 0x002dce, PG_U_OTHER_LETTER},
+	{0x002dcf, 0x002dcf, PG_U_UNASSIGNED},
+	{0x002dd0, 0x002dd6, PG_U_OTHER_LETTER},
+	{0x002dd7, 0x002dd7, PG_U_UNASSIGNED},
+	{0x002dd8, 0x002dde, PG_U_OTHER_LETTER},
+	{0x002ddf, 0x002ddf, PG_U_UNASSIGNED},
+	{0x002de0, 0x002dff, PG_U_NON_SPACING_MARK},
+	{0x002e00, 0x002e01, PG_U_OTHER_PUNCTUATION},
+	{0x002e02, 0x002e02, PG_U_INITIAL_PUNCTUATION},
+	{0x002e03, 0x002e03, PG_U_FINAL_PUNCTUATION},
+	{0x002e04, 0x002e04, PG_U_INITIAL_PUNCTUATION},
+	{0x002e05, 0x002e05, PG_U_FINAL_PUNCTUATION},
+	{0x002e06, 0x002e08, PG_U_OTHER_PUNCTUATION},
+	{0x002e09, 0x002e09, PG_U_INITIAL_PUNCTUATION},
+	{0x002e0a, 0x002e0a, PG_U_FINAL_PUNCTUATION},
+	{0x002e0b, 0x002e0b, PG_U_OTHER_PUNCTUATION},
+	{0x002e0c, 0x002e0c, PG_U_INITIAL_PUNCTUATION},
+	{0x002e0d, 0x002e0d, PG_U_FINAL_PUNCTUATION},
+	{0x002e0e, 0x002e16, PG_U_OTHER_PUNCTUATION},
+	{0x002e17, 0x002e17, PG_U_DASH_PUNCTUATION},
+	{0x002e18, 0x002e19, PG_U_OTHER_PUNCTUATION},
+	{0x002e1a, 0x002e1a, PG_U_DASH_PUNCTUATION},
+	{0x002e1b, 0x002e1b, PG_U_OTHER_PUNCTUATION},
+	{0x002e1c, 0x002e1c, PG_U_INITIAL_PUNCTUATION},
+	{0x002e1d, 0x002e1d, PG_U_FINAL_PUNCTUATION},
+	{0x002e1e, 0x002e1f, PG_U_OTHER_PUNCTUATION},
+	{0x002e20, 0x002e20, PG_U_INITIAL_PUNCTUATION},
+	{0x002e21, 0x002e21, PG_U_FINAL_PUNCTUATION},
+	{0x002e22, 0x002e22, PG_U_START_PUNCTUATION},
+	{0x002e23, 0x002e23, PG_U_END_PUNCTUATION},
+	{0x002e24, 0x002e24, PG_U_START_PUNCTUATION},
+	{0x002e25, 0x002e25, PG_U_END_PUNCTUATION},
+	{0x002e26, 0x002e26, PG_U_START_PUNCTUATION},
+	{0x002e27, 0x002e27, PG_U_END_PUNCTUATION},
+	{0x002e28, 0x002e28, PG_U_START_PUNCTUATION},
+	{0x002e29, 0x002e29, PG_U_END_PUNCTUATION},
+	{0x002e2a, 0x002e2e, PG_U_OTHER_PUNCTUATION},
+	{0x002e2f, 0x002e2f, PG_U_MODIFIER_LETTER},
+	{0x002e30, 0x002e39, PG_U_OTHER_PUNCTUATION},
+	{0x002e3a, 0x002e3b, PG_U_DASH_PUNCTUATION},
+	{0x002e3c, 0x002e3f, PG_U_OTHER_PUNCTUATION},
+	{0x002e40, 0x002e40, PG_U_DASH_PUNCTUATION},
+	{0x002e41, 0x002e41, PG_U_OTHER_PUNCTUATION},
+	{0x002e42, 0x002e42, PG_U_START_PUNCTUATION},
+	{0x002e43, 0x002e4f, PG_U_OTHER_PUNCTUATION},
+	{0x002e50, 0x002e51, PG_U_OTHER_SYMBOL},
+	{0x002e52, 0x002e54, PG_U_OTHER_PUNCTUATION},
+	{0x002e55, 0x002e55, PG_U_START_PUNCTUATION},
+	{0x002e56, 0x002e56, PG_U_END_PUNCTUATION},
+	{0x002e57, 0x002e57, PG_U_START_PUNCTUATION},
+	{0x002e58, 0x002e58, PG_U_END_PUNCTUATION},
+	{0x002e59, 0x002e59, PG_U_START_PUNCTUATION},
+	{0x002e5a, 0x002e5a, PG_U_END_PUNCTUATION},
+	{0x002e5b, 0x002e5b, PG_U_START_PUNCTUATION},
+	{0x002e5c, 0x002e5c, PG_U_END_PUNCTUATION},
+	{0x002e5d, 0x002e5d, PG_U_DASH_PUNCTUATION},
+	{0x002e5e, 0x002e7f, PG_U_UNASSIGNED},
+	{0x002e80, 0x002e99, PG_U_OTHER_SYMBOL},
+	{0x002e9a, 0x002e9a, PG_U_UNASSIGNED},
+	{0x002e9b, 0x002ef3, PG_U_OTHER_SYMBOL},
+	{0x002ef4, 0x002eff, PG_U_UNASSIGNED},
+	{0x002f00, 0x002fd5, PG_U_OTHER_SYMBOL},
+	{0x002fd6, 0x002fef, PG_U_UNASSIGNED},
+	{0x002ff0, 0x002fff, PG_U_OTHER_SYMBOL},
+	{0x003000, 0x003000, PG_U_SPACE_SEPARATOR},
+	{0x003001, 0x003003, PG_U_OTHER_PUNCTUATION},
+	{0x003004, 0x003004, PG_U_OTHER_SYMBOL},
+	{0x003005, 0x003005, PG_U_MODIFIER_LETTER},
+	{0x003006, 0x003006, PG_U_OTHER_LETTER},
+	{0x003007, 0x003007, PG_U_LETTER_NUMBER},
+	{0x003008, 0x003008, PG_U_START_PUNCTUATION},
+	{0x003009, 0x003009, PG_U_END_PUNCTUATION},
+	{0x00300a, 0x00300a, PG_U_START_PUNCTUATION},
+	{0x00300b, 0x00300b, PG_U_END_PUNCTUATION},
+	{0x00300c, 0x00300c, PG_U_START_PUNCTUATION},
+	{0x00300d, 0x00300d, PG_U_END_PUNCTUATION},
+	{0x00300e, 0x00300e, PG_U_START_PUNCTUATION},
+	{0x00300f, 0x00300f, PG_U_END_PUNCTUATION},
+	{0x003010, 0x003010, PG_U_START_PUNCTUATION},
+	{0x003011, 0x003011, PG_U_END_PUNCTUATION},
+	{0x003012, 0x003013, PG_U_OTHER_SYMBOL},
+	{0x003014, 0x003014, PG_U_START_PUNCTUATION},
+	{0x003015, 0x003015, PG_U_END_PUNCTUATION},
+	{0x003016, 0x003016, PG_U_START_PUNCTUATION},
+	{0x003017, 0x003017, PG_U_END_PUNCTUATION},
+	{0x003018, 0x003018, PG_U_START_PUNCTUATION},
+	{0x003019, 0x003019, PG_U_END_PUNCTUATION},
+	{0x00301a, 0x00301a, PG_U_START_PUNCTUATION},
+	{0x00301b, 0x00301b, PG_U_END_PUNCTUATION},
+	{0x00301c, 0x00301c, PG_U_DASH_PUNCTUATION},
+	{0x00301d, 0x00301d, PG_U_START_PUNCTUATION},
+	{0x00301e, 0x00301f, PG_U_END_PUNCTUATION},
+	{0x003020, 0x003020, PG_U_OTHER_SYMBOL},
+	{0x003021, 0x003029, PG_U_LETTER_NUMBER},
+	{0x00302a, 0x00302d, PG_U_NON_SPACING_MARK},
+	{0x00302e, 0x00302f, PG_U_COMBINING_SPACING_MARK},
+	{0x003030, 0x003030, PG_U_DASH_PUNCTUATION},
+	{0x003031, 0x003035, PG_U_MODIFIER_LETTER},
+	{0x003036, 0x003037, PG_U_OTHER_SYMBOL},
+	{0x003038, 0x00303a, PG_U_LETTER_NUMBER},
+	{0x00303b, 0x00303b, PG_U_MODIFIER_LETTER},
+	{0x00303c, 0x00303c, PG_U_OTHER_LETTER},
+	{0x00303d, 0x00303d, PG_U_OTHER_PUNCTUATION},
+	{0x00303e, 0x00303f, PG_U_OTHER_SYMBOL},
+	{0x003040, 0x003040, PG_U_UNASSIGNED},
+	{0x003041, 0x003096, PG_U_OTHER_LETTER},
+	{0x003097, 0x003098, PG_U_UNASSIGNED},
+	{0x003099, 0x00309a, PG_U_NON_SPACING_MARK},
+	{0x00309b, 0x00309c, PG_U_MODIFIER_SYMBOL},
+	{0x00309d, 0x00309e, PG_U_MODIFIER_LETTER},
+	{0x00309f, 0x00309f, PG_U_OTHER_LETTER},
+	{0x0030a0, 0x0030a0, PG_U_DASH_PUNCTUATION},
+	{0x0030a1, 0x0030fa, PG_U_OTHER_LETTER},
+	{0x0030fb, 0x0030fb, PG_U_OTHER_PUNCTUATION},
+	{0x0030fc, 0x0030fe, PG_U_MODIFIER_LETTER},
+	{0x0030ff, 0x0030ff, PG_U_OTHER_LETTER},
+	{0x003100, 0x003104, PG_U_UNASSIGNED},
+	{0x003105, 0x00312f, PG_U_OTHER_LETTER},
+	{0x003130, 0x003130, PG_U_UNASSIGNED},
+	{0x003131, 0x00318e, PG_U_OTHER_LETTER},
+	{0x00318f, 0x00318f, PG_U_UNASSIGNED},
+	{0x003190, 0x003191, PG_U_OTHER_SYMBOL},
+	{0x003192, 0x003195, PG_U_OTHER_NUMBER},
+	{0x003196, 0x00319f, PG_U_OTHER_SYMBOL},
+	{0x0031a0, 0x0031bf, PG_U_OTHER_LETTER},
+	{0x0031c0, 0x0031e3, PG_U_OTHER_SYMBOL},
+	{0x0031e4, 0x0031ee, PG_U_UNASSIGNED},
+	{0x0031ef, 0x0031ef, PG_U_OTHER_SYMBOL},
+	{0x0031f0, 0x0031ff, PG_U_OTHER_LETTER},
+	{0x003200, 0x00321e, PG_U_OTHER_SYMBOL},
+	{0x00321f, 0x00321f, PG_U_UNASSIGNED},
+	{0x003220, 0x003229, PG_U_OTHER_NUMBER},
+	{0x00322a, 0x003247, PG_U_OTHER_SYMBOL},
+	{0x003248, 0x00324f, PG_U_OTHER_NUMBER},
+	{0x003250, 0x003250, PG_U_OTHER_SYMBOL},
+	{0x003251, 0x00325f, PG_U_OTHER_NUMBER},
+	{0x003260, 0x00327f, PG_U_OTHER_SYMBOL},
+	{0x003280, 0x003289, PG_U_OTHER_NUMBER},
+	{0x00328a, 0x0032b0, PG_U_OTHER_SYMBOL},
+	{0x0032b1, 0x0032bf, PG_U_OTHER_NUMBER},
+	{0x0032c0, 0x0033ff, PG_U_OTHER_SYMBOL},
+	{0x003400, 0x004dbf, PG_U_OTHER_LETTER},
+	{0x004dc0, 0x004dff, PG_U_OTHER_SYMBOL},
+	{0x004e00, 0x00a014, PG_U_OTHER_LETTER},
+	{0x00a015, 0x00a015, PG_U_MODIFIER_LETTER},
+	{0x00a016, 0x00a48c, PG_U_OTHER_LETTER},
+	{0x00a48d, 0x00a48f, PG_U_UNASSIGNED},
+	{0x00a490, 0x00a4c6, PG_U_OTHER_SYMBOL},
+	{0x00a4c7, 0x00a4cf, PG_U_UNASSIGNED},
+	{0x00a4d0, 0x00a4f7, PG_U_OTHER_LETTER},
+	{0x00a4f8, 0x00a4fd, PG_U_MODIFIER_LETTER},
+	{0x00a4fe, 0x00a4ff, PG_U_OTHER_PUNCTUATION},
+	{0x00a500, 0x00a60b, PG_U_OTHER_LETTER},
+	{0x00a60c, 0x00a60c, PG_U_MODIFIER_LETTER},
+	{0x00a60d, 0x00a60f, PG_U_OTHER_PUNCTUATION},
+	{0x00a610, 0x00a61f, PG_U_OTHER_LETTER},
+	{0x00a620, 0x00a629, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a62a, 0x00a62b, PG_U_OTHER_LETTER},
+	{0x00a62c, 0x00a63f, PG_U_UNASSIGNED},
+	{0x00a640, 0x00a640, PG_U_UPPERCASE_LETTER},
+	{0x00a641, 0x00a641, PG_U_LOWERCASE_LETTER},
+	{0x00a642, 0x00a642, PG_U_UPPERCASE_LETTER},
+	{0x00a643, 0x00a643, PG_U_LOWERCASE_LETTER},
+	{0x00a644, 0x00a644, PG_U_UPPERCASE_LETTER},
+	{0x00a645, 0x00a645, PG_U_LOWERCASE_LETTER},
+	{0x00a646, 0x00a646, PG_U_UPPERCASE_LETTER},
+	{0x00a647, 0x00a647, PG_U_LOWERCASE_LETTER},
+	{0x00a648, 0x00a648, PG_U_UPPERCASE_LETTER},
+	{0x00a649, 0x00a649, PG_U_LOWERCASE_LETTER},
+	{0x00a64a, 0x00a64a, PG_U_UPPERCASE_LETTER},
+	{0x00a64b, 0x00a64b, PG_U_LOWERCASE_LETTER},
+	{0x00a64c, 0x00a64c, PG_U_UPPERCASE_LETTER},
+	{0x00a64d, 0x00a64d, PG_U_LOWERCASE_LETTER},
+	{0x00a64e, 0x00a64e, PG_U_UPPERCASE_LETTER},
+	{0x00a64f, 0x00a64f, PG_U_LOWERCASE_LETTER},
+	{0x00a650, 0x00a650, PG_U_UPPERCASE_LETTER},
+	{0x00a651, 0x00a651, PG_U_LOWERCASE_LETTER},
+	{0x00a652, 0x00a652, PG_U_UPPERCASE_LETTER},
+	{0x00a653, 0x00a653, PG_U_LOWERCASE_LETTER},
+	{0x00a654, 0x00a654, PG_U_UPPERCASE_LETTER},
+	{0x00a655, 0x00a655, PG_U_LOWERCASE_LETTER},
+	{0x00a656, 0x00a656, PG_U_UPPERCASE_LETTER},
+	{0x00a657, 0x00a657, PG_U_LOWERCASE_LETTER},
+	{0x00a658, 0x00a658, PG_U_UPPERCASE_LETTER},
+	{0x00a659, 0x00a659, PG_U_LOWERCASE_LETTER},
+	{0x00a65a, 0x00a65a, PG_U_UPPERCASE_LETTER},
+	{0x00a65b, 0x00a65b, PG_U_LOWERCASE_LETTER},
+	{0x00a65c, 0x00a65c, PG_U_UPPERCASE_LETTER},
+	{0x00a65d, 0x00a65d, PG_U_LOWERCASE_LETTER},
+	{0x00a65e, 0x00a65e, PG_U_UPPERCASE_LETTER},
+	{0x00a65f, 0x00a65f, PG_U_LOWERCASE_LETTER},
+	{0x00a660, 0x00a660, PG_U_UPPERCASE_LETTER},
+	{0x00a661, 0x00a661, PG_U_LOWERCASE_LETTER},
+	{0x00a662, 0x00a662, PG_U_UPPERCASE_LETTER},
+	{0x00a663, 0x00a663, PG_U_LOWERCASE_LETTER},
+	{0x00a664, 0x00a664, PG_U_UPPERCASE_LETTER},
+	{0x00a665, 0x00a665, PG_U_LOWERCASE_LETTER},
+	{0x00a666, 0x00a666, PG_U_UPPERCASE_LETTER},
+	{0x00a667, 0x00a667, PG_U_LOWERCASE_LETTER},
+	{0x00a668, 0x00a668, PG_U_UPPERCASE_LETTER},
+	{0x00a669, 0x00a669, PG_U_LOWERCASE_LETTER},
+	{0x00a66a, 0x00a66a, PG_U_UPPERCASE_LETTER},
+	{0x00a66b, 0x00a66b, PG_U_LOWERCASE_LETTER},
+	{0x00a66c, 0x00a66c, PG_U_UPPERCASE_LETTER},
+	{0x00a66d, 0x00a66d, PG_U_LOWERCASE_LETTER},
+	{0x00a66e, 0x00a66e, PG_U_OTHER_LETTER},
+	{0x00a66f, 0x00a66f, PG_U_NON_SPACING_MARK},
+	{0x00a670, 0x00a672, PG_U_ENCLOSING_MARK},
+	{0x00a673, 0x00a673, PG_U_OTHER_PUNCTUATION},
+	{0x00a674, 0x00a67d, PG_U_NON_SPACING_MARK},
+	{0x00a67e, 0x00a67e, PG_U_OTHER_PUNCTUATION},
+	{0x00a67f, 0x00a67f, PG_U_MODIFIER_LETTER},
+	{0x00a680, 0x00a680, PG_U_UPPERCASE_LETTER},
+	{0x00a681, 0x00a681, PG_U_LOWERCASE_LETTER},
+	{0x00a682, 0x00a682, PG_U_UPPERCASE_LETTER},
+	{0x00a683, 0x00a683, PG_U_LOWERCASE_LETTER},
+	{0x00a684, 0x00a684, PG_U_UPPERCASE_LETTER},
+	{0x00a685, 0x00a685, PG_U_LOWERCASE_LETTER},
+	{0x00a686, 0x00a686, PG_U_UPPERCASE_LETTER},
+	{0x00a687, 0x00a687, PG_U_LOWERCASE_LETTER},
+	{0x00a688, 0x00a688, PG_U_UPPERCASE_LETTER},
+	{0x00a689, 0x00a689, PG_U_LOWERCASE_LETTER},
+	{0x00a68a, 0x00a68a, PG_U_UPPERCASE_LETTER},
+	{0x00a68b, 0x00a68b, PG_U_LOWERCASE_LETTER},
+	{0x00a68c, 0x00a68c, PG_U_UPPERCASE_LETTER},
+	{0x00a68d, 0x00a68d, PG_U_LOWERCASE_LETTER},
+	{0x00a68e, 0x00a68e, PG_U_UPPERCASE_LETTER},
+	{0x00a68f, 0x00a68f, PG_U_LOWERCASE_LETTER},
+	{0x00a690, 0x00a690, PG_U_UPPERCASE_LETTER},
+	{0x00a691, 0x00a691, PG_U_LOWERCASE_LETTER},
+	{0x00a692, 0x00a692, PG_U_UPPERCASE_LETTER},
+	{0x00a693, 0x00a693, PG_U_LOWERCASE_LETTER},
+	{0x00a694, 0x00a694, PG_U_UPPERCASE_LETTER},
+	{0x00a695, 0x00a695, PG_U_LOWERCASE_LETTER},
+	{0x00a696, 0x00a696, PG_U_UPPERCASE_LETTER},
+	{0x00a697, 0x00a697, PG_U_LOWERCASE_LETTER},
+	{0x00a698, 0x00a698, PG_U_UPPERCASE_LETTER},
+	{0x00a699, 0x00a699, PG_U_LOWERCASE_LETTER},
+	{0x00a69a, 0x00a69a, PG_U_UPPERCASE_LETTER},
+	{0x00a69b, 0x00a69b, PG_U_LOWERCASE_LETTER},
+	{0x00a69c, 0x00a69d, PG_U_MODIFIER_LETTER},
+	{0x00a69e, 0x00a69f, PG_U_NON_SPACING_MARK},
+	{0x00a6a0, 0x00a6e5, PG_U_OTHER_LETTER},
+	{0x00a6e6, 0x00a6ef, PG_U_LETTER_NUMBER},
+	{0x00a6f0, 0x00a6f1, PG_U_NON_SPACING_MARK},
+	{0x00a6f2, 0x00a6f7, PG_U_OTHER_PUNCTUATION},
+	{0x00a6f8, 0x00a6ff, PG_U_UNASSIGNED},
+	{0x00a700, 0x00a716, PG_U_MODIFIER_SYMBOL},
+	{0x00a717, 0x00a71f, PG_U_MODIFIER_LETTER},
+	{0x00a720, 0x00a721, PG_U_MODIFIER_SYMBOL},
+	{0x00a722, 0x00a722, PG_U_UPPERCASE_LETTER},
+	{0x00a723, 0x00a723, PG_U_LOWERCASE_LETTER},
+	{0x00a724, 0x00a724, PG_U_UPPERCASE_LETTER},
+	{0x00a725, 0x00a725, PG_U_LOWERCASE_LETTER},
+	{0x00a726, 0x00a726, PG_U_UPPERCASE_LETTER},
+	{0x00a727, 0x00a727, PG_U_LOWERCASE_LETTER},
+	{0x00a728, 0x00a728, PG_U_UPPERCASE_LETTER},
+	{0x00a729, 0x00a729, PG_U_LOWERCASE_LETTER},
+	{0x00a72a, 0x00a72a, PG_U_UPPERCASE_LETTER},
+	{0x00a72b, 0x00a72b, PG_U_LOWERCASE_LETTER},
+	{0x00a72c, 0x00a72c, PG_U_UPPERCASE_LETTER},
+	{0x00a72d, 0x00a72d, PG_U_LOWERCASE_LETTER},
+	{0x00a72e, 0x00a72e, PG_U_UPPERCASE_LETTER},
+	{0x00a72f, 0x00a731, PG_U_LOWERCASE_LETTER},
+	{0x00a732, 0x00a732, PG_U_UPPERCASE_LETTER},
+	{0x00a733, 0x00a733, PG_U_LOWERCASE_LETTER},
+	{0x00a734, 0x00a734, PG_U_UPPERCASE_LETTER},
+	{0x00a735, 0x00a735, PG_U_LOWERCASE_LETTER},
+	{0x00a736, 0x00a736, PG_U_UPPERCASE_LETTER},
+	{0x00a737, 0x00a737, PG_U_LOWERCASE_LETTER},
+	{0x00a738, 0x00a738, PG_U_UPPERCASE_LETTER},
+	{0x00a739, 0x00a739, PG_U_LOWERCASE_LETTER},
+	{0x00a73a, 0x00a73a, PG_U_UPPERCASE_LETTER},
+	{0x00a73b, 0x00a73b, PG_U_LOWERCASE_LETTER},
+	{0x00a73c, 0x00a73c, PG_U_UPPERCASE_LETTER},
+	{0x00a73d, 0x00a73d, PG_U_LOWERCASE_LETTER},
+	{0x00a73e, 0x00a73e, PG_U_UPPERCASE_LETTER},
+	{0x00a73f, 0x00a73f, PG_U_LOWERCASE_LETTER},
+	{0x00a740, 0x00a740, PG_U_UPPERCASE_LETTER},
+	{0x00a741, 0x00a741, PG_U_LOWERCASE_LETTER},
+	{0x00a742, 0x00a742, PG_U_UPPERCASE_LETTER},
+	{0x00a743, 0x00a743, PG_U_LOWERCASE_LETTER},
+	{0x00a744, 0x00a744, PG_U_UPPERCASE_LETTER},
+	{0x00a745, 0x00a745, PG_U_LOWERCASE_LETTER},
+	{0x00a746, 0x00a746, PG_U_UPPERCASE_LETTER},
+	{0x00a747, 0x00a747, PG_U_LOWERCASE_LETTER},
+	{0x00a748, 0x00a748, PG_U_UPPERCASE_LETTER},
+	{0x00a749, 0x00a749, PG_U_LOWERCASE_LETTER},
+	{0x00a74a, 0x00a74a, PG_U_UPPERCASE_LETTER},
+	{0x00a74b, 0x00a74b, PG_U_LOWERCASE_LETTER},
+	{0x00a74c, 0x00a74c, PG_U_UPPERCASE_LETTER},
+	{0x00a74d, 0x00a74d, PG_U_LOWERCASE_LETTER},
+	{0x00a74e, 0x00a74e, PG_U_UPPERCASE_LETTER},
+	{0x00a74f, 0x00a74f, PG_U_LOWERCASE_LETTER},
+	{0x00a750, 0x00a750, PG_U_UPPERCASE_LETTER},
+	{0x00a751, 0x00a751, PG_U_LOWERCASE_LETTER},
+	{0x00a752, 0x00a752, PG_U_UPPERCASE_LETTER},
+	{0x00a753, 0x00a753, PG_U_LOWERCASE_LETTER},
+	{0x00a754, 0x00a754, PG_U_UPPERCASE_LETTER},
+	{0x00a755, 0x00a755, PG_U_LOWERCASE_LETTER},
+	{0x00a756, 0x00a756, PG_U_UPPERCASE_LETTER},
+	{0x00a757, 0x00a757, PG_U_LOWERCASE_LETTER},
+	{0x00a758, 0x00a758, PG_U_UPPERCASE_LETTER},
+	{0x00a759, 0x00a759, PG_U_LOWERCASE_LETTER},
+	{0x00a75a, 0x00a75a, PG_U_UPPERCASE_LETTER},
+	{0x00a75b, 0x00a75b, PG_U_LOWERCASE_LETTER},
+	{0x00a75c, 0x00a75c, PG_U_UPPERCASE_LETTER},
+	{0x00a75d, 0x00a75d, PG_U_LOWERCASE_LETTER},
+	{0x00a75e, 0x00a75e, PG_U_UPPERCASE_LETTER},
+	{0x00a75f, 0x00a75f, PG_U_LOWERCASE_LETTER},
+	{0x00a760, 0x00a760, PG_U_UPPERCASE_LETTER},
+	{0x00a761, 0x00a761, PG_U_LOWERCASE_LETTER},
+	{0x00a762, 0x00a762, PG_U_UPPERCASE_LETTER},
+	{0x00a763, 0x00a763, PG_U_LOWERCASE_LETTER},
+	{0x00a764, 0x00a764, PG_U_UPPERCASE_LETTER},
+	{0x00a765, 0x00a765, PG_U_LOWERCASE_LETTER},
+	{0x00a766, 0x00a766, PG_U_UPPERCASE_LETTER},
+	{0x00a767, 0x00a767, PG_U_LOWERCASE_LETTER},
+	{0x00a768, 0x00a768, PG_U_UPPERCASE_LETTER},
+	{0x00a769, 0x00a769, PG_U_LOWERCASE_LETTER},
+	{0x00a76a, 0x00a76a, PG_U_UPPERCASE_LETTER},
+	{0x00a76b, 0x00a76b, PG_U_LOWERCASE_LETTER},
+	{0x00a76c, 0x00a76c, PG_U_UPPERCASE_LETTER},
+	{0x00a76d, 0x00a76d, PG_U_LOWERCASE_LETTER},
+	{0x00a76e, 0x00a76e, PG_U_UPPERCASE_LETTER},
+	{0x00a76f, 0x00a76f, PG_U_LOWERCASE_LETTER},
+	{0x00a770, 0x00a770, PG_U_MODIFIER_LETTER},
+	{0x00a771, 0x00a778, PG_U_LOWERCASE_LETTER},
+	{0x00a779, 0x00a779, PG_U_UPPERCASE_LETTER},
+	{0x00a77a, 0x00a77a, PG_U_LOWERCASE_LETTER},
+	{0x00a77b, 0x00a77b, PG_U_UPPERCASE_LETTER},
+	{0x00a77c, 0x00a77c, PG_U_LOWERCASE_LETTER},
+	{0x00a77d, 0x00a77e, PG_U_UPPERCASE_LETTER},
+	{0x00a77f, 0x00a77f, PG_U_LOWERCASE_LETTER},
+	{0x00a780, 0x00a780, PG_U_UPPERCASE_LETTER},
+	{0x00a781, 0x00a781, PG_U_LOWERCASE_LETTER},
+	{0x00a782, 0x00a782, PG_U_UPPERCASE_LETTER},
+	{0x00a783, 0x00a783, PG_U_LOWERCASE_LETTER},
+	{0x00a784, 0x00a784, PG_U_UPPERCASE_LETTER},
+	{0x00a785, 0x00a785, PG_U_LOWERCASE_LETTER},
+	{0x00a786, 0x00a786, PG_U_UPPERCASE_LETTER},
+	{0x00a787, 0x00a787, PG_U_LOWERCASE_LETTER},
+	{0x00a788, 0x00a788, PG_U_MODIFIER_LETTER},
+	{0x00a789, 0x00a78a, PG_U_MODIFIER_SYMBOL},
+	{0x00a78b, 0x00a78b, PG_U_UPPERCASE_LETTER},
+	{0x00a78c, 0x00a78c, PG_U_LOWERCASE_LETTER},
+	{0x00a78d, 0x00a78d, PG_U_UPPERCASE_LETTER},
+	{0x00a78e, 0x00a78e, PG_U_LOWERCASE_LETTER},
+	{0x00a78f, 0x00a78f, PG_U_OTHER_LETTER},
+	{0x00a790, 0x00a790, PG_U_UPPERCASE_LETTER},
+	{0x00a791, 0x00a791, PG_U_LOWERCASE_LETTER},
+	{0x00a792, 0x00a792, PG_U_UPPERCASE_LETTER},
+	{0x00a793, 0x00a795, PG_U_LOWERCASE_LETTER},
+	{0x00a796, 0x00a796, PG_U_UPPERCASE_LETTER},
+	{0x00a797, 0x00a797, PG_U_LOWERCASE_LETTER},
+	{0x00a798, 0x00a798, PG_U_UPPERCASE_LETTER},
+	{0x00a799, 0x00a799, PG_U_LOWERCASE_LETTER},
+	{0x00a79a, 0x00a79a, PG_U_UPPERCASE_LETTER},
+	{0x00a79b, 0x00a79b, PG_U_LOWERCASE_LETTER},
+	{0x00a79c, 0x00a79c, PG_U_UPPERCASE_LETTER},
+	{0x00a79d, 0x00a79d, PG_U_LOWERCASE_LETTER},
+	{0x00a79e, 0x00a79e, PG_U_UPPERCASE_LETTER},
+	{0x00a79f, 0x00a79f, PG_U_LOWERCASE_LETTER},
+	{0x00a7a0, 0x00a7a0, PG_U_UPPERCASE_LETTER},
+	{0x00a7a1, 0x00a7a1, PG_U_LOWERCASE_LETTER},
+	{0x00a7a2, 0x00a7a2, PG_U_UPPERCASE_LETTER},
+	{0x00a7a3, 0x00a7a3, PG_U_LOWERCASE_LETTER},
+	{0x00a7a4, 0x00a7a4, PG_U_UPPERCASE_LETTER},
+	{0x00a7a5, 0x00a7a5, PG_U_LOWERCASE_LETTER},
+	{0x00a7a6, 0x00a7a6, PG_U_UPPERCASE_LETTER},
+	{0x00a7a7, 0x00a7a7, PG_U_LOWERCASE_LETTER},
+	{0x00a7a8, 0x00a7a8, PG_U_UPPERCASE_LETTER},
+	{0x00a7a9, 0x00a7a9, PG_U_LOWERCASE_LETTER},
+	{0x00a7aa, 0x00a7ae, PG_U_UPPERCASE_LETTER},
+	{0x00a7af, 0x00a7af, PG_U_LOWERCASE_LETTER},
+	{0x00a7b0, 0x00a7b4, PG_U_UPPERCASE_LETTER},
+	{0x00a7b5, 0x00a7b5, PG_U_LOWERCASE_LETTER},
+	{0x00a7b6, 0x00a7b6, PG_U_UPPERCASE_LETTER},
+	{0x00a7b7, 0x00a7b7, PG_U_LOWERCASE_LETTER},
+	{0x00a7b8, 0x00a7b8, PG_U_UPPERCASE_LETTER},
+	{0x00a7b9, 0x00a7b9, PG_U_LOWERCASE_LETTER},
+	{0x00a7ba, 0x00a7ba, PG_U_UPPERCASE_LETTER},
+	{0x00a7bb, 0x00a7bb, PG_U_LOWERCASE_LETTER},
+	{0x00a7bc, 0x00a7bc, PG_U_UPPERCASE_LETTER},
+	{0x00a7bd, 0x00a7bd, PG_U_LOWERCASE_LETTER},
+	{0x00a7be, 0x00a7be, PG_U_UPPERCASE_LETTER},
+	{0x00a7bf, 0x00a7bf, PG_U_LOWERCASE_LETTER},
+	{0x00a7c0, 0x00a7c0, PG_U_UPPERCASE_LETTER},
+	{0x00a7c1, 0x00a7c1, PG_U_LOWERCASE_LETTER},
+	{0x00a7c2, 0x00a7c2, PG_U_UPPERCASE_LETTER},
+	{0x00a7c3, 0x00a7c3, PG_U_LOWERCASE_LETTER},
+	{0x00a7c4, 0x00a7c7, PG_U_UPPERCASE_LETTER},
+	{0x00a7c8, 0x00a7c8, PG_U_LOWERCASE_LETTER},
+	{0x00a7c9, 0x00a7c9, PG_U_UPPERCASE_LETTER},
+	{0x00a7ca, 0x00a7ca, PG_U_LOWERCASE_LETTER},
+	{0x00a7cb, 0x00a7cf, PG_U_UNASSIGNED},
+	{0x00a7d0, 0x00a7d0, PG_U_UPPERCASE_LETTER},
+	{0x00a7d1, 0x00a7d1, PG_U_LOWERCASE_LETTER},
+	{0x00a7d2, 0x00a7d2, PG_U_UNASSIGNED},
+	{0x00a7d3, 0x00a7d3, PG_U_LOWERCASE_LETTER},
+	{0x00a7d4, 0x00a7d4, PG_U_UNASSIGNED},
+	{0x00a7d5, 0x00a7d5, PG_U_LOWERCASE_LETTER},
+	{0x00a7d6, 0x00a7d6, PG_U_UPPERCASE_LETTER},
+	{0x00a7d7, 0x00a7d7, PG_U_LOWERCASE_LETTER},
+	{0x00a7d8, 0x00a7d8, PG_U_UPPERCASE_LETTER},
+	{0x00a7d9, 0x00a7d9, PG_U_LOWERCASE_LETTER},
+	{0x00a7da, 0x00a7f1, PG_U_UNASSIGNED},
+	{0x00a7f2, 0x00a7f4, PG_U_MODIFIER_LETTER},
+	{0x00a7f5, 0x00a7f5, PG_U_UPPERCASE_LETTER},
+	{0x00a7f6, 0x00a7f6, PG_U_LOWERCASE_LETTER},
+	{0x00a7f7, 0x00a7f7, PG_U_OTHER_LETTER},
+	{0x00a7f8, 0x00a7f9, PG_U_MODIFIER_LETTER},
+	{0x00a7fa, 0x00a7fa, PG_U_LOWERCASE_LETTER},
+	{0x00a7fb, 0x00a801, PG_U_OTHER_LETTER},
+	{0x00a802, 0x00a802, PG_U_NON_SPACING_MARK},
+	{0x00a803, 0x00a805, PG_U_OTHER_LETTER},
+	{0x00a806, 0x00a806, PG_U_NON_SPACING_MARK},
+	{0x00a807, 0x00a80a, PG_U_OTHER_LETTER},
+	{0x00a80b, 0x00a80b, PG_U_NON_SPACING_MARK},
+	{0x00a80c, 0x00a822, PG_U_OTHER_LETTER},
+	{0x00a823, 0x00a824, PG_U_COMBINING_SPACING_MARK},
+	{0x00a825, 0x00a826, PG_U_NON_SPACING_MARK},
+	{0x00a827, 0x00a827, PG_U_COMBINING_SPACING_MARK},
+	{0x00a828, 0x00a82b, PG_U_OTHER_SYMBOL},
+	{0x00a82c, 0x00a82c, PG_U_NON_SPACING_MARK},
+	{0x00a82d, 0x00a82f, PG_U_UNASSIGNED},
+	{0x00a830, 0x00a835, PG_U_OTHER_NUMBER},
+	{0x00a836, 0x00a837, PG_U_OTHER_SYMBOL},
+	{0x00a838, 0x00a838, PG_U_CURRENCY_SYMBOL},
+	{0x00a839, 0x00a839, PG_U_OTHER_SYMBOL},
+	{0x00a83a, 0x00a83f, PG_U_UNASSIGNED},
+	{0x00a840, 0x00a873, PG_U_OTHER_LETTER},
+	{0x00a874, 0x00a877, PG_U_OTHER_PUNCTUATION},
+	{0x00a878, 0x00a87f, PG_U_UNASSIGNED},
+	{0x00a880, 0x00a881, PG_U_COMBINING_SPACING_MARK},
+	{0x00a882, 0x00a8b3, PG_U_OTHER_LETTER},
+	{0x00a8b4, 0x00a8c3, PG_U_COMBINING_SPACING_MARK},
+	{0x00a8c4, 0x00a8c5, PG_U_NON_SPACING_MARK},
+	{0x00a8c6, 0x00a8cd, PG_U_UNASSIGNED},
+	{0x00a8ce, 0x00a8cf, PG_U_OTHER_PUNCTUATION},
+	{0x00a8d0, 0x00a8d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a8da, 0x00a8df, PG_U_UNASSIGNED},
+	{0x00a8e0, 0x00a8f1, PG_U_NON_SPACING_MARK},
+	{0x00a8f2, 0x00a8f7, PG_U_OTHER_LETTER},
+	{0x00a8f8, 0x00a8fa, PG_U_OTHER_PUNCTUATION},
+	{0x00a8fb, 0x00a8fb, PG_U_OTHER_LETTER},
+	{0x00a8fc, 0x00a8fc, PG_U_OTHER_PUNCTUATION},
+	{0x00a8fd, 0x00a8fe, PG_U_OTHER_LETTER},
+	{0x00a8ff, 0x00a8ff, PG_U_NON_SPACING_MARK},
+	{0x00a900, 0x00a909, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a90a, 0x00a925, PG_U_OTHER_LETTER},
+	{0x00a926, 0x00a92d, PG_U_NON_SPACING_MARK},
+	{0x00a92e, 0x00a92f, PG_U_OTHER_PUNCTUATION},
+	{0x00a930, 0x00a946, PG_U_OTHER_LETTER},
+	{0x00a947, 0x00a951, PG_U_NON_SPACING_MARK},
+	{0x00a952, 0x00a953, PG_U_COMBINING_SPACING_MARK},
+	{0x00a954, 0x00a95e, PG_U_UNASSIGNED},
+	{0x00a95f, 0x00a95f, PG_U_OTHER_PUNCTUATION},
+	{0x00a960, 0x00a97c, PG_U_OTHER_LETTER},
+	{0x00a97d, 0x00a97f, PG_U_UNASSIGNED},
+	{0x00a980, 0x00a982, PG_U_NON_SPACING_MARK},
+	{0x00a983, 0x00a983, PG_U_COMBINING_SPACING_MARK},
+	{0x00a984, 0x00a9b2, PG_U_OTHER_LETTER},
+	{0x00a9b3, 0x00a9b3, PG_U_NON_SPACING_MARK},
+	{0x00a9b4, 0x00a9b5, PG_U_COMBINING_SPACING_MARK},
+	{0x00a9b6, 0x00a9b9, PG_U_NON_SPACING_MARK},
+	{0x00a9ba, 0x00a9bb, PG_U_COMBINING_SPACING_MARK},
+	{0x00a9bc, 0x00a9bd, PG_U_NON_SPACING_MARK},
+	{0x00a9be, 0x00a9c0, PG_U_COMBINING_SPACING_MARK},
+	{0x00a9c1, 0x00a9cd, PG_U_OTHER_PUNCTUATION},
+	{0x00a9ce, 0x00a9ce, PG_U_UNASSIGNED},
+	{0x00a9cf, 0x00a9cf, PG_U_MODIFIER_LETTER},
+	{0x00a9d0, 0x00a9d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a9da, 0x00a9dd, PG_U_UNASSIGNED},
+	{0x00a9de, 0x00a9df, PG_U_OTHER_PUNCTUATION},
+	{0x00a9e0, 0x00a9e4, PG_U_OTHER_LETTER},
+	{0x00a9e5, 0x00a9e5, PG_U_NON_SPACING_MARK},
+	{0x00a9e6, 0x00a9e6, PG_U_MODIFIER_LETTER},
+	{0x00a9e7, 0x00a9ef, PG_U_OTHER_LETTER},
+	{0x00a9f0, 0x00a9f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a9fa, 0x00a9fe, PG_U_OTHER_LETTER},
+	{0x00a9ff, 0x00a9ff, PG_U_UNASSIGNED},
+	{0x00aa00, 0x00aa28, PG_U_OTHER_LETTER},
+	{0x00aa29, 0x00aa2e, PG_U_NON_SPACING_MARK},
+	{0x00aa2f, 0x00aa30, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa31, 0x00aa32, PG_U_NON_SPACING_MARK},
+	{0x00aa33, 0x00aa34, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa35, 0x00aa36, PG_U_NON_SPACING_MARK},
+	{0x00aa37, 0x00aa3f, PG_U_UNASSIGNED},
+	{0x00aa40, 0x00aa42, PG_U_OTHER_LETTER},
+	{0x00aa43, 0x00aa43, PG_U_NON_SPACING_MARK},
+	{0x00aa44, 0x00aa4b, PG_U_OTHER_LETTER},
+	{0x00aa4c, 0x00aa4c, PG_U_NON_SPACING_MARK},
+	{0x00aa4d, 0x00aa4d, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa4e, 0x00aa4f, PG_U_UNASSIGNED},
+	{0x00aa50, 0x00aa59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00aa5a, 0x00aa5b, PG_U_UNASSIGNED},
+	{0x00aa5c, 0x00aa5f, PG_U_OTHER_PUNCTUATION},
+	{0x00aa60, 0x00aa6f, PG_U_OTHER_LETTER},
+	{0x00aa70, 0x00aa70, PG_U_MODIFIER_LETTER},
+	{0x00aa71, 0x00aa76, PG_U_OTHER_LETTER},
+	{0x00aa77, 0x00aa79, PG_U_OTHER_SYMBOL},
+	{0x00aa7a, 0x00aa7a, PG_U_OTHER_LETTER},
+	{0x00aa7b, 0x00aa7b, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa7c, 0x00aa7c, PG_U_NON_SPACING_MARK},
+	{0x00aa7d, 0x00aa7d, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa7e, 0x00aaaf, PG_U_OTHER_LETTER},
+	{0x00aab0, 0x00aab0, PG_U_NON_SPACING_MARK},
+	{0x00aab1, 0x00aab1, PG_U_OTHER_LETTER},
+	{0x00aab2, 0x00aab4, PG_U_NON_SPACING_MARK},
+	{0x00aab5, 0x00aab6, PG_U_OTHER_LETTER},
+	{0x00aab7, 0x00aab8, PG_U_NON_SPACING_MARK},
+	{0x00aab9, 0x00aabd, PG_U_OTHER_LETTER},
+	{0x00aabe, 0x00aabf, PG_U_NON_SPACING_MARK},
+	{0x00aac0, 0x00aac0, PG_U_OTHER_LETTER},
+	{0x00aac1, 0x00aac1, PG_U_NON_SPACING_MARK},
+	{0x00aac2, 0x00aac2, PG_U_OTHER_LETTER},
+	{0x00aac3, 0x00aada, PG_U_UNASSIGNED},
+	{0x00aadb, 0x00aadc, PG_U_OTHER_LETTER},
+	{0x00aadd, 0x00aadd, PG_U_MODIFIER_LETTER},
+	{0x00aade, 0x00aadf, PG_U_OTHER_PUNCTUATION},
+	{0x00aae0, 0x00aaea, PG_U_OTHER_LETTER},
+	{0x00aaeb, 0x00aaeb, PG_U_COMBINING_SPACING_MARK},
+	{0x00aaec, 0x00aaed, PG_U_NON_SPACING_MARK},
+	{0x00aaee, 0x00aaef, PG_U_COMBINING_SPACING_MARK},
+	{0x00aaf0, 0x00aaf1, PG_U_OTHER_PUNCTUATION},
+	{0x00aaf2, 0x00aaf2, PG_U_OTHER_LETTER},
+	{0x00aaf3, 0x00aaf4, PG_U_MODIFIER_LETTER},
+	{0x00aaf5, 0x00aaf5, PG_U_COMBINING_SPACING_MARK},
+	{0x00aaf6, 0x00aaf6, PG_U_NON_SPACING_MARK},
+	{0x00aaf7, 0x00ab00, PG_U_UNASSIGNED},
+	{0x00ab01, 0x00ab06, PG_U_OTHER_LETTER},
+	{0x00ab07, 0x00ab08, PG_U_UNASSIGNED},
+	{0x00ab09, 0x00ab0e, PG_U_OTHER_LETTER},
+	{0x00ab0f, 0x00ab10, PG_U_UNASSIGNED},
+	{0x00ab11, 0x00ab16, PG_U_OTHER_LETTER},
+	{0x00ab17, 0x00ab1f, PG_U_UNASSIGNED},
+	{0x00ab20, 0x00ab26, PG_U_OTHER_LETTER},
+	{0x00ab27, 0x00ab27, PG_U_UNASSIGNED},
+	{0x00ab28, 0x00ab2e, PG_U_OTHER_LETTER},
+	{0x00ab2f, 0x00ab2f, PG_U_UNASSIGNED},
+	{0x00ab30, 0x00ab5a, PG_U_LOWERCASE_LETTER},
+	{0x00ab5b, 0x00ab5b, PG_U_MODIFIER_SYMBOL},
+	{0x00ab5c, 0x00ab5f, PG_U_MODIFIER_LETTER},
+	{0x00ab60, 0x00ab68, PG_U_LOWERCASE_LETTER},
+	{0x00ab69, 0x00ab69, PG_U_MODIFIER_LETTER},
+	{0x00ab6a, 0x00ab6b, PG_U_MODIFIER_SYMBOL},
+	{0x00ab6c, 0x00ab6f, PG_U_UNASSIGNED},
+	{0x00ab70, 0x00abbf, PG_U_LOWERCASE_LETTER},
+	{0x00abc0, 0x00abe2, PG_U_OTHER_LETTER},
+	{0x00abe3, 0x00abe4, PG_U_COMBINING_SPACING_MARK},
+	{0x00abe5, 0x00abe5, PG_U_NON_SPACING_MARK},
+	{0x00abe6, 0x00abe7, PG_U_COMBINING_SPACING_MARK},
+	{0x00abe8, 0x00abe8, PG_U_NON_SPACING_MARK},
+	{0x00abe9, 0x00abea, PG_U_COMBINING_SPACING_MARK},
+	{0x00abeb, 0x00abeb, PG_U_OTHER_PUNCTUATION},
+	{0x00abec, 0x00abec, PG_U_COMBINING_SPACING_MARK},
+	{0x00abed, 0x00abed, PG_U_NON_SPACING_MARK},
+	{0x00abee, 0x00abef, PG_U_UNASSIGNED},
+	{0x00abf0, 0x00abf9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00abfa, 0x00abff, PG_U_UNASSIGNED},
+	{0x00ac00, 0x00d7a3, PG_U_OTHER_LETTER},
+	{0x00d7a4, 0x00d7af, PG_U_UNASSIGNED},
+	{0x00d7b0, 0x00d7c6, PG_U_OTHER_LETTER},
+	{0x00d7c7, 0x00d7ca, PG_U_UNASSIGNED},
+	{0x00d7cb, 0x00d7fb, PG_U_OTHER_LETTER},
+	{0x00d7fc, 0x00d7ff, PG_U_UNASSIGNED},
+	{0x00d800, 0x00dfff, PG_U_SURROGATE},
+	{0x00e000, 0x00f8ff, PG_U_PRIVATE_USE_CHAR},
+	{0x00f900, 0x00fa6d, PG_U_OTHER_LETTER},
+	{0x00fa6e, 0x00fa6f, PG_U_UNASSIGNED},
+	{0x00fa70, 0x00fad9, PG_U_OTHER_LETTER},
+	{0x00fada, 0x00faff, PG_U_UNASSIGNED},
+	{0x00fb00, 0x00fb06, PG_U_LOWERCASE_LETTER},
+	{0x00fb07, 0x00fb12, PG_U_UNASSIGNED},
+	{0x00fb13, 0x00fb17, PG_U_LOWERCASE_LETTER},
+	{0x00fb18, 0x00fb1c, PG_U_UNASSIGNED},
+	{0x00fb1d, 0x00fb1d, PG_U_OTHER_LETTER},
+	{0x00fb1e, 0x00fb1e, PG_U_NON_SPACING_MARK},
+	{0x00fb1f, 0x00fb28, PG_U_OTHER_LETTER},
+	{0x00fb29, 0x00fb29, PG_U_MATH_SYMBOL},
+	{0x00fb2a, 0x00fb36, PG_U_OTHER_LETTER},
+	{0x00fb37, 0x00fb37, PG_U_UNASSIGNED},
+	{0x00fb38, 0x00fb3c, PG_U_OTHER_LETTER},
+	{0x00fb3d, 0x00fb3d, PG_U_UNASSIGNED},
+	{0x00fb3e, 0x00fb3e, PG_U_OTHER_LETTER},
+	{0x00fb3f, 0x00fb3f, PG_U_UNASSIGNED},
+	{0x00fb40, 0x00fb41, PG_U_OTHER_LETTER},
+	{0x00fb42, 0x00fb42, PG_U_UNASSIGNED},
+	{0x00fb43, 0x00fb44, PG_U_OTHER_LETTER},
+	{0x00fb45, 0x00fb45, PG_U_UNASSIGNED},
+	{0x00fb46, 0x00fbb1, PG_U_OTHER_LETTER},
+	{0x00fbb2, 0x00fbc2, PG_U_MODIFIER_SYMBOL},
+	{0x00fbc3, 0x00fbd2, PG_U_UNASSIGNED},
+	{0x00fbd3, 0x00fd3d, PG_U_OTHER_LETTER},
+	{0x00fd3e, 0x00fd3e, PG_U_END_PUNCTUATION},
+	{0x00fd3f, 0x00fd3f, PG_U_START_PUNCTUATION},
+	{0x00fd40, 0x00fd4f, PG_U_OTHER_SYMBOL},
+	{0x00fd50, 0x00fd8f, PG_U_OTHER_LETTER},
+	{0x00fd90, 0x00fd91, PG_U_UNASSIGNED},
+	{0x00fd92, 0x00fdc7, PG_U_OTHER_LETTER},
+	{0x00fdc8, 0x00fdce, PG_U_UNASSIGNED},
+	{0x00fdcf, 0x00fdcf, PG_U_OTHER_SYMBOL},
+	{0x00fdd0, 0x00fdef, PG_U_UNASSIGNED},
+	{0x00fdf0, 0x00fdfb, PG_U_OTHER_LETTER},
+	{0x00fdfc, 0x00fdfc, PG_U_CURRENCY_SYMBOL},
+	{0x00fdfd, 0x00fdff, PG_U_OTHER_SYMBOL},
+	{0x00fe00, 0x00fe0f, PG_U_NON_SPACING_MARK},
+	{0x00fe10, 0x00fe16, PG_U_OTHER_PUNCTUATION},
+	{0x00fe17, 0x00fe17, PG_U_START_PUNCTUATION},
+	{0x00fe18, 0x00fe18, PG_U_END_PUNCTUATION},
+	{0x00fe19, 0x00fe19, PG_U_OTHER_PUNCTUATION},
+	{0x00fe1a, 0x00fe1f, PG_U_UNASSIGNED},
+	{0x00fe20, 0x00fe2f, PG_U_NON_SPACING_MARK},
+	{0x00fe30, 0x00fe30, PG_U_OTHER_PUNCTUATION},
+	{0x00fe31, 0x00fe32, PG_U_DASH_PUNCTUATION},
+	{0x00fe33, 0x00fe34, PG_U_CONNECTOR_PUNCTUATION},
+	{0x00fe35, 0x00fe35, PG_U_START_PUNCTUATION},
+	{0x00fe36, 0x00fe36, PG_U_END_PUNCTUATION},
+	{0x00fe37, 0x00fe37, PG_U_START_PUNCTUATION},
+	{0x00fe38, 0x00fe38, PG_U_END_PUNCTUATION},
+	{0x00fe39, 0x00fe39, PG_U_START_PUNCTUATION},
+	{0x00fe3a, 0x00fe3a, PG_U_END_PUNCTUATION},
+	{0x00fe3b, 0x00fe3b, PG_U_START_PUNCTUATION},
+	{0x00fe3c, 0x00fe3c, PG_U_END_PUNCTUATION},
+	{0x00fe3d, 0x00fe3d, PG_U_START_PUNCTUATION},
+	{0x00fe3e, 0x00fe3e, PG_U_END_PUNCTUATION},
+	{0x00fe3f, 0x00fe3f, PG_U_START_PUNCTUATION},
+	{0x00fe40, 0x00fe40, PG_U_END_PUNCTUATION},
+	{0x00fe41, 0x00fe41, PG_U_START_PUNCTUATION},
+	{0x00fe42, 0x00fe42, PG_U_END_PUNCTUATION},
+	{0x00fe43, 0x00fe43, PG_U_START_PUNCTUATION},
+	{0x00fe44, 0x00fe44, PG_U_END_PUNCTUATION},
+	{0x00fe45, 0x00fe46, PG_U_OTHER_PUNCTUATION},
+	{0x00fe47, 0x00fe47, PG_U_START_PUNCTUATION},
+	{0x00fe48, 0x00fe48, PG_U_END_PUNCTUATION},
+	{0x00fe49, 0x00fe4c, PG_U_OTHER_PUNCTUATION},
+	{0x00fe4d, 0x00fe4f, PG_U_CONNECTOR_PUNCTUATION},
+	{0x00fe50, 0x00fe52, PG_U_OTHER_PUNCTUATION},
+	{0x00fe53, 0x00fe53, PG_U_UNASSIGNED},
+	{0x00fe54, 0x00fe57, PG_U_OTHER_PUNCTUATION},
+	{0x00fe58, 0x00fe58, PG_U_DASH_PUNCTUATION},
+	{0x00fe59, 0x00fe59, PG_U_START_PUNCTUATION},
+	{0x00fe5a, 0x00fe5a, PG_U_END_PUNCTUATION},
+	{0x00fe5b, 0x00fe5b, PG_U_START_PUNCTUATION},
+	{0x00fe5c, 0x00fe5c, PG_U_END_PUNCTUATION},
+	{0x00fe5d, 0x00fe5d, PG_U_START_PUNCTUATION},
+	{0x00fe5e, 0x00fe5e, PG_U_END_PUNCTUATION},
+	{0x00fe5f, 0x00fe61, PG_U_OTHER_PUNCTUATION},
+	{0x00fe62, 0x00fe62, PG_U_MATH_SYMBOL},
+	{0x00fe63, 0x00fe63, PG_U_DASH_PUNCTUATION},
+	{0x00fe64, 0x00fe66, PG_U_MATH_SYMBOL},
+	{0x00fe67, 0x00fe67, PG_U_UNASSIGNED},
+	{0x00fe68, 0x00fe68, PG_U_OTHER_PUNCTUATION},
+	{0x00fe69, 0x00fe69, PG_U_CURRENCY_SYMBOL},
+	{0x00fe6a, 0x00fe6b, PG_U_OTHER_PUNCTUATION},
+	{0x00fe6c, 0x00fe6f, PG_U_UNASSIGNED},
+	{0x00fe70, 0x00fe74, PG_U_OTHER_LETTER},
+	{0x00fe75, 0x00fe75, PG_U_UNASSIGNED},
+	{0x00fe76, 0x00fefc, PG_U_OTHER_LETTER},
+	{0x00fefd, 0x00fefe, PG_U_UNASSIGNED},
+	{0x00feff, 0x00feff, PG_U_FORMAT_CHAR},
+	{0x00ff00, 0x00ff00, PG_U_UNASSIGNED},
+	{0x00ff01, 0x00ff03, PG_U_OTHER_PUNCTUATION},
+	{0x00ff04, 0x00ff04, PG_U_CURRENCY_SYMBOL},
+	{0x00ff05, 0x00ff07, PG_U_OTHER_PUNCTUATION},
+	{0x00ff08, 0x00ff08, PG_U_START_PUNCTUATION},
+	{0x00ff09, 0x00ff09, PG_U_END_PUNCTUATION},
+	{0x00ff0a, 0x00ff0a, PG_U_OTHER_PUNCTUATION},
+	{0x00ff0b, 0x00ff0b, PG_U_MATH_SYMBOL},
+	{0x00ff0c, 0x00ff0c, PG_U_OTHER_PUNCTUATION},
+	{0x00ff0d, 0x00ff0d, PG_U_DASH_PUNCTUATION},
+	{0x00ff0e, 0x00ff0f, PG_U_OTHER_PUNCTUATION},
+	{0x00ff10, 0x00ff19, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00ff1a, 0x00ff1b, PG_U_OTHER_PUNCTUATION},
+	{0x00ff1c, 0x00ff1e, PG_U_MATH_SYMBOL},
+	{0x00ff1f, 0x00ff20, PG_U_OTHER_PUNCTUATION},
+	{0x00ff21, 0x00ff3a, PG_U_UPPERCASE_LETTER},
+	{0x00ff3b, 0x00ff3b, PG_U_START_PUNCTUATION},
+	{0x00ff3c, 0x00ff3c, PG_U_OTHER_PUNCTUATION},
+	{0x00ff3d, 0x00ff3d, PG_U_END_PUNCTUATION},
+	{0x00ff3e, 0x00ff3e, PG_U_MODIFIER_SYMBOL},
+	{0x00ff3f, 0x00ff3f, PG_U_CONNECTOR_PUNCTUATION},
+	{0x00ff40, 0x00ff40, PG_U_MODIFIER_SYMBOL},
+	{0x00ff41, 0x00ff5a, PG_U_LOWERCASE_LETTER},
+	{0x00ff5b, 0x00ff5b, PG_U_START_PUNCTUATION},
+	{0x00ff5c, 0x00ff5c, PG_U_MATH_SYMBOL},
+	{0x00ff5d, 0x00ff5d, PG_U_END_PUNCTUATION},
+	{0x00ff5e, 0x00ff5e, PG_U_MATH_SYMBOL},
+	{0x00ff5f, 0x00ff5f, PG_U_START_PUNCTUATION},
+	{0x00ff60, 0x00ff60, PG_U_END_PUNCTUATION},
+	{0x00ff61, 0x00ff61, PG_U_OTHER_PUNCTUATION},
+	{0x00ff62, 0x00ff62, PG_U_START_PUNCTUATION},
+	{0x00ff63, 0x00ff63, PG_U_END_PUNCTUATION},
+	{0x00ff64, 0x00ff65, PG_U_OTHER_PUNCTUATION},
+	{0x00ff66, 0x00ff6f, PG_U_OTHER_LETTER},
+	{0x00ff70, 0x00ff70, PG_U_MODIFIER_LETTER},
+	{0x00ff71, 0x00ff9d, PG_U_OTHER_LETTER},
+	{0x00ff9e, 0x00ff9f, PG_U_MODIFIER_LETTER},
+	{0x00ffa0, 0x00ffbe, PG_U_OTHER_LETTER},
+	{0x00ffbf, 0x00ffc1, PG_U_UNASSIGNED},
+	{0x00ffc2, 0x00ffc7, PG_U_OTHER_LETTER},
+	{0x00ffc8, 0x00ffc9, PG_U_UNASSIGNED},
+	{0x00ffca, 0x00ffcf, PG_U_OTHER_LETTER},
+	{0x00ffd0, 0x00ffd1, PG_U_UNASSIGNED},
+	{0x00ffd2, 0x00ffd7, PG_U_OTHER_LETTER},
+	{0x00ffd8, 0x00ffd9, PG_U_UNASSIGNED},
+	{0x00ffda, 0x00ffdc, PG_U_OTHER_LETTER},
+	{0x00ffdd, 0x00ffdf, PG_U_UNASSIGNED},
+	{0x00ffe0, 0x00ffe1, PG_U_CURRENCY_SYMBOL},
+	{0x00ffe2, 0x00ffe2, PG_U_MATH_SYMBOL},
+	{0x00ffe3, 0x00ffe3, PG_U_MODIFIER_SYMBOL},
+	{0x00ffe4, 0x00ffe4, PG_U_OTHER_SYMBOL},
+	{0x00ffe5, 0x00ffe6, PG_U_CURRENCY_SYMBOL},
+	{0x00ffe7, 0x00ffe7, PG_U_UNASSIGNED},
+	{0x00ffe8, 0x00ffe8, PG_U_OTHER_SYMBOL},
+	{0x00ffe9, 0x00ffec, PG_U_MATH_SYMBOL},
+	{0x00ffed, 0x00ffee, PG_U_OTHER_SYMBOL},
+	{0x00ffef, 0x00fff8, PG_U_UNASSIGNED},
+	{0x00fff9, 0x00fffb, PG_U_FORMAT_CHAR},
+	{0x00fffc, 0x00fffd, PG_U_OTHER_SYMBOL},
+	{0x00fffe, 0x00ffff, PG_U_UNASSIGNED},
+	{0x010000, 0x01000b, PG_U_OTHER_LETTER},
+	{0x01000c, 0x01000c, PG_U_UNASSIGNED},
+	{0x01000d, 0x010026, PG_U_OTHER_LETTER},
+	{0x010027, 0x010027, PG_U_UNASSIGNED},
+	{0x010028, 0x01003a, PG_U_OTHER_LETTER},
+	{0x01003b, 0x01003b, PG_U_UNASSIGNED},
+	{0x01003c, 0x01003d, PG_U_OTHER_LETTER},
+	{0x01003e, 0x01003e, PG_U_UNASSIGNED},
+	{0x01003f, 0x01004d, PG_U_OTHER_LETTER},
+	{0x01004e, 0x01004f, PG_U_UNASSIGNED},
+	{0x010050, 0x01005d, PG_U_OTHER_LETTER},
+	{0x01005e, 0x01007f, PG_U_UNASSIGNED},
+	{0x010080, 0x0100fa, PG_U_OTHER_LETTER},
+	{0x0100fb, 0x0100ff, PG_U_UNASSIGNED},
+	{0x010100, 0x010102, PG_U_OTHER_PUNCTUATION},
+	{0x010103, 0x010106, PG_U_UNASSIGNED},
+	{0x010107, 0x010133, PG_U_OTHER_NUMBER},
+	{0x010134, 0x010136, PG_U_UNASSIGNED},
+	{0x010137, 0x01013f, PG_U_OTHER_SYMBOL},
+	{0x010140, 0x010174, PG_U_LETTER_NUMBER},
+	{0x010175, 0x010178, PG_U_OTHER_NUMBER},
+	{0x010179, 0x010189, PG_U_OTHER_SYMBOL},
+	{0x01018a, 0x01018b, PG_U_OTHER_NUMBER},
+	{0x01018c, 0x01018e, PG_U_OTHER_SYMBOL},
+	{0x01018f, 0x01018f, PG_U_UNASSIGNED},
+	{0x010190, 0x01019c, PG_U_OTHER_SYMBOL},
+	{0x01019d, 0x01019f, PG_U_UNASSIGNED},
+	{0x0101a0, 0x0101a0, PG_U_OTHER_SYMBOL},
+	{0x0101a1, 0x0101cf, PG_U_UNASSIGNED},
+	{0x0101d0, 0x0101fc, PG_U_OTHER_SYMBOL},
+	{0x0101fd, 0x0101fd, PG_U_NON_SPACING_MARK},
+	{0x0101fe, 0x01027f, PG_U_UNASSIGNED},
+	{0x010280, 0x01029c, PG_U_OTHER_LETTER},
+	{0x01029d, 0x01029f, PG_U_UNASSIGNED},
+	{0x0102a0, 0x0102d0, PG_U_OTHER_LETTER},
+	{0x0102d1, 0x0102df, PG_U_UNASSIGNED},
+	{0x0102e0, 0x0102e0, PG_U_NON_SPACING_MARK},
+	{0x0102e1, 0x0102fb, PG_U_OTHER_NUMBER},
+	{0x0102fc, 0x0102ff, PG_U_UNASSIGNED},
+	{0x010300, 0x01031f, PG_U_OTHER_LETTER},
+	{0x010320, 0x010323, PG_U_OTHER_NUMBER},
+	{0x010324, 0x01032c, PG_U_UNASSIGNED},
+	{0x01032d, 0x010340, PG_U_OTHER_LETTER},
+	{0x010341, 0x010341, PG_U_LETTER_NUMBER},
+	{0x010342, 0x010349, PG_U_OTHER_LETTER},
+	{0x01034a, 0x01034a, PG_U_LETTER_NUMBER},
+	{0x01034b, 0x01034f, PG_U_UNASSIGNED},
+	{0x010350, 0x010375, PG_U_OTHER_LETTER},
+	{0x010376, 0x01037a, PG_U_NON_SPACING_MARK},
+	{0x01037b, 0x01037f, PG_U_UNASSIGNED},
+	{0x010380, 0x01039d, PG_U_OTHER_LETTER},
+	{0x01039e, 0x01039e, PG_U_UNASSIGNED},
+	{0x01039f, 0x01039f, PG_U_OTHER_PUNCTUATION},
+	{0x0103a0, 0x0103c3, PG_U_OTHER_LETTER},
+	{0x0103c4, 0x0103c7, PG_U_UNASSIGNED},
+	{0x0103c8, 0x0103cf, PG_U_OTHER_LETTER},
+	{0x0103d0, 0x0103d0, PG_U_OTHER_PUNCTUATION},
+	{0x0103d1, 0x0103d5, PG_U_LETTER_NUMBER},
+	{0x0103d6, 0x0103ff, PG_U_UNASSIGNED},
+	{0x010400, 0x010427, PG_U_UPPERCASE_LETTER},
+	{0x010428, 0x01044f, PG_U_LOWERCASE_LETTER},
+	{0x010450, 0x01049d, PG_U_OTHER_LETTER},
+	{0x01049e, 0x01049f, PG_U_UNASSIGNED},
+	{0x0104a0, 0x0104a9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0104aa, 0x0104af, PG_U_UNASSIGNED},
+	{0x0104b0, 0x0104d3, PG_U_UPPERCASE_LETTER},
+	{0x0104d4, 0x0104d7, PG_U_UNASSIGNED},
+	{0x0104d8, 0x0104fb, PG_U_LOWERCASE_LETTER},
+	{0x0104fc, 0x0104ff, PG_U_UNASSIGNED},
+	{0x010500, 0x010527, PG_U_OTHER_LETTER},
+	{0x010528, 0x01052f, PG_U_UNASSIGNED},
+	{0x010530, 0x010563, PG_U_OTHER_LETTER},
+	{0x010564, 0x01056e, PG_U_UNASSIGNED},
+	{0x01056f, 0x01056f, PG_U_OTHER_PUNCTUATION},
+	{0x010570, 0x01057a, PG_U_UPPERCASE_LETTER},
+	{0x01057b, 0x01057b, PG_U_UNASSIGNED},
+	{0x01057c, 0x01058a, PG_U_UPPERCASE_LETTER},
+	{0x01058b, 0x01058b, PG_U_UNASSIGNED},
+	{0x01058c, 0x010592, PG_U_UPPERCASE_LETTER},
+	{0x010593, 0x010593, PG_U_UNASSIGNED},
+	{0x010594, 0x010595, PG_U_UPPERCASE_LETTER},
+	{0x010596, 0x010596, PG_U_UNASSIGNED},
+	{0x010597, 0x0105a1, PG_U_LOWERCASE_LETTER},
+	{0x0105a2, 0x0105a2, PG_U_UNASSIGNED},
+	{0x0105a3, 0x0105b1, PG_U_LOWERCASE_LETTER},
+	{0x0105b2, 0x0105b2, PG_U_UNASSIGNED},
+	{0x0105b3, 0x0105b9, PG_U_LOWERCASE_LETTER},
+	{0x0105ba, 0x0105ba, PG_U_UNASSIGNED},
+	{0x0105bb, 0x0105bc, PG_U_LOWERCASE_LETTER},
+	{0x0105bd, 0x0105ff, PG_U_UNASSIGNED},
+	{0x010600, 0x010736, PG_U_OTHER_LETTER},
+	{0x010737, 0x01073f, PG_U_UNASSIGNED},
+	{0x010740, 0x010755, PG_U_OTHER_LETTER},
+	{0x010756, 0x01075f, PG_U_UNASSIGNED},
+	{0x010760, 0x010767, PG_U_OTHER_LETTER},
+	{0x010768, 0x01077f, PG_U_UNASSIGNED},
+	{0x010780, 0x010785, PG_U_MODIFIER_LETTER},
+	{0x010786, 0x010786, PG_U_UNASSIGNED},
+	{0x010787, 0x0107b0, PG_U_MODIFIER_LETTER},
+	{0x0107b1, 0x0107b1, PG_U_UNASSIGNED},
+	{0x0107b2, 0x0107ba, PG_U_MODIFIER_LETTER},
+	{0x0107bb, 0x0107ff, PG_U_UNASSIGNED},
+	{0x010800, 0x010805, PG_U_OTHER_LETTER},
+	{0x010806, 0x010807, PG_U_UNASSIGNED},
+	{0x010808, 0x010808, PG_U_OTHER_LETTER},
+	{0x010809, 0x010809, PG_U_UNASSIGNED},
+	{0x01080a, 0x010835, PG_U_OTHER_LETTER},
+	{0x010836, 0x010836, PG_U_UNASSIGNED},
+	{0x010837, 0x010838, PG_U_OTHER_LETTER},
+	{0x010839, 0x01083b, PG_U_UNASSIGNED},
+	{0x01083c, 0x01083c, PG_U_OTHER_LETTER},
+	{0x01083d, 0x01083e, PG_U_UNASSIGNED},
+	{0x01083f, 0x010855, PG_U_OTHER_LETTER},
+	{0x010856, 0x010856, PG_U_UNASSIGNED},
+	{0x010857, 0x010857, PG_U_OTHER_PUNCTUATION},
+	{0x010858, 0x01085f, PG_U_OTHER_NUMBER},
+	{0x010860, 0x010876, PG_U_OTHER_LETTER},
+	{0x010877, 0x010878, PG_U_OTHER_SYMBOL},
+	{0x010879, 0x01087f, PG_U_OTHER_NUMBER},
+	{0x010880, 0x01089e, PG_U_OTHER_LETTER},
+	{0x01089f, 0x0108a6, PG_U_UNASSIGNED},
+	{0x0108a7, 0x0108af, PG_U_OTHER_NUMBER},
+	{0x0108b0, 0x0108df, PG_U_UNASSIGNED},
+	{0x0108e0, 0x0108f2, PG_U_OTHER_LETTER},
+	{0x0108f3, 0x0108f3, PG_U_UNASSIGNED},
+	{0x0108f4, 0x0108f5, PG_U_OTHER_LETTER},
+	{0x0108f6, 0x0108fa, PG_U_UNASSIGNED},
+	{0x0108fb, 0x0108ff, PG_U_OTHER_NUMBER},
+	{0x010900, 0x010915, PG_U_OTHER_LETTER},
+	{0x010916, 0x01091b, PG_U_OTHER_NUMBER},
+	{0x01091c, 0x01091e, PG_U_UNASSIGNED},
+	{0x01091f, 0x01091f, PG_U_OTHER_PUNCTUATION},
+	{0x010920, 0x010939, PG_U_OTHER_LETTER},
+	{0x01093a, 0x01093e, PG_U_UNASSIGNED},
+	{0x01093f, 0x01093f, PG_U_OTHER_PUNCTUATION},
+	{0x010940, 0x01097f, PG_U_UNASSIGNED},
+	{0x010980, 0x0109b7, PG_U_OTHER_LETTER},
+	{0x0109b8, 0x0109bb, PG_U_UNASSIGNED},
+	{0x0109bc, 0x0109bd, PG_U_OTHER_NUMBER},
+	{0x0109be, 0x0109bf, PG_U_OTHER_LETTER},
+	{0x0109c0, 0x0109cf, PG_U_OTHER_NUMBER},
+	{0x0109d0, 0x0109d1, PG_U_UNASSIGNED},
+	{0x0109d2, 0x0109ff, PG_U_OTHER_NUMBER},
+	{0x010a00, 0x010a00, PG_U_OTHER_LETTER},
+	{0x010a01, 0x010a03, PG_U_NON_SPACING_MARK},
+	{0x010a04, 0x010a04, PG_U_UNASSIGNED},
+	{0x010a05, 0x010a06, PG_U_NON_SPACING_MARK},
+	{0x010a07, 0x010a0b, PG_U_UNASSIGNED},
+	{0x010a0c, 0x010a0f, PG_U_NON_SPACING_MARK},
+	{0x010a10, 0x010a13, PG_U_OTHER_LETTER},
+	{0x010a14, 0x010a14, PG_U_UNASSIGNED},
+	{0x010a15, 0x010a17, PG_U_OTHER_LETTER},
+	{0x010a18, 0x010a18, PG_U_UNASSIGNED},
+	{0x010a19, 0x010a35, PG_U_OTHER_LETTER},
+	{0x010a36, 0x010a37, PG_U_UNASSIGNED},
+	{0x010a38, 0x010a3a, PG_U_NON_SPACING_MARK},
+	{0x010a3b, 0x010a3e, PG_U_UNASSIGNED},
+	{0x010a3f, 0x010a3f, PG_U_NON_SPACING_MARK},
+	{0x010a40, 0x010a48, PG_U_OTHER_NUMBER},
+	{0x010a49, 0x010a4f, PG_U_UNASSIGNED},
+	{0x010a50, 0x010a58, PG_U_OTHER_PUNCTUATION},
+	{0x010a59, 0x010a5f, PG_U_UNASSIGNED},
+	{0x010a60, 0x010a7c, PG_U_OTHER_LETTER},
+	{0x010a7d, 0x010a7e, PG_U_OTHER_NUMBER},
+	{0x010a7f, 0x010a7f, PG_U_OTHER_PUNCTUATION},
+	{0x010a80, 0x010a9c, PG_U_OTHER_LETTER},
+	{0x010a9d, 0x010a9f, PG_U_OTHER_NUMBER},
+	{0x010aa0, 0x010abf, PG_U_UNASSIGNED},
+	{0x010ac0, 0x010ac7, PG_U_OTHER_LETTER},
+	{0x010ac8, 0x010ac8, PG_U_OTHER_SYMBOL},
+	{0x010ac9, 0x010ae4, PG_U_OTHER_LETTER},
+	{0x010ae5, 0x010ae6, PG_U_NON_SPACING_MARK},
+	{0x010ae7, 0x010aea, PG_U_UNASSIGNED},
+	{0x010aeb, 0x010aef, PG_U_OTHER_NUMBER},
+	{0x010af0, 0x010af6, PG_U_OTHER_PUNCTUATION},
+	{0x010af7, 0x010aff, PG_U_UNASSIGNED},
+	{0x010b00, 0x010b35, PG_U_OTHER_LETTER},
+	{0x010b36, 0x010b38, PG_U_UNASSIGNED},
+	{0x010b39, 0x010b3f, PG_U_OTHER_PUNCTUATION},
+	{0x010b40, 0x010b55, PG_U_OTHER_LETTER},
+	{0x010b56, 0x010b57, PG_U_UNASSIGNED},
+	{0x010b58, 0x010b5f, PG_U_OTHER_NUMBER},
+	{0x010b60, 0x010b72, PG_U_OTHER_LETTER},
+	{0x010b73, 0x010b77, PG_U_UNASSIGNED},
+	{0x010b78, 0x010b7f, PG_U_OTHER_NUMBER},
+	{0x010b80, 0x010b91, PG_U_OTHER_LETTER},
+	{0x010b92, 0x010b98, PG_U_UNASSIGNED},
+	{0x010b99, 0x010b9c, PG_U_OTHER_PUNCTUATION},
+	{0x010b9d, 0x010ba8, PG_U_UNASSIGNED},
+	{0x010ba9, 0x010baf, PG_U_OTHER_NUMBER},
+	{0x010bb0, 0x010bff, PG_U_UNASSIGNED},
+	{0x010c00, 0x010c48, PG_U_OTHER_LETTER},
+	{0x010c49, 0x010c7f, PG_U_UNASSIGNED},
+	{0x010c80, 0x010cb2, PG_U_UPPERCASE_LETTER},
+	{0x010cb3, 0x010cbf, PG_U_UNASSIGNED},
+	{0x010cc0, 0x010cf2, PG_U_LOWERCASE_LETTER},
+	{0x010cf3, 0x010cf9, PG_U_UNASSIGNED},
+	{0x010cfa, 0x010cff, PG_U_OTHER_NUMBER},
+	{0x010d00, 0x010d23, PG_U_OTHER_LETTER},
+	{0x010d24, 0x010d27, PG_U_NON_SPACING_MARK},
+	{0x010d28, 0x010d2f, PG_U_UNASSIGNED},
+	{0x010d30, 0x010d39, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x010d3a, 0x010e5f, PG_U_UNASSIGNED},
+	{0x010e60, 0x010e7e, PG_U_OTHER_NUMBER},
+	{0x010e7f, 0x010e7f, PG_U_UNASSIGNED},
+	{0x010e80, 0x010ea9, PG_U_OTHER_LETTER},
+	{0x010eaa, 0x010eaa, PG_U_UNASSIGNED},
+	{0x010eab, 0x010eac, PG_U_NON_SPACING_MARK},
+	{0x010ead, 0x010ead, PG_U_DASH_PUNCTUATION},
+	{0x010eae, 0x010eaf, PG_U_UNASSIGNED},
+	{0x010eb0, 0x010eb1, PG_U_OTHER_LETTER},
+	{0x010eb2, 0x010efc, PG_U_UNASSIGNED},
+	{0x010efd, 0x010eff, PG_U_NON_SPACING_MARK},
+	{0x010f00, 0x010f1c, PG_U_OTHER_LETTER},
+	{0x010f1d, 0x010f26, PG_U_OTHER_NUMBER},
+	{0x010f27, 0x010f27, PG_U_OTHER_LETTER},
+	{0x010f28, 0x010f2f, PG_U_UNASSIGNED},
+	{0x010f30, 0x010f45, PG_U_OTHER_LETTER},
+	{0x010f46, 0x010f50, PG_U_NON_SPACING_MARK},
+	{0x010f51, 0x010f54, PG_U_OTHER_NUMBER},
+	{0x010f55, 0x010f59, PG_U_OTHER_PUNCTUATION},
+	{0x010f5a, 0x010f6f, PG_U_UNASSIGNED},
+	{0x010f70, 0x010f81, PG_U_OTHER_LETTER},
+	{0x010f82, 0x010f85, PG_U_NON_SPACING_MARK},
+	{0x010f86, 0x010f89, PG_U_OTHER_PUNCTUATION},
+	{0x010f8a, 0x010faf, PG_U_UNASSIGNED},
+	{0x010fb0, 0x010fc4, PG_U_OTHER_LETTER},
+	{0x010fc5, 0x010fcb, PG_U_OTHER_NUMBER},
+	{0x010fcc, 0x010fdf, PG_U_UNASSIGNED},
+	{0x010fe0, 0x010ff6, PG_U_OTHER_LETTER},
+	{0x010ff7, 0x010fff, PG_U_UNASSIGNED},
+	{0x011000, 0x011000, PG_U_COMBINING_SPACING_MARK},
+	{0x011001, 0x011001, PG_U_NON_SPACING_MARK},
+	{0x011002, 0x011002, PG_U_COMBINING_SPACING_MARK},
+	{0x011003, 0x011037, PG_U_OTHER_LETTER},
+	{0x011038, 0x011046, PG_U_NON_SPACING_MARK},
+	{0x011047, 0x01104d, PG_U_OTHER_PUNCTUATION},
+	{0x01104e, 0x011051, PG_U_UNASSIGNED},
+	{0x011052, 0x011065, PG_U_OTHER_NUMBER},
+	{0x011066, 0x01106f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011070, 0x011070, PG_U_NON_SPACING_MARK},
+	{0x011071, 0x011072, PG_U_OTHER_LETTER},
+	{0x011073, 0x011074, PG_U_NON_SPACING_MARK},
+	{0x011075, 0x011075, PG_U_OTHER_LETTER},
+	{0x011076, 0x01107e, PG_U_UNASSIGNED},
+	{0x01107f, 0x011081, PG_U_NON_SPACING_MARK},
+	{0x011082, 0x011082, PG_U_COMBINING_SPACING_MARK},
+	{0x011083, 0x0110af, PG_U_OTHER_LETTER},
+	{0x0110b0, 0x0110b2, PG_U_COMBINING_SPACING_MARK},
+	{0x0110b3, 0x0110b6, PG_U_NON_SPACING_MARK},
+	{0x0110b7, 0x0110b8, PG_U_COMBINING_SPACING_MARK},
+	{0x0110b9, 0x0110ba, PG_U_NON_SPACING_MARK},
+	{0x0110bb, 0x0110bc, PG_U_OTHER_PUNCTUATION},
+	{0x0110bd, 0x0110bd, PG_U_FORMAT_CHAR},
+	{0x0110be, 0x0110c1, PG_U_OTHER_PUNCTUATION},
+	{0x0110c2, 0x0110c2, PG_U_NON_SPACING_MARK},
+	{0x0110c3, 0x0110cc, PG_U_UNASSIGNED},
+	{0x0110cd, 0x0110cd, PG_U_FORMAT_CHAR},
+	{0x0110ce, 0x0110cf, PG_U_UNASSIGNED},
+	{0x0110d0, 0x0110e8, PG_U_OTHER_LETTER},
+	{0x0110e9, 0x0110ef, PG_U_UNASSIGNED},
+	{0x0110f0, 0x0110f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0110fa, 0x0110ff, PG_U_UNASSIGNED},
+	{0x011100, 0x011102, PG_U_NON_SPACING_MARK},
+	{0x011103, 0x011126, PG_U_OTHER_LETTER},
+	{0x011127, 0x01112b, PG_U_NON_SPACING_MARK},
+	{0x01112c, 0x01112c, PG_U_COMBINING_SPACING_MARK},
+	{0x01112d, 0x011134, PG_U_NON_SPACING_MARK},
+	{0x011135, 0x011135, PG_U_UNASSIGNED},
+	{0x011136, 0x01113f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011140, 0x011143, PG_U_OTHER_PUNCTUATION},
+	{0x011144, 0x011144, PG_U_OTHER_LETTER},
+	{0x011145, 0x011146, PG_U_COMBINING_SPACING_MARK},
+	{0x011147, 0x011147, PG_U_OTHER_LETTER},
+	{0x011148, 0x01114f, PG_U_UNASSIGNED},
+	{0x011150, 0x011172, PG_U_OTHER_LETTER},
+	{0x011173, 0x011173, PG_U_NON_SPACING_MARK},
+	{0x011174, 0x011175, PG_U_OTHER_PUNCTUATION},
+	{0x011176, 0x011176, PG_U_OTHER_LETTER},
+	{0x011177, 0x01117f, PG_U_UNASSIGNED},
+	{0x011180, 0x011181, PG_U_NON_SPACING_MARK},
+	{0x011182, 0x011182, PG_U_COMBINING_SPACING_MARK},
+	{0x011183, 0x0111b2, PG_U_OTHER_LETTER},
+	{0x0111b3, 0x0111b5, PG_U_COMBINING_SPACING_MARK},
+	{0x0111b6, 0x0111be, PG_U_NON_SPACING_MARK},
+	{0x0111bf, 0x0111c0, PG_U_COMBINING_SPACING_MARK},
+	{0x0111c1, 0x0111c4, PG_U_OTHER_LETTER},
+	{0x0111c5, 0x0111c8, PG_U_OTHER_PUNCTUATION},
+	{0x0111c9, 0x0111cc, PG_U_NON_SPACING_MARK},
+	{0x0111cd, 0x0111cd, PG_U_OTHER_PUNCTUATION},
+	{0x0111ce, 0x0111ce, PG_U_COMBINING_SPACING_MARK},
+	{0x0111cf, 0x0111cf, PG_U_NON_SPACING_MARK},
+	{0x0111d0, 0x0111d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0111da, 0x0111da, PG_U_OTHER_LETTER},
+	{0x0111db, 0x0111db, PG_U_OTHER_PUNCTUATION},
+	{0x0111dc, 0x0111dc, PG_U_OTHER_LETTER},
+	{0x0111dd, 0x0111df, PG_U_OTHER_PUNCTUATION},
+	{0x0111e0, 0x0111e0, PG_U_UNASSIGNED},
+	{0x0111e1, 0x0111f4, PG_U_OTHER_NUMBER},
+	{0x0111f5, 0x0111ff, PG_U_UNASSIGNED},
+	{0x011200, 0x011211, PG_U_OTHER_LETTER},
+	{0x011212, 0x011212, PG_U_UNASSIGNED},
+	{0x011213, 0x01122b, PG_U_OTHER_LETTER},
+	{0x01122c, 0x01122e, PG_U_COMBINING_SPACING_MARK},
+	{0x01122f, 0x011231, PG_U_NON_SPACING_MARK},
+	{0x011232, 0x011233, PG_U_COMBINING_SPACING_MARK},
+	{0x011234, 0x011234, PG_U_NON_SPACING_MARK},
+	{0x011235, 0x011235, PG_U_COMBINING_SPACING_MARK},
+	{0x011236, 0x011237, PG_U_NON_SPACING_MARK},
+	{0x011238, 0x01123d, PG_U_OTHER_PUNCTUATION},
+	{0x01123e, 0x01123e, PG_U_NON_SPACING_MARK},
+	{0x01123f, 0x011240, PG_U_OTHER_LETTER},
+	{0x011241, 0x011241, PG_U_NON_SPACING_MARK},
+	{0x011242, 0x01127f, PG_U_UNASSIGNED},
+	{0x011280, 0x011286, PG_U_OTHER_LETTER},
+	{0x011287, 0x011287, PG_U_UNASSIGNED},
+	{0x011288, 0x011288, PG_U_OTHER_LETTER},
+	{0x011289, 0x011289, PG_U_UNASSIGNED},
+	{0x01128a, 0x01128d, PG_U_OTHER_LETTER},
+	{0x01128e, 0x01128e, PG_U_UNASSIGNED},
+	{0x01128f, 0x01129d, PG_U_OTHER_LETTER},
+	{0x01129e, 0x01129e, PG_U_UNASSIGNED},
+	{0x01129f, 0x0112a8, PG_U_OTHER_LETTER},
+	{0x0112a9, 0x0112a9, PG_U_OTHER_PUNCTUATION},
+	{0x0112aa, 0x0112af, PG_U_UNASSIGNED},
+	{0x0112b0, 0x0112de, PG_U_OTHER_LETTER},
+	{0x0112df, 0x0112df, PG_U_NON_SPACING_MARK},
+	{0x0112e0, 0x0112e2, PG_U_COMBINING_SPACING_MARK},
+	{0x0112e3, 0x0112ea, PG_U_NON_SPACING_MARK},
+	{0x0112eb, 0x0112ef, PG_U_UNASSIGNED},
+	{0x0112f0, 0x0112f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0112fa, 0x0112ff, PG_U_UNASSIGNED},
+	{0x011300, 0x011301, PG_U_NON_SPACING_MARK},
+	{0x011302, 0x011303, PG_U_COMBINING_SPACING_MARK},
+	{0x011304, 0x011304, PG_U_UNASSIGNED},
+	{0x011305, 0x01130c, PG_U_OTHER_LETTER},
+	{0x01130d, 0x01130e, PG_U_UNASSIGNED},
+	{0x01130f, 0x011310, PG_U_OTHER_LETTER},
+	{0x011311, 0x011312, PG_U_UNASSIGNED},
+	{0x011313, 0x011328, PG_U_OTHER_LETTER},
+	{0x011329, 0x011329, PG_U_UNASSIGNED},
+	{0x01132a, 0x011330, PG_U_OTHER_LETTER},
+	{0x011331, 0x011331, PG_U_UNASSIGNED},
+	{0x011332, 0x011333, PG_U_OTHER_LETTER},
+	{0x011334, 0x011334, PG_U_UNASSIGNED},
+	{0x011335, 0x011339, PG_U_OTHER_LETTER},
+	{0x01133a, 0x01133a, PG_U_UNASSIGNED},
+	{0x01133b, 0x01133c, PG_U_NON_SPACING_MARK},
+	{0x01133d, 0x01133d, PG_U_OTHER_LETTER},
+	{0x01133e, 0x01133f, PG_U_COMBINING_SPACING_MARK},
+	{0x011340, 0x011340, PG_U_NON_SPACING_MARK},
+	{0x011341, 0x011344, PG_U_COMBINING_SPACING_MARK},
+	{0x011345, 0x011346, PG_U_UNASSIGNED},
+	{0x011347, 0x011348, PG_U_COMBINING_SPACING_MARK},
+	{0x011349, 0x01134a, PG_U_UNASSIGNED},
+	{0x01134b, 0x01134d, PG_U_COMBINING_SPACING_MARK},
+	{0x01134e, 0x01134f, PG_U_UNASSIGNED},
+	{0x011350, 0x011350, PG_U_OTHER_LETTER},
+	{0x011351, 0x011356, PG_U_UNASSIGNED},
+	{0x011357, 0x011357, PG_U_COMBINING_SPACING_MARK},
+	{0x011358, 0x01135c, PG_U_UNASSIGNED},
+	{0x01135d, 0x011361, PG_U_OTHER_LETTER},
+	{0x011362, 0x011363, PG_U_COMBINING_SPACING_MARK},
+	{0x011364, 0x011365, PG_U_UNASSIGNED},
+	{0x011366, 0x01136c, PG_U_NON_SPACING_MARK},
+	{0x01136d, 0x01136f, PG_U_UNASSIGNED},
+	{0x011370, 0x011374, PG_U_NON_SPACING_MARK},
+	{0x011375, 0x0113ff, PG_U_UNASSIGNED},
+	{0x011400, 0x011434, PG_U_OTHER_LETTER},
+	{0x011435, 0x011437, PG_U_COMBINING_SPACING_MARK},
+	{0x011438, 0x01143f, PG_U_NON_SPACING_MARK},
+	{0x011440, 0x011441, PG_U_COMBINING_SPACING_MARK},
+	{0x011442, 0x011444, PG_U_NON_SPACING_MARK},
+	{0x011445, 0x011445, PG_U_COMBINING_SPACING_MARK},
+	{0x011446, 0x011446, PG_U_NON_SPACING_MARK},
+	{0x011447, 0x01144a, PG_U_OTHER_LETTER},
+	{0x01144b, 0x01144f, PG_U_OTHER_PUNCTUATION},
+	{0x011450, 0x011459, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01145a, 0x01145b, PG_U_OTHER_PUNCTUATION},
+	{0x01145c, 0x01145c, PG_U_UNASSIGNED},
+	{0x01145d, 0x01145d, PG_U_OTHER_PUNCTUATION},
+	{0x01145e, 0x01145e, PG_U_NON_SPACING_MARK},
+	{0x01145f, 0x011461, PG_U_OTHER_LETTER},
+	{0x011462, 0x01147f, PG_U_UNASSIGNED},
+	{0x011480, 0x0114af, PG_U_OTHER_LETTER},
+	{0x0114b0, 0x0114b2, PG_U_COMBINING_SPACING_MARK},
+	{0x0114b3, 0x0114b8, PG_U_NON_SPACING_MARK},
+	{0x0114b9, 0x0114b9, PG_U_COMBINING_SPACING_MARK},
+	{0x0114ba, 0x0114ba, PG_U_NON_SPACING_MARK},
+	{0x0114bb, 0x0114be, PG_U_COMBINING_SPACING_MARK},
+	{0x0114bf, 0x0114c0, PG_U_NON_SPACING_MARK},
+	{0x0114c1, 0x0114c1, PG_U_COMBINING_SPACING_MARK},
+	{0x0114c2, 0x0114c3, PG_U_NON_SPACING_MARK},
+	{0x0114c4, 0x0114c5, PG_U_OTHER_LETTER},
+	{0x0114c6, 0x0114c6, PG_U_OTHER_PUNCTUATION},
+	{0x0114c7, 0x0114c7, PG_U_OTHER_LETTER},
+	{0x0114c8, 0x0114cf, PG_U_UNASSIGNED},
+	{0x0114d0, 0x0114d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0114da, 0x01157f, PG_U_UNASSIGNED},
+	{0x011580, 0x0115ae, PG_U_OTHER_LETTER},
+	{0x0115af, 0x0115b1, PG_U_COMBINING_SPACING_MARK},
+	{0x0115b2, 0x0115b5, PG_U_NON_SPACING_MARK},
+	{0x0115b6, 0x0115b7, PG_U_UNASSIGNED},
+	{0x0115b8, 0x0115bb, PG_U_COMBINING_SPACING_MARK},
+	{0x0115bc, 0x0115bd, PG_U_NON_SPACING_MARK},
+	{0x0115be, 0x0115be, PG_U_COMBINING_SPACING_MARK},
+	{0x0115bf, 0x0115c0, PG_U_NON_SPACING_MARK},
+	{0x0115c1, 0x0115d7, PG_U_OTHER_PUNCTUATION},
+	{0x0115d8, 0x0115db, PG_U_OTHER_LETTER},
+	{0x0115dc, 0x0115dd, PG_U_NON_SPACING_MARK},
+	{0x0115de, 0x0115ff, PG_U_UNASSIGNED},
+	{0x011600, 0x01162f, PG_U_OTHER_LETTER},
+	{0x011630, 0x011632, PG_U_COMBINING_SPACING_MARK},
+	{0x011633, 0x01163a, PG_U_NON_SPACING_MARK},
+	{0x01163b, 0x01163c, PG_U_COMBINING_SPACING_MARK},
+	{0x01163d, 0x01163d, PG_U_NON_SPACING_MARK},
+	{0x01163e, 0x01163e, PG_U_COMBINING_SPACING_MARK},
+	{0x01163f, 0x011640, PG_U_NON_SPACING_MARK},
+	{0x011641, 0x011643, PG_U_OTHER_PUNCTUATION},
+	{0x011644, 0x011644, PG_U_OTHER_LETTER},
+	{0x011645, 0x01164f, PG_U_UNASSIGNED},
+	{0x011650, 0x011659, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01165a, 0x01165f, PG_U_UNASSIGNED},
+	{0x011660, 0x01166c, PG_U_OTHER_PUNCTUATION},
+	{0x01166d, 0x01167f, PG_U_UNASSIGNED},
+	{0x011680, 0x0116aa, PG_U_OTHER_LETTER},
+	{0x0116ab, 0x0116ab, PG_U_NON_SPACING_MARK},
+	{0x0116ac, 0x0116ac, PG_U_COMBINING_SPACING_MARK},
+	{0x0116ad, 0x0116ad, PG_U_NON_SPACING_MARK},
+	{0x0116ae, 0x0116af, PG_U_COMBINING_SPACING_MARK},
+	{0x0116b0, 0x0116b5, PG_U_NON_SPACING_MARK},
+	{0x0116b6, 0x0116b6, PG_U_COMBINING_SPACING_MARK},
+	{0x0116b7, 0x0116b7, PG_U_NON_SPACING_MARK},
+	{0x0116b8, 0x0116b8, PG_U_OTHER_LETTER},
+	{0x0116b9, 0x0116b9, PG_U_OTHER_PUNCTUATION},
+	{0x0116ba, 0x0116bf, PG_U_UNASSIGNED},
+	{0x0116c0, 0x0116c9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0116ca, 0x0116ff, PG_U_UNASSIGNED},
+	{0x011700, 0x01171a, PG_U_OTHER_LETTER},
+	{0x01171b, 0x01171c, PG_U_UNASSIGNED},
+	{0x01171d, 0x01171f, PG_U_NON_SPACING_MARK},
+	{0x011720, 0x011721, PG_U_COMBINING_SPACING_MARK},
+	{0x011722, 0x011725, PG_U_NON_SPACING_MARK},
+	{0x011726, 0x011726, PG_U_COMBINING_SPACING_MARK},
+	{0x011727, 0x01172b, PG_U_NON_SPACING_MARK},
+	{0x01172c, 0x01172f, PG_U_UNASSIGNED},
+	{0x011730, 0x011739, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01173a, 0x01173b, PG_U_OTHER_NUMBER},
+	{0x01173c, 0x01173e, PG_U_OTHER_PUNCTUATION},
+	{0x01173f, 0x01173f, PG_U_OTHER_SYMBOL},
+	{0x011740, 0x011746, PG_U_OTHER_LETTER},
+	{0x011747, 0x0117ff, PG_U_UNASSIGNED},
+	{0x011800, 0x01182b, PG_U_OTHER_LETTER},
+	{0x01182c, 0x01182e, PG_U_COMBINING_SPACING_MARK},
+	{0x01182f, 0x011837, PG_U_NON_SPACING_MARK},
+	{0x011838, 0x011838, PG_U_COMBINING_SPACING_MARK},
+	{0x011839, 0x01183a, PG_U_NON_SPACING_MARK},
+	{0x01183b, 0x01183b, PG_U_OTHER_PUNCTUATION},
+	{0x01183c, 0x01189f, PG_U_UNASSIGNED},
+	{0x0118a0, 0x0118bf, PG_U_UPPERCASE_LETTER},
+	{0x0118c0, 0x0118df, PG_U_LOWERCASE_LETTER},
+	{0x0118e0, 0x0118e9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0118ea, 0x0118f2, PG_U_OTHER_NUMBER},
+	{0x0118f3, 0x0118fe, PG_U_UNASSIGNED},
+	{0x0118ff, 0x011906, PG_U_OTHER_LETTER},
+	{0x011907, 0x011908, PG_U_UNASSIGNED},
+	{0x011909, 0x011909, PG_U_OTHER_LETTER},
+	{0x01190a, 0x01190b, PG_U_UNASSIGNED},
+	{0x01190c, 0x011913, PG_U_OTHER_LETTER},
+	{0x011914, 0x011914, PG_U_UNASSIGNED},
+	{0x011915, 0x011916, PG_U_OTHER_LETTER},
+	{0x011917, 0x011917, PG_U_UNASSIGNED},
+	{0x011918, 0x01192f, PG_U_OTHER_LETTER},
+	{0x011930, 0x011935, PG_U_COMBINING_SPACING_MARK},
+	{0x011936, 0x011936, PG_U_UNASSIGNED},
+	{0x011937, 0x011938, PG_U_COMBINING_SPACING_MARK},
+	{0x011939, 0x01193a, PG_U_UNASSIGNED},
+	{0x01193b, 0x01193c, PG_U_NON_SPACING_MARK},
+	{0x01193d, 0x01193d, PG_U_COMBINING_SPACING_MARK},
+	{0x01193e, 0x01193e, PG_U_NON_SPACING_MARK},
+	{0x01193f, 0x01193f, PG_U_OTHER_LETTER},
+	{0x011940, 0x011940, PG_U_COMBINING_SPACING_MARK},
+	{0x011941, 0x011941, PG_U_OTHER_LETTER},
+	{0x011942, 0x011942, PG_U_COMBINING_SPACING_MARK},
+	{0x011943, 0x011943, PG_U_NON_SPACING_MARK},
+	{0x011944, 0x011946, PG_U_OTHER_PUNCTUATION},
+	{0x011947, 0x01194f, PG_U_UNASSIGNED},
+	{0x011950, 0x011959, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01195a, 0x01199f, PG_U_UNASSIGNED},
+	{0x0119a0, 0x0119a7, PG_U_OTHER_LETTER},
+	{0x0119a8, 0x0119a9, PG_U_UNASSIGNED},
+	{0x0119aa, 0x0119d0, PG_U_OTHER_LETTER},
+	{0x0119d1, 0x0119d3, PG_U_COMBINING_SPACING_MARK},
+	{0x0119d4, 0x0119d7, PG_U_NON_SPACING_MARK},
+	{0x0119d8, 0x0119d9, PG_U_UNASSIGNED},
+	{0x0119da, 0x0119db, PG_U_NON_SPACING_MARK},
+	{0x0119dc, 0x0119df, PG_U_COMBINING_SPACING_MARK},
+	{0x0119e0, 0x0119e0, PG_U_NON_SPACING_MARK},
+	{0x0119e1, 0x0119e1, PG_U_OTHER_LETTER},
+	{0x0119e2, 0x0119e2, PG_U_OTHER_PUNCTUATION},
+	{0x0119e3, 0x0119e3, PG_U_OTHER_LETTER},
+	{0x0119e4, 0x0119e4, PG_U_COMBINING_SPACING_MARK},
+	{0x0119e5, 0x0119ff, PG_U_UNASSIGNED},
+	{0x011a00, 0x011a00, PG_U_OTHER_LETTER},
+	{0x011a01, 0x011a0a, PG_U_NON_SPACING_MARK},
+	{0x011a0b, 0x011a32, PG_U_OTHER_LETTER},
+	{0x011a33, 0x011a38, PG_U_NON_SPACING_MARK},
+	{0x011a39, 0x011a39, PG_U_COMBINING_SPACING_MARK},
+	{0x011a3a, 0x011a3a, PG_U_OTHER_LETTER},
+	{0x011a3b, 0x011a3e, PG_U_NON_SPACING_MARK},
+	{0x011a3f, 0x011a46, PG_U_OTHER_PUNCTUATION},
+	{0x011a47, 0x011a47, PG_U_NON_SPACING_MARK},
+	{0x011a48, 0x011a4f, PG_U_UNASSIGNED},
+	{0x011a50, 0x011a50, PG_U_OTHER_LETTER},
+	{0x011a51, 0x011a56, PG_U_NON_SPACING_MARK},
+	{0x011a57, 0x011a58, PG_U_COMBINING_SPACING_MARK},
+	{0x011a59, 0x011a5b, PG_U_NON_SPACING_MARK},
+	{0x011a5c, 0x011a89, PG_U_OTHER_LETTER},
+	{0x011a8a, 0x011a96, PG_U_NON_SPACING_MARK},
+	{0x011a97, 0x011a97, PG_U_COMBINING_SPACING_MARK},
+	{0x011a98, 0x011a99, PG_U_NON_SPACING_MARK},
+	{0x011a9a, 0x011a9c, PG_U_OTHER_PUNCTUATION},
+	{0x011a9d, 0x011a9d, PG_U_OTHER_LETTER},
+	{0x011a9e, 0x011aa2, PG_U_OTHER_PUNCTUATION},
+	{0x011aa3, 0x011aaf, PG_U_UNASSIGNED},
+	{0x011ab0, 0x011af8, PG_U_OTHER_LETTER},
+	{0x011af9, 0x011aff, PG_U_UNASSIGNED},
+	{0x011b00, 0x011b09, PG_U_OTHER_PUNCTUATION},
+	{0x011b0a, 0x011bff, PG_U_UNASSIGNED},
+	{0x011c00, 0x011c08, PG_U_OTHER_LETTER},
+	{0x011c09, 0x011c09, PG_U_UNASSIGNED},
+	{0x011c0a, 0x011c2e, PG_U_OTHER_LETTER},
+	{0x011c2f, 0x011c2f, PG_U_COMBINING_SPACING_MARK},
+	{0x011c30, 0x011c36, PG_U_NON_SPACING_MARK},
+	{0x011c37, 0x011c37, PG_U_UNASSIGNED},
+	{0x011c38, 0x011c3d, PG_U_NON_SPACING_MARK},
+	{0x011c3e, 0x011c3e, PG_U_COMBINING_SPACING_MARK},
+	{0x011c3f, 0x011c3f, PG_U_NON_SPACING_MARK},
+	{0x011c40, 0x011c40, PG_U_OTHER_LETTER},
+	{0x011c41, 0x011c45, PG_U_OTHER_PUNCTUATION},
+	{0x011c46, 0x011c4f, PG_U_UNASSIGNED},
+	{0x011c50, 0x011c59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011c5a, 0x011c6c, PG_U_OTHER_NUMBER},
+	{0x011c6d, 0x011c6f, PG_U_UNASSIGNED},
+	{0x011c70, 0x011c71, PG_U_OTHER_PUNCTUATION},
+	{0x011c72, 0x011c8f, PG_U_OTHER_LETTER},
+	{0x011c90, 0x011c91, PG_U_UNASSIGNED},
+	{0x011c92, 0x011ca7, PG_U_NON_SPACING_MARK},
+	{0x011ca8, 0x011ca8, PG_U_UNASSIGNED},
+	{0x011ca9, 0x011ca9, PG_U_COMBINING_SPACING_MARK},
+	{0x011caa, 0x011cb0, PG_U_NON_SPACING_MARK},
+	{0x011cb1, 0x011cb1, PG_U_COMBINING_SPACING_MARK},
+	{0x011cb2, 0x011cb3, PG_U_NON_SPACING_MARK},
+	{0x011cb4, 0x011cb4, PG_U_COMBINING_SPACING_MARK},
+	{0x011cb5, 0x011cb6, PG_U_NON_SPACING_MARK},
+	{0x011cb7, 0x011cff, PG_U_UNASSIGNED},
+	{0x011d00, 0x011d06, PG_U_OTHER_LETTER},
+	{0x011d07, 0x011d07, PG_U_UNASSIGNED},
+	{0x011d08, 0x011d09, PG_U_OTHER_LETTER},
+	{0x011d0a, 0x011d0a, PG_U_UNASSIGNED},
+	{0x011d0b, 0x011d30, PG_U_OTHER_LETTER},
+	{0x011d31, 0x011d36, PG_U_NON_SPACING_MARK},
+	{0x011d37, 0x011d39, PG_U_UNASSIGNED},
+	{0x011d3a, 0x011d3a, PG_U_NON_SPACING_MARK},
+	{0x011d3b, 0x011d3b, PG_U_UNASSIGNED},
+	{0x011d3c, 0x011d3d, PG_U_NON_SPACING_MARK},
+	{0x011d3e, 0x011d3e, PG_U_UNASSIGNED},
+	{0x011d3f, 0x011d45, PG_U_NON_SPACING_MARK},
+	{0x011d46, 0x011d46, PG_U_OTHER_LETTER},
+	{0x011d47, 0x011d47, PG_U_NON_SPACING_MARK},
+	{0x011d48, 0x011d4f, PG_U_UNASSIGNED},
+	{0x011d50, 0x011d59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011d5a, 0x011d5f, PG_U_UNASSIGNED},
+	{0x011d60, 0x011d65, PG_U_OTHER_LETTER},
+	{0x011d66, 0x011d66, PG_U_UNASSIGNED},
+	{0x011d67, 0x011d68, PG_U_OTHER_LETTER},
+	{0x011d69, 0x011d69, PG_U_UNASSIGNED},
+	{0x011d6a, 0x011d89, PG_U_OTHER_LETTER},
+	{0x011d8a, 0x011d8e, PG_U_COMBINING_SPACING_MARK},
+	{0x011d8f, 0x011d8f, PG_U_UNASSIGNED},
+	{0x011d90, 0x011d91, PG_U_NON_SPACING_MARK},
+	{0x011d92, 0x011d92, PG_U_UNASSIGNED},
+	{0x011d93, 0x011d94, PG_U_COMBINING_SPACING_MARK},
+	{0x011d95, 0x011d95, PG_U_NON_SPACING_MARK},
+	{0x011d96, 0x011d96, PG_U_COMBINING_SPACING_MARK},
+	{0x011d97, 0x011d97, PG_U_NON_SPACING_MARK},
+	{0x011d98, 0x011d98, PG_U_OTHER_LETTER},
+	{0x011d99, 0x011d9f, PG_U_UNASSIGNED},
+	{0x011da0, 0x011da9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011daa, 0x011edf, PG_U_UNASSIGNED},
+	{0x011ee0, 0x011ef2, PG_U_OTHER_LETTER},
+	{0x011ef3, 0x011ef4, PG_U_NON_SPACING_MARK},
+	{0x011ef5, 0x011ef6, PG_U_COMBINING_SPACING_MARK},
+	{0x011ef7, 0x011ef8, PG_U_OTHER_PUNCTUATION},
+	{0x011ef9, 0x011eff, PG_U_UNASSIGNED},
+	{0x011f00, 0x011f01, PG_U_NON_SPACING_MARK},
+	{0x011f02, 0x011f02, PG_U_OTHER_LETTER},
+	{0x011f03, 0x011f03, PG_U_COMBINING_SPACING_MARK},
+	{0x011f04, 0x011f10, PG_U_OTHER_LETTER},
+	{0x011f11, 0x011f11, PG_U_UNASSIGNED},
+	{0x011f12, 0x011f33, PG_U_OTHER_LETTER},
+	{0x011f34, 0x011f35, PG_U_COMBINING_SPACING_MARK},
+	{0x011f36, 0x011f3a, PG_U_NON_SPACING_MARK},
+	{0x011f3b, 0x011f3d, PG_U_UNASSIGNED},
+	{0x011f3e, 0x011f3f, PG_U_COMBINING_SPACING_MARK},
+	{0x011f40, 0x011f40, PG_U_NON_SPACING_MARK},
+	{0x011f41, 0x011f41, PG_U_COMBINING_SPACING_MARK},
+	{0x011f42, 0x011f42, PG_U_NON_SPACING_MARK},
+	{0x011f43, 0x011f4f, PG_U_OTHER_PUNCTUATION},
+	{0x011f50, 0x011f59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011f5a, 0x011faf, PG_U_UNASSIGNED},
+	{0x011fb0, 0x011fb0, PG_U_OTHER_LETTER},
+	{0x011fb1, 0x011fbf, PG_U_UNASSIGNED},
+	{0x011fc0, 0x011fd4, PG_U_OTHER_NUMBER},
+	{0x011fd5, 0x011fdc, PG_U_OTHER_SYMBOL},
+	{0x011fdd, 0x011fe0, PG_U_CURRENCY_SYMBOL},
+	{0x011fe1, 0x011ff1, PG_U_OTHER_SYMBOL},
+	{0x011ff2, 0x011ffe, PG_U_UNASSIGNED},
+	{0x011fff, 0x011fff, PG_U_OTHER_PUNCTUATION},
+	{0x012000, 0x012399, PG_U_OTHER_LETTER},
+	{0x01239a, 0x0123ff, PG_U_UNASSIGNED},
+	{0x012400, 0x01246e, PG_U_LETTER_NUMBER},
+	{0x01246f, 0x01246f, PG_U_UNASSIGNED},
+	{0x012470, 0x012474, PG_U_OTHER_PUNCTUATION},
+	{0x012475, 0x01247f, PG_U_UNASSIGNED},
+	{0x012480, 0x012543, PG_U_OTHER_LETTER},
+	{0x012544, 0x012f8f, PG_U_UNASSIGNED},
+	{0x012f90, 0x012ff0, PG_U_OTHER_LETTER},
+	{0x012ff1, 0x012ff2, PG_U_OTHER_PUNCTUATION},
+	{0x012ff3, 0x012fff, PG_U_UNASSIGNED},
+	{0x013000, 0x01342f, PG_U_OTHER_LETTER},
+	{0x013430, 0x01343f, PG_U_FORMAT_CHAR},
+	{0x013440, 0x013440, PG_U_NON_SPACING_MARK},
+	{0x013441, 0x013446, PG_U_OTHER_LETTER},
+	{0x013447, 0x013455, PG_U_NON_SPACING_MARK},
+	{0x013456, 0x0143ff, PG_U_UNASSIGNED},
+	{0x014400, 0x014646, PG_U_OTHER_LETTER},
+	{0x014647, 0x0167ff, PG_U_UNASSIGNED},
+	{0x016800, 0x016a38, PG_U_OTHER_LETTER},
+	{0x016a39, 0x016a3f, PG_U_UNASSIGNED},
+	{0x016a40, 0x016a5e, PG_U_OTHER_LETTER},
+	{0x016a5f, 0x016a5f, PG_U_UNASSIGNED},
+	{0x016a60, 0x016a69, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x016a6a, 0x016a6d, PG_U_UNASSIGNED},
+	{0x016a6e, 0x016a6f, PG_U_OTHER_PUNCTUATION},
+	{0x016a70, 0x016abe, PG_U_OTHER_LETTER},
+	{0x016abf, 0x016abf, PG_U_UNASSIGNED},
+	{0x016ac0, 0x016ac9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x016aca, 0x016acf, PG_U_UNASSIGNED},
+	{0x016ad0, 0x016aed, PG_U_OTHER_LETTER},
+	{0x016aee, 0x016aef, PG_U_UNASSIGNED},
+	{0x016af0, 0x016af4, PG_U_NON_SPACING_MARK},
+	{0x016af5, 0x016af5, PG_U_OTHER_PUNCTUATION},
+	{0x016af6, 0x016aff, PG_U_UNASSIGNED},
+	{0x016b00, 0x016b2f, PG_U_OTHER_LETTER},
+	{0x016b30, 0x016b36, PG_U_NON_SPACING_MARK},
+	{0x016b37, 0x016b3b, PG_U_OTHER_PUNCTUATION},
+	{0x016b3c, 0x016b3f, PG_U_OTHER_SYMBOL},
+	{0x016b40, 0x016b43, PG_U_MODIFIER_LETTER},
+	{0x016b44, 0x016b44, PG_U_OTHER_PUNCTUATION},
+	{0x016b45, 0x016b45, PG_U_OTHER_SYMBOL},
+	{0x016b46, 0x016b4f, PG_U_UNASSIGNED},
+	{0x016b50, 0x016b59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x016b5a, 0x016b5a, PG_U_UNASSIGNED},
+	{0x016b5b, 0x016b61, PG_U_OTHER_NUMBER},
+	{0x016b62, 0x016b62, PG_U_UNASSIGNED},
+	{0x016b63, 0x016b77, PG_U_OTHER_LETTER},
+	{0x016b78, 0x016b7c, PG_U_UNASSIGNED},
+	{0x016b7d, 0x016b8f, PG_U_OTHER_LETTER},
+	{0x016b90, 0x016e3f, PG_U_UNASSIGNED},
+	{0x016e40, 0x016e5f, PG_U_UPPERCASE_LETTER},
+	{0x016e60, 0x016e7f, PG_U_LOWERCASE_LETTER},
+	{0x016e80, 0x016e96, PG_U_OTHER_NUMBER},
+	{0x016e97, 0x016e9a, PG_U_OTHER_PUNCTUATION},
+	{0x016e9b, 0x016eff, PG_U_UNASSIGNED},
+	{0x016f00, 0x016f4a, PG_U_OTHER_LETTER},
+	{0x016f4b, 0x016f4e, PG_U_UNASSIGNED},
+	{0x016f4f, 0x016f4f, PG_U_NON_SPACING_MARK},
+	{0x016f50, 0x016f50, PG_U_OTHER_LETTER},
+	{0x016f51, 0x016f87, PG_U_COMBINING_SPACING_MARK},
+	{0x016f88, 0x016f8e, PG_U_UNASSIGNED},
+	{0x016f8f, 0x016f92, PG_U_NON_SPACING_MARK},
+	{0x016f93, 0x016f9f, PG_U_MODIFIER_LETTER},
+	{0x016fa0, 0x016fdf, PG_U_UNASSIGNED},
+	{0x016fe0, 0x016fe1, PG_U_MODIFIER_LETTER},
+	{0x016fe2, 0x016fe2, PG_U_OTHER_PUNCTUATION},
+	{0x016fe3, 0x016fe3, PG_U_MODIFIER_LETTER},
+	{0x016fe4, 0x016fe4, PG_U_NON_SPACING_MARK},
+	{0x016fe5, 0x016fef, PG_U_UNASSIGNED},
+	{0x016ff0, 0x016ff1, PG_U_COMBINING_SPACING_MARK},
+	{0x016ff2, 0x016fff, PG_U_UNASSIGNED},
+	{0x017000, 0x0187f7, PG_U_OTHER_LETTER},
+	{0x0187f8, 0x0187ff, PG_U_UNASSIGNED},
+	{0x018800, 0x018cd5, PG_U_OTHER_LETTER},
+	{0x018cd6, 0x018cff, PG_U_UNASSIGNED},
+	{0x018d00, 0x018d08, PG_U_OTHER_LETTER},
+	{0x018d09, 0x01afef, PG_U_UNASSIGNED},
+	{0x01aff0, 0x01aff3, PG_U_MODIFIER_LETTER},
+	{0x01aff4, 0x01aff4, PG_U_UNASSIGNED},
+	{0x01aff5, 0x01affb, PG_U_MODIFIER_LETTER},
+	{0x01affc, 0x01affc, PG_U_UNASSIGNED},
+	{0x01affd, 0x01affe, PG_U_MODIFIER_LETTER},
+	{0x01afff, 0x01afff, PG_U_UNASSIGNED},
+	{0x01b000, 0x01b122, PG_U_OTHER_LETTER},
+	{0x01b123, 0x01b131, PG_U_UNASSIGNED},
+	{0x01b132, 0x01b132, PG_U_OTHER_LETTER},
+	{0x01b133, 0x01b14f, PG_U_UNASSIGNED},
+	{0x01b150, 0x01b152, PG_U_OTHER_LETTER},
+	{0x01b153, 0x01b154, PG_U_UNASSIGNED},
+	{0x01b155, 0x01b155, PG_U_OTHER_LETTER},
+	{0x01b156, 0x01b163, PG_U_UNASSIGNED},
+	{0x01b164, 0x01b167, PG_U_OTHER_LETTER},
+	{0x01b168, 0x01b16f, PG_U_UNASSIGNED},
+	{0x01b170, 0x01b2fb, PG_U_OTHER_LETTER},
+	{0x01b2fc, 0x01bbff, PG_U_UNASSIGNED},
+	{0x01bc00, 0x01bc6a, PG_U_OTHER_LETTER},
+	{0x01bc6b, 0x01bc6f, PG_U_UNASSIGNED},
+	{0x01bc70, 0x01bc7c, PG_U_OTHER_LETTER},
+	{0x01bc7d, 0x01bc7f, PG_U_UNASSIGNED},
+	{0x01bc80, 0x01bc88, PG_U_OTHER_LETTER},
+	{0x01bc89, 0x01bc8f, PG_U_UNASSIGNED},
+	{0x01bc90, 0x01bc99, PG_U_OTHER_LETTER},
+	{0x01bc9a, 0x01bc9b, PG_U_UNASSIGNED},
+	{0x01bc9c, 0x01bc9c, PG_U_OTHER_SYMBOL},
+	{0x01bc9d, 0x01bc9e, PG_U_NON_SPACING_MARK},
+	{0x01bc9f, 0x01bc9f, PG_U_OTHER_PUNCTUATION},
+	{0x01bca0, 0x01bca3, PG_U_FORMAT_CHAR},
+	{0x01bca4, 0x01ceff, PG_U_UNASSIGNED},
+	{0x01cf00, 0x01cf2d, PG_U_NON_SPACING_MARK},
+	{0x01cf2e, 0x01cf2f, PG_U_UNASSIGNED},
+	{0x01cf30, 0x01cf46, PG_U_NON_SPACING_MARK},
+	{0x01cf47, 0x01cf4f, PG_U_UNASSIGNED},
+	{0x01cf50, 0x01cfc3, PG_U_OTHER_SYMBOL},
+	{0x01cfc4, 0x01cfff, PG_U_UNASSIGNED},
+	{0x01d000, 0x01d0f5, PG_U_OTHER_SYMBOL},
+	{0x01d0f6, 0x01d0ff, PG_U_UNASSIGNED},
+	{0x01d100, 0x01d126, PG_U_OTHER_SYMBOL},
+	{0x01d127, 0x01d128, PG_U_UNASSIGNED},
+	{0x01d129, 0x01d164, PG_U_OTHER_SYMBOL},
+	{0x01d165, 0x01d166, PG_U_COMBINING_SPACING_MARK},
+	{0x01d167, 0x01d169, PG_U_NON_SPACING_MARK},
+	{0x01d16a, 0x01d16c, PG_U_OTHER_SYMBOL},
+	{0x01d16d, 0x01d172, PG_U_COMBINING_SPACING_MARK},
+	{0x01d173, 0x01d17a, PG_U_FORMAT_CHAR},
+	{0x01d17b, 0x01d182, PG_U_NON_SPACING_MARK},
+	{0x01d183, 0x01d184, PG_U_OTHER_SYMBOL},
+	{0x01d185, 0x01d18b, PG_U_NON_SPACING_MARK},
+	{0x01d18c, 0x01d1a9, PG_U_OTHER_SYMBOL},
+	{0x01d1aa, 0x01d1ad, PG_U_NON_SPACING_MARK},
+	{0x01d1ae, 0x01d1ea, PG_U_OTHER_SYMBOL},
+	{0x01d1eb, 0x01d1ff, PG_U_UNASSIGNED},
+	{0x01d200, 0x01d241, PG_U_OTHER_SYMBOL},
+	{0x01d242, 0x01d244, PG_U_NON_SPACING_MARK},
+	{0x01d245, 0x01d245, PG_U_OTHER_SYMBOL},
+	{0x01d246, 0x01d2bf, PG_U_UNASSIGNED},
+	{0x01d2c0, 0x01d2d3, PG_U_OTHER_NUMBER},
+	{0x01d2d4, 0x01d2df, PG_U_UNASSIGNED},
+	{0x01d2e0, 0x01d2f3, PG_U_OTHER_NUMBER},
+	{0x01d2f4, 0x01d2ff, PG_U_UNASSIGNED},
+	{0x01d300, 0x01d356, PG_U_OTHER_SYMBOL},
+	{0x01d357, 0x01d35f, PG_U_UNASSIGNED},
+	{0x01d360, 0x01d378, PG_U_OTHER_NUMBER},
+	{0x01d379, 0x01d3ff, PG_U_UNASSIGNED},
+	{0x01d400, 0x01d419, PG_U_UPPERCASE_LETTER},
+	{0x01d41a, 0x01d433, PG_U_LOWERCASE_LETTER},
+	{0x01d434, 0x01d44d, PG_U_UPPERCASE_LETTER},
+	{0x01d44e, 0x01d454, PG_U_LOWERCASE_LETTER},
+	{0x01d455, 0x01d455, PG_U_UNASSIGNED},
+	{0x01d456, 0x01d467, PG_U_LOWERCASE_LETTER},
+	{0x01d468, 0x01d481, PG_U_UPPERCASE_LETTER},
+	{0x01d482, 0x01d49b, PG_U_LOWERCASE_LETTER},
+	{0x01d49c, 0x01d49c, PG_U_UPPERCASE_LETTER},
+	{0x01d49d, 0x01d49d, PG_U_UNASSIGNED},
+	{0x01d49e, 0x01d49f, PG_U_UPPERCASE_LETTER},
+	{0x01d4a0, 0x01d4a1, PG_U_UNASSIGNED},
+	{0x01d4a2, 0x01d4a2, PG_U_UPPERCASE_LETTER},
+	{0x01d4a3, 0x01d4a4, PG_U_UNASSIGNED},
+	{0x01d4a5, 0x01d4a6, PG_U_UPPERCASE_LETTER},
+	{0x01d4a7, 0x01d4a8, PG_U_UNASSIGNED},
+	{0x01d4a9, 0x01d4ac, PG_U_UPPERCASE_LETTER},
+	{0x01d4ad, 0x01d4ad, PG_U_UNASSIGNED},
+	{0x01d4ae, 0x01d4b5, PG_U_UPPERCASE_LETTER},
+	{0x01d4b6, 0x01d4b9, PG_U_LOWERCASE_LETTER},
+	{0x01d4ba, 0x01d4ba, PG_U_UNASSIGNED},
+	{0x01d4bb, 0x01d4bb, PG_U_LOWERCASE_LETTER},
+	{0x01d4bc, 0x01d4bc, PG_U_UNASSIGNED},
+	{0x01d4bd, 0x01d4c3, PG_U_LOWERCASE_LETTER},
+	{0x01d4c4, 0x01d4c4, PG_U_UNASSIGNED},
+	{0x01d4c5, 0x01d4cf, PG_U_LOWERCASE_LETTER},
+	{0x01d4d0, 0x01d4e9, PG_U_UPPERCASE_LETTER},
+	{0x01d4ea, 0x01d503, PG_U_LOWERCASE_LETTER},
+	{0x01d504, 0x01d505, PG_U_UPPERCASE_LETTER},
+	{0x01d506, 0x01d506, PG_U_UNASSIGNED},
+	{0x01d507, 0x01d50a, PG_U_UPPERCASE_LETTER},
+	{0x01d50b, 0x01d50c, PG_U_UNASSIGNED},
+	{0x01d50d, 0x01d514, PG_U_UPPERCASE_LETTER},
+	{0x01d515, 0x01d515, PG_U_UNASSIGNED},
+	{0x01d516, 0x01d51c, PG_U_UPPERCASE_LETTER},
+	{0x01d51d, 0x01d51d, PG_U_UNASSIGNED},
+	{0x01d51e, 0x01d537, PG_U_LOWERCASE_LETTER},
+	{0x01d538, 0x01d539, PG_U_UPPERCASE_LETTER},
+	{0x01d53a, 0x01d53a, PG_U_UNASSIGNED},
+	{0x01d53b, 0x01d53e, PG_U_UPPERCASE_LETTER},
+	{0x01d53f, 0x01d53f, PG_U_UNASSIGNED},
+	{0x01d540, 0x01d544, PG_U_UPPERCASE_LETTER},
+	{0x01d545, 0x01d545, PG_U_UNASSIGNED},
+	{0x01d546, 0x01d546, PG_U_UPPERCASE_LETTER},
+	{0x01d547, 0x01d549, PG_U_UNASSIGNED},
+	{0x01d54a, 0x01d550, PG_U_UPPERCASE_LETTER},
+	{0x01d551, 0x01d551, PG_U_UNASSIGNED},
+	{0x01d552, 0x01d56b, PG_U_LOWERCASE_LETTER},
+	{0x01d56c, 0x01d585, PG_U_UPPERCASE_LETTER},
+	{0x01d586, 0x01d59f, PG_U_LOWERCASE_LETTER},
+	{0x01d5a0, 0x01d5b9, PG_U_UPPERCASE_LETTER},
+	{0x01d5ba, 0x01d5d3, PG_U_LOWERCASE_LETTER},
+	{0x01d5d4, 0x01d5ed, PG_U_UPPERCASE_LETTER},
+	{0x01d5ee, 0x01d607, PG_U_LOWERCASE_LETTER},
+	{0x01d608, 0x01d621, PG_U_UPPERCASE_LETTER},
+	{0x01d622, 0x01d63b, PG_U_LOWERCASE_LETTER},
+	{0x01d63c, 0x01d655, PG_U_UPPERCASE_LETTER},
+	{0x01d656, 0x01d66f, PG_U_LOWERCASE_LETTER},
+	{0x01d670, 0x01d689, PG_U_UPPERCASE_LETTER},
+	{0x01d68a, 0x01d6a5, PG_U_LOWERCASE_LETTER},
+	{0x01d6a6, 0x01d6a7, PG_U_UNASSIGNED},
+	{0x01d6a8, 0x01d6c0, PG_U_UPPERCASE_LETTER},
+	{0x01d6c1, 0x01d6c1, PG_U_MATH_SYMBOL},
+	{0x01d6c2, 0x01d6da, PG_U_LOWERCASE_LETTER},
+	{0x01d6db, 0x01d6db, PG_U_MATH_SYMBOL},
+	{0x01d6dc, 0x01d6e1, PG_U_LOWERCASE_LETTER},
+	{0x01d6e2, 0x01d6fa, PG_U_UPPERCASE_LETTER},
+	{0x01d6fb, 0x01d6fb, PG_U_MATH_SYMBOL},
+	{0x01d6fc, 0x01d714, PG_U_LOWERCASE_LETTER},
+	{0x01d715, 0x01d715, PG_U_MATH_SYMBOL},
+	{0x01d716, 0x01d71b, PG_U_LOWERCASE_LETTER},
+	{0x01d71c, 0x01d734, PG_U_UPPERCASE_LETTER},
+	{0x01d735, 0x01d735, PG_U_MATH_SYMBOL},
+	{0x01d736, 0x01d74e, PG_U_LOWERCASE_LETTER},
+	{0x01d74f, 0x01d74f, PG_U_MATH_SYMBOL},
+	{0x01d750, 0x01d755, PG_U_LOWERCASE_LETTER},
+	{0x01d756, 0x01d76e, PG_U_UPPERCASE_LETTER},
+	{0x01d76f, 0x01d76f, PG_U_MATH_SYMBOL},
+	{0x01d770, 0x01d788, PG_U_LOWERCASE_LETTER},
+	{0x01d789, 0x01d789, PG_U_MATH_SYMBOL},
+	{0x01d78a, 0x01d78f, PG_U_LOWERCASE_LETTER},
+	{0x01d790, 0x01d7a8, PG_U_UPPERCASE_LETTER},
+	{0x01d7a9, 0x01d7a9, PG_U_MATH_SYMBOL},
+	{0x01d7aa, 0x01d7c2, PG_U_LOWERCASE_LETTER},
+	{0x01d7c3, 0x01d7c3, PG_U_MATH_SYMBOL},
+	{0x01d7c4, 0x01d7c9, PG_U_LOWERCASE_LETTER},
+	{0x01d7ca, 0x01d7ca, PG_U_UPPERCASE_LETTER},
+	{0x01d7cb, 0x01d7cb, PG_U_LOWERCASE_LETTER},
+	{0x01d7cc, 0x01d7cd, PG_U_UNASSIGNED},
+	{0x01d7ce, 0x01d7ff, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01d800, 0x01d9ff, PG_U_OTHER_SYMBOL},
+	{0x01da00, 0x01da36, PG_U_NON_SPACING_MARK},
+	{0x01da37, 0x01da3a, PG_U_OTHER_SYMBOL},
+	{0x01da3b, 0x01da6c, PG_U_NON_SPACING_MARK},
+	{0x01da6d, 0x01da74, PG_U_OTHER_SYMBOL},
+	{0x01da75, 0x01da75, PG_U_NON_SPACING_MARK},
+	{0x01da76, 0x01da83, PG_U_OTHER_SYMBOL},
+	{0x01da84, 0x01da84, PG_U_NON_SPACING_MARK},
+	{0x01da85, 0x01da86, PG_U_OTHER_SYMBOL},
+	{0x01da87, 0x01da8b, PG_U_OTHER_PUNCTUATION},
+	{0x01da8c, 0x01da9a, PG_U_UNASSIGNED},
+	{0x01da9b, 0x01da9f, PG_U_NON_SPACING_MARK},
+	{0x01daa0, 0x01daa0, PG_U_UNASSIGNED},
+	{0x01daa1, 0x01daaf, PG_U_NON_SPACING_MARK},
+	{0x01dab0, 0x01deff, PG_U_UNASSIGNED},
+	{0x01df00, 0x01df09, PG_U_LOWERCASE_LETTER},
+	{0x01df0a, 0x01df0a, PG_U_OTHER_LETTER},
+	{0x01df0b, 0x01df1e, PG_U_LOWERCASE_LETTER},
+	{0x01df1f, 0x01df24, PG_U_UNASSIGNED},
+	{0x01df25, 0x01df2a, PG_U_LOWERCASE_LETTER},
+	{0x01df2b, 0x01dfff, PG_U_UNASSIGNED},
+	{0x01e000, 0x01e006, PG_U_NON_SPACING_MARK},
+	{0x01e007, 0x01e007, PG_U_UNASSIGNED},
+	{0x01e008, 0x01e018, PG_U_NON_SPACING_MARK},
+	{0x01e019, 0x01e01a, PG_U_UNASSIGNED},
+	{0x01e01b, 0x01e021, PG_U_NON_SPACING_MARK},
+	{0x01e022, 0x01e022, PG_U_UNASSIGNED},
+	{0x01e023, 0x01e024, PG_U_NON_SPACING_MARK},
+	{0x01e025, 0x01e025, PG_U_UNASSIGNED},
+	{0x01e026, 0x01e02a, PG_U_NON_SPACING_MARK},
+	{0x01e02b, 0x01e02f, PG_U_UNASSIGNED},
+	{0x01e030, 0x01e06d, PG_U_MODIFIER_LETTER},
+	{0x01e06e, 0x01e08e, PG_U_UNASSIGNED},
+	{0x01e08f, 0x01e08f, PG_U_NON_SPACING_MARK},
+	{0x01e090, 0x01e0ff, PG_U_UNASSIGNED},
+	{0x01e100, 0x01e12c, PG_U_OTHER_LETTER},
+	{0x01e12d, 0x01e12f, PG_U_UNASSIGNED},
+	{0x01e130, 0x01e136, PG_U_NON_SPACING_MARK},
+	{0x01e137, 0x01e13d, PG_U_MODIFIER_LETTER},
+	{0x01e13e, 0x01e13f, PG_U_UNASSIGNED},
+	{0x01e140, 0x01e149, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e14a, 0x01e14d, PG_U_UNASSIGNED},
+	{0x01e14e, 0x01e14e, PG_U_OTHER_LETTER},
+	{0x01e14f, 0x01e14f, PG_U_OTHER_SYMBOL},
+	{0x01e150, 0x01e28f, PG_U_UNASSIGNED},
+	{0x01e290, 0x01e2ad, PG_U_OTHER_LETTER},
+	{0x01e2ae, 0x01e2ae, PG_U_NON_SPACING_MARK},
+	{0x01e2af, 0x01e2bf, PG_U_UNASSIGNED},
+	{0x01e2c0, 0x01e2eb, PG_U_OTHER_LETTER},
+	{0x01e2ec, 0x01e2ef, PG_U_NON_SPACING_MARK},
+	{0x01e2f0, 0x01e2f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e2fa, 0x01e2fe, PG_U_UNASSIGNED},
+	{0x01e2ff, 0x01e2ff, PG_U_CURRENCY_SYMBOL},
+	{0x01e300, 0x01e4cf, PG_U_UNASSIGNED},
+	{0x01e4d0, 0x01e4ea, PG_U_OTHER_LETTER},
+	{0x01e4eb, 0x01e4eb, PG_U_MODIFIER_LETTER},
+	{0x01e4ec, 0x01e4ef, PG_U_NON_SPACING_MARK},
+	{0x01e4f0, 0x01e4f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e4fa, 0x01e7df, PG_U_UNASSIGNED},
+	{0x01e7e0, 0x01e7e6, PG_U_OTHER_LETTER},
+	{0x01e7e7, 0x01e7e7, PG_U_UNASSIGNED},
+	{0x01e7e8, 0x01e7eb, PG_U_OTHER_LETTER},
+	{0x01e7ec, 0x01e7ec, PG_U_UNASSIGNED},
+	{0x01e7ed, 0x01e7ee, PG_U_OTHER_LETTER},
+	{0x01e7ef, 0x01e7ef, PG_U_UNASSIGNED},
+	{0x01e7f0, 0x01e7fe, PG_U_OTHER_LETTER},
+	{0x01e7ff, 0x01e7ff, PG_U_UNASSIGNED},
+	{0x01e800, 0x01e8c4, PG_U_OTHER_LETTER},
+	{0x01e8c5, 0x01e8c6, PG_U_UNASSIGNED},
+	{0x01e8c7, 0x01e8cf, PG_U_OTHER_NUMBER},
+	{0x01e8d0, 0x01e8d6, PG_U_NON_SPACING_MARK},
+	{0x01e8d7, 0x01e8ff, PG_U_UNASSIGNED},
+	{0x01e900, 0x01e921, PG_U_UPPERCASE_LETTER},
+	{0x01e922, 0x01e943, PG_U_LOWERCASE_LETTER},
+	{0x01e944, 0x01e94a, PG_U_NON_SPACING_MARK},
+	{0x01e94b, 0x01e94b, PG_U_MODIFIER_LETTER},
+	{0x01e94c, 0x01e94f, PG_U_UNASSIGNED},
+	{0x01e950, 0x01e959, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e95a, 0x01e95d, PG_U_UNASSIGNED},
+	{0x01e95e, 0x01e95f, PG_U_OTHER_PUNCTUATION},
+	{0x01e960, 0x01ec70, PG_U_UNASSIGNED},
+	{0x01ec71, 0x01ecab, PG_U_OTHER_NUMBER},
+	{0x01ecac, 0x01ecac, PG_U_OTHER_SYMBOL},
+	{0x01ecad, 0x01ecaf, PG_U_OTHER_NUMBER},
+	{0x01ecb0, 0x01ecb0, PG_U_CURRENCY_SYMBOL},
+	{0x01ecb1, 0x01ecb4, PG_U_OTHER_NUMBER},
+	{0x01ecb5, 0x01ed00, PG_U_UNASSIGNED},
+	{0x01ed01, 0x01ed2d, PG_U_OTHER_NUMBER},
+	{0x01ed2e, 0x01ed2e, PG_U_OTHER_SYMBOL},
+	{0x01ed2f, 0x01ed3d, PG_U_OTHER_NUMBER},
+	{0x01ed3e, 0x01edff, PG_U_UNASSIGNED},
+	{0x01ee00, 0x01ee03, PG_U_OTHER_LETTER},
+	{0x01ee04, 0x01ee04, PG_U_UNASSIGNED},
+	{0x01ee05, 0x01ee1f, PG_U_OTHER_LETTER},
+	{0x01ee20, 0x01ee20, PG_U_UNASSIGNED},
+	{0x01ee21, 0x01ee22, PG_U_OTHER_LETTER},
+	{0x01ee23, 0x01ee23, PG_U_UNASSIGNED},
+	{0x01ee24, 0x01ee24, PG_U_OTHER_LETTER},
+	{0x01ee25, 0x01ee26, PG_U_UNASSIGNED},
+	{0x01ee27, 0x01ee27, PG_U_OTHER_LETTER},
+	{0x01ee28, 0x01ee28, PG_U_UNASSIGNED},
+	{0x01ee29, 0x01ee32, PG_U_OTHER_LETTER},
+	{0x01ee33, 0x01ee33, PG_U_UNASSIGNED},
+	{0x01ee34, 0x01ee37, PG_U_OTHER_LETTER},
+	{0x01ee38, 0x01ee38, PG_U_UNASSIGNED},
+	{0x01ee39, 0x01ee39, PG_U_OTHER_LETTER},
+	{0x01ee3a, 0x01ee3a, PG_U_UNASSIGNED},
+	{0x01ee3b, 0x01ee3b, PG_U_OTHER_LETTER},
+	{0x01ee3c, 0x01ee41, PG_U_UNASSIGNED},
+	{0x01ee42, 0x01ee42, PG_U_OTHER_LETTER},
+	{0x01ee43, 0x01ee46, PG_U_UNASSIGNED},
+	{0x01ee47, 0x01ee47, PG_U_OTHER_LETTER},
+	{0x01ee48, 0x01ee48, PG_U_UNASSIGNED},
+	{0x01ee49, 0x01ee49, PG_U_OTHER_LETTER},
+	{0x01ee4a, 0x01ee4a, PG_U_UNASSIGNED},
+	{0x01ee4b, 0x01ee4b, PG_U_OTHER_LETTER},
+	{0x01ee4c, 0x01ee4c, PG_U_UNASSIGNED},
+	{0x01ee4d, 0x01ee4f, PG_U_OTHER_LETTER},
+	{0x01ee50, 0x01ee50, PG_U_UNASSIGNED},
+	{0x01ee51, 0x01ee52, PG_U_OTHER_LETTER},
+	{0x01ee53, 0x01ee53, PG_U_UNASSIGNED},
+	{0x01ee54, 0x01ee54, PG_U_OTHER_LETTER},
+	{0x01ee55, 0x01ee56, PG_U_UNASSIGNED},
+	{0x01ee57, 0x01ee57, PG_U_OTHER_LETTER},
+	{0x01ee58, 0x01ee58, PG_U_UNASSIGNED},
+	{0x01ee59, 0x01ee59, PG_U_OTHER_LETTER},
+	{0x01ee5a, 0x01ee5a, PG_U_UNASSIGNED},
+	{0x01ee5b, 0x01ee5b, PG_U_OTHER_LETTER},
+	{0x01ee5c, 0x01ee5c, PG_U_UNASSIGNED},
+	{0x01ee5d, 0x01ee5d, PG_U_OTHER_LETTER},
+	{0x01ee5e, 0x01ee5e, PG_U_UNASSIGNED},
+	{0x01ee5f, 0x01ee5f, PG_U_OTHER_LETTER},
+	{0x01ee60, 0x01ee60, PG_U_UNASSIGNED},
+	{0x01ee61, 0x01ee62, PG_U_OTHER_LETTER},
+	{0x01ee63, 0x01ee63, PG_U_UNASSIGNED},
+	{0x01ee64, 0x01ee64, PG_U_OTHER_LETTER},
+	{0x01ee65, 0x01ee66, PG_U_UNASSIGNED},
+	{0x01ee67, 0x01ee6a, PG_U_OTHER_LETTER},
+	{0x01ee6b, 0x01ee6b, PG_U_UNASSIGNED},
+	{0x01ee6c, 0x01ee72, PG_U_OTHER_LETTER},
+	{0x01ee73, 0x01ee73, PG_U_UNASSIGNED},
+	{0x01ee74, 0x01ee77, PG_U_OTHER_LETTER},
+	{0x01ee78, 0x01ee78, PG_U_UNASSIGNED},
+	{0x01ee79, 0x01ee7c, PG_U_OTHER_LETTER},
+	{0x01ee7d, 0x01ee7d, PG_U_UNASSIGNED},
+	{0x01ee7e, 0x01ee7e, PG_U_OTHER_LETTER},
+	{0x01ee7f, 0x01ee7f, PG_U_UNASSIGNED},
+	{0x01ee80, 0x01ee89, PG_U_OTHER_LETTER},
+	{0x01ee8a, 0x01ee8a, PG_U_UNASSIGNED},
+	{0x01ee8b, 0x01ee9b, PG_U_OTHER_LETTER},
+	{0x01ee9c, 0x01eea0, PG_U_UNASSIGNED},
+	{0x01eea1, 0x01eea3, PG_U_OTHER_LETTER},
+	{0x01eea4, 0x01eea4, PG_U_UNASSIGNED},
+	{0x01eea5, 0x01eea9, PG_U_OTHER_LETTER},
+	{0x01eeaa, 0x01eeaa, PG_U_UNASSIGNED},
+	{0x01eeab, 0x01eebb, PG_U_OTHER_LETTER},
+	{0x01eebc, 0x01eeef, PG_U_UNASSIGNED},
+	{0x01eef0, 0x01eef1, PG_U_MATH_SYMBOL},
+	{0x01eef2, 0x01efff, PG_U_UNASSIGNED},
+	{0x01f000, 0x01f02b, PG_U_OTHER_SYMBOL},
+	{0x01f02c, 0x01f02f, PG_U_UNASSIGNED},
+	{0x01f030, 0x01f093, PG_U_OTHER_SYMBOL},
+	{0x01f094, 0x01f09f, PG_U_UNASSIGNED},
+	{0x01f0a0, 0x01f0ae, PG_U_OTHER_SYMBOL},
+	{0x01f0af, 0x01f0b0, PG_U_UNASSIGNED},
+	{0x01f0b1, 0x01f0bf, PG_U_OTHER_SYMBOL},
+	{0x01f0c0, 0x01f0c0, PG_U_UNASSIGNED},
+	{0x01f0c1, 0x01f0cf, PG_U_OTHER_SYMBOL},
+	{0x01f0d0, 0x01f0d0, PG_U_UNASSIGNED},
+	{0x01f0d1, 0x01f0f5, PG_U_OTHER_SYMBOL},
+	{0x01f0f6, 0x01f0ff, PG_U_UNASSIGNED},
+	{0x01f100, 0x01f10c, PG_U_OTHER_NUMBER},
+	{0x01f10d, 0x01f1ad, PG_U_OTHER_SYMBOL},
+	{0x01f1ae, 0x01f1e5, PG_U_UNASSIGNED},
+	{0x01f1e6, 0x01f202, PG_U_OTHER_SYMBOL},
+	{0x01f203, 0x01f20f, PG_U_UNASSIGNED},
+	{0x01f210, 0x01f23b, PG_U_OTHER_SYMBOL},
+	{0x01f23c, 0x01f23f, PG_U_UNASSIGNED},
+	{0x01f240, 0x01f248, PG_U_OTHER_SYMBOL},
+	{0x01f249, 0x01f24f, PG_U_UNASSIGNED},
+	{0x01f250, 0x01f251, PG_U_OTHER_SYMBOL},
+	{0x01f252, 0x01f25f, PG_U_UNASSIGNED},
+	{0x01f260, 0x01f265, PG_U_OTHER_SYMBOL},
+	{0x01f266, 0x01f2ff, PG_U_UNASSIGNED},
+	{0x01f300, 0x01f3fa, PG_U_OTHER_SYMBOL},
+	{0x01f3fb, 0x01f3ff, PG_U_MODIFIER_SYMBOL},
+	{0x01f400, 0x01f6d7, PG_U_OTHER_SYMBOL},
+	{0x01f6d8, 0x01f6db, PG_U_UNASSIGNED},
+	{0x01f6dc, 0x01f6ec, PG_U_OTHER_SYMBOL},
+	{0x01f6ed, 0x01f6ef, PG_U_UNASSIGNED},
+	{0x01f6f0, 0x01f6fc, PG_U_OTHER_SYMBOL},
+	{0x01f6fd, 0x01f6ff, PG_U_UNASSIGNED},
+	{0x01f700, 0x01f776, PG_U_OTHER_SYMBOL},
+	{0x01f777, 0x01f77a, PG_U_UNASSIGNED},
+	{0x01f77b, 0x01f7d9, PG_U_OTHER_SYMBOL},
+	{0x01f7da, 0x01f7df, PG_U_UNASSIGNED},
+	{0x01f7e0, 0x01f7eb, PG_U_OTHER_SYMBOL},
+	{0x01f7ec, 0x01f7ef, PG_U_UNASSIGNED},
+	{0x01f7f0, 0x01f7f0, PG_U_OTHER_SYMBOL},
+	{0x01f7f1, 0x01f7ff, PG_U_UNASSIGNED},
+	{0x01f800, 0x01f80b, PG_U_OTHER_SYMBOL},
+	{0x01f80c, 0x01f80f, PG_U_UNASSIGNED},
+	{0x01f810, 0x01f847, PG_U_OTHER_SYMBOL},
+	{0x01f848, 0x01f84f, PG_U_UNASSIGNED},
+	{0x01f850, 0x01f859, PG_U_OTHER_SYMBOL},
+	{0x01f85a, 0x01f85f, PG_U_UNASSIGNED},
+	{0x01f860, 0x01f887, PG_U_OTHER_SYMBOL},
+	{0x01f888, 0x01f88f, PG_U_UNASSIGNED},
+	{0x01f890, 0x01f8ad, PG_U_OTHER_SYMBOL},
+	{0x01f8ae, 0x01f8af, PG_U_UNASSIGNED},
+	{0x01f8b0, 0x01f8b1, PG_U_OTHER_SYMBOL},
+	{0x01f8b2, 0x01f8ff, PG_U_UNASSIGNED},
+	{0x01f900, 0x01fa53, PG_U_OTHER_SYMBOL},
+	{0x01fa54, 0x01fa5f, PG_U_UNASSIGNED},
+	{0x01fa60, 0x01fa6d, PG_U_OTHER_SYMBOL},
+	{0x01fa6e, 0x01fa6f, PG_U_UNASSIGNED},
+	{0x01fa70, 0x01fa7c, PG_U_OTHER_SYMBOL},
+	{0x01fa7d, 0x01fa7f, PG_U_UNASSIGNED},
+	{0x01fa80, 0x01fa88, PG_U_OTHER_SYMBOL},
+	{0x01fa89, 0x01fa8f, PG_U_UNASSIGNED},
+	{0x01fa90, 0x01fabd, PG_U_OTHER_SYMBOL},
+	{0x01fabe, 0x01fabe, PG_U_UNASSIGNED},
+	{0x01fabf, 0x01fac5, PG_U_OTHER_SYMBOL},
+	{0x01fac6, 0x01facd, PG_U_UNASSIGNED},
+	{0x01face, 0x01fadb, PG_U_OTHER_SYMBOL},
+	{0x01fadc, 0x01fadf, PG_U_UNASSIGNED},
+	{0x01fae0, 0x01fae8, PG_U_OTHER_SYMBOL},
+	{0x01fae9, 0x01faef, PG_U_UNASSIGNED},
+	{0x01faf0, 0x01faf8, PG_U_OTHER_SYMBOL},
+	{0x01faf9, 0x01faff, PG_U_UNASSIGNED},
+	{0x01fb00, 0x01fb92, PG_U_OTHER_SYMBOL},
+	{0x01fb93, 0x01fb93, PG_U_UNASSIGNED},
+	{0x01fb94, 0x01fbca, PG_U_OTHER_SYMBOL},
+	{0x01fbcb, 0x01fbef, PG_U_UNASSIGNED},
+	{0x01fbf0, 0x01fbf9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01fbfa, 0x01ffff, PG_U_UNASSIGNED},
+	{0x020000, 0x02a6df, PG_U_OTHER_LETTER},
+	{0x02a6e0, 0x02a6ff, PG_U_UNASSIGNED},
+	{0x02a700, 0x02b739, PG_U_OTHER_LETTER},
+	{0x02b73a, 0x02b73f, PG_U_UNASSIGNED},
+	{0x02b740, 0x02b81d, PG_U_OTHER_LETTER},
+	{0x02b81e, 0x02b81f, PG_U_UNASSIGNED},
+	{0x02b820, 0x02cea1, PG_U_OTHER_LETTER},
+	{0x02cea2, 0x02ceaf, PG_U_UNASSIGNED},
+	{0x02ceb0, 0x02ebe0, PG_U_OTHER_LETTER},
+	{0x02ebe1, 0x02ebef, PG_U_UNASSIGNED},
+	{0x02ebf0, 0x02ee5d, PG_U_OTHER_LETTER},
+	{0x02ee5e, 0x02f7ff, PG_U_UNASSIGNED},
+	{0x02f800, 0x02fa1d, PG_U_OTHER_LETTER},
+	{0x02fa1e, 0x02ffff, PG_U_UNASSIGNED},
+	{0x030000, 0x03134a, PG_U_OTHER_LETTER},
+	{0x03134b, 0x03134f, PG_U_UNASSIGNED},
+	{0x031350, 0x0323af, PG_U_OTHER_LETTER},
+	{0x0323b0, 0x0e0000, PG_U_UNASSIGNED},
+	{0x0e0001, 0x0e0001, PG_U_FORMAT_CHAR},
+	{0x0e0002, 0x0e001f, PG_U_UNASSIGNED},
+	{0x0e0020, 0x0e007f, PG_U_FORMAT_CHAR},
+	{0x0e0080, 0x0e00ff, PG_U_UNASSIGNED},
+	{0x0e0100, 0x0e01ef, PG_U_NON_SPACING_MARK},
+	{0x0e01f0, 0x0effff, PG_U_UNASSIGNED},
+	{0x0f0000, 0x0ffffd, PG_U_PRIVATE_USE_CHAR},
+	{0x0ffffe, 0x0fffff, PG_U_UNASSIGNED},
+	{0x100000, 0x10fffd, PG_U_PRIVATE_USE_CHAR},
+	{0x10fffe, 0x10ffff, PG_U_UNASSIGNED}
+};
+
-- 
2.34.1

#46Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#42)
Re: Pre-proposal: unicode normalized text

On Fri, Oct 6, 2023 at 3:07 PM Jeff Davis <pgsql@j-davis.com> wrote:

On Fri, 2023-10-06 at 13:33 -0400, Robert Haas wrote:

What I think people really want is a whole column in
some encoding that isn't the normal one for that database.

Do people really want that? I'd be curious to know why.

Because it's a feature that exists in other products and so having it
eases migrations and/or replication of data between systems.

I'm not saying that there are a lot of people who want this, any more.
I think there used to be more interest in it. But the point of the
comment was that people who want multiple character set support want
it as a per-column property, not a per-value property. I've never
heard of anyone wanting to store text blobs in multiple distinct
character sets in the same column. But I have heard of people wanting
text blobs in multiple distinct character sets in the same database,
each one in its own column.

--
Robert Haas
EDB: http://www.enterprisedb.com

#47Peter Eisentraut
peter@eisentraut.org
In reply to: Jeff Davis (#45)
Re: Pre-proposal: unicode normalized text

On 07.10.23 03:18, Jeff Davis wrote:

On Wed, 2023-10-04 at 13:16 -0400, Robert Haas wrote:

At minimum I think we need to have some internal functions to check
for
unassigned code points. That belongs in core, because we generate
the
unicode tables from a specific version.

That's a good idea.

Patch attached.

Can you restate what this is supposed to be for? This thread appears to
have morphed from "let's normalize everything" to "let's check for
unassigned code points", but I'm not sure what we are aiming for now.

#48Peter Eisentraut
peter@eisentraut.org
In reply to: Jeff Davis (#35)
Re: Pre-proposal: unicode normalized text

On 06.10.23 19:22, Jeff Davis wrote:

On Fri, 2023-10-06 at 09:58 +0200, Peter Eisentraut wrote:

If you want to be rigid about it, you also need to consider whether
the
Unicode version used by the ICU library in use matches the one used
by
the in-core tables.

What problem are you concerned about here? I thought about it and I
didn't see an obvious issue.

If the ICU unicode version is ahead of the Postgres unicode version,
and no unassigned code points are used according to the Postgres
version, then there's no problem.

And in the other direction, there might be some code points that are
assigned according to the postgres unicode version but unassigned
according to the ICU version. But that would be tracked by the
collation version as you pointed out earlier, so upgrading ICU would be
like any other ICU upgrade (with the same risks). Right?

It might be alright in this particular combination of circumstances.
But in general if we rely on these tables for correctness (e.g., check
that a string is normalized before passing it to a function that
requires it to be normalized), we would need to consider this. The
correct fix would then probably be to not use our own tables but use
some ICU function to achieve the desired task.

#49Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#47)
Re: Pre-proposal: unicode normalized text

On Tue, Oct 10, 2023 at 2:44 AM Peter Eisentraut <peter@eisentraut.org> wrote:

Can you restate what this is supposed to be for? This thread appears to
have morphed from "let's normalize everything" to "let's check for
unassigned code points", but I'm not sure what we are aiming for now.

Jeff can say what he wants it for, but one obvious application would
be to have the ability to add a CHECK constraint that forbids
inserting unassigned code points into your database, which would be
useful if you're worried about forward-compatibility with collation
definitions that might be extended to cover those code points in the
future. Another application would be to find data already in your
database that has this potential problem.

--
Robert Haas
EDB: http://www.enterprisedb.com

#50Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#49)
1 attachment(s)
Re: Pre-proposal: unicode normalized text

On Tue, 2023-10-10 at 10:02 -0400, Robert Haas wrote:

On Tue, Oct 10, 2023 at 2:44 AM Peter Eisentraut
<peter@eisentraut.org> wrote:

Can you restate what this is supposed to be for?  This thread
appears to
have morphed from "let's normalize everything" to "let's check for
unassigned code points", but I'm not sure what we are aiming for
now.

It was a "pre-proposal", so yes, the goalposts have moved a bit. Right
now I'm aiming to get some primitives in place that will be useful by
themselves, but also that we can potentially build on.

Attached is a new version of the patch which introduces some SQL
functions as well:

* unicode_is_valid(text): returns true if all codepoints are
assigned, false otherwise
* unicode_version(): version of unicode Postgres is built with
* icu_unicode_version(): version of Unicode ICU is built with

I'm not 100% clear on the consequences of differences between the PG
unicode version and the ICU unicode version, but because normalization
uses the Postgres version of Unicode, I believe the Postgres version of
Unicode should also be available to determine whether a code point is
assigned or not.

We may also find it interesting to use the PG Unicode tables for regex
character classification. This is just an idea and we can discuss
whether that makes sense or not, but having the primitives in place
seems like a good idea regardless.

Jeff can say what he wants it for, but one obvious application would
be to have the ability to add a CHECK constraint that forbids
inserting unassigned code points into your database, which would be
useful if you're worried about forward-compatibility with collation
definitions that might be extended to cover those code points in the
future. Another application would be to find data already in your
database that has this potential problem.

Exactly. Avoiding unassigned code points also allows you to be forward-
compatible with normalization.

Regards,
Jeff Davis

Attachments:

v2-0001-Additional-unicode-primitive-functions.patchtext/x-patch; charset=UTF-8; name=v2-0001-Additional-unicode-primitive-functions.patchDownload
From 9752d817ee77cf021dd733a4055929cba021b506 Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Thu, 5 Oct 2023 17:01:03 -0700
Subject: [PATCH v2] Additional unicode primitive functions.

Introduce unicode_version(), icu_unicode_version(), and
unicode_is_valid().

The latter requires introducing a new lookup table, which is generated
along with the other lookup tables.

Discussion: https://postgr.es/m/CA+TgmoYzYR-yhU6k1XFCADeyj=Oyz2PkVsa3iKv+keM8wp-F_A@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |  113 +-
 src/backend/utils/adt/varlena.c               |   61 +
 src/common/Makefile                           |    1 +
 src/common/meson.build                        |    1 +
 src/common/unicode/Makefile                   |   19 +-
 src/common/unicode/category_test.c            |  103 +
 .../generate-unicode_category_table.pl        |  202 +
 .../unicode/generate-unicode_version.pl       |   48 +
 src/common/unicode/meson.build                |   40 +
 src/common/unicode/norm_test.c                |    2 +-
 src/common/unicode_category.c                 |  197 +
 src/include/catalog/pg_proc.dat               |   12 +
 src/include/common/unicode_category.h         |   57 +
 src/include/common/unicode_category_table.h   | 4039 +++++++++++++++++
 src/include/common/unicode_version.h          |   16 +
 src/test/icu/t/010_database.pl                |    4 +
 src/test/regress/expected/unicode.out         |   18 +
 src/test/regress/sql/unicode.sql              |    4 +
 18 files changed, 4915 insertions(+), 22 deletions(-)
 create mode 100644 src/common/unicode/category_test.c
 create mode 100644 src/common/unicode/generate-unicode_category_table.pl
 create mode 100644 src/common/unicode/generate-unicode_version.pl
 create mode 100644 src/common/unicode_category.c
 create mode 100644 src/include/common/unicode_category.h
 create mode 100644 src/include/common/unicode_category_table.h
 create mode 100644 src/include/common/unicode_version.h

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index f1ad64c3d6..ad3d60d625 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -2859,6 +2859,26 @@ repeat('Pg', 4) <returnvalue>PgPgPgPg</returnvalue>
        </para></entry>
       </row>
 
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>unicode_is_valid</primary>
+        </indexterm>
+        <function>unicode_is_valid</function> ( <type>text</type> )
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Returns <literal>true</literal> if all characters in the string are
+        valid and assigned Unicode codepoints; <literal>false</literal>
+        otherwise. This function can only be used when the server encoding is
+        <literal>UTF8</literal>.
+       </para>
+       <para>
+        <literal>upper('tom')</literal>
+        <returnvalue>TOM</returnvalue>
+       </para></entry>
+      </row>
+
       <row>
        <entry role="func_table_entry"><para role="func_signature">
         <indexterm>
@@ -23330,25 +23350,6 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
         This is equivalent to <function>current_user</function>.
        </para></entry>
       </row>
-
-      <row>
-       <entry role="func_table_entry"><para role="func_signature">
-        <indexterm>
-         <primary>version</primary>
-        </indexterm>
-        <function>version</function> ()
-        <returnvalue>text</returnvalue>
-       </para>
-       <para>
-        Returns a string describing the <productname>PostgreSQL</productname>
-        server's version.  You can also get this information from
-        <xref linkend="guc-server-version"/>, or for a machine-readable
-        version use <xref linkend="guc-server-version-num"/>.  Software
-        developers should use <varname>server_version_num</varname> (available
-        since 8.2) or <xref linkend="libpq-PQserverVersion"/> instead of
-        parsing the text version.
-       </para></entry>
-      </row>
      </tbody>
     </tgroup>
    </table>
@@ -26235,6 +26236,80 @@ SELECT collation for ('foo' COLLATE "de_DE");
 
   </sect2>
 
+  <sect2 id="functions-info-version">
+   <title>Version Information Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-version"/>
+    print version information.
+   </para>
+
+   <table id="functions-version">
+    <title>Version Information Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>version</primary>
+        </indexterm>
+        <function>version</function> ()
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Returns a string describing the <productname>PostgreSQL</productname>
+        server's version.  You can also get this information from
+        <xref linkend="guc-server-version"/>, or for a machine-readable
+        version use <xref linkend="guc-server-version-num"/>.  Software
+        developers should use <varname>server_version_num</varname> (available
+        since 8.2) or <xref linkend="libpq-PQserverVersion"/> instead of
+        parsing the text version.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>unicode_version</primary>
+        </indexterm>
+        <function>unicode_version</function> ()
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Returns a string representing the version of Unicode used by
+        <productname>PostgreSQL</productname>.
+       </para></entry>
+      </row>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>icu_unicode_version</primary>
+        </indexterm>
+        <function>icu_unicode_version</function> ()
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Returns a string representing the version of Unicode used by ICU, if
+        the server was built with ICU support; otherwise returns
+        <literal>NULL</literal> </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   </sect1>
 
   <sect1 id="functions-admin">
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 72e1e24fe0..3d8856f6fb 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -23,7 +23,9 @@
 #include "catalog/pg_type.h"
 #include "common/hashfn.h"
 #include "common/int.h"
+#include "common/unicode_category.h"
 #include "common/unicode_norm.h"
+#include "common/unicode_version.h"
 #include "funcapi.h"
 #include "lib/hyperloglog.h"
 #include "libpq/pqformat.h"
@@ -6239,6 +6241,65 @@ unicode_norm_form_from_string(const char *formstr)
 	return form;
 }
 
+/*
+ * Returns version of Unicode used by Postgres in "major.minor" format. The
+ * third component ("update version") never involves additions to the
+ * character repertiore and is unimportant for most purposes.
+ *
+ * See: https://unicode.org/versions/
+ */
+Datum
+unicode_version(PG_FUNCTION_ARGS)
+{
+	PG_RETURN_TEXT_P(cstring_to_text(PG_UNICODE_VERSION));
+}
+
+/*
+ * Returns version of Unicode used by ICU, if enabled; otherwise NULL.
+ */
+Datum
+icu_unicode_version(PG_FUNCTION_ARGS)
+{
+#ifdef USE_ICU
+	PG_RETURN_TEXT_P(cstring_to_text(U_UNICODE_VERSION));
+#else
+	PG_RETURN_NULL();
+#endif
+}
+
+/*
+ * Check whether the string contains only assigned Unicode code
+ * points. Requires that the database encoding is UTF-8.
+ */
+Datum
+unicode_is_valid(PG_FUNCTION_ARGS)
+{
+	text			*input = PG_GETARG_TEXT_PP(0);
+	unsigned char	*p;
+	int				 size;
+
+	if (GetDatabaseEncoding() != PG_UTF8)
+		ereport(ERROR,
+				(errcode(ERRCODE_SYNTAX_ERROR),
+				 errmsg("Unicode normalization can only be performed if server encoding is UTF8")));
+
+	/* convert to pg_wchar */
+	size = pg_mbstrlen_with_len(VARDATA_ANY(input), VARSIZE_ANY_EXHDR(input));
+	p = (unsigned char *) VARDATA_ANY(input);
+	for (int i = 0; i < size; i++)
+	{
+		pg_wchar	uchar	 = utf8_to_unicode(p);
+		int			category = unicode_category(uchar);
+
+		if (category == PG_U_UNASSIGNED)
+			PG_RETURN_BOOL(false);
+
+		p += pg_utf_mblen(p);
+	}
+
+	PG_RETURN_BOOL(true);
+}
+
 Datum
 unicode_normalize_func(PG_FUNCTION_ARGS)
 {
diff --git a/src/common/Makefile b/src/common/Makefile
index 70884be00c..8de31d4763 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -78,6 +78,7 @@ OBJS_COMMON = \
 	scram-common.o \
 	string.o \
 	stringinfo.o \
+	unicode_category.o \
 	unicode_norm.o \
 	username.o \
 	wait_error.o \
diff --git a/src/common/meson.build b/src/common/meson.build
index ae05ac63cf..8be145c0fb 100644
--- a/src/common/meson.build
+++ b/src/common/meson.build
@@ -30,6 +30,7 @@ common_sources = files(
   'scram-common.c',
   'string.c',
   'stringinfo.c',
+  'unicode_category.c',
   'unicode_norm.c',
   'username.c',
   'wait_error.c',
diff --git a/src/common/unicode/Makefile b/src/common/unicode/Makefile
index 382da476cf..27a7d5a807 100644
--- a/src/common/unicode/Makefile
+++ b/src/common/unicode/Makefile
@@ -15,11 +15,15 @@ include $(top_builddir)/src/Makefile.global
 override CPPFLAGS := -DFRONTEND -I. $(CPPFLAGS)
 LIBS += $(PTHREAD_LIBS)
 
+LDFLAGS_INTERNAL += $(ICU_LIBS)
+CPPFLAGS += $(ICU_CFLAGS)
+
 # By default, do nothing.
 all:
 
-update-unicode: unicode_norm_table.h unicode_nonspacing_table.h unicode_east_asian_fw_table.h unicode_normprops_table.h unicode_norm_hashfunc.h
+update-unicode: unicode_category_table.h unicode_norm_table.h unicode_nonspacing_table.h unicode_east_asian_fw_table.h unicode_normprops_table.h unicode_norm_hashfunc.h unicode_version.h
 	mv $^ $(top_srcdir)/src/include/common/
+	$(MAKE) category-check
 	$(MAKE) normalization-check
 
 # These files are part of the Unicode Character Database. Download
@@ -28,6 +32,12 @@ update-unicode: unicode_norm_table.h unicode_nonspacing_table.h unicode_east_asi
 UnicodeData.txt EastAsianWidth.txt DerivedNormalizationProps.txt CompositionExclusions.txt NormalizationTest.txt: $(top_builddir)/src/Makefile.global
 	$(DOWNLOAD) https://www.unicode.org/Public/$(UNICODE_VERSION)/ucd/$(@F)
 
+unicode_version.h: generate-unicode_version.pl
+	$(PERL) $< --version $(UNICODE_VERSION)
+
+unicode_category_table.h: generate-unicode_category_table.pl UnicodeData.txt
+	$(PERL) $<
+
 # Generation of conversion tables used for string normalization with
 # UTF-8 strings.
 unicode_norm_hashfunc.h: unicode_norm_table.h
@@ -45,9 +55,14 @@ unicode_normprops_table.h: generate-unicode_normprops_table.pl DerivedNormalizat
 	$(PERL) $^ >$@
 
 # Test suite
+category-check: category_test
+	./category_test
+
 normalization-check: norm_test
 	./norm_test
 
+category_test: category_test.o ../unicode_category.o | submake-common
+
 norm_test: norm_test.o ../unicode_norm.o | submake-common
 
 norm_test.o: norm_test_table.h
@@ -64,7 +79,7 @@ norm_test_table.h: generate-norm_test_table.pl NormalizationTest.txt
 
 
 clean:
-	rm -f $(OBJS) norm_test norm_test.o
+	rm -f $(OBJS) category_test category_test.o norm_test norm_test.o
 
 distclean: clean
 	rm -f UnicodeData.txt EastAsianWidth.txt CompositionExclusions.txt NormalizationTest.txt norm_test_table.h unicode_norm_table.h
diff --git a/src/common/unicode/category_test.c b/src/common/unicode/category_test.c
new file mode 100644
index 0000000000..2cbd4250f9
--- /dev/null
+++ b/src/common/unicode/category_test.c
@@ -0,0 +1,103 @@
+/*-------------------------------------------------------------------------
+ * category_test.c
+ *		Program to test Unicode general category functions.
+ *
+ * Portions Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/common/unicode/category_test.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#ifdef USE_ICU
+#include <unicode/uchar.h>
+#endif
+#include "common/unicode_category.h"
+#include "common/unicode_version.h"
+
+/*
+ * Parse X.Y[.Z] into integer composed from X and Y.
+ */
+static int
+parse_unicode_version(const char *version)
+{
+	int n, major, minor;
+
+	n = sscanf(version, "%d.%d", &major, &minor);
+
+	Assert(n == 2);
+	Assert(major < 0xff && minor < 0xff);
+
+	return major * 100 + minor;
+}
+
+/*
+ * Exhaustively test that the Unicode category for each codepoint matches that
+ * returned by ICU.
+ */
+int
+main(int argc, char **argv)
+{
+#ifdef USE_ICU
+	int		pg_unicode_version = parse_unicode_version(PG_UNICODE_VERSION);
+	int		icu_unicode_version = parse_unicode_version(U_UNICODE_VERSION);
+	int		pg_skipped_codepoints  = 0;
+	int		icu_skipped_codepoints = 0;
+
+	printf("Postgres Unicode Version:\t%s\n", PG_UNICODE_VERSION);
+	printf("ICU Unicode Version:\t\t%s\n", U_UNICODE_VERSION);
+
+	for (UChar32 code = 0; code <= 0x10ffff; code++)
+	{
+		uint8_t pg_category = unicode_category(code);
+		uint8_t icu_category = u_charType(code);
+		if (pg_category != icu_category)
+		{
+			/*
+			 * A version mismatch means that some assigned codepoints in the
+			 * newer version may be unassigned in the older version. That's
+			 * OK, though the test will not cover those codepoints marked
+			 * unassigned in the older version (that is, it will no longer be
+			 * an exhaustive test).
+			 */
+			if (pg_category == PG_U_UNASSIGNED &&
+				pg_unicode_version < icu_unicode_version)
+				pg_skipped_codepoints++;
+			else if (icu_category == PG_U_UNASSIGNED &&
+					 icu_unicode_version < pg_unicode_version)
+				icu_skipped_codepoints++;
+			else
+			{
+				printf("FAILURE for codepoint %06x\n", code);
+				printf("Postgres category:	%02d %s %s\n", pg_category,
+					   unicode_category_short(pg_category),
+					   unicode_category_string(pg_category));
+				printf("ICU category:		%02d %s %s\n", icu_category,
+					   unicode_category_short(icu_category),
+					   unicode_category_string(icu_category));
+				printf("\n");
+				exit(1);
+			}
+		}
+	}
+
+	if (pg_skipped_codepoints > 0)
+		printf("Skipped %d codepoints unassigned in Postgres due to Unicode version mismatch.\n",
+			   pg_skipped_codepoints);
+	if (icu_skipped_codepoints > 0)
+		printf("Skipped %d codepoints unassigned in ICU due to Unicode version mismatch.\n",
+			   icu_skipped_codepoints);
+
+	printf("category_test: All tests successful!\n");
+	exit(0);
+#else
+	printf("ICU support required for test; skipping.\n");
+	exit(0);
+#endif
+}
diff --git a/src/common/unicode/generate-unicode_category_table.pl b/src/common/unicode/generate-unicode_category_table.pl
new file mode 100644
index 0000000000..bec34d591d
--- /dev/null
+++ b/src/common/unicode/generate-unicode_category_table.pl
@@ -0,0 +1,202 @@
+#!/usr/bin/perl
+#
+# Generate a code point category table and its lookup utilities, using
+# Unicode data files as input.
+#
+# Input: UnicodeData.txt
+# Output: unicode_category_table.h
+#
+# Copyright (c) 2000-2023, PostgreSQL Global Development Group
+
+use strict;
+use warnings;
+use Getopt::Long;
+
+use FindBin;
+use lib "$FindBin::RealBin/../../tools/";
+
+my $CATEGORY_UNASSIGNED = 'Cn';
+
+my $output_path = '.';
+
+GetOptions('outdir:s' => \$output_path);
+
+my $output_table_file = "$output_path/unicode_category_table.h";
+
+my $FH;
+
+# Read entries from UnicodeData.txt into a list of codepoint ranges
+# and their general category.
+my @category_ranges = ();
+my $range_start = undef;
+my $range_end = undef;
+my $range_category = undef;
+
+# If between a "<..., First>" entry and a "<..., Last>" entry, the gap in
+# codepoints represents a range, and $gap_category is equal to the
+# category for both (which must match). Otherwise, the gap represents
+# unassigned code points.
+my $gap_category = undef;
+
+open($FH, '<', "$output_path/UnicodeData.txt")
+  or die "Could not open $output_path/UnicodeData.txt: $!.";
+while (my $line = <$FH>)
+{
+	my @elts = split(';', $line);
+	my $code = hex($elts[0]);
+	my $name = $elts[1];
+	my $category = $elts[2];
+
+	die "codepoint out of range" if $code > 0x10FFFF;
+	die "unassigned codepoint in UnicodeData.txt" if $category eq $CATEGORY_UNASSIGNED;
+
+	if (!defined($range_start)) {
+		my $code_str = sprintf "0x%06x", $code;
+		die if defined($range_end) || defined($range_category) || defined($gap_category);
+		die "unexpected first entry <..., Last>" if ($name =~ /Last>/);
+		die "expected 0x000000 for first entry, got $code_str" if $code != 0x000000;
+
+		# initialize
+		$range_start = $code;
+		$range_end = $code;
+		$range_category = $category;
+		if ($name =~ /<.*, First>$/) {
+			$gap_category = $category;
+		} else {
+			$gap_category = $CATEGORY_UNASSIGNED;
+		}
+		next;
+	}
+
+	# Gap in codepoints detected. If it's a different category than
+	# the current range, emit the current range and initialize a new
+	# range representing the gap.
+	if ($range_end + 1 != $code && $range_category ne $gap_category) {
+		push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+		$range_start = $range_end + 1;
+		$range_end = $code - 1;
+		$range_category = $gap_category;
+	}
+
+	# different category; new range
+	if ($range_category ne $category) {
+		push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+		$range_start = $code;
+		$range_end = $code;
+		$range_category = $category;
+	}
+
+	if ($name =~ /<.*, First>$/) {
+		die "<..., First> entry unexpectedly follows <..., Last> entry"
+		  if $gap_category ne $CATEGORY_UNASSIGNED;
+		$gap_category = $category;
+	}
+	elsif ($name =~ /<.*, Last>$/) {
+		die "<..., First> and <..., Last> entries have mismatching general category"
+		  if $gap_category ne $category;
+		$gap_category = $CATEGORY_UNASSIGNED;
+	}
+	else {
+		die "unexpected entry found between <..., First> and <..., Last>"
+		  if $gap_category ne $CATEGORY_UNASSIGNED;
+	}
+
+	$range_end = $code;
+}
+close $FH;
+
+die "<..., First> entry with no corresponding <..., Last> entry"
+  if $gap_category ne $CATEGORY_UNASSIGNED;
+
+# emit final range
+push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+
+# emit range for any unassigned code points after last entry
+if ($range_end < 0x10FFFF) {
+	$range_start = $range_end + 1;
+	$range_end = 0x10FFFF;
+	$range_category = $CATEGORY_UNASSIGNED;
+	push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+}
+
+my $num_ranges = scalar @category_ranges;
+
+# See: https://www.unicode.org/reports/tr44/#General_Category_Values
+my $categories = {
+	Cn => 'PG_U_UNASSIGNED',
+	Lu => 'PG_U_UPPERCASE_LETTER',
+	Ll => 'PG_U_LOWERCASE_LETTER',
+	Lt => 'PG_U_TITLECASE_LETTER',
+	Lm => 'PG_U_MODIFIER_LETTER',
+	Lo => 'PG_U_OTHER_LETTER',
+	Mn => 'PG_U_NON_SPACING_MARK',
+	Me => 'PG_U_ENCLOSING_MARK',
+	Mc => 'PG_U_COMBINING_SPACING_MARK',
+	Nd => 'PG_U_DECIMAL_DIGIT_NUMBER',
+	Nl => 'PG_U_LETTER_NUMBER',
+	No => 'PG_U_OTHER_NUMBER',
+	Zs => 'PG_U_SPACE_SEPARATOR',
+	Zl => 'PG_U_LINE_SEPARATOR',
+	Zp => 'PG_U_PARAGRAPH_SEPARATOR',
+	Cc => 'PG_U_CONTROL_CHAR',
+	Cf => 'PG_U_FORMAT_CHAR',
+	Co => 'PG_U_PRIVATE_USE_CHAR',
+	Cs => 'PG_U_SURROGATE',
+	Pd => 'PG_U_DASH_PUNCTUATION',
+	Ps => 'PG_U_START_PUNCTUATION',
+	Pe => 'PG_U_END_PUNCTUATION',
+	Pc => 'PG_U_CONNECTOR_PUNCTUATION',
+	Po => 'PG_U_OTHER_PUNCTUATION',
+	Sm => 'PG_U_MATH_SYMBOL',
+	Sc => 'PG_U_CURRENCY_SYMBOL',
+	Sk => 'PG_U_MODIFIER_SYMBOL',
+	So => 'PG_U_OTHER_SYMBOL',
+	Pi => 'PG_U_INITIAL_PUNCTUATION',
+	Pf => 'PG_U_FINAL_PUNCTUATION'
+};
+
+# Start writing out the output files
+open my $OT, '>', $output_table_file
+  or die "Could not open output file $output_table_file: $!\n";
+
+print $OT <<HEADER;
+/*-------------------------------------------------------------------------
+ *
+ * unicode_category_table.h
+ *	  Category table for Unicode character classification.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/unicode_category_table.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * File auto-generated by src/common/unicode/generate-unicode_category_table.pl,
+ * do not edit. There is deliberately not an #ifndef PG_UNICODE_CATEGORY_TABLE_H
+ * here.
+ */
+typedef struct
+{
+	uint32		first;	/* Unicode codepoint */
+	uint32		last;		/* Unicode codepoint */
+	uint8		category;		/* combining class of character */
+} pg_category_range;
+
+/* table of Unicode codepoint ranges and their categories */
+static const pg_category_range unicode_categories[$num_ranges] =
+{
+HEADER
+
+my $firsttime = 1;
+foreach my $range (@category_ranges) {
+	printf $OT ",\n" unless $firsttime;
+	$firsttime = 0;
+
+	my $category = $categories->{$range->{category}};
+	die "category missing: $range->{category}" unless $category;
+	printf $OT "\t{0x%06x, 0x%06x, %s}", $range->{start}, $range->{end}, $category;
+}
+print $OT "\n};\n\n";
diff --git a/src/common/unicode/generate-unicode_version.pl b/src/common/unicode/generate-unicode_version.pl
new file mode 100644
index 0000000000..4dd400e32d
--- /dev/null
+++ b/src/common/unicode/generate-unicode_version.pl
@@ -0,0 +1,48 @@
+#!/usr/bin/perl
+#
+# Generate header file with Unicode version used by Postgres.
+#
+# Output: unicode_version.h
+#
+# Copyright (c) 2000-2023, PostgreSQL Global Development Group
+
+use strict;
+use warnings;
+use Getopt::Long;
+
+use FindBin;
+use lib "$FindBin::RealBin/../../tools/";
+
+my $output_path = '.';
+my $version_str = undef;
+
+GetOptions('outdir:s' => \$output_path, 'version:s' => \$version_str);
+
+my @version_parts = split /\./, $version_str;
+
+my $unicode_version_str = sprintf "%d.%d", $version_parts[0], $version_parts[1];
+
+my $output_file = "$output_path/unicode_version.h";
+
+# Start writing out the output files
+open my $OT, '>', $output_file
+  or die "Could not open output file $output_file: $!\n";
+
+print $OT <<HEADER;
+/*-------------------------------------------------------------------------
+ *
+ * unicode_version.h
+ *	  Unicode version used by Postgres.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/unicode_version.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#define PG_UNICODE_VERSION		"$unicode_version_str"
+
+
+HEADER
diff --git a/src/common/unicode/meson.build b/src/common/unicode/meson.build
index 357ca2f9fb..6af46122c4 100644
--- a/src/common/unicode/meson.build
+++ b/src/common/unicode/meson.build
@@ -24,6 +24,25 @@ endforeach
 
 update_unicode_targets = []
 
+update_unicode_targets += \
+  custom_target('unicode_version.h',
+    output: ['unicode_version.h'],
+    command: [
+      perl, files('generate-unicode_version.pl'),
+      '--outdir', '@OUTDIR@', '--version', UNICODE_VERSION],
+    build_by_default: false,
+  )
+
+update_unicode_targets += \
+  custom_target('unicode_category_table.h',
+    input: [unicode_data['UnicodeData.txt']],
+    output: ['unicode_category_table.h'],
+    command: [
+      perl, files('generate-unicode_category_table.pl'),
+      '--outdir', '@OUTDIR@', '@INPUT@'],
+    build_by_default: false,
+  )
+
 update_unicode_targets += \
   custom_target('unicode_norm_table.h',
     input: [unicode_data['UnicodeData.txt'], unicode_data['CompositionExclusions.txt']],
@@ -73,6 +92,17 @@ norm_test_table = custom_target('norm_test_table.h',
 
 inc = include_directories('.')
 
+category_test = executable('category_test',
+  ['category_test.c'],
+  dependencies: [frontend_port_code, icu],
+  include_directories: inc,
+  link_with: [common_static, pgport_static],
+  build_by_default: false,
+  kwargs: default_bin_args + {
+    'install': false,
+  }
+)
+
 norm_test = executable('norm_test',
   ['norm_test.c', norm_test_table],
   dependencies: [frontend_port_code],
@@ -86,6 +116,16 @@ norm_test = executable('norm_test',
 
 update_unicode_dep = []
 
+if not meson.is_cross_build()
+  update_unicode_dep += custom_target('category_test.run',
+    output: 'category_test.run',
+    input: update_unicode_targets,
+    command: [category_test, UNICODE_VERSION],
+    build_by_default: false,
+    build_always_stale: true,
+  )
+endif
+
 if not meson.is_cross_build()
   update_unicode_dep += custom_target('norm_test.run',
     output: 'norm_test.run',
diff --git a/src/common/unicode/norm_test.c b/src/common/unicode/norm_test.c
index 809a6dee54..b6097b912a 100644
--- a/src/common/unicode/norm_test.c
+++ b/src/common/unicode/norm_test.c
@@ -81,6 +81,6 @@ main(int argc, char **argv)
 		}
 	}
 
-	printf("All tests successful!\n");
+	printf("norm_test: All tests successful!\n");
 	exit(0);
 }
diff --git a/src/common/unicode_category.c b/src/common/unicode_category.c
new file mode 100644
index 0000000000..b8b8ee8a4e
--- /dev/null
+++ b/src/common/unicode_category.c
@@ -0,0 +1,197 @@
+/*-------------------------------------------------------------------------
+ * unicode_category.c
+ *		Determine general category of Unicode characters.
+ *
+ * Portions Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/common/unicode_category.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef FRONTEND
+#include "postgres.h"
+#else
+#include "postgres_fe.h"
+#endif
+
+#include "common/unicode_category.h"
+#include "common/unicode_category_table.h"
+
+/*
+ * Unicode general category for the given codepoint.
+ */
+pg_unicode_category
+unicode_category(pg_wchar ucs)
+{
+	int	min = 0;
+	int	mid;
+	int max = (sizeof(unicode_categories) / sizeof(pg_category_range)) - 1;
+
+	Assert (ucs >= unicode_categories[0].first &&
+			ucs <= unicode_categories[max].last);
+
+	while (max >= min)
+	{
+		mid = (min + max) / 2;
+		if (ucs > unicode_categories[mid].last)
+			min = mid + 1;
+		else if (ucs < unicode_categories[mid].first)
+			max = mid - 1;
+		else
+			return unicode_categories[mid].category;
+	}
+
+	Assert(false);
+	return (pg_unicode_category) -1;
+}
+
+/*
+ * Description of Unicode general category.
+ *
+ * See: https://www.unicode.org/reports/tr44/#General_Category_Values
+ */
+const char *
+unicode_category_string(pg_unicode_category category)
+{
+	switch (category)
+	{
+		case PG_U_UNASSIGNED:
+			return "Unassigned";
+		case PG_U_UPPERCASE_LETTER:
+			return "Uppercase_Letter";
+		case PG_U_LOWERCASE_LETTER:
+			return "Lowercase_Letter";
+		case PG_U_TITLECASE_LETTER:
+			return "Titlecase_Letter";
+		case PG_U_MODIFIER_LETTER:
+			return "Modifier_Letter";
+		case PG_U_OTHER_LETTER:
+			return "Other_Letter";
+		case PG_U_NON_SPACING_MARK:
+			return "Nonspacing_Mark";
+		case PG_U_ENCLOSING_MARK:
+			return "Enclosing_Mark";
+		case PG_U_COMBINING_SPACING_MARK:
+			return "Spacing_Mark";
+		case PG_U_DECIMAL_DIGIT_NUMBER:
+			return "Decimal_Number";
+		case PG_U_LETTER_NUMBER:
+			return "Letter_Number";
+		case PG_U_OTHER_NUMBER:
+			return "Other_Number";
+		case PG_U_SPACE_SEPARATOR:
+			return "Space_Separator";
+		case PG_U_LINE_SEPARATOR:
+			return "Line_Separator";
+		case PG_U_PARAGRAPH_SEPARATOR:
+			return "Paragraph_Separator";
+		case PG_U_CONTROL_CHAR:
+			return "Control";
+		case PG_U_FORMAT_CHAR:
+			return "Format";
+		case PG_U_PRIVATE_USE_CHAR:
+			return "Private_Use";
+		case PG_U_SURROGATE:
+			return "Surrogate";
+		case PG_U_DASH_PUNCTUATION:
+			return "Dash_Punctuation";
+		case PG_U_START_PUNCTUATION:
+			return "Open_Punctuation";
+		case PG_U_END_PUNCTUATION:
+			return "Close_Punctuation";
+		case PG_U_CONNECTOR_PUNCTUATION:
+			return "Connector_Punctuation";
+		case PG_U_OTHER_PUNCTUATION:
+			return "Other_Punctuation";
+		case PG_U_MATH_SYMBOL:
+			return "Math_Symbol";
+		case PG_U_CURRENCY_SYMBOL:
+			return "Currency_Symbol";
+		case PG_U_MODIFIER_SYMBOL:
+			return "Modifier_Symbol";
+		case PG_U_OTHER_SYMBOL:
+			return "Other_Symbol";
+		case PG_U_INITIAL_PUNCTUATION:
+			return "Initial_Punctuation";
+		case PG_U_FINAL_PUNCTUATION:
+			return "Final_Punctuation";
+		default:
+			return "Unrecognized";
+	}
+}
+
+/*
+ * Short code for Unicode general category.
+ *
+ * See: https://www.unicode.org/reports/tr44/#General_Category_Values
+ */
+const char *
+unicode_category_short(pg_unicode_category category)
+{
+	switch (category)
+	{
+		case PG_U_UNASSIGNED:
+			return "Cn";
+		case PG_U_UPPERCASE_LETTER:
+			return "Lu";
+		case PG_U_LOWERCASE_LETTER:
+			return "Ll";
+		case PG_U_TITLECASE_LETTER:
+			return "Lt";
+		case PG_U_MODIFIER_LETTER:
+			return "Lm";
+		case PG_U_OTHER_LETTER:
+			return "Lo";
+		case PG_U_NON_SPACING_MARK:
+			return "Mn";
+		case PG_U_ENCLOSING_MARK:
+			return "Me";
+		case PG_U_COMBINING_SPACING_MARK:
+			return "Mc";
+		case PG_U_DECIMAL_DIGIT_NUMBER:
+			return "Nd";
+		case PG_U_LETTER_NUMBER:
+			return "Nl";
+		case PG_U_OTHER_NUMBER:
+			return "No";
+		case PG_U_SPACE_SEPARATOR:
+			return "Zs";
+		case PG_U_LINE_SEPARATOR:
+			return "Zl";
+		case PG_U_PARAGRAPH_SEPARATOR:
+			return "Zp";
+		case PG_U_CONTROL_CHAR:
+			return "Cc";
+		case PG_U_FORMAT_CHAR:
+			return "Cf";
+		case PG_U_PRIVATE_USE_CHAR:
+			return "Co";
+		case PG_U_SURROGATE:
+			return "Cs";
+		case PG_U_DASH_PUNCTUATION:
+			return "Pd";
+		case PG_U_START_PUNCTUATION:
+			return "Ps";
+		case PG_U_END_PUNCTUATION:
+			return "Pe";
+		case PG_U_CONNECTOR_PUNCTUATION:
+			return "Pc";
+		case PG_U_OTHER_PUNCTUATION:
+			return "Po";
+		case PG_U_MATH_SYMBOL:
+			return "Sm";
+		case PG_U_CURRENCY_SYMBOL:
+			return "Sc";
+		case PG_U_MODIFIER_SYMBOL:
+			return "Sk";
+		case PG_U_OTHER_SYMBOL:
+			return "So";
+		case PG_U_INITIAL_PUNCTUATION:
+			return "Pi";
+		case PG_U_FINAL_PUNCTUATION:
+			return "Pf";
+		default:
+			return "??";
+	}
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index f0b7b9cbd8..2118b1e258 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12010,6 +12010,18 @@
   proname => 'pg_partition_root', prorettype => 'regclass',
   proargtypes => 'regclass', prosrc => 'pg_partition_root' },
 
+{ oid => '4549', descr => 'Unicode version used by Postgres',
+  proname => 'unicode_version', prorettype => 'text', proargtypes => '',
+  prosrc => 'unicode_version' },
+
+{ oid => '6099', descr => 'Unicode version used by ICU, if enabled',
+  proname => 'icu_unicode_version', prorettype => 'text', proargtypes => '',
+  prosrc => 'icu_unicode_version' },
+
+{ oid => '6105', descr => 'check valid Unicode',
+  proname => 'unicode_is_valid', prorettype => 'bool', proargtypes => 'text',
+  prosrc => 'unicode_is_valid' },
+
 { oid => '4350', descr => 'Unicode normalization',
   proname => 'normalize', prorettype => 'text', proargtypes => 'text text',
   prosrc => 'unicode_normalize_func' },
diff --git a/src/include/common/unicode_category.h b/src/include/common/unicode_category.h
new file mode 100644
index 0000000000..e4301be726
--- /dev/null
+++ b/src/include/common/unicode_category.h
@@ -0,0 +1,57 @@
+/*-------------------------------------------------------------------------
+ *
+ * unicode_category.h
+ *	  Routines for determining the category of Unicode characters.
+ *
+ * These definitions can be used by both frontend and backend code.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * src/include/common/unicode_category.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef UNICODE_CATEGORY_H
+#define UNICODE_CATEGORY_H
+
+#include "mb/pg_wchar.h"
+
+/* matches corresponding numeric values of UCharCategory, defined by ICU */
+typedef enum pg_unicode_category {
+	PG_U_UNASSIGNED = 0,
+	PG_U_UPPERCASE_LETTER = 1,
+	PG_U_LOWERCASE_LETTER = 2,
+	PG_U_TITLECASE_LETTER = 3,
+	PG_U_MODIFIER_LETTER = 4,
+	PG_U_OTHER_LETTER = 5,
+	PG_U_NON_SPACING_MARK = 6,
+	PG_U_ENCLOSING_MARK = 7,
+	PG_U_COMBINING_SPACING_MARK = 8,
+	PG_U_DECIMAL_DIGIT_NUMBER = 9,
+	PG_U_LETTER_NUMBER = 10,
+	PG_U_OTHER_NUMBER = 11,
+	PG_U_SPACE_SEPARATOR = 12,
+	PG_U_LINE_SEPARATOR = 13,
+	PG_U_PARAGRAPH_SEPARATOR = 14,
+	PG_U_CONTROL_CHAR = 15,
+	PG_U_FORMAT_CHAR = 16,
+	PG_U_PRIVATE_USE_CHAR = 17,
+	PG_U_SURROGATE = 18,
+	PG_U_DASH_PUNCTUATION = 19,
+	PG_U_START_PUNCTUATION = 20,
+	PG_U_END_PUNCTUATION = 21,
+	PG_U_CONNECTOR_PUNCTUATION = 22,
+	PG_U_OTHER_PUNCTUATION = 23,
+	PG_U_MATH_SYMBOL = 24,
+	PG_U_CURRENCY_SYMBOL = 25,
+	PG_U_MODIFIER_SYMBOL = 26,
+	PG_U_OTHER_SYMBOL = 27,
+	PG_U_INITIAL_PUNCTUATION = 28,
+	PG_U_FINAL_PUNCTUATION = 29
+} pg_unicode_category;
+
+extern pg_unicode_category unicode_category(pg_wchar ucs);
+const char *unicode_category_string(pg_unicode_category category);
+const char *unicode_category_short(pg_unicode_category category);
+
+#endif							/* UNICODE_CATEGORY_H */
diff --git a/src/include/common/unicode_category_table.h b/src/include/common/unicode_category_table.h
new file mode 100644
index 0000000000..3125cbdbf5
--- /dev/null
+++ b/src/include/common/unicode_category_table.h
@@ -0,0 +1,4039 @@
+/*-------------------------------------------------------------------------
+ *
+ * unicode_category_table.h
+ *	  Category table for Unicode character classification.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/unicode_category_table.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * File auto-generated by src/common/unicode/generate-unicode_category_table.pl,
+ * do not edit. There is deliberately not an #ifndef PG_UNICODE_CATEGORY_TABLE_H
+ * here.
+ */
+typedef struct
+{
+	uint32		first;	/* Unicode codepoint */
+	uint32		last;		/* Unicode codepoint */
+	uint8		category;		/* combining class of character */
+} pg_category_range;
+
+/* table of Unicode codepoint ranges and their categories */
+static const pg_category_range unicode_categories[4009] =
+{
+	{0x000000, 0x00001f, PG_U_CONTROL_CHAR},
+	{0x000020, 0x000020, PG_U_SPACE_SEPARATOR},
+	{0x000021, 0x000023, PG_U_OTHER_PUNCTUATION},
+	{0x000024, 0x000024, PG_U_CURRENCY_SYMBOL},
+	{0x000025, 0x000027, PG_U_OTHER_PUNCTUATION},
+	{0x000028, 0x000028, PG_U_START_PUNCTUATION},
+	{0x000029, 0x000029, PG_U_END_PUNCTUATION},
+	{0x00002a, 0x00002a, PG_U_OTHER_PUNCTUATION},
+	{0x00002b, 0x00002b, PG_U_MATH_SYMBOL},
+	{0x00002c, 0x00002c, PG_U_OTHER_PUNCTUATION},
+	{0x00002d, 0x00002d, PG_U_DASH_PUNCTUATION},
+	{0x00002e, 0x00002f, PG_U_OTHER_PUNCTUATION},
+	{0x000030, 0x000039, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00003a, 0x00003b, PG_U_OTHER_PUNCTUATION},
+	{0x00003c, 0x00003e, PG_U_MATH_SYMBOL},
+	{0x00003f, 0x000040, PG_U_OTHER_PUNCTUATION},
+	{0x000041, 0x00005a, PG_U_UPPERCASE_LETTER},
+	{0x00005b, 0x00005b, PG_U_START_PUNCTUATION},
+	{0x00005c, 0x00005c, PG_U_OTHER_PUNCTUATION},
+	{0x00005d, 0x00005d, PG_U_END_PUNCTUATION},
+	{0x00005e, 0x00005e, PG_U_MODIFIER_SYMBOL},
+	{0x00005f, 0x00005f, PG_U_CONNECTOR_PUNCTUATION},
+	{0x000060, 0x000060, PG_U_MODIFIER_SYMBOL},
+	{0x000061, 0x00007a, PG_U_LOWERCASE_LETTER},
+	{0x00007b, 0x00007b, PG_U_START_PUNCTUATION},
+	{0x00007c, 0x00007c, PG_U_MATH_SYMBOL},
+	{0x00007d, 0x00007d, PG_U_END_PUNCTUATION},
+	{0x00007e, 0x00007e, PG_U_MATH_SYMBOL},
+	{0x00007f, 0x00009f, PG_U_CONTROL_CHAR},
+	{0x0000a0, 0x0000a0, PG_U_SPACE_SEPARATOR},
+	{0x0000a1, 0x0000a1, PG_U_OTHER_PUNCTUATION},
+	{0x0000a2, 0x0000a5, PG_U_CURRENCY_SYMBOL},
+	{0x0000a6, 0x0000a6, PG_U_OTHER_SYMBOL},
+	{0x0000a7, 0x0000a7, PG_U_OTHER_PUNCTUATION},
+	{0x0000a8, 0x0000a8, PG_U_MODIFIER_SYMBOL},
+	{0x0000a9, 0x0000a9, PG_U_OTHER_SYMBOL},
+	{0x0000aa, 0x0000aa, PG_U_OTHER_LETTER},
+	{0x0000ab, 0x0000ab, PG_U_INITIAL_PUNCTUATION},
+	{0x0000ac, 0x0000ac, PG_U_MATH_SYMBOL},
+	{0x0000ad, 0x0000ad, PG_U_FORMAT_CHAR},
+	{0x0000ae, 0x0000ae, PG_U_OTHER_SYMBOL},
+	{0x0000af, 0x0000af, PG_U_MODIFIER_SYMBOL},
+	{0x0000b0, 0x0000b0, PG_U_OTHER_SYMBOL},
+	{0x0000b1, 0x0000b1, PG_U_MATH_SYMBOL},
+	{0x0000b2, 0x0000b3, PG_U_OTHER_NUMBER},
+	{0x0000b4, 0x0000b4, PG_U_MODIFIER_SYMBOL},
+	{0x0000b5, 0x0000b5, PG_U_LOWERCASE_LETTER},
+	{0x0000b6, 0x0000b7, PG_U_OTHER_PUNCTUATION},
+	{0x0000b8, 0x0000b8, PG_U_MODIFIER_SYMBOL},
+	{0x0000b9, 0x0000b9, PG_U_OTHER_NUMBER},
+	{0x0000ba, 0x0000ba, PG_U_OTHER_LETTER},
+	{0x0000bb, 0x0000bb, PG_U_FINAL_PUNCTUATION},
+	{0x0000bc, 0x0000be, PG_U_OTHER_NUMBER},
+	{0x0000bf, 0x0000bf, PG_U_OTHER_PUNCTUATION},
+	{0x0000c0, 0x0000d6, PG_U_UPPERCASE_LETTER},
+	{0x0000d7, 0x0000d7, PG_U_MATH_SYMBOL},
+	{0x0000d8, 0x0000de, PG_U_UPPERCASE_LETTER},
+	{0x0000df, 0x0000f6, PG_U_LOWERCASE_LETTER},
+	{0x0000f7, 0x0000f7, PG_U_MATH_SYMBOL},
+	{0x0000f8, 0x0000ff, PG_U_LOWERCASE_LETTER},
+	{0x000100, 0x000100, PG_U_UPPERCASE_LETTER},
+	{0x000101, 0x000101, PG_U_LOWERCASE_LETTER},
+	{0x000102, 0x000102, PG_U_UPPERCASE_LETTER},
+	{0x000103, 0x000103, PG_U_LOWERCASE_LETTER},
+	{0x000104, 0x000104, PG_U_UPPERCASE_LETTER},
+	{0x000105, 0x000105, PG_U_LOWERCASE_LETTER},
+	{0x000106, 0x000106, PG_U_UPPERCASE_LETTER},
+	{0x000107, 0x000107, PG_U_LOWERCASE_LETTER},
+	{0x000108, 0x000108, PG_U_UPPERCASE_LETTER},
+	{0x000109, 0x000109, PG_U_LOWERCASE_LETTER},
+	{0x00010a, 0x00010a, PG_U_UPPERCASE_LETTER},
+	{0x00010b, 0x00010b, PG_U_LOWERCASE_LETTER},
+	{0x00010c, 0x00010c, PG_U_UPPERCASE_LETTER},
+	{0x00010d, 0x00010d, PG_U_LOWERCASE_LETTER},
+	{0x00010e, 0x00010e, PG_U_UPPERCASE_LETTER},
+	{0x00010f, 0x00010f, PG_U_LOWERCASE_LETTER},
+	{0x000110, 0x000110, PG_U_UPPERCASE_LETTER},
+	{0x000111, 0x000111, PG_U_LOWERCASE_LETTER},
+	{0x000112, 0x000112, PG_U_UPPERCASE_LETTER},
+	{0x000113, 0x000113, PG_U_LOWERCASE_LETTER},
+	{0x000114, 0x000114, PG_U_UPPERCASE_LETTER},
+	{0x000115, 0x000115, PG_U_LOWERCASE_LETTER},
+	{0x000116, 0x000116, PG_U_UPPERCASE_LETTER},
+	{0x000117, 0x000117, PG_U_LOWERCASE_LETTER},
+	{0x000118, 0x000118, PG_U_UPPERCASE_LETTER},
+	{0x000119, 0x000119, PG_U_LOWERCASE_LETTER},
+	{0x00011a, 0x00011a, PG_U_UPPERCASE_LETTER},
+	{0x00011b, 0x00011b, PG_U_LOWERCASE_LETTER},
+	{0x00011c, 0x00011c, PG_U_UPPERCASE_LETTER},
+	{0x00011d, 0x00011d, PG_U_LOWERCASE_LETTER},
+	{0x00011e, 0x00011e, PG_U_UPPERCASE_LETTER},
+	{0x00011f, 0x00011f, PG_U_LOWERCASE_LETTER},
+	{0x000120, 0x000120, PG_U_UPPERCASE_LETTER},
+	{0x000121, 0x000121, PG_U_LOWERCASE_LETTER},
+	{0x000122, 0x000122, PG_U_UPPERCASE_LETTER},
+	{0x000123, 0x000123, PG_U_LOWERCASE_LETTER},
+	{0x000124, 0x000124, PG_U_UPPERCASE_LETTER},
+	{0x000125, 0x000125, PG_U_LOWERCASE_LETTER},
+	{0x000126, 0x000126, PG_U_UPPERCASE_LETTER},
+	{0x000127, 0x000127, PG_U_LOWERCASE_LETTER},
+	{0x000128, 0x000128, PG_U_UPPERCASE_LETTER},
+	{0x000129, 0x000129, PG_U_LOWERCASE_LETTER},
+	{0x00012a, 0x00012a, PG_U_UPPERCASE_LETTER},
+	{0x00012b, 0x00012b, PG_U_LOWERCASE_LETTER},
+	{0x00012c, 0x00012c, PG_U_UPPERCASE_LETTER},
+	{0x00012d, 0x00012d, PG_U_LOWERCASE_LETTER},
+	{0x00012e, 0x00012e, PG_U_UPPERCASE_LETTER},
+	{0x00012f, 0x00012f, PG_U_LOWERCASE_LETTER},
+	{0x000130, 0x000130, PG_U_UPPERCASE_LETTER},
+	{0x000131, 0x000131, PG_U_LOWERCASE_LETTER},
+	{0x000132, 0x000132, PG_U_UPPERCASE_LETTER},
+	{0x000133, 0x000133, PG_U_LOWERCASE_LETTER},
+	{0x000134, 0x000134, PG_U_UPPERCASE_LETTER},
+	{0x000135, 0x000135, PG_U_LOWERCASE_LETTER},
+	{0x000136, 0x000136, PG_U_UPPERCASE_LETTER},
+	{0x000137, 0x000138, PG_U_LOWERCASE_LETTER},
+	{0x000139, 0x000139, PG_U_UPPERCASE_LETTER},
+	{0x00013a, 0x00013a, PG_U_LOWERCASE_LETTER},
+	{0x00013b, 0x00013b, PG_U_UPPERCASE_LETTER},
+	{0x00013c, 0x00013c, PG_U_LOWERCASE_LETTER},
+	{0x00013d, 0x00013d, PG_U_UPPERCASE_LETTER},
+	{0x00013e, 0x00013e, PG_U_LOWERCASE_LETTER},
+	{0x00013f, 0x00013f, PG_U_UPPERCASE_LETTER},
+	{0x000140, 0x000140, PG_U_LOWERCASE_LETTER},
+	{0x000141, 0x000141, PG_U_UPPERCASE_LETTER},
+	{0x000142, 0x000142, PG_U_LOWERCASE_LETTER},
+	{0x000143, 0x000143, PG_U_UPPERCASE_LETTER},
+	{0x000144, 0x000144, PG_U_LOWERCASE_LETTER},
+	{0x000145, 0x000145, PG_U_UPPERCASE_LETTER},
+	{0x000146, 0x000146, PG_U_LOWERCASE_LETTER},
+	{0x000147, 0x000147, PG_U_UPPERCASE_LETTER},
+	{0x000148, 0x000149, PG_U_LOWERCASE_LETTER},
+	{0x00014a, 0x00014a, PG_U_UPPERCASE_LETTER},
+	{0x00014b, 0x00014b, PG_U_LOWERCASE_LETTER},
+	{0x00014c, 0x00014c, PG_U_UPPERCASE_LETTER},
+	{0x00014d, 0x00014d, PG_U_LOWERCASE_LETTER},
+	{0x00014e, 0x00014e, PG_U_UPPERCASE_LETTER},
+	{0x00014f, 0x00014f, PG_U_LOWERCASE_LETTER},
+	{0x000150, 0x000150, PG_U_UPPERCASE_LETTER},
+	{0x000151, 0x000151, PG_U_LOWERCASE_LETTER},
+	{0x000152, 0x000152, PG_U_UPPERCASE_LETTER},
+	{0x000153, 0x000153, PG_U_LOWERCASE_LETTER},
+	{0x000154, 0x000154, PG_U_UPPERCASE_LETTER},
+	{0x000155, 0x000155, PG_U_LOWERCASE_LETTER},
+	{0x000156, 0x000156, PG_U_UPPERCASE_LETTER},
+	{0x000157, 0x000157, PG_U_LOWERCASE_LETTER},
+	{0x000158, 0x000158, PG_U_UPPERCASE_LETTER},
+	{0x000159, 0x000159, PG_U_LOWERCASE_LETTER},
+	{0x00015a, 0x00015a, PG_U_UPPERCASE_LETTER},
+	{0x00015b, 0x00015b, PG_U_LOWERCASE_LETTER},
+	{0x00015c, 0x00015c, PG_U_UPPERCASE_LETTER},
+	{0x00015d, 0x00015d, PG_U_LOWERCASE_LETTER},
+	{0x00015e, 0x00015e, PG_U_UPPERCASE_LETTER},
+	{0x00015f, 0x00015f, PG_U_LOWERCASE_LETTER},
+	{0x000160, 0x000160, PG_U_UPPERCASE_LETTER},
+	{0x000161, 0x000161, PG_U_LOWERCASE_LETTER},
+	{0x000162, 0x000162, PG_U_UPPERCASE_LETTER},
+	{0x000163, 0x000163, PG_U_LOWERCASE_LETTER},
+	{0x000164, 0x000164, PG_U_UPPERCASE_LETTER},
+	{0x000165, 0x000165, PG_U_LOWERCASE_LETTER},
+	{0x000166, 0x000166, PG_U_UPPERCASE_LETTER},
+	{0x000167, 0x000167, PG_U_LOWERCASE_LETTER},
+	{0x000168, 0x000168, PG_U_UPPERCASE_LETTER},
+	{0x000169, 0x000169, PG_U_LOWERCASE_LETTER},
+	{0x00016a, 0x00016a, PG_U_UPPERCASE_LETTER},
+	{0x00016b, 0x00016b, PG_U_LOWERCASE_LETTER},
+	{0x00016c, 0x00016c, PG_U_UPPERCASE_LETTER},
+	{0x00016d, 0x00016d, PG_U_LOWERCASE_LETTER},
+	{0x00016e, 0x00016e, PG_U_UPPERCASE_LETTER},
+	{0x00016f, 0x00016f, PG_U_LOWERCASE_LETTER},
+	{0x000170, 0x000170, PG_U_UPPERCASE_LETTER},
+	{0x000171, 0x000171, PG_U_LOWERCASE_LETTER},
+	{0x000172, 0x000172, PG_U_UPPERCASE_LETTER},
+	{0x000173, 0x000173, PG_U_LOWERCASE_LETTER},
+	{0x000174, 0x000174, PG_U_UPPERCASE_LETTER},
+	{0x000175, 0x000175, PG_U_LOWERCASE_LETTER},
+	{0x000176, 0x000176, PG_U_UPPERCASE_LETTER},
+	{0x000177, 0x000177, PG_U_LOWERCASE_LETTER},
+	{0x000178, 0x000179, PG_U_UPPERCASE_LETTER},
+	{0x00017a, 0x00017a, PG_U_LOWERCASE_LETTER},
+	{0x00017b, 0x00017b, PG_U_UPPERCASE_LETTER},
+	{0x00017c, 0x00017c, PG_U_LOWERCASE_LETTER},
+	{0x00017d, 0x00017d, PG_U_UPPERCASE_LETTER},
+	{0x00017e, 0x000180, PG_U_LOWERCASE_LETTER},
+	{0x000181, 0x000182, PG_U_UPPERCASE_LETTER},
+	{0x000183, 0x000183, PG_U_LOWERCASE_LETTER},
+	{0x000184, 0x000184, PG_U_UPPERCASE_LETTER},
+	{0x000185, 0x000185, PG_U_LOWERCASE_LETTER},
+	{0x000186, 0x000187, PG_U_UPPERCASE_LETTER},
+	{0x000188, 0x000188, PG_U_LOWERCASE_LETTER},
+	{0x000189, 0x00018b, PG_U_UPPERCASE_LETTER},
+	{0x00018c, 0x00018d, PG_U_LOWERCASE_LETTER},
+	{0x00018e, 0x000191, PG_U_UPPERCASE_LETTER},
+	{0x000192, 0x000192, PG_U_LOWERCASE_LETTER},
+	{0x000193, 0x000194, PG_U_UPPERCASE_LETTER},
+	{0x000195, 0x000195, PG_U_LOWERCASE_LETTER},
+	{0x000196, 0x000198, PG_U_UPPERCASE_LETTER},
+	{0x000199, 0x00019b, PG_U_LOWERCASE_LETTER},
+	{0x00019c, 0x00019d, PG_U_UPPERCASE_LETTER},
+	{0x00019e, 0x00019e, PG_U_LOWERCASE_LETTER},
+	{0x00019f, 0x0001a0, PG_U_UPPERCASE_LETTER},
+	{0x0001a1, 0x0001a1, PG_U_LOWERCASE_LETTER},
+	{0x0001a2, 0x0001a2, PG_U_UPPERCASE_LETTER},
+	{0x0001a3, 0x0001a3, PG_U_LOWERCASE_LETTER},
+	{0x0001a4, 0x0001a4, PG_U_UPPERCASE_LETTER},
+	{0x0001a5, 0x0001a5, PG_U_LOWERCASE_LETTER},
+	{0x0001a6, 0x0001a7, PG_U_UPPERCASE_LETTER},
+	{0x0001a8, 0x0001a8, PG_U_LOWERCASE_LETTER},
+	{0x0001a9, 0x0001a9, PG_U_UPPERCASE_LETTER},
+	{0x0001aa, 0x0001ab, PG_U_LOWERCASE_LETTER},
+	{0x0001ac, 0x0001ac, PG_U_UPPERCASE_LETTER},
+	{0x0001ad, 0x0001ad, PG_U_LOWERCASE_LETTER},
+	{0x0001ae, 0x0001af, PG_U_UPPERCASE_LETTER},
+	{0x0001b0, 0x0001b0, PG_U_LOWERCASE_LETTER},
+	{0x0001b1, 0x0001b3, PG_U_UPPERCASE_LETTER},
+	{0x0001b4, 0x0001b4, PG_U_LOWERCASE_LETTER},
+	{0x0001b5, 0x0001b5, PG_U_UPPERCASE_LETTER},
+	{0x0001b6, 0x0001b6, PG_U_LOWERCASE_LETTER},
+	{0x0001b7, 0x0001b8, PG_U_UPPERCASE_LETTER},
+	{0x0001b9, 0x0001ba, PG_U_LOWERCASE_LETTER},
+	{0x0001bb, 0x0001bb, PG_U_OTHER_LETTER},
+	{0x0001bc, 0x0001bc, PG_U_UPPERCASE_LETTER},
+	{0x0001bd, 0x0001bf, PG_U_LOWERCASE_LETTER},
+	{0x0001c0, 0x0001c3, PG_U_OTHER_LETTER},
+	{0x0001c4, 0x0001c4, PG_U_UPPERCASE_LETTER},
+	{0x0001c5, 0x0001c5, PG_U_TITLECASE_LETTER},
+	{0x0001c6, 0x0001c6, PG_U_LOWERCASE_LETTER},
+	{0x0001c7, 0x0001c7, PG_U_UPPERCASE_LETTER},
+	{0x0001c8, 0x0001c8, PG_U_TITLECASE_LETTER},
+	{0x0001c9, 0x0001c9, PG_U_LOWERCASE_LETTER},
+	{0x0001ca, 0x0001ca, PG_U_UPPERCASE_LETTER},
+	{0x0001cb, 0x0001cb, PG_U_TITLECASE_LETTER},
+	{0x0001cc, 0x0001cc, PG_U_LOWERCASE_LETTER},
+	{0x0001cd, 0x0001cd, PG_U_UPPERCASE_LETTER},
+	{0x0001ce, 0x0001ce, PG_U_LOWERCASE_LETTER},
+	{0x0001cf, 0x0001cf, PG_U_UPPERCASE_LETTER},
+	{0x0001d0, 0x0001d0, PG_U_LOWERCASE_LETTER},
+	{0x0001d1, 0x0001d1, PG_U_UPPERCASE_LETTER},
+	{0x0001d2, 0x0001d2, PG_U_LOWERCASE_LETTER},
+	{0x0001d3, 0x0001d3, PG_U_UPPERCASE_LETTER},
+	{0x0001d4, 0x0001d4, PG_U_LOWERCASE_LETTER},
+	{0x0001d5, 0x0001d5, PG_U_UPPERCASE_LETTER},
+	{0x0001d6, 0x0001d6, PG_U_LOWERCASE_LETTER},
+	{0x0001d7, 0x0001d7, PG_U_UPPERCASE_LETTER},
+	{0x0001d8, 0x0001d8, PG_U_LOWERCASE_LETTER},
+	{0x0001d9, 0x0001d9, PG_U_UPPERCASE_LETTER},
+	{0x0001da, 0x0001da, PG_U_LOWERCASE_LETTER},
+	{0x0001db, 0x0001db, PG_U_UPPERCASE_LETTER},
+	{0x0001dc, 0x0001dd, PG_U_LOWERCASE_LETTER},
+	{0x0001de, 0x0001de, PG_U_UPPERCASE_LETTER},
+	{0x0001df, 0x0001df, PG_U_LOWERCASE_LETTER},
+	{0x0001e0, 0x0001e0, PG_U_UPPERCASE_LETTER},
+	{0x0001e1, 0x0001e1, PG_U_LOWERCASE_LETTER},
+	{0x0001e2, 0x0001e2, PG_U_UPPERCASE_LETTER},
+	{0x0001e3, 0x0001e3, PG_U_LOWERCASE_LETTER},
+	{0x0001e4, 0x0001e4, PG_U_UPPERCASE_LETTER},
+	{0x0001e5, 0x0001e5, PG_U_LOWERCASE_LETTER},
+	{0x0001e6, 0x0001e6, PG_U_UPPERCASE_LETTER},
+	{0x0001e7, 0x0001e7, PG_U_LOWERCASE_LETTER},
+	{0x0001e8, 0x0001e8, PG_U_UPPERCASE_LETTER},
+	{0x0001e9, 0x0001e9, PG_U_LOWERCASE_LETTER},
+	{0x0001ea, 0x0001ea, PG_U_UPPERCASE_LETTER},
+	{0x0001eb, 0x0001eb, PG_U_LOWERCASE_LETTER},
+	{0x0001ec, 0x0001ec, PG_U_UPPERCASE_LETTER},
+	{0x0001ed, 0x0001ed, PG_U_LOWERCASE_LETTER},
+	{0x0001ee, 0x0001ee, PG_U_UPPERCASE_LETTER},
+	{0x0001ef, 0x0001f0, PG_U_LOWERCASE_LETTER},
+	{0x0001f1, 0x0001f1, PG_U_UPPERCASE_LETTER},
+	{0x0001f2, 0x0001f2, PG_U_TITLECASE_LETTER},
+	{0x0001f3, 0x0001f3, PG_U_LOWERCASE_LETTER},
+	{0x0001f4, 0x0001f4, PG_U_UPPERCASE_LETTER},
+	{0x0001f5, 0x0001f5, PG_U_LOWERCASE_LETTER},
+	{0x0001f6, 0x0001f8, PG_U_UPPERCASE_LETTER},
+	{0x0001f9, 0x0001f9, PG_U_LOWERCASE_LETTER},
+	{0x0001fa, 0x0001fa, PG_U_UPPERCASE_LETTER},
+	{0x0001fb, 0x0001fb, PG_U_LOWERCASE_LETTER},
+	{0x0001fc, 0x0001fc, PG_U_UPPERCASE_LETTER},
+	{0x0001fd, 0x0001fd, PG_U_LOWERCASE_LETTER},
+	{0x0001fe, 0x0001fe, PG_U_UPPERCASE_LETTER},
+	{0x0001ff, 0x0001ff, PG_U_LOWERCASE_LETTER},
+	{0x000200, 0x000200, PG_U_UPPERCASE_LETTER},
+	{0x000201, 0x000201, PG_U_LOWERCASE_LETTER},
+	{0x000202, 0x000202, PG_U_UPPERCASE_LETTER},
+	{0x000203, 0x000203, PG_U_LOWERCASE_LETTER},
+	{0x000204, 0x000204, PG_U_UPPERCASE_LETTER},
+	{0x000205, 0x000205, PG_U_LOWERCASE_LETTER},
+	{0x000206, 0x000206, PG_U_UPPERCASE_LETTER},
+	{0x000207, 0x000207, PG_U_LOWERCASE_LETTER},
+	{0x000208, 0x000208, PG_U_UPPERCASE_LETTER},
+	{0x000209, 0x000209, PG_U_LOWERCASE_LETTER},
+	{0x00020a, 0x00020a, PG_U_UPPERCASE_LETTER},
+	{0x00020b, 0x00020b, PG_U_LOWERCASE_LETTER},
+	{0x00020c, 0x00020c, PG_U_UPPERCASE_LETTER},
+	{0x00020d, 0x00020d, PG_U_LOWERCASE_LETTER},
+	{0x00020e, 0x00020e, PG_U_UPPERCASE_LETTER},
+	{0x00020f, 0x00020f, PG_U_LOWERCASE_LETTER},
+	{0x000210, 0x000210, PG_U_UPPERCASE_LETTER},
+	{0x000211, 0x000211, PG_U_LOWERCASE_LETTER},
+	{0x000212, 0x000212, PG_U_UPPERCASE_LETTER},
+	{0x000213, 0x000213, PG_U_LOWERCASE_LETTER},
+	{0x000214, 0x000214, PG_U_UPPERCASE_LETTER},
+	{0x000215, 0x000215, PG_U_LOWERCASE_LETTER},
+	{0x000216, 0x000216, PG_U_UPPERCASE_LETTER},
+	{0x000217, 0x000217, PG_U_LOWERCASE_LETTER},
+	{0x000218, 0x000218, PG_U_UPPERCASE_LETTER},
+	{0x000219, 0x000219, PG_U_LOWERCASE_LETTER},
+	{0x00021a, 0x00021a, PG_U_UPPERCASE_LETTER},
+	{0x00021b, 0x00021b, PG_U_LOWERCASE_LETTER},
+	{0x00021c, 0x00021c, PG_U_UPPERCASE_LETTER},
+	{0x00021d, 0x00021d, PG_U_LOWERCASE_LETTER},
+	{0x00021e, 0x00021e, PG_U_UPPERCASE_LETTER},
+	{0x00021f, 0x00021f, PG_U_LOWERCASE_LETTER},
+	{0x000220, 0x000220, PG_U_UPPERCASE_LETTER},
+	{0x000221, 0x000221, PG_U_LOWERCASE_LETTER},
+	{0x000222, 0x000222, PG_U_UPPERCASE_LETTER},
+	{0x000223, 0x000223, PG_U_LOWERCASE_LETTER},
+	{0x000224, 0x000224, PG_U_UPPERCASE_LETTER},
+	{0x000225, 0x000225, PG_U_LOWERCASE_LETTER},
+	{0x000226, 0x000226, PG_U_UPPERCASE_LETTER},
+	{0x000227, 0x000227, PG_U_LOWERCASE_LETTER},
+	{0x000228, 0x000228, PG_U_UPPERCASE_LETTER},
+	{0x000229, 0x000229, PG_U_LOWERCASE_LETTER},
+	{0x00022a, 0x00022a, PG_U_UPPERCASE_LETTER},
+	{0x00022b, 0x00022b, PG_U_LOWERCASE_LETTER},
+	{0x00022c, 0x00022c, PG_U_UPPERCASE_LETTER},
+	{0x00022d, 0x00022d, PG_U_LOWERCASE_LETTER},
+	{0x00022e, 0x00022e, PG_U_UPPERCASE_LETTER},
+	{0x00022f, 0x00022f, PG_U_LOWERCASE_LETTER},
+	{0x000230, 0x000230, PG_U_UPPERCASE_LETTER},
+	{0x000231, 0x000231, PG_U_LOWERCASE_LETTER},
+	{0x000232, 0x000232, PG_U_UPPERCASE_LETTER},
+	{0x000233, 0x000239, PG_U_LOWERCASE_LETTER},
+	{0x00023a, 0x00023b, PG_U_UPPERCASE_LETTER},
+	{0x00023c, 0x00023c, PG_U_LOWERCASE_LETTER},
+	{0x00023d, 0x00023e, PG_U_UPPERCASE_LETTER},
+	{0x00023f, 0x000240, PG_U_LOWERCASE_LETTER},
+	{0x000241, 0x000241, PG_U_UPPERCASE_LETTER},
+	{0x000242, 0x000242, PG_U_LOWERCASE_LETTER},
+	{0x000243, 0x000246, PG_U_UPPERCASE_LETTER},
+	{0x000247, 0x000247, PG_U_LOWERCASE_LETTER},
+	{0x000248, 0x000248, PG_U_UPPERCASE_LETTER},
+	{0x000249, 0x000249, PG_U_LOWERCASE_LETTER},
+	{0x00024a, 0x00024a, PG_U_UPPERCASE_LETTER},
+	{0x00024b, 0x00024b, PG_U_LOWERCASE_LETTER},
+	{0x00024c, 0x00024c, PG_U_UPPERCASE_LETTER},
+	{0x00024d, 0x00024d, PG_U_LOWERCASE_LETTER},
+	{0x00024e, 0x00024e, PG_U_UPPERCASE_LETTER},
+	{0x00024f, 0x000293, PG_U_LOWERCASE_LETTER},
+	{0x000294, 0x000294, PG_U_OTHER_LETTER},
+	{0x000295, 0x0002af, PG_U_LOWERCASE_LETTER},
+	{0x0002b0, 0x0002c1, PG_U_MODIFIER_LETTER},
+	{0x0002c2, 0x0002c5, PG_U_MODIFIER_SYMBOL},
+	{0x0002c6, 0x0002d1, PG_U_MODIFIER_LETTER},
+	{0x0002d2, 0x0002df, PG_U_MODIFIER_SYMBOL},
+	{0x0002e0, 0x0002e4, PG_U_MODIFIER_LETTER},
+	{0x0002e5, 0x0002eb, PG_U_MODIFIER_SYMBOL},
+	{0x0002ec, 0x0002ec, PG_U_MODIFIER_LETTER},
+	{0x0002ed, 0x0002ed, PG_U_MODIFIER_SYMBOL},
+	{0x0002ee, 0x0002ee, PG_U_MODIFIER_LETTER},
+	{0x0002ef, 0x0002ff, PG_U_MODIFIER_SYMBOL},
+	{0x000300, 0x00036f, PG_U_NON_SPACING_MARK},
+	{0x000370, 0x000370, PG_U_UPPERCASE_LETTER},
+	{0x000371, 0x000371, PG_U_LOWERCASE_LETTER},
+	{0x000372, 0x000372, PG_U_UPPERCASE_LETTER},
+	{0x000373, 0x000373, PG_U_LOWERCASE_LETTER},
+	{0x000374, 0x000374, PG_U_MODIFIER_LETTER},
+	{0x000375, 0x000375, PG_U_MODIFIER_SYMBOL},
+	{0x000376, 0x000376, PG_U_UPPERCASE_LETTER},
+	{0x000377, 0x000377, PG_U_LOWERCASE_LETTER},
+	{0x000378, 0x000379, PG_U_UNASSIGNED},
+	{0x00037a, 0x00037a, PG_U_MODIFIER_LETTER},
+	{0x00037b, 0x00037d, PG_U_LOWERCASE_LETTER},
+	{0x00037e, 0x00037e, PG_U_OTHER_PUNCTUATION},
+	{0x00037f, 0x00037f, PG_U_UPPERCASE_LETTER},
+	{0x000380, 0x000383, PG_U_UNASSIGNED},
+	{0x000384, 0x000385, PG_U_MODIFIER_SYMBOL},
+	{0x000386, 0x000386, PG_U_UPPERCASE_LETTER},
+	{0x000387, 0x000387, PG_U_OTHER_PUNCTUATION},
+	{0x000388, 0x00038a, PG_U_UPPERCASE_LETTER},
+	{0x00038b, 0x00038b, PG_U_UNASSIGNED},
+	{0x00038c, 0x00038c, PG_U_UPPERCASE_LETTER},
+	{0x00038d, 0x00038d, PG_U_UNASSIGNED},
+	{0x00038e, 0x00038f, PG_U_UPPERCASE_LETTER},
+	{0x000390, 0x000390, PG_U_LOWERCASE_LETTER},
+	{0x000391, 0x0003a1, PG_U_UPPERCASE_LETTER},
+	{0x0003a2, 0x0003a2, PG_U_UNASSIGNED},
+	{0x0003a3, 0x0003ab, PG_U_UPPERCASE_LETTER},
+	{0x0003ac, 0x0003ce, PG_U_LOWERCASE_LETTER},
+	{0x0003cf, 0x0003cf, PG_U_UPPERCASE_LETTER},
+	{0x0003d0, 0x0003d1, PG_U_LOWERCASE_LETTER},
+	{0x0003d2, 0x0003d4, PG_U_UPPERCASE_LETTER},
+	{0x0003d5, 0x0003d7, PG_U_LOWERCASE_LETTER},
+	{0x0003d8, 0x0003d8, PG_U_UPPERCASE_LETTER},
+	{0x0003d9, 0x0003d9, PG_U_LOWERCASE_LETTER},
+	{0x0003da, 0x0003da, PG_U_UPPERCASE_LETTER},
+	{0x0003db, 0x0003db, PG_U_LOWERCASE_LETTER},
+	{0x0003dc, 0x0003dc, PG_U_UPPERCASE_LETTER},
+	{0x0003dd, 0x0003dd, PG_U_LOWERCASE_LETTER},
+	{0x0003de, 0x0003de, PG_U_UPPERCASE_LETTER},
+	{0x0003df, 0x0003df, PG_U_LOWERCASE_LETTER},
+	{0x0003e0, 0x0003e0, PG_U_UPPERCASE_LETTER},
+	{0x0003e1, 0x0003e1, PG_U_LOWERCASE_LETTER},
+	{0x0003e2, 0x0003e2, PG_U_UPPERCASE_LETTER},
+	{0x0003e3, 0x0003e3, PG_U_LOWERCASE_LETTER},
+	{0x0003e4, 0x0003e4, PG_U_UPPERCASE_LETTER},
+	{0x0003e5, 0x0003e5, PG_U_LOWERCASE_LETTER},
+	{0x0003e6, 0x0003e6, PG_U_UPPERCASE_LETTER},
+	{0x0003e7, 0x0003e7, PG_U_LOWERCASE_LETTER},
+	{0x0003e8, 0x0003e8, PG_U_UPPERCASE_LETTER},
+	{0x0003e9, 0x0003e9, PG_U_LOWERCASE_LETTER},
+	{0x0003ea, 0x0003ea, PG_U_UPPERCASE_LETTER},
+	{0x0003eb, 0x0003eb, PG_U_LOWERCASE_LETTER},
+	{0x0003ec, 0x0003ec, PG_U_UPPERCASE_LETTER},
+	{0x0003ed, 0x0003ed, PG_U_LOWERCASE_LETTER},
+	{0x0003ee, 0x0003ee, PG_U_UPPERCASE_LETTER},
+	{0x0003ef, 0x0003f3, PG_U_LOWERCASE_LETTER},
+	{0x0003f4, 0x0003f4, PG_U_UPPERCASE_LETTER},
+	{0x0003f5, 0x0003f5, PG_U_LOWERCASE_LETTER},
+	{0x0003f6, 0x0003f6, PG_U_MATH_SYMBOL},
+	{0x0003f7, 0x0003f7, PG_U_UPPERCASE_LETTER},
+	{0x0003f8, 0x0003f8, PG_U_LOWERCASE_LETTER},
+	{0x0003f9, 0x0003fa, PG_U_UPPERCASE_LETTER},
+	{0x0003fb, 0x0003fc, PG_U_LOWERCASE_LETTER},
+	{0x0003fd, 0x00042f, PG_U_UPPERCASE_LETTER},
+	{0x000430, 0x00045f, PG_U_LOWERCASE_LETTER},
+	{0x000460, 0x000460, PG_U_UPPERCASE_LETTER},
+	{0x000461, 0x000461, PG_U_LOWERCASE_LETTER},
+	{0x000462, 0x000462, PG_U_UPPERCASE_LETTER},
+	{0x000463, 0x000463, PG_U_LOWERCASE_LETTER},
+	{0x000464, 0x000464, PG_U_UPPERCASE_LETTER},
+	{0x000465, 0x000465, PG_U_LOWERCASE_LETTER},
+	{0x000466, 0x000466, PG_U_UPPERCASE_LETTER},
+	{0x000467, 0x000467, PG_U_LOWERCASE_LETTER},
+	{0x000468, 0x000468, PG_U_UPPERCASE_LETTER},
+	{0x000469, 0x000469, PG_U_LOWERCASE_LETTER},
+	{0x00046a, 0x00046a, PG_U_UPPERCASE_LETTER},
+	{0x00046b, 0x00046b, PG_U_LOWERCASE_LETTER},
+	{0x00046c, 0x00046c, PG_U_UPPERCASE_LETTER},
+	{0x00046d, 0x00046d, PG_U_LOWERCASE_LETTER},
+	{0x00046e, 0x00046e, PG_U_UPPERCASE_LETTER},
+	{0x00046f, 0x00046f, PG_U_LOWERCASE_LETTER},
+	{0x000470, 0x000470, PG_U_UPPERCASE_LETTER},
+	{0x000471, 0x000471, PG_U_LOWERCASE_LETTER},
+	{0x000472, 0x000472, PG_U_UPPERCASE_LETTER},
+	{0x000473, 0x000473, PG_U_LOWERCASE_LETTER},
+	{0x000474, 0x000474, PG_U_UPPERCASE_LETTER},
+	{0x000475, 0x000475, PG_U_LOWERCASE_LETTER},
+	{0x000476, 0x000476, PG_U_UPPERCASE_LETTER},
+	{0x000477, 0x000477, PG_U_LOWERCASE_LETTER},
+	{0x000478, 0x000478, PG_U_UPPERCASE_LETTER},
+	{0x000479, 0x000479, PG_U_LOWERCASE_LETTER},
+	{0x00047a, 0x00047a, PG_U_UPPERCASE_LETTER},
+	{0x00047b, 0x00047b, PG_U_LOWERCASE_LETTER},
+	{0x00047c, 0x00047c, PG_U_UPPERCASE_LETTER},
+	{0x00047d, 0x00047d, PG_U_LOWERCASE_LETTER},
+	{0x00047e, 0x00047e, PG_U_UPPERCASE_LETTER},
+	{0x00047f, 0x00047f, PG_U_LOWERCASE_LETTER},
+	{0x000480, 0x000480, PG_U_UPPERCASE_LETTER},
+	{0x000481, 0x000481, PG_U_LOWERCASE_LETTER},
+	{0x000482, 0x000482, PG_U_OTHER_SYMBOL},
+	{0x000483, 0x000487, PG_U_NON_SPACING_MARK},
+	{0x000488, 0x000489, PG_U_ENCLOSING_MARK},
+	{0x00048a, 0x00048a, PG_U_UPPERCASE_LETTER},
+	{0x00048b, 0x00048b, PG_U_LOWERCASE_LETTER},
+	{0x00048c, 0x00048c, PG_U_UPPERCASE_LETTER},
+	{0x00048d, 0x00048d, PG_U_LOWERCASE_LETTER},
+	{0x00048e, 0x00048e, PG_U_UPPERCASE_LETTER},
+	{0x00048f, 0x00048f, PG_U_LOWERCASE_LETTER},
+	{0x000490, 0x000490, PG_U_UPPERCASE_LETTER},
+	{0x000491, 0x000491, PG_U_LOWERCASE_LETTER},
+	{0x000492, 0x000492, PG_U_UPPERCASE_LETTER},
+	{0x000493, 0x000493, PG_U_LOWERCASE_LETTER},
+	{0x000494, 0x000494, PG_U_UPPERCASE_LETTER},
+	{0x000495, 0x000495, PG_U_LOWERCASE_LETTER},
+	{0x000496, 0x000496, PG_U_UPPERCASE_LETTER},
+	{0x000497, 0x000497, PG_U_LOWERCASE_LETTER},
+	{0x000498, 0x000498, PG_U_UPPERCASE_LETTER},
+	{0x000499, 0x000499, PG_U_LOWERCASE_LETTER},
+	{0x00049a, 0x00049a, PG_U_UPPERCASE_LETTER},
+	{0x00049b, 0x00049b, PG_U_LOWERCASE_LETTER},
+	{0x00049c, 0x00049c, PG_U_UPPERCASE_LETTER},
+	{0x00049d, 0x00049d, PG_U_LOWERCASE_LETTER},
+	{0x00049e, 0x00049e, PG_U_UPPERCASE_LETTER},
+	{0x00049f, 0x00049f, PG_U_LOWERCASE_LETTER},
+	{0x0004a0, 0x0004a0, PG_U_UPPERCASE_LETTER},
+	{0x0004a1, 0x0004a1, PG_U_LOWERCASE_LETTER},
+	{0x0004a2, 0x0004a2, PG_U_UPPERCASE_LETTER},
+	{0x0004a3, 0x0004a3, PG_U_LOWERCASE_LETTER},
+	{0x0004a4, 0x0004a4, PG_U_UPPERCASE_LETTER},
+	{0x0004a5, 0x0004a5, PG_U_LOWERCASE_LETTER},
+	{0x0004a6, 0x0004a6, PG_U_UPPERCASE_LETTER},
+	{0x0004a7, 0x0004a7, PG_U_LOWERCASE_LETTER},
+	{0x0004a8, 0x0004a8, PG_U_UPPERCASE_LETTER},
+	{0x0004a9, 0x0004a9, PG_U_LOWERCASE_LETTER},
+	{0x0004aa, 0x0004aa, PG_U_UPPERCASE_LETTER},
+	{0x0004ab, 0x0004ab, PG_U_LOWERCASE_LETTER},
+	{0x0004ac, 0x0004ac, PG_U_UPPERCASE_LETTER},
+	{0x0004ad, 0x0004ad, PG_U_LOWERCASE_LETTER},
+	{0x0004ae, 0x0004ae, PG_U_UPPERCASE_LETTER},
+	{0x0004af, 0x0004af, PG_U_LOWERCASE_LETTER},
+	{0x0004b0, 0x0004b0, PG_U_UPPERCASE_LETTER},
+	{0x0004b1, 0x0004b1, PG_U_LOWERCASE_LETTER},
+	{0x0004b2, 0x0004b2, PG_U_UPPERCASE_LETTER},
+	{0x0004b3, 0x0004b3, PG_U_LOWERCASE_LETTER},
+	{0x0004b4, 0x0004b4, PG_U_UPPERCASE_LETTER},
+	{0x0004b5, 0x0004b5, PG_U_LOWERCASE_LETTER},
+	{0x0004b6, 0x0004b6, PG_U_UPPERCASE_LETTER},
+	{0x0004b7, 0x0004b7, PG_U_LOWERCASE_LETTER},
+	{0x0004b8, 0x0004b8, PG_U_UPPERCASE_LETTER},
+	{0x0004b9, 0x0004b9, PG_U_LOWERCASE_LETTER},
+	{0x0004ba, 0x0004ba, PG_U_UPPERCASE_LETTER},
+	{0x0004bb, 0x0004bb, PG_U_LOWERCASE_LETTER},
+	{0x0004bc, 0x0004bc, PG_U_UPPERCASE_LETTER},
+	{0x0004bd, 0x0004bd, PG_U_LOWERCASE_LETTER},
+	{0x0004be, 0x0004be, PG_U_UPPERCASE_LETTER},
+	{0x0004bf, 0x0004bf, PG_U_LOWERCASE_LETTER},
+	{0x0004c0, 0x0004c1, PG_U_UPPERCASE_LETTER},
+	{0x0004c2, 0x0004c2, PG_U_LOWERCASE_LETTER},
+	{0x0004c3, 0x0004c3, PG_U_UPPERCASE_LETTER},
+	{0x0004c4, 0x0004c4, PG_U_LOWERCASE_LETTER},
+	{0x0004c5, 0x0004c5, PG_U_UPPERCASE_LETTER},
+	{0x0004c6, 0x0004c6, PG_U_LOWERCASE_LETTER},
+	{0x0004c7, 0x0004c7, PG_U_UPPERCASE_LETTER},
+	{0x0004c8, 0x0004c8, PG_U_LOWERCASE_LETTER},
+	{0x0004c9, 0x0004c9, PG_U_UPPERCASE_LETTER},
+	{0x0004ca, 0x0004ca, PG_U_LOWERCASE_LETTER},
+	{0x0004cb, 0x0004cb, PG_U_UPPERCASE_LETTER},
+	{0x0004cc, 0x0004cc, PG_U_LOWERCASE_LETTER},
+	{0x0004cd, 0x0004cd, PG_U_UPPERCASE_LETTER},
+	{0x0004ce, 0x0004cf, PG_U_LOWERCASE_LETTER},
+	{0x0004d0, 0x0004d0, PG_U_UPPERCASE_LETTER},
+	{0x0004d1, 0x0004d1, PG_U_LOWERCASE_LETTER},
+	{0x0004d2, 0x0004d2, PG_U_UPPERCASE_LETTER},
+	{0x0004d3, 0x0004d3, PG_U_LOWERCASE_LETTER},
+	{0x0004d4, 0x0004d4, PG_U_UPPERCASE_LETTER},
+	{0x0004d5, 0x0004d5, PG_U_LOWERCASE_LETTER},
+	{0x0004d6, 0x0004d6, PG_U_UPPERCASE_LETTER},
+	{0x0004d7, 0x0004d7, PG_U_LOWERCASE_LETTER},
+	{0x0004d8, 0x0004d8, PG_U_UPPERCASE_LETTER},
+	{0x0004d9, 0x0004d9, PG_U_LOWERCASE_LETTER},
+	{0x0004da, 0x0004da, PG_U_UPPERCASE_LETTER},
+	{0x0004db, 0x0004db, PG_U_LOWERCASE_LETTER},
+	{0x0004dc, 0x0004dc, PG_U_UPPERCASE_LETTER},
+	{0x0004dd, 0x0004dd, PG_U_LOWERCASE_LETTER},
+	{0x0004de, 0x0004de, PG_U_UPPERCASE_LETTER},
+	{0x0004df, 0x0004df, PG_U_LOWERCASE_LETTER},
+	{0x0004e0, 0x0004e0, PG_U_UPPERCASE_LETTER},
+	{0x0004e1, 0x0004e1, PG_U_LOWERCASE_LETTER},
+	{0x0004e2, 0x0004e2, PG_U_UPPERCASE_LETTER},
+	{0x0004e3, 0x0004e3, PG_U_LOWERCASE_LETTER},
+	{0x0004e4, 0x0004e4, PG_U_UPPERCASE_LETTER},
+	{0x0004e5, 0x0004e5, PG_U_LOWERCASE_LETTER},
+	{0x0004e6, 0x0004e6, PG_U_UPPERCASE_LETTER},
+	{0x0004e7, 0x0004e7, PG_U_LOWERCASE_LETTER},
+	{0x0004e8, 0x0004e8, PG_U_UPPERCASE_LETTER},
+	{0x0004e9, 0x0004e9, PG_U_LOWERCASE_LETTER},
+	{0x0004ea, 0x0004ea, PG_U_UPPERCASE_LETTER},
+	{0x0004eb, 0x0004eb, PG_U_LOWERCASE_LETTER},
+	{0x0004ec, 0x0004ec, PG_U_UPPERCASE_LETTER},
+	{0x0004ed, 0x0004ed, PG_U_LOWERCASE_LETTER},
+	{0x0004ee, 0x0004ee, PG_U_UPPERCASE_LETTER},
+	{0x0004ef, 0x0004ef, PG_U_LOWERCASE_LETTER},
+	{0x0004f0, 0x0004f0, PG_U_UPPERCASE_LETTER},
+	{0x0004f1, 0x0004f1, PG_U_LOWERCASE_LETTER},
+	{0x0004f2, 0x0004f2, PG_U_UPPERCASE_LETTER},
+	{0x0004f3, 0x0004f3, PG_U_LOWERCASE_LETTER},
+	{0x0004f4, 0x0004f4, PG_U_UPPERCASE_LETTER},
+	{0x0004f5, 0x0004f5, PG_U_LOWERCASE_LETTER},
+	{0x0004f6, 0x0004f6, PG_U_UPPERCASE_LETTER},
+	{0x0004f7, 0x0004f7, PG_U_LOWERCASE_LETTER},
+	{0x0004f8, 0x0004f8, PG_U_UPPERCASE_LETTER},
+	{0x0004f9, 0x0004f9, PG_U_LOWERCASE_LETTER},
+	{0x0004fa, 0x0004fa, PG_U_UPPERCASE_LETTER},
+	{0x0004fb, 0x0004fb, PG_U_LOWERCASE_LETTER},
+	{0x0004fc, 0x0004fc, PG_U_UPPERCASE_LETTER},
+	{0x0004fd, 0x0004fd, PG_U_LOWERCASE_LETTER},
+	{0x0004fe, 0x0004fe, PG_U_UPPERCASE_LETTER},
+	{0x0004ff, 0x0004ff, PG_U_LOWERCASE_LETTER},
+	{0x000500, 0x000500, PG_U_UPPERCASE_LETTER},
+	{0x000501, 0x000501, PG_U_LOWERCASE_LETTER},
+	{0x000502, 0x000502, PG_U_UPPERCASE_LETTER},
+	{0x000503, 0x000503, PG_U_LOWERCASE_LETTER},
+	{0x000504, 0x000504, PG_U_UPPERCASE_LETTER},
+	{0x000505, 0x000505, PG_U_LOWERCASE_LETTER},
+	{0x000506, 0x000506, PG_U_UPPERCASE_LETTER},
+	{0x000507, 0x000507, PG_U_LOWERCASE_LETTER},
+	{0x000508, 0x000508, PG_U_UPPERCASE_LETTER},
+	{0x000509, 0x000509, PG_U_LOWERCASE_LETTER},
+	{0x00050a, 0x00050a, PG_U_UPPERCASE_LETTER},
+	{0x00050b, 0x00050b, PG_U_LOWERCASE_LETTER},
+	{0x00050c, 0x00050c, PG_U_UPPERCASE_LETTER},
+	{0x00050d, 0x00050d, PG_U_LOWERCASE_LETTER},
+	{0x00050e, 0x00050e, PG_U_UPPERCASE_LETTER},
+	{0x00050f, 0x00050f, PG_U_LOWERCASE_LETTER},
+	{0x000510, 0x000510, PG_U_UPPERCASE_LETTER},
+	{0x000511, 0x000511, PG_U_LOWERCASE_LETTER},
+	{0x000512, 0x000512, PG_U_UPPERCASE_LETTER},
+	{0x000513, 0x000513, PG_U_LOWERCASE_LETTER},
+	{0x000514, 0x000514, PG_U_UPPERCASE_LETTER},
+	{0x000515, 0x000515, PG_U_LOWERCASE_LETTER},
+	{0x000516, 0x000516, PG_U_UPPERCASE_LETTER},
+	{0x000517, 0x000517, PG_U_LOWERCASE_LETTER},
+	{0x000518, 0x000518, PG_U_UPPERCASE_LETTER},
+	{0x000519, 0x000519, PG_U_LOWERCASE_LETTER},
+	{0x00051a, 0x00051a, PG_U_UPPERCASE_LETTER},
+	{0x00051b, 0x00051b, PG_U_LOWERCASE_LETTER},
+	{0x00051c, 0x00051c, PG_U_UPPERCASE_LETTER},
+	{0x00051d, 0x00051d, PG_U_LOWERCASE_LETTER},
+	{0x00051e, 0x00051e, PG_U_UPPERCASE_LETTER},
+	{0x00051f, 0x00051f, PG_U_LOWERCASE_LETTER},
+	{0x000520, 0x000520, PG_U_UPPERCASE_LETTER},
+	{0x000521, 0x000521, PG_U_LOWERCASE_LETTER},
+	{0x000522, 0x000522, PG_U_UPPERCASE_LETTER},
+	{0x000523, 0x000523, PG_U_LOWERCASE_LETTER},
+	{0x000524, 0x000524, PG_U_UPPERCASE_LETTER},
+	{0x000525, 0x000525, PG_U_LOWERCASE_LETTER},
+	{0x000526, 0x000526, PG_U_UPPERCASE_LETTER},
+	{0x000527, 0x000527, PG_U_LOWERCASE_LETTER},
+	{0x000528, 0x000528, PG_U_UPPERCASE_LETTER},
+	{0x000529, 0x000529, PG_U_LOWERCASE_LETTER},
+	{0x00052a, 0x00052a, PG_U_UPPERCASE_LETTER},
+	{0x00052b, 0x00052b, PG_U_LOWERCASE_LETTER},
+	{0x00052c, 0x00052c, PG_U_UPPERCASE_LETTER},
+	{0x00052d, 0x00052d, PG_U_LOWERCASE_LETTER},
+	{0x00052e, 0x00052e, PG_U_UPPERCASE_LETTER},
+	{0x00052f, 0x00052f, PG_U_LOWERCASE_LETTER},
+	{0x000530, 0x000530, PG_U_UNASSIGNED},
+	{0x000531, 0x000556, PG_U_UPPERCASE_LETTER},
+	{0x000557, 0x000558, PG_U_UNASSIGNED},
+	{0x000559, 0x000559, PG_U_MODIFIER_LETTER},
+	{0x00055a, 0x00055f, PG_U_OTHER_PUNCTUATION},
+	{0x000560, 0x000588, PG_U_LOWERCASE_LETTER},
+	{0x000589, 0x000589, PG_U_OTHER_PUNCTUATION},
+	{0x00058a, 0x00058a, PG_U_DASH_PUNCTUATION},
+	{0x00058b, 0x00058c, PG_U_UNASSIGNED},
+	{0x00058d, 0x00058e, PG_U_OTHER_SYMBOL},
+	{0x00058f, 0x00058f, PG_U_CURRENCY_SYMBOL},
+	{0x000590, 0x000590, PG_U_UNASSIGNED},
+	{0x000591, 0x0005bd, PG_U_NON_SPACING_MARK},
+	{0x0005be, 0x0005be, PG_U_DASH_PUNCTUATION},
+	{0x0005bf, 0x0005bf, PG_U_NON_SPACING_MARK},
+	{0x0005c0, 0x0005c0, PG_U_OTHER_PUNCTUATION},
+	{0x0005c1, 0x0005c2, PG_U_NON_SPACING_MARK},
+	{0x0005c3, 0x0005c3, PG_U_OTHER_PUNCTUATION},
+	{0x0005c4, 0x0005c5, PG_U_NON_SPACING_MARK},
+	{0x0005c6, 0x0005c6, PG_U_OTHER_PUNCTUATION},
+	{0x0005c7, 0x0005c7, PG_U_NON_SPACING_MARK},
+	{0x0005c8, 0x0005cf, PG_U_UNASSIGNED},
+	{0x0005d0, 0x0005ea, PG_U_OTHER_LETTER},
+	{0x0005eb, 0x0005ee, PG_U_UNASSIGNED},
+	{0x0005ef, 0x0005f2, PG_U_OTHER_LETTER},
+	{0x0005f3, 0x0005f4, PG_U_OTHER_PUNCTUATION},
+	{0x0005f5, 0x0005ff, PG_U_UNASSIGNED},
+	{0x000600, 0x000605, PG_U_FORMAT_CHAR},
+	{0x000606, 0x000608, PG_U_MATH_SYMBOL},
+	{0x000609, 0x00060a, PG_U_OTHER_PUNCTUATION},
+	{0x00060b, 0x00060b, PG_U_CURRENCY_SYMBOL},
+	{0x00060c, 0x00060d, PG_U_OTHER_PUNCTUATION},
+	{0x00060e, 0x00060f, PG_U_OTHER_SYMBOL},
+	{0x000610, 0x00061a, PG_U_NON_SPACING_MARK},
+	{0x00061b, 0x00061b, PG_U_OTHER_PUNCTUATION},
+	{0x00061c, 0x00061c, PG_U_FORMAT_CHAR},
+	{0x00061d, 0x00061f, PG_U_OTHER_PUNCTUATION},
+	{0x000620, 0x00063f, PG_U_OTHER_LETTER},
+	{0x000640, 0x000640, PG_U_MODIFIER_LETTER},
+	{0x000641, 0x00064a, PG_U_OTHER_LETTER},
+	{0x00064b, 0x00065f, PG_U_NON_SPACING_MARK},
+	{0x000660, 0x000669, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00066a, 0x00066d, PG_U_OTHER_PUNCTUATION},
+	{0x00066e, 0x00066f, PG_U_OTHER_LETTER},
+	{0x000670, 0x000670, PG_U_NON_SPACING_MARK},
+	{0x000671, 0x0006d3, PG_U_OTHER_LETTER},
+	{0x0006d4, 0x0006d4, PG_U_OTHER_PUNCTUATION},
+	{0x0006d5, 0x0006d5, PG_U_OTHER_LETTER},
+	{0x0006d6, 0x0006dc, PG_U_NON_SPACING_MARK},
+	{0x0006dd, 0x0006dd, PG_U_FORMAT_CHAR},
+	{0x0006de, 0x0006de, PG_U_OTHER_SYMBOL},
+	{0x0006df, 0x0006e4, PG_U_NON_SPACING_MARK},
+	{0x0006e5, 0x0006e6, PG_U_MODIFIER_LETTER},
+	{0x0006e7, 0x0006e8, PG_U_NON_SPACING_MARK},
+	{0x0006e9, 0x0006e9, PG_U_OTHER_SYMBOL},
+	{0x0006ea, 0x0006ed, PG_U_NON_SPACING_MARK},
+	{0x0006ee, 0x0006ef, PG_U_OTHER_LETTER},
+	{0x0006f0, 0x0006f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0006fa, 0x0006fc, PG_U_OTHER_LETTER},
+	{0x0006fd, 0x0006fe, PG_U_OTHER_SYMBOL},
+	{0x0006ff, 0x0006ff, PG_U_OTHER_LETTER},
+	{0x000700, 0x00070d, PG_U_OTHER_PUNCTUATION},
+	{0x00070e, 0x00070e, PG_U_UNASSIGNED},
+	{0x00070f, 0x00070f, PG_U_FORMAT_CHAR},
+	{0x000710, 0x000710, PG_U_OTHER_LETTER},
+	{0x000711, 0x000711, PG_U_NON_SPACING_MARK},
+	{0x000712, 0x00072f, PG_U_OTHER_LETTER},
+	{0x000730, 0x00074a, PG_U_NON_SPACING_MARK},
+	{0x00074b, 0x00074c, PG_U_UNASSIGNED},
+	{0x00074d, 0x0007a5, PG_U_OTHER_LETTER},
+	{0x0007a6, 0x0007b0, PG_U_NON_SPACING_MARK},
+	{0x0007b1, 0x0007b1, PG_U_OTHER_LETTER},
+	{0x0007b2, 0x0007bf, PG_U_UNASSIGNED},
+	{0x0007c0, 0x0007c9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0007ca, 0x0007ea, PG_U_OTHER_LETTER},
+	{0x0007eb, 0x0007f3, PG_U_NON_SPACING_MARK},
+	{0x0007f4, 0x0007f5, PG_U_MODIFIER_LETTER},
+	{0x0007f6, 0x0007f6, PG_U_OTHER_SYMBOL},
+	{0x0007f7, 0x0007f9, PG_U_OTHER_PUNCTUATION},
+	{0x0007fa, 0x0007fa, PG_U_MODIFIER_LETTER},
+	{0x0007fb, 0x0007fc, PG_U_UNASSIGNED},
+	{0x0007fd, 0x0007fd, PG_U_NON_SPACING_MARK},
+	{0x0007fe, 0x0007ff, PG_U_CURRENCY_SYMBOL},
+	{0x000800, 0x000815, PG_U_OTHER_LETTER},
+	{0x000816, 0x000819, PG_U_NON_SPACING_MARK},
+	{0x00081a, 0x00081a, PG_U_MODIFIER_LETTER},
+	{0x00081b, 0x000823, PG_U_NON_SPACING_MARK},
+	{0x000824, 0x000824, PG_U_MODIFIER_LETTER},
+	{0x000825, 0x000827, PG_U_NON_SPACING_MARK},
+	{0x000828, 0x000828, PG_U_MODIFIER_LETTER},
+	{0x000829, 0x00082d, PG_U_NON_SPACING_MARK},
+	{0x00082e, 0x00082f, PG_U_UNASSIGNED},
+	{0x000830, 0x00083e, PG_U_OTHER_PUNCTUATION},
+	{0x00083f, 0x00083f, PG_U_UNASSIGNED},
+	{0x000840, 0x000858, PG_U_OTHER_LETTER},
+	{0x000859, 0x00085b, PG_U_NON_SPACING_MARK},
+	{0x00085c, 0x00085d, PG_U_UNASSIGNED},
+	{0x00085e, 0x00085e, PG_U_OTHER_PUNCTUATION},
+	{0x00085f, 0x00085f, PG_U_UNASSIGNED},
+	{0x000860, 0x00086a, PG_U_OTHER_LETTER},
+	{0x00086b, 0x00086f, PG_U_UNASSIGNED},
+	{0x000870, 0x000887, PG_U_OTHER_LETTER},
+	{0x000888, 0x000888, PG_U_MODIFIER_SYMBOL},
+	{0x000889, 0x00088e, PG_U_OTHER_LETTER},
+	{0x00088f, 0x00088f, PG_U_UNASSIGNED},
+	{0x000890, 0x000891, PG_U_FORMAT_CHAR},
+	{0x000892, 0x000897, PG_U_UNASSIGNED},
+	{0x000898, 0x00089f, PG_U_NON_SPACING_MARK},
+	{0x0008a0, 0x0008c8, PG_U_OTHER_LETTER},
+	{0x0008c9, 0x0008c9, PG_U_MODIFIER_LETTER},
+	{0x0008ca, 0x0008e1, PG_U_NON_SPACING_MARK},
+	{0x0008e2, 0x0008e2, PG_U_FORMAT_CHAR},
+	{0x0008e3, 0x000902, PG_U_NON_SPACING_MARK},
+	{0x000903, 0x000903, PG_U_COMBINING_SPACING_MARK},
+	{0x000904, 0x000939, PG_U_OTHER_LETTER},
+	{0x00093a, 0x00093a, PG_U_NON_SPACING_MARK},
+	{0x00093b, 0x00093b, PG_U_COMBINING_SPACING_MARK},
+	{0x00093c, 0x00093c, PG_U_NON_SPACING_MARK},
+	{0x00093d, 0x00093d, PG_U_OTHER_LETTER},
+	{0x00093e, 0x000940, PG_U_COMBINING_SPACING_MARK},
+	{0x000941, 0x000948, PG_U_NON_SPACING_MARK},
+	{0x000949, 0x00094c, PG_U_COMBINING_SPACING_MARK},
+	{0x00094d, 0x00094d, PG_U_NON_SPACING_MARK},
+	{0x00094e, 0x00094f, PG_U_COMBINING_SPACING_MARK},
+	{0x000950, 0x000950, PG_U_OTHER_LETTER},
+	{0x000951, 0x000957, PG_U_NON_SPACING_MARK},
+	{0x000958, 0x000961, PG_U_OTHER_LETTER},
+	{0x000962, 0x000963, PG_U_NON_SPACING_MARK},
+	{0x000964, 0x000965, PG_U_OTHER_PUNCTUATION},
+	{0x000966, 0x00096f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000970, 0x000970, PG_U_OTHER_PUNCTUATION},
+	{0x000971, 0x000971, PG_U_MODIFIER_LETTER},
+	{0x000972, 0x000980, PG_U_OTHER_LETTER},
+	{0x000981, 0x000981, PG_U_NON_SPACING_MARK},
+	{0x000982, 0x000983, PG_U_COMBINING_SPACING_MARK},
+	{0x000984, 0x000984, PG_U_UNASSIGNED},
+	{0x000985, 0x00098c, PG_U_OTHER_LETTER},
+	{0x00098d, 0x00098e, PG_U_UNASSIGNED},
+	{0x00098f, 0x000990, PG_U_OTHER_LETTER},
+	{0x000991, 0x000992, PG_U_UNASSIGNED},
+	{0x000993, 0x0009a8, PG_U_OTHER_LETTER},
+	{0x0009a9, 0x0009a9, PG_U_UNASSIGNED},
+	{0x0009aa, 0x0009b0, PG_U_OTHER_LETTER},
+	{0x0009b1, 0x0009b1, PG_U_UNASSIGNED},
+	{0x0009b2, 0x0009b2, PG_U_OTHER_LETTER},
+	{0x0009b3, 0x0009b5, PG_U_UNASSIGNED},
+	{0x0009b6, 0x0009b9, PG_U_OTHER_LETTER},
+	{0x0009ba, 0x0009bb, PG_U_UNASSIGNED},
+	{0x0009bc, 0x0009bc, PG_U_NON_SPACING_MARK},
+	{0x0009bd, 0x0009bd, PG_U_OTHER_LETTER},
+	{0x0009be, 0x0009c0, PG_U_COMBINING_SPACING_MARK},
+	{0x0009c1, 0x0009c4, PG_U_NON_SPACING_MARK},
+	{0x0009c5, 0x0009c6, PG_U_UNASSIGNED},
+	{0x0009c7, 0x0009c8, PG_U_COMBINING_SPACING_MARK},
+	{0x0009c9, 0x0009ca, PG_U_UNASSIGNED},
+	{0x0009cb, 0x0009cc, PG_U_COMBINING_SPACING_MARK},
+	{0x0009cd, 0x0009cd, PG_U_NON_SPACING_MARK},
+	{0x0009ce, 0x0009ce, PG_U_OTHER_LETTER},
+	{0x0009cf, 0x0009d6, PG_U_UNASSIGNED},
+	{0x0009d7, 0x0009d7, PG_U_COMBINING_SPACING_MARK},
+	{0x0009d8, 0x0009db, PG_U_UNASSIGNED},
+	{0x0009dc, 0x0009dd, PG_U_OTHER_LETTER},
+	{0x0009de, 0x0009de, PG_U_UNASSIGNED},
+	{0x0009df, 0x0009e1, PG_U_OTHER_LETTER},
+	{0x0009e2, 0x0009e3, PG_U_NON_SPACING_MARK},
+	{0x0009e4, 0x0009e5, PG_U_UNASSIGNED},
+	{0x0009e6, 0x0009ef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0009f0, 0x0009f1, PG_U_OTHER_LETTER},
+	{0x0009f2, 0x0009f3, PG_U_CURRENCY_SYMBOL},
+	{0x0009f4, 0x0009f9, PG_U_OTHER_NUMBER},
+	{0x0009fa, 0x0009fa, PG_U_OTHER_SYMBOL},
+	{0x0009fb, 0x0009fb, PG_U_CURRENCY_SYMBOL},
+	{0x0009fc, 0x0009fc, PG_U_OTHER_LETTER},
+	{0x0009fd, 0x0009fd, PG_U_OTHER_PUNCTUATION},
+	{0x0009fe, 0x0009fe, PG_U_NON_SPACING_MARK},
+	{0x0009ff, 0x000a00, PG_U_UNASSIGNED},
+	{0x000a01, 0x000a02, PG_U_NON_SPACING_MARK},
+	{0x000a03, 0x000a03, PG_U_COMBINING_SPACING_MARK},
+	{0x000a04, 0x000a04, PG_U_UNASSIGNED},
+	{0x000a05, 0x000a0a, PG_U_OTHER_LETTER},
+	{0x000a0b, 0x000a0e, PG_U_UNASSIGNED},
+	{0x000a0f, 0x000a10, PG_U_OTHER_LETTER},
+	{0x000a11, 0x000a12, PG_U_UNASSIGNED},
+	{0x000a13, 0x000a28, PG_U_OTHER_LETTER},
+	{0x000a29, 0x000a29, PG_U_UNASSIGNED},
+	{0x000a2a, 0x000a30, PG_U_OTHER_LETTER},
+	{0x000a31, 0x000a31, PG_U_UNASSIGNED},
+	{0x000a32, 0x000a33, PG_U_OTHER_LETTER},
+	{0x000a34, 0x000a34, PG_U_UNASSIGNED},
+	{0x000a35, 0x000a36, PG_U_OTHER_LETTER},
+	{0x000a37, 0x000a37, PG_U_UNASSIGNED},
+	{0x000a38, 0x000a39, PG_U_OTHER_LETTER},
+	{0x000a3a, 0x000a3b, PG_U_UNASSIGNED},
+	{0x000a3c, 0x000a3c, PG_U_NON_SPACING_MARK},
+	{0x000a3d, 0x000a3d, PG_U_UNASSIGNED},
+	{0x000a3e, 0x000a40, PG_U_COMBINING_SPACING_MARK},
+	{0x000a41, 0x000a42, PG_U_NON_SPACING_MARK},
+	{0x000a43, 0x000a46, PG_U_UNASSIGNED},
+	{0x000a47, 0x000a48, PG_U_NON_SPACING_MARK},
+	{0x000a49, 0x000a4a, PG_U_UNASSIGNED},
+	{0x000a4b, 0x000a4d, PG_U_NON_SPACING_MARK},
+	{0x000a4e, 0x000a50, PG_U_UNASSIGNED},
+	{0x000a51, 0x000a51, PG_U_NON_SPACING_MARK},
+	{0x000a52, 0x000a58, PG_U_UNASSIGNED},
+	{0x000a59, 0x000a5c, PG_U_OTHER_LETTER},
+	{0x000a5d, 0x000a5d, PG_U_UNASSIGNED},
+	{0x000a5e, 0x000a5e, PG_U_OTHER_LETTER},
+	{0x000a5f, 0x000a65, PG_U_UNASSIGNED},
+	{0x000a66, 0x000a6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000a70, 0x000a71, PG_U_NON_SPACING_MARK},
+	{0x000a72, 0x000a74, PG_U_OTHER_LETTER},
+	{0x000a75, 0x000a75, PG_U_NON_SPACING_MARK},
+	{0x000a76, 0x000a76, PG_U_OTHER_PUNCTUATION},
+	{0x000a77, 0x000a80, PG_U_UNASSIGNED},
+	{0x000a81, 0x000a82, PG_U_NON_SPACING_MARK},
+	{0x000a83, 0x000a83, PG_U_COMBINING_SPACING_MARK},
+	{0x000a84, 0x000a84, PG_U_UNASSIGNED},
+	{0x000a85, 0x000a8d, PG_U_OTHER_LETTER},
+	{0x000a8e, 0x000a8e, PG_U_UNASSIGNED},
+	{0x000a8f, 0x000a91, PG_U_OTHER_LETTER},
+	{0x000a92, 0x000a92, PG_U_UNASSIGNED},
+	{0x000a93, 0x000aa8, PG_U_OTHER_LETTER},
+	{0x000aa9, 0x000aa9, PG_U_UNASSIGNED},
+	{0x000aaa, 0x000ab0, PG_U_OTHER_LETTER},
+	{0x000ab1, 0x000ab1, PG_U_UNASSIGNED},
+	{0x000ab2, 0x000ab3, PG_U_OTHER_LETTER},
+	{0x000ab4, 0x000ab4, PG_U_UNASSIGNED},
+	{0x000ab5, 0x000ab9, PG_U_OTHER_LETTER},
+	{0x000aba, 0x000abb, PG_U_UNASSIGNED},
+	{0x000abc, 0x000abc, PG_U_NON_SPACING_MARK},
+	{0x000abd, 0x000abd, PG_U_OTHER_LETTER},
+	{0x000abe, 0x000ac0, PG_U_COMBINING_SPACING_MARK},
+	{0x000ac1, 0x000ac5, PG_U_NON_SPACING_MARK},
+	{0x000ac6, 0x000ac6, PG_U_UNASSIGNED},
+	{0x000ac7, 0x000ac8, PG_U_NON_SPACING_MARK},
+	{0x000ac9, 0x000ac9, PG_U_COMBINING_SPACING_MARK},
+	{0x000aca, 0x000aca, PG_U_UNASSIGNED},
+	{0x000acb, 0x000acc, PG_U_COMBINING_SPACING_MARK},
+	{0x000acd, 0x000acd, PG_U_NON_SPACING_MARK},
+	{0x000ace, 0x000acf, PG_U_UNASSIGNED},
+	{0x000ad0, 0x000ad0, PG_U_OTHER_LETTER},
+	{0x000ad1, 0x000adf, PG_U_UNASSIGNED},
+	{0x000ae0, 0x000ae1, PG_U_OTHER_LETTER},
+	{0x000ae2, 0x000ae3, PG_U_NON_SPACING_MARK},
+	{0x000ae4, 0x000ae5, PG_U_UNASSIGNED},
+	{0x000ae6, 0x000aef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000af0, 0x000af0, PG_U_OTHER_PUNCTUATION},
+	{0x000af1, 0x000af1, PG_U_CURRENCY_SYMBOL},
+	{0x000af2, 0x000af8, PG_U_UNASSIGNED},
+	{0x000af9, 0x000af9, PG_U_OTHER_LETTER},
+	{0x000afa, 0x000aff, PG_U_NON_SPACING_MARK},
+	{0x000b00, 0x000b00, PG_U_UNASSIGNED},
+	{0x000b01, 0x000b01, PG_U_NON_SPACING_MARK},
+	{0x000b02, 0x000b03, PG_U_COMBINING_SPACING_MARK},
+	{0x000b04, 0x000b04, PG_U_UNASSIGNED},
+	{0x000b05, 0x000b0c, PG_U_OTHER_LETTER},
+	{0x000b0d, 0x000b0e, PG_U_UNASSIGNED},
+	{0x000b0f, 0x000b10, PG_U_OTHER_LETTER},
+	{0x000b11, 0x000b12, PG_U_UNASSIGNED},
+	{0x000b13, 0x000b28, PG_U_OTHER_LETTER},
+	{0x000b29, 0x000b29, PG_U_UNASSIGNED},
+	{0x000b2a, 0x000b30, PG_U_OTHER_LETTER},
+	{0x000b31, 0x000b31, PG_U_UNASSIGNED},
+	{0x000b32, 0x000b33, PG_U_OTHER_LETTER},
+	{0x000b34, 0x000b34, PG_U_UNASSIGNED},
+	{0x000b35, 0x000b39, PG_U_OTHER_LETTER},
+	{0x000b3a, 0x000b3b, PG_U_UNASSIGNED},
+	{0x000b3c, 0x000b3c, PG_U_NON_SPACING_MARK},
+	{0x000b3d, 0x000b3d, PG_U_OTHER_LETTER},
+	{0x000b3e, 0x000b3e, PG_U_COMBINING_SPACING_MARK},
+	{0x000b3f, 0x000b3f, PG_U_NON_SPACING_MARK},
+	{0x000b40, 0x000b40, PG_U_COMBINING_SPACING_MARK},
+	{0x000b41, 0x000b44, PG_U_NON_SPACING_MARK},
+	{0x000b45, 0x000b46, PG_U_UNASSIGNED},
+	{0x000b47, 0x000b48, PG_U_COMBINING_SPACING_MARK},
+	{0x000b49, 0x000b4a, PG_U_UNASSIGNED},
+	{0x000b4b, 0x000b4c, PG_U_COMBINING_SPACING_MARK},
+	{0x000b4d, 0x000b4d, PG_U_NON_SPACING_MARK},
+	{0x000b4e, 0x000b54, PG_U_UNASSIGNED},
+	{0x000b55, 0x000b56, PG_U_NON_SPACING_MARK},
+	{0x000b57, 0x000b57, PG_U_COMBINING_SPACING_MARK},
+	{0x000b58, 0x000b5b, PG_U_UNASSIGNED},
+	{0x000b5c, 0x000b5d, PG_U_OTHER_LETTER},
+	{0x000b5e, 0x000b5e, PG_U_UNASSIGNED},
+	{0x000b5f, 0x000b61, PG_U_OTHER_LETTER},
+	{0x000b62, 0x000b63, PG_U_NON_SPACING_MARK},
+	{0x000b64, 0x000b65, PG_U_UNASSIGNED},
+	{0x000b66, 0x000b6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000b70, 0x000b70, PG_U_OTHER_SYMBOL},
+	{0x000b71, 0x000b71, PG_U_OTHER_LETTER},
+	{0x000b72, 0x000b77, PG_U_OTHER_NUMBER},
+	{0x000b78, 0x000b81, PG_U_UNASSIGNED},
+	{0x000b82, 0x000b82, PG_U_NON_SPACING_MARK},
+	{0x000b83, 0x000b83, PG_U_OTHER_LETTER},
+	{0x000b84, 0x000b84, PG_U_UNASSIGNED},
+	{0x000b85, 0x000b8a, PG_U_OTHER_LETTER},
+	{0x000b8b, 0x000b8d, PG_U_UNASSIGNED},
+	{0x000b8e, 0x000b90, PG_U_OTHER_LETTER},
+	{0x000b91, 0x000b91, PG_U_UNASSIGNED},
+	{0x000b92, 0x000b95, PG_U_OTHER_LETTER},
+	{0x000b96, 0x000b98, PG_U_UNASSIGNED},
+	{0x000b99, 0x000b9a, PG_U_OTHER_LETTER},
+	{0x000b9b, 0x000b9b, PG_U_UNASSIGNED},
+	{0x000b9c, 0x000b9c, PG_U_OTHER_LETTER},
+	{0x000b9d, 0x000b9d, PG_U_UNASSIGNED},
+	{0x000b9e, 0x000b9f, PG_U_OTHER_LETTER},
+	{0x000ba0, 0x000ba2, PG_U_UNASSIGNED},
+	{0x000ba3, 0x000ba4, PG_U_OTHER_LETTER},
+	{0x000ba5, 0x000ba7, PG_U_UNASSIGNED},
+	{0x000ba8, 0x000baa, PG_U_OTHER_LETTER},
+	{0x000bab, 0x000bad, PG_U_UNASSIGNED},
+	{0x000bae, 0x000bb9, PG_U_OTHER_LETTER},
+	{0x000bba, 0x000bbd, PG_U_UNASSIGNED},
+	{0x000bbe, 0x000bbf, PG_U_COMBINING_SPACING_MARK},
+	{0x000bc0, 0x000bc0, PG_U_NON_SPACING_MARK},
+	{0x000bc1, 0x000bc2, PG_U_COMBINING_SPACING_MARK},
+	{0x000bc3, 0x000bc5, PG_U_UNASSIGNED},
+	{0x000bc6, 0x000bc8, PG_U_COMBINING_SPACING_MARK},
+	{0x000bc9, 0x000bc9, PG_U_UNASSIGNED},
+	{0x000bca, 0x000bcc, PG_U_COMBINING_SPACING_MARK},
+	{0x000bcd, 0x000bcd, PG_U_NON_SPACING_MARK},
+	{0x000bce, 0x000bcf, PG_U_UNASSIGNED},
+	{0x000bd0, 0x000bd0, PG_U_OTHER_LETTER},
+	{0x000bd1, 0x000bd6, PG_U_UNASSIGNED},
+	{0x000bd7, 0x000bd7, PG_U_COMBINING_SPACING_MARK},
+	{0x000bd8, 0x000be5, PG_U_UNASSIGNED},
+	{0x000be6, 0x000bef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000bf0, 0x000bf2, PG_U_OTHER_NUMBER},
+	{0x000bf3, 0x000bf8, PG_U_OTHER_SYMBOL},
+	{0x000bf9, 0x000bf9, PG_U_CURRENCY_SYMBOL},
+	{0x000bfa, 0x000bfa, PG_U_OTHER_SYMBOL},
+	{0x000bfb, 0x000bff, PG_U_UNASSIGNED},
+	{0x000c00, 0x000c00, PG_U_NON_SPACING_MARK},
+	{0x000c01, 0x000c03, PG_U_COMBINING_SPACING_MARK},
+	{0x000c04, 0x000c04, PG_U_NON_SPACING_MARK},
+	{0x000c05, 0x000c0c, PG_U_OTHER_LETTER},
+	{0x000c0d, 0x000c0d, PG_U_UNASSIGNED},
+	{0x000c0e, 0x000c10, PG_U_OTHER_LETTER},
+	{0x000c11, 0x000c11, PG_U_UNASSIGNED},
+	{0x000c12, 0x000c28, PG_U_OTHER_LETTER},
+	{0x000c29, 0x000c29, PG_U_UNASSIGNED},
+	{0x000c2a, 0x000c39, PG_U_OTHER_LETTER},
+	{0x000c3a, 0x000c3b, PG_U_UNASSIGNED},
+	{0x000c3c, 0x000c3c, PG_U_NON_SPACING_MARK},
+	{0x000c3d, 0x000c3d, PG_U_OTHER_LETTER},
+	{0x000c3e, 0x000c40, PG_U_NON_SPACING_MARK},
+	{0x000c41, 0x000c44, PG_U_COMBINING_SPACING_MARK},
+	{0x000c45, 0x000c45, PG_U_UNASSIGNED},
+	{0x000c46, 0x000c48, PG_U_NON_SPACING_MARK},
+	{0x000c49, 0x000c49, PG_U_UNASSIGNED},
+	{0x000c4a, 0x000c4d, PG_U_NON_SPACING_MARK},
+	{0x000c4e, 0x000c54, PG_U_UNASSIGNED},
+	{0x000c55, 0x000c56, PG_U_NON_SPACING_MARK},
+	{0x000c57, 0x000c57, PG_U_UNASSIGNED},
+	{0x000c58, 0x000c5a, PG_U_OTHER_LETTER},
+	{0x000c5b, 0x000c5c, PG_U_UNASSIGNED},
+	{0x000c5d, 0x000c5d, PG_U_OTHER_LETTER},
+	{0x000c5e, 0x000c5f, PG_U_UNASSIGNED},
+	{0x000c60, 0x000c61, PG_U_OTHER_LETTER},
+	{0x000c62, 0x000c63, PG_U_NON_SPACING_MARK},
+	{0x000c64, 0x000c65, PG_U_UNASSIGNED},
+	{0x000c66, 0x000c6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000c70, 0x000c76, PG_U_UNASSIGNED},
+	{0x000c77, 0x000c77, PG_U_OTHER_PUNCTUATION},
+	{0x000c78, 0x000c7e, PG_U_OTHER_NUMBER},
+	{0x000c7f, 0x000c7f, PG_U_OTHER_SYMBOL},
+	{0x000c80, 0x000c80, PG_U_OTHER_LETTER},
+	{0x000c81, 0x000c81, PG_U_NON_SPACING_MARK},
+	{0x000c82, 0x000c83, PG_U_COMBINING_SPACING_MARK},
+	{0x000c84, 0x000c84, PG_U_OTHER_PUNCTUATION},
+	{0x000c85, 0x000c8c, PG_U_OTHER_LETTER},
+	{0x000c8d, 0x000c8d, PG_U_UNASSIGNED},
+	{0x000c8e, 0x000c90, PG_U_OTHER_LETTER},
+	{0x000c91, 0x000c91, PG_U_UNASSIGNED},
+	{0x000c92, 0x000ca8, PG_U_OTHER_LETTER},
+	{0x000ca9, 0x000ca9, PG_U_UNASSIGNED},
+	{0x000caa, 0x000cb3, PG_U_OTHER_LETTER},
+	{0x000cb4, 0x000cb4, PG_U_UNASSIGNED},
+	{0x000cb5, 0x000cb9, PG_U_OTHER_LETTER},
+	{0x000cba, 0x000cbb, PG_U_UNASSIGNED},
+	{0x000cbc, 0x000cbc, PG_U_NON_SPACING_MARK},
+	{0x000cbd, 0x000cbd, PG_U_OTHER_LETTER},
+	{0x000cbe, 0x000cbe, PG_U_COMBINING_SPACING_MARK},
+	{0x000cbf, 0x000cbf, PG_U_NON_SPACING_MARK},
+	{0x000cc0, 0x000cc4, PG_U_COMBINING_SPACING_MARK},
+	{0x000cc5, 0x000cc5, PG_U_UNASSIGNED},
+	{0x000cc6, 0x000cc6, PG_U_NON_SPACING_MARK},
+	{0x000cc7, 0x000cc8, PG_U_COMBINING_SPACING_MARK},
+	{0x000cc9, 0x000cc9, PG_U_UNASSIGNED},
+	{0x000cca, 0x000ccb, PG_U_COMBINING_SPACING_MARK},
+	{0x000ccc, 0x000ccd, PG_U_NON_SPACING_MARK},
+	{0x000cce, 0x000cd4, PG_U_UNASSIGNED},
+	{0x000cd5, 0x000cd6, PG_U_COMBINING_SPACING_MARK},
+	{0x000cd7, 0x000cdc, PG_U_UNASSIGNED},
+	{0x000cdd, 0x000cde, PG_U_OTHER_LETTER},
+	{0x000cdf, 0x000cdf, PG_U_UNASSIGNED},
+	{0x000ce0, 0x000ce1, PG_U_OTHER_LETTER},
+	{0x000ce2, 0x000ce3, PG_U_NON_SPACING_MARK},
+	{0x000ce4, 0x000ce5, PG_U_UNASSIGNED},
+	{0x000ce6, 0x000cef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000cf0, 0x000cf0, PG_U_UNASSIGNED},
+	{0x000cf1, 0x000cf2, PG_U_OTHER_LETTER},
+	{0x000cf3, 0x000cf3, PG_U_COMBINING_SPACING_MARK},
+	{0x000cf4, 0x000cff, PG_U_UNASSIGNED},
+	{0x000d00, 0x000d01, PG_U_NON_SPACING_MARK},
+	{0x000d02, 0x000d03, PG_U_COMBINING_SPACING_MARK},
+	{0x000d04, 0x000d0c, PG_U_OTHER_LETTER},
+	{0x000d0d, 0x000d0d, PG_U_UNASSIGNED},
+	{0x000d0e, 0x000d10, PG_U_OTHER_LETTER},
+	{0x000d11, 0x000d11, PG_U_UNASSIGNED},
+	{0x000d12, 0x000d3a, PG_U_OTHER_LETTER},
+	{0x000d3b, 0x000d3c, PG_U_NON_SPACING_MARK},
+	{0x000d3d, 0x000d3d, PG_U_OTHER_LETTER},
+	{0x000d3e, 0x000d40, PG_U_COMBINING_SPACING_MARK},
+	{0x000d41, 0x000d44, PG_U_NON_SPACING_MARK},
+	{0x000d45, 0x000d45, PG_U_UNASSIGNED},
+	{0x000d46, 0x000d48, PG_U_COMBINING_SPACING_MARK},
+	{0x000d49, 0x000d49, PG_U_UNASSIGNED},
+	{0x000d4a, 0x000d4c, PG_U_COMBINING_SPACING_MARK},
+	{0x000d4d, 0x000d4d, PG_U_NON_SPACING_MARK},
+	{0x000d4e, 0x000d4e, PG_U_OTHER_LETTER},
+	{0x000d4f, 0x000d4f, PG_U_OTHER_SYMBOL},
+	{0x000d50, 0x000d53, PG_U_UNASSIGNED},
+	{0x000d54, 0x000d56, PG_U_OTHER_LETTER},
+	{0x000d57, 0x000d57, PG_U_COMBINING_SPACING_MARK},
+	{0x000d58, 0x000d5e, PG_U_OTHER_NUMBER},
+	{0x000d5f, 0x000d61, PG_U_OTHER_LETTER},
+	{0x000d62, 0x000d63, PG_U_NON_SPACING_MARK},
+	{0x000d64, 0x000d65, PG_U_UNASSIGNED},
+	{0x000d66, 0x000d6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000d70, 0x000d78, PG_U_OTHER_NUMBER},
+	{0x000d79, 0x000d79, PG_U_OTHER_SYMBOL},
+	{0x000d7a, 0x000d7f, PG_U_OTHER_LETTER},
+	{0x000d80, 0x000d80, PG_U_UNASSIGNED},
+	{0x000d81, 0x000d81, PG_U_NON_SPACING_MARK},
+	{0x000d82, 0x000d83, PG_U_COMBINING_SPACING_MARK},
+	{0x000d84, 0x000d84, PG_U_UNASSIGNED},
+	{0x000d85, 0x000d96, PG_U_OTHER_LETTER},
+	{0x000d97, 0x000d99, PG_U_UNASSIGNED},
+	{0x000d9a, 0x000db1, PG_U_OTHER_LETTER},
+	{0x000db2, 0x000db2, PG_U_UNASSIGNED},
+	{0x000db3, 0x000dbb, PG_U_OTHER_LETTER},
+	{0x000dbc, 0x000dbc, PG_U_UNASSIGNED},
+	{0x000dbd, 0x000dbd, PG_U_OTHER_LETTER},
+	{0x000dbe, 0x000dbf, PG_U_UNASSIGNED},
+	{0x000dc0, 0x000dc6, PG_U_OTHER_LETTER},
+	{0x000dc7, 0x000dc9, PG_U_UNASSIGNED},
+	{0x000dca, 0x000dca, PG_U_NON_SPACING_MARK},
+	{0x000dcb, 0x000dce, PG_U_UNASSIGNED},
+	{0x000dcf, 0x000dd1, PG_U_COMBINING_SPACING_MARK},
+	{0x000dd2, 0x000dd4, PG_U_NON_SPACING_MARK},
+	{0x000dd5, 0x000dd5, PG_U_UNASSIGNED},
+	{0x000dd6, 0x000dd6, PG_U_NON_SPACING_MARK},
+	{0x000dd7, 0x000dd7, PG_U_UNASSIGNED},
+	{0x000dd8, 0x000ddf, PG_U_COMBINING_SPACING_MARK},
+	{0x000de0, 0x000de5, PG_U_UNASSIGNED},
+	{0x000de6, 0x000def, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000df0, 0x000df1, PG_U_UNASSIGNED},
+	{0x000df2, 0x000df3, PG_U_COMBINING_SPACING_MARK},
+	{0x000df4, 0x000df4, PG_U_OTHER_PUNCTUATION},
+	{0x000df5, 0x000e00, PG_U_UNASSIGNED},
+	{0x000e01, 0x000e30, PG_U_OTHER_LETTER},
+	{0x000e31, 0x000e31, PG_U_NON_SPACING_MARK},
+	{0x000e32, 0x000e33, PG_U_OTHER_LETTER},
+	{0x000e34, 0x000e3a, PG_U_NON_SPACING_MARK},
+	{0x000e3b, 0x000e3e, PG_U_UNASSIGNED},
+	{0x000e3f, 0x000e3f, PG_U_CURRENCY_SYMBOL},
+	{0x000e40, 0x000e45, PG_U_OTHER_LETTER},
+	{0x000e46, 0x000e46, PG_U_MODIFIER_LETTER},
+	{0x000e47, 0x000e4e, PG_U_NON_SPACING_MARK},
+	{0x000e4f, 0x000e4f, PG_U_OTHER_PUNCTUATION},
+	{0x000e50, 0x000e59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000e5a, 0x000e5b, PG_U_OTHER_PUNCTUATION},
+	{0x000e5c, 0x000e80, PG_U_UNASSIGNED},
+	{0x000e81, 0x000e82, PG_U_OTHER_LETTER},
+	{0x000e83, 0x000e83, PG_U_UNASSIGNED},
+	{0x000e84, 0x000e84, PG_U_OTHER_LETTER},
+	{0x000e85, 0x000e85, PG_U_UNASSIGNED},
+	{0x000e86, 0x000e8a, PG_U_OTHER_LETTER},
+	{0x000e8b, 0x000e8b, PG_U_UNASSIGNED},
+	{0x000e8c, 0x000ea3, PG_U_OTHER_LETTER},
+	{0x000ea4, 0x000ea4, PG_U_UNASSIGNED},
+	{0x000ea5, 0x000ea5, PG_U_OTHER_LETTER},
+	{0x000ea6, 0x000ea6, PG_U_UNASSIGNED},
+	{0x000ea7, 0x000eb0, PG_U_OTHER_LETTER},
+	{0x000eb1, 0x000eb1, PG_U_NON_SPACING_MARK},
+	{0x000eb2, 0x000eb3, PG_U_OTHER_LETTER},
+	{0x000eb4, 0x000ebc, PG_U_NON_SPACING_MARK},
+	{0x000ebd, 0x000ebd, PG_U_OTHER_LETTER},
+	{0x000ebe, 0x000ebf, PG_U_UNASSIGNED},
+	{0x000ec0, 0x000ec4, PG_U_OTHER_LETTER},
+	{0x000ec5, 0x000ec5, PG_U_UNASSIGNED},
+	{0x000ec6, 0x000ec6, PG_U_MODIFIER_LETTER},
+	{0x000ec7, 0x000ec7, PG_U_UNASSIGNED},
+	{0x000ec8, 0x000ece, PG_U_NON_SPACING_MARK},
+	{0x000ecf, 0x000ecf, PG_U_UNASSIGNED},
+	{0x000ed0, 0x000ed9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000eda, 0x000edb, PG_U_UNASSIGNED},
+	{0x000edc, 0x000edf, PG_U_OTHER_LETTER},
+	{0x000ee0, 0x000eff, PG_U_UNASSIGNED},
+	{0x000f00, 0x000f00, PG_U_OTHER_LETTER},
+	{0x000f01, 0x000f03, PG_U_OTHER_SYMBOL},
+	{0x000f04, 0x000f12, PG_U_OTHER_PUNCTUATION},
+	{0x000f13, 0x000f13, PG_U_OTHER_SYMBOL},
+	{0x000f14, 0x000f14, PG_U_OTHER_PUNCTUATION},
+	{0x000f15, 0x000f17, PG_U_OTHER_SYMBOL},
+	{0x000f18, 0x000f19, PG_U_NON_SPACING_MARK},
+	{0x000f1a, 0x000f1f, PG_U_OTHER_SYMBOL},
+	{0x000f20, 0x000f29, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000f2a, 0x000f33, PG_U_OTHER_NUMBER},
+	{0x000f34, 0x000f34, PG_U_OTHER_SYMBOL},
+	{0x000f35, 0x000f35, PG_U_NON_SPACING_MARK},
+	{0x000f36, 0x000f36, PG_U_OTHER_SYMBOL},
+	{0x000f37, 0x000f37, PG_U_NON_SPACING_MARK},
+	{0x000f38, 0x000f38, PG_U_OTHER_SYMBOL},
+	{0x000f39, 0x000f39, PG_U_NON_SPACING_MARK},
+	{0x000f3a, 0x000f3a, PG_U_START_PUNCTUATION},
+	{0x000f3b, 0x000f3b, PG_U_END_PUNCTUATION},
+	{0x000f3c, 0x000f3c, PG_U_START_PUNCTUATION},
+	{0x000f3d, 0x000f3d, PG_U_END_PUNCTUATION},
+	{0x000f3e, 0x000f3f, PG_U_COMBINING_SPACING_MARK},
+	{0x000f40, 0x000f47, PG_U_OTHER_LETTER},
+	{0x000f48, 0x000f48, PG_U_UNASSIGNED},
+	{0x000f49, 0x000f6c, PG_U_OTHER_LETTER},
+	{0x000f6d, 0x000f70, PG_U_UNASSIGNED},
+	{0x000f71, 0x000f7e, PG_U_NON_SPACING_MARK},
+	{0x000f7f, 0x000f7f, PG_U_COMBINING_SPACING_MARK},
+	{0x000f80, 0x000f84, PG_U_NON_SPACING_MARK},
+	{0x000f85, 0x000f85, PG_U_OTHER_PUNCTUATION},
+	{0x000f86, 0x000f87, PG_U_NON_SPACING_MARK},
+	{0x000f88, 0x000f8c, PG_U_OTHER_LETTER},
+	{0x000f8d, 0x000f97, PG_U_NON_SPACING_MARK},
+	{0x000f98, 0x000f98, PG_U_UNASSIGNED},
+	{0x000f99, 0x000fbc, PG_U_NON_SPACING_MARK},
+	{0x000fbd, 0x000fbd, PG_U_UNASSIGNED},
+	{0x000fbe, 0x000fc5, PG_U_OTHER_SYMBOL},
+	{0x000fc6, 0x000fc6, PG_U_NON_SPACING_MARK},
+	{0x000fc7, 0x000fcc, PG_U_OTHER_SYMBOL},
+	{0x000fcd, 0x000fcd, PG_U_UNASSIGNED},
+	{0x000fce, 0x000fcf, PG_U_OTHER_SYMBOL},
+	{0x000fd0, 0x000fd4, PG_U_OTHER_PUNCTUATION},
+	{0x000fd5, 0x000fd8, PG_U_OTHER_SYMBOL},
+	{0x000fd9, 0x000fda, PG_U_OTHER_PUNCTUATION},
+	{0x000fdb, 0x000fff, PG_U_UNASSIGNED},
+	{0x001000, 0x00102a, PG_U_OTHER_LETTER},
+	{0x00102b, 0x00102c, PG_U_COMBINING_SPACING_MARK},
+	{0x00102d, 0x001030, PG_U_NON_SPACING_MARK},
+	{0x001031, 0x001031, PG_U_COMBINING_SPACING_MARK},
+	{0x001032, 0x001037, PG_U_NON_SPACING_MARK},
+	{0x001038, 0x001038, PG_U_COMBINING_SPACING_MARK},
+	{0x001039, 0x00103a, PG_U_NON_SPACING_MARK},
+	{0x00103b, 0x00103c, PG_U_COMBINING_SPACING_MARK},
+	{0x00103d, 0x00103e, PG_U_NON_SPACING_MARK},
+	{0x00103f, 0x00103f, PG_U_OTHER_LETTER},
+	{0x001040, 0x001049, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00104a, 0x00104f, PG_U_OTHER_PUNCTUATION},
+	{0x001050, 0x001055, PG_U_OTHER_LETTER},
+	{0x001056, 0x001057, PG_U_COMBINING_SPACING_MARK},
+	{0x001058, 0x001059, PG_U_NON_SPACING_MARK},
+	{0x00105a, 0x00105d, PG_U_OTHER_LETTER},
+	{0x00105e, 0x001060, PG_U_NON_SPACING_MARK},
+	{0x001061, 0x001061, PG_U_OTHER_LETTER},
+	{0x001062, 0x001064, PG_U_COMBINING_SPACING_MARK},
+	{0x001065, 0x001066, PG_U_OTHER_LETTER},
+	{0x001067, 0x00106d, PG_U_COMBINING_SPACING_MARK},
+	{0x00106e, 0x001070, PG_U_OTHER_LETTER},
+	{0x001071, 0x001074, PG_U_NON_SPACING_MARK},
+	{0x001075, 0x001081, PG_U_OTHER_LETTER},
+	{0x001082, 0x001082, PG_U_NON_SPACING_MARK},
+	{0x001083, 0x001084, PG_U_COMBINING_SPACING_MARK},
+	{0x001085, 0x001086, PG_U_NON_SPACING_MARK},
+	{0x001087, 0x00108c, PG_U_COMBINING_SPACING_MARK},
+	{0x00108d, 0x00108d, PG_U_NON_SPACING_MARK},
+	{0x00108e, 0x00108e, PG_U_OTHER_LETTER},
+	{0x00108f, 0x00108f, PG_U_COMBINING_SPACING_MARK},
+	{0x001090, 0x001099, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00109a, 0x00109c, PG_U_COMBINING_SPACING_MARK},
+	{0x00109d, 0x00109d, PG_U_NON_SPACING_MARK},
+	{0x00109e, 0x00109f, PG_U_OTHER_SYMBOL},
+	{0x0010a0, 0x0010c5, PG_U_UPPERCASE_LETTER},
+	{0x0010c6, 0x0010c6, PG_U_UNASSIGNED},
+	{0x0010c7, 0x0010c7, PG_U_UPPERCASE_LETTER},
+	{0x0010c8, 0x0010cc, PG_U_UNASSIGNED},
+	{0x0010cd, 0x0010cd, PG_U_UPPERCASE_LETTER},
+	{0x0010ce, 0x0010cf, PG_U_UNASSIGNED},
+	{0x0010d0, 0x0010fa, PG_U_LOWERCASE_LETTER},
+	{0x0010fb, 0x0010fb, PG_U_OTHER_PUNCTUATION},
+	{0x0010fc, 0x0010fc, PG_U_MODIFIER_LETTER},
+	{0x0010fd, 0x0010ff, PG_U_LOWERCASE_LETTER},
+	{0x001100, 0x001248, PG_U_OTHER_LETTER},
+	{0x001249, 0x001249, PG_U_UNASSIGNED},
+	{0x00124a, 0x00124d, PG_U_OTHER_LETTER},
+	{0x00124e, 0x00124f, PG_U_UNASSIGNED},
+	{0x001250, 0x001256, PG_U_OTHER_LETTER},
+	{0x001257, 0x001257, PG_U_UNASSIGNED},
+	{0x001258, 0x001258, PG_U_OTHER_LETTER},
+	{0x001259, 0x001259, PG_U_UNASSIGNED},
+	{0x00125a, 0x00125d, PG_U_OTHER_LETTER},
+	{0x00125e, 0x00125f, PG_U_UNASSIGNED},
+	{0x001260, 0x001288, PG_U_OTHER_LETTER},
+	{0x001289, 0x001289, PG_U_UNASSIGNED},
+	{0x00128a, 0x00128d, PG_U_OTHER_LETTER},
+	{0x00128e, 0x00128f, PG_U_UNASSIGNED},
+	{0x001290, 0x0012b0, PG_U_OTHER_LETTER},
+	{0x0012b1, 0x0012b1, PG_U_UNASSIGNED},
+	{0x0012b2, 0x0012b5, PG_U_OTHER_LETTER},
+	{0x0012b6, 0x0012b7, PG_U_UNASSIGNED},
+	{0x0012b8, 0x0012be, PG_U_OTHER_LETTER},
+	{0x0012bf, 0x0012bf, PG_U_UNASSIGNED},
+	{0x0012c0, 0x0012c0, PG_U_OTHER_LETTER},
+	{0x0012c1, 0x0012c1, PG_U_UNASSIGNED},
+	{0x0012c2, 0x0012c5, PG_U_OTHER_LETTER},
+	{0x0012c6, 0x0012c7, PG_U_UNASSIGNED},
+	{0x0012c8, 0x0012d6, PG_U_OTHER_LETTER},
+	{0x0012d7, 0x0012d7, PG_U_UNASSIGNED},
+	{0x0012d8, 0x001310, PG_U_OTHER_LETTER},
+	{0x001311, 0x001311, PG_U_UNASSIGNED},
+	{0x001312, 0x001315, PG_U_OTHER_LETTER},
+	{0x001316, 0x001317, PG_U_UNASSIGNED},
+	{0x001318, 0x00135a, PG_U_OTHER_LETTER},
+	{0x00135b, 0x00135c, PG_U_UNASSIGNED},
+	{0x00135d, 0x00135f, PG_U_NON_SPACING_MARK},
+	{0x001360, 0x001368, PG_U_OTHER_PUNCTUATION},
+	{0x001369, 0x00137c, PG_U_OTHER_NUMBER},
+	{0x00137d, 0x00137f, PG_U_UNASSIGNED},
+	{0x001380, 0x00138f, PG_U_OTHER_LETTER},
+	{0x001390, 0x001399, PG_U_OTHER_SYMBOL},
+	{0x00139a, 0x00139f, PG_U_UNASSIGNED},
+	{0x0013a0, 0x0013f5, PG_U_UPPERCASE_LETTER},
+	{0x0013f6, 0x0013f7, PG_U_UNASSIGNED},
+	{0x0013f8, 0x0013fd, PG_U_LOWERCASE_LETTER},
+	{0x0013fe, 0x0013ff, PG_U_UNASSIGNED},
+	{0x001400, 0x001400, PG_U_DASH_PUNCTUATION},
+	{0x001401, 0x00166c, PG_U_OTHER_LETTER},
+	{0x00166d, 0x00166d, PG_U_OTHER_SYMBOL},
+	{0x00166e, 0x00166e, PG_U_OTHER_PUNCTUATION},
+	{0x00166f, 0x00167f, PG_U_OTHER_LETTER},
+	{0x001680, 0x001680, PG_U_SPACE_SEPARATOR},
+	{0x001681, 0x00169a, PG_U_OTHER_LETTER},
+	{0x00169b, 0x00169b, PG_U_START_PUNCTUATION},
+	{0x00169c, 0x00169c, PG_U_END_PUNCTUATION},
+	{0x00169d, 0x00169f, PG_U_UNASSIGNED},
+	{0x0016a0, 0x0016ea, PG_U_OTHER_LETTER},
+	{0x0016eb, 0x0016ed, PG_U_OTHER_PUNCTUATION},
+	{0x0016ee, 0x0016f0, PG_U_LETTER_NUMBER},
+	{0x0016f1, 0x0016f8, PG_U_OTHER_LETTER},
+	{0x0016f9, 0x0016ff, PG_U_UNASSIGNED},
+	{0x001700, 0x001711, PG_U_OTHER_LETTER},
+	{0x001712, 0x001714, PG_U_NON_SPACING_MARK},
+	{0x001715, 0x001715, PG_U_COMBINING_SPACING_MARK},
+	{0x001716, 0x00171e, PG_U_UNASSIGNED},
+	{0x00171f, 0x001731, PG_U_OTHER_LETTER},
+	{0x001732, 0x001733, PG_U_NON_SPACING_MARK},
+	{0x001734, 0x001734, PG_U_COMBINING_SPACING_MARK},
+	{0x001735, 0x001736, PG_U_OTHER_PUNCTUATION},
+	{0x001737, 0x00173f, PG_U_UNASSIGNED},
+	{0x001740, 0x001751, PG_U_OTHER_LETTER},
+	{0x001752, 0x001753, PG_U_NON_SPACING_MARK},
+	{0x001754, 0x00175f, PG_U_UNASSIGNED},
+	{0x001760, 0x00176c, PG_U_OTHER_LETTER},
+	{0x00176d, 0x00176d, PG_U_UNASSIGNED},
+	{0x00176e, 0x001770, PG_U_OTHER_LETTER},
+	{0x001771, 0x001771, PG_U_UNASSIGNED},
+	{0x001772, 0x001773, PG_U_NON_SPACING_MARK},
+	{0x001774, 0x00177f, PG_U_UNASSIGNED},
+	{0x001780, 0x0017b3, PG_U_OTHER_LETTER},
+	{0x0017b4, 0x0017b5, PG_U_NON_SPACING_MARK},
+	{0x0017b6, 0x0017b6, PG_U_COMBINING_SPACING_MARK},
+	{0x0017b7, 0x0017bd, PG_U_NON_SPACING_MARK},
+	{0x0017be, 0x0017c5, PG_U_COMBINING_SPACING_MARK},
+	{0x0017c6, 0x0017c6, PG_U_NON_SPACING_MARK},
+	{0x0017c7, 0x0017c8, PG_U_COMBINING_SPACING_MARK},
+	{0x0017c9, 0x0017d3, PG_U_NON_SPACING_MARK},
+	{0x0017d4, 0x0017d6, PG_U_OTHER_PUNCTUATION},
+	{0x0017d7, 0x0017d7, PG_U_MODIFIER_LETTER},
+	{0x0017d8, 0x0017da, PG_U_OTHER_PUNCTUATION},
+	{0x0017db, 0x0017db, PG_U_CURRENCY_SYMBOL},
+	{0x0017dc, 0x0017dc, PG_U_OTHER_LETTER},
+	{0x0017dd, 0x0017dd, PG_U_NON_SPACING_MARK},
+	{0x0017de, 0x0017df, PG_U_UNASSIGNED},
+	{0x0017e0, 0x0017e9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0017ea, 0x0017ef, PG_U_UNASSIGNED},
+	{0x0017f0, 0x0017f9, PG_U_OTHER_NUMBER},
+	{0x0017fa, 0x0017ff, PG_U_UNASSIGNED},
+	{0x001800, 0x001805, PG_U_OTHER_PUNCTUATION},
+	{0x001806, 0x001806, PG_U_DASH_PUNCTUATION},
+	{0x001807, 0x00180a, PG_U_OTHER_PUNCTUATION},
+	{0x00180b, 0x00180d, PG_U_NON_SPACING_MARK},
+	{0x00180e, 0x00180e, PG_U_FORMAT_CHAR},
+	{0x00180f, 0x00180f, PG_U_NON_SPACING_MARK},
+	{0x001810, 0x001819, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00181a, 0x00181f, PG_U_UNASSIGNED},
+	{0x001820, 0x001842, PG_U_OTHER_LETTER},
+	{0x001843, 0x001843, PG_U_MODIFIER_LETTER},
+	{0x001844, 0x001878, PG_U_OTHER_LETTER},
+	{0x001879, 0x00187f, PG_U_UNASSIGNED},
+	{0x001880, 0x001884, PG_U_OTHER_LETTER},
+	{0x001885, 0x001886, PG_U_NON_SPACING_MARK},
+	{0x001887, 0x0018a8, PG_U_OTHER_LETTER},
+	{0x0018a9, 0x0018a9, PG_U_NON_SPACING_MARK},
+	{0x0018aa, 0x0018aa, PG_U_OTHER_LETTER},
+	{0x0018ab, 0x0018af, PG_U_UNASSIGNED},
+	{0x0018b0, 0x0018f5, PG_U_OTHER_LETTER},
+	{0x0018f6, 0x0018ff, PG_U_UNASSIGNED},
+	{0x001900, 0x00191e, PG_U_OTHER_LETTER},
+	{0x00191f, 0x00191f, PG_U_UNASSIGNED},
+	{0x001920, 0x001922, PG_U_NON_SPACING_MARK},
+	{0x001923, 0x001926, PG_U_COMBINING_SPACING_MARK},
+	{0x001927, 0x001928, PG_U_NON_SPACING_MARK},
+	{0x001929, 0x00192b, PG_U_COMBINING_SPACING_MARK},
+	{0x00192c, 0x00192f, PG_U_UNASSIGNED},
+	{0x001930, 0x001931, PG_U_COMBINING_SPACING_MARK},
+	{0x001932, 0x001932, PG_U_NON_SPACING_MARK},
+	{0x001933, 0x001938, PG_U_COMBINING_SPACING_MARK},
+	{0x001939, 0x00193b, PG_U_NON_SPACING_MARK},
+	{0x00193c, 0x00193f, PG_U_UNASSIGNED},
+	{0x001940, 0x001940, PG_U_OTHER_SYMBOL},
+	{0x001941, 0x001943, PG_U_UNASSIGNED},
+	{0x001944, 0x001945, PG_U_OTHER_PUNCTUATION},
+	{0x001946, 0x00194f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001950, 0x00196d, PG_U_OTHER_LETTER},
+	{0x00196e, 0x00196f, PG_U_UNASSIGNED},
+	{0x001970, 0x001974, PG_U_OTHER_LETTER},
+	{0x001975, 0x00197f, PG_U_UNASSIGNED},
+	{0x001980, 0x0019ab, PG_U_OTHER_LETTER},
+	{0x0019ac, 0x0019af, PG_U_UNASSIGNED},
+	{0x0019b0, 0x0019c9, PG_U_OTHER_LETTER},
+	{0x0019ca, 0x0019cf, PG_U_UNASSIGNED},
+	{0x0019d0, 0x0019d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0019da, 0x0019da, PG_U_OTHER_NUMBER},
+	{0x0019db, 0x0019dd, PG_U_UNASSIGNED},
+	{0x0019de, 0x0019ff, PG_U_OTHER_SYMBOL},
+	{0x001a00, 0x001a16, PG_U_OTHER_LETTER},
+	{0x001a17, 0x001a18, PG_U_NON_SPACING_MARK},
+	{0x001a19, 0x001a1a, PG_U_COMBINING_SPACING_MARK},
+	{0x001a1b, 0x001a1b, PG_U_NON_SPACING_MARK},
+	{0x001a1c, 0x001a1d, PG_U_UNASSIGNED},
+	{0x001a1e, 0x001a1f, PG_U_OTHER_PUNCTUATION},
+	{0x001a20, 0x001a54, PG_U_OTHER_LETTER},
+	{0x001a55, 0x001a55, PG_U_COMBINING_SPACING_MARK},
+	{0x001a56, 0x001a56, PG_U_NON_SPACING_MARK},
+	{0x001a57, 0x001a57, PG_U_COMBINING_SPACING_MARK},
+	{0x001a58, 0x001a5e, PG_U_NON_SPACING_MARK},
+	{0x001a5f, 0x001a5f, PG_U_UNASSIGNED},
+	{0x001a60, 0x001a60, PG_U_NON_SPACING_MARK},
+	{0x001a61, 0x001a61, PG_U_COMBINING_SPACING_MARK},
+	{0x001a62, 0x001a62, PG_U_NON_SPACING_MARK},
+	{0x001a63, 0x001a64, PG_U_COMBINING_SPACING_MARK},
+	{0x001a65, 0x001a6c, PG_U_NON_SPACING_MARK},
+	{0x001a6d, 0x001a72, PG_U_COMBINING_SPACING_MARK},
+	{0x001a73, 0x001a7c, PG_U_NON_SPACING_MARK},
+	{0x001a7d, 0x001a7e, PG_U_UNASSIGNED},
+	{0x001a7f, 0x001a7f, PG_U_NON_SPACING_MARK},
+	{0x001a80, 0x001a89, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001a8a, 0x001a8f, PG_U_UNASSIGNED},
+	{0x001a90, 0x001a99, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001a9a, 0x001a9f, PG_U_UNASSIGNED},
+	{0x001aa0, 0x001aa6, PG_U_OTHER_PUNCTUATION},
+	{0x001aa7, 0x001aa7, PG_U_MODIFIER_LETTER},
+	{0x001aa8, 0x001aad, PG_U_OTHER_PUNCTUATION},
+	{0x001aae, 0x001aaf, PG_U_UNASSIGNED},
+	{0x001ab0, 0x001abd, PG_U_NON_SPACING_MARK},
+	{0x001abe, 0x001abe, PG_U_ENCLOSING_MARK},
+	{0x001abf, 0x001ace, PG_U_NON_SPACING_MARK},
+	{0x001acf, 0x001aff, PG_U_UNASSIGNED},
+	{0x001b00, 0x001b03, PG_U_NON_SPACING_MARK},
+	{0x001b04, 0x001b04, PG_U_COMBINING_SPACING_MARK},
+	{0x001b05, 0x001b33, PG_U_OTHER_LETTER},
+	{0x001b34, 0x001b34, PG_U_NON_SPACING_MARK},
+	{0x001b35, 0x001b35, PG_U_COMBINING_SPACING_MARK},
+	{0x001b36, 0x001b3a, PG_U_NON_SPACING_MARK},
+	{0x001b3b, 0x001b3b, PG_U_COMBINING_SPACING_MARK},
+	{0x001b3c, 0x001b3c, PG_U_NON_SPACING_MARK},
+	{0x001b3d, 0x001b41, PG_U_COMBINING_SPACING_MARK},
+	{0x001b42, 0x001b42, PG_U_NON_SPACING_MARK},
+	{0x001b43, 0x001b44, PG_U_COMBINING_SPACING_MARK},
+	{0x001b45, 0x001b4c, PG_U_OTHER_LETTER},
+	{0x001b4d, 0x001b4f, PG_U_UNASSIGNED},
+	{0x001b50, 0x001b59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001b5a, 0x001b60, PG_U_OTHER_PUNCTUATION},
+	{0x001b61, 0x001b6a, PG_U_OTHER_SYMBOL},
+	{0x001b6b, 0x001b73, PG_U_NON_SPACING_MARK},
+	{0x001b74, 0x001b7c, PG_U_OTHER_SYMBOL},
+	{0x001b7d, 0x001b7e, PG_U_OTHER_PUNCTUATION},
+	{0x001b7f, 0x001b7f, PG_U_UNASSIGNED},
+	{0x001b80, 0x001b81, PG_U_NON_SPACING_MARK},
+	{0x001b82, 0x001b82, PG_U_COMBINING_SPACING_MARK},
+	{0x001b83, 0x001ba0, PG_U_OTHER_LETTER},
+	{0x001ba1, 0x001ba1, PG_U_COMBINING_SPACING_MARK},
+	{0x001ba2, 0x001ba5, PG_U_NON_SPACING_MARK},
+	{0x001ba6, 0x001ba7, PG_U_COMBINING_SPACING_MARK},
+	{0x001ba8, 0x001ba9, PG_U_NON_SPACING_MARK},
+	{0x001baa, 0x001baa, PG_U_COMBINING_SPACING_MARK},
+	{0x001bab, 0x001bad, PG_U_NON_SPACING_MARK},
+	{0x001bae, 0x001baf, PG_U_OTHER_LETTER},
+	{0x001bb0, 0x001bb9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001bba, 0x001be5, PG_U_OTHER_LETTER},
+	{0x001be6, 0x001be6, PG_U_NON_SPACING_MARK},
+	{0x001be7, 0x001be7, PG_U_COMBINING_SPACING_MARK},
+	{0x001be8, 0x001be9, PG_U_NON_SPACING_MARK},
+	{0x001bea, 0x001bec, PG_U_COMBINING_SPACING_MARK},
+	{0x001bed, 0x001bed, PG_U_NON_SPACING_MARK},
+	{0x001bee, 0x001bee, PG_U_COMBINING_SPACING_MARK},
+	{0x001bef, 0x001bf1, PG_U_NON_SPACING_MARK},
+	{0x001bf2, 0x001bf3, PG_U_COMBINING_SPACING_MARK},
+	{0x001bf4, 0x001bfb, PG_U_UNASSIGNED},
+	{0x001bfc, 0x001bff, PG_U_OTHER_PUNCTUATION},
+	{0x001c00, 0x001c23, PG_U_OTHER_LETTER},
+	{0x001c24, 0x001c2b, PG_U_COMBINING_SPACING_MARK},
+	{0x001c2c, 0x001c33, PG_U_NON_SPACING_MARK},
+	{0x001c34, 0x001c35, PG_U_COMBINING_SPACING_MARK},
+	{0x001c36, 0x001c37, PG_U_NON_SPACING_MARK},
+	{0x001c38, 0x001c3a, PG_U_UNASSIGNED},
+	{0x001c3b, 0x001c3f, PG_U_OTHER_PUNCTUATION},
+	{0x001c40, 0x001c49, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001c4a, 0x001c4c, PG_U_UNASSIGNED},
+	{0x001c4d, 0x001c4f, PG_U_OTHER_LETTER},
+	{0x001c50, 0x001c59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001c5a, 0x001c77, PG_U_OTHER_LETTER},
+	{0x001c78, 0x001c7d, PG_U_MODIFIER_LETTER},
+	{0x001c7e, 0x001c7f, PG_U_OTHER_PUNCTUATION},
+	{0x001c80, 0x001c88, PG_U_LOWERCASE_LETTER},
+	{0x001c89, 0x001c8f, PG_U_UNASSIGNED},
+	{0x001c90, 0x001cba, PG_U_UPPERCASE_LETTER},
+	{0x001cbb, 0x001cbc, PG_U_UNASSIGNED},
+	{0x001cbd, 0x001cbf, PG_U_UPPERCASE_LETTER},
+	{0x001cc0, 0x001cc7, PG_U_OTHER_PUNCTUATION},
+	{0x001cc8, 0x001ccf, PG_U_UNASSIGNED},
+	{0x001cd0, 0x001cd2, PG_U_NON_SPACING_MARK},
+	{0x001cd3, 0x001cd3, PG_U_OTHER_PUNCTUATION},
+	{0x001cd4, 0x001ce0, PG_U_NON_SPACING_MARK},
+	{0x001ce1, 0x001ce1, PG_U_COMBINING_SPACING_MARK},
+	{0x001ce2, 0x001ce8, PG_U_NON_SPACING_MARK},
+	{0x001ce9, 0x001cec, PG_U_OTHER_LETTER},
+	{0x001ced, 0x001ced, PG_U_NON_SPACING_MARK},
+	{0x001cee, 0x001cf3, PG_U_OTHER_LETTER},
+	{0x001cf4, 0x001cf4, PG_U_NON_SPACING_MARK},
+	{0x001cf5, 0x001cf6, PG_U_OTHER_LETTER},
+	{0x001cf7, 0x001cf7, PG_U_COMBINING_SPACING_MARK},
+	{0x001cf8, 0x001cf9, PG_U_NON_SPACING_MARK},
+	{0x001cfa, 0x001cfa, PG_U_OTHER_LETTER},
+	{0x001cfb, 0x001cff, PG_U_UNASSIGNED},
+	{0x001d00, 0x001d2b, PG_U_LOWERCASE_LETTER},
+	{0x001d2c, 0x001d6a, PG_U_MODIFIER_LETTER},
+	{0x001d6b, 0x001d77, PG_U_LOWERCASE_LETTER},
+	{0x001d78, 0x001d78, PG_U_MODIFIER_LETTER},
+	{0x001d79, 0x001d9a, PG_U_LOWERCASE_LETTER},
+	{0x001d9b, 0x001dbf, PG_U_MODIFIER_LETTER},
+	{0x001dc0, 0x001dff, PG_U_NON_SPACING_MARK},
+	{0x001e00, 0x001e00, PG_U_UPPERCASE_LETTER},
+	{0x001e01, 0x001e01, PG_U_LOWERCASE_LETTER},
+	{0x001e02, 0x001e02, PG_U_UPPERCASE_LETTER},
+	{0x001e03, 0x001e03, PG_U_LOWERCASE_LETTER},
+	{0x001e04, 0x001e04, PG_U_UPPERCASE_LETTER},
+	{0x001e05, 0x001e05, PG_U_LOWERCASE_LETTER},
+	{0x001e06, 0x001e06, PG_U_UPPERCASE_LETTER},
+	{0x001e07, 0x001e07, PG_U_LOWERCASE_LETTER},
+	{0x001e08, 0x001e08, PG_U_UPPERCASE_LETTER},
+	{0x001e09, 0x001e09, PG_U_LOWERCASE_LETTER},
+	{0x001e0a, 0x001e0a, PG_U_UPPERCASE_LETTER},
+	{0x001e0b, 0x001e0b, PG_U_LOWERCASE_LETTER},
+	{0x001e0c, 0x001e0c, PG_U_UPPERCASE_LETTER},
+	{0x001e0d, 0x001e0d, PG_U_LOWERCASE_LETTER},
+	{0x001e0e, 0x001e0e, PG_U_UPPERCASE_LETTER},
+	{0x001e0f, 0x001e0f, PG_U_LOWERCASE_LETTER},
+	{0x001e10, 0x001e10, PG_U_UPPERCASE_LETTER},
+	{0x001e11, 0x001e11, PG_U_LOWERCASE_LETTER},
+	{0x001e12, 0x001e12, PG_U_UPPERCASE_LETTER},
+	{0x001e13, 0x001e13, PG_U_LOWERCASE_LETTER},
+	{0x001e14, 0x001e14, PG_U_UPPERCASE_LETTER},
+	{0x001e15, 0x001e15, PG_U_LOWERCASE_LETTER},
+	{0x001e16, 0x001e16, PG_U_UPPERCASE_LETTER},
+	{0x001e17, 0x001e17, PG_U_LOWERCASE_LETTER},
+	{0x001e18, 0x001e18, PG_U_UPPERCASE_LETTER},
+	{0x001e19, 0x001e19, PG_U_LOWERCASE_LETTER},
+	{0x001e1a, 0x001e1a, PG_U_UPPERCASE_LETTER},
+	{0x001e1b, 0x001e1b, PG_U_LOWERCASE_LETTER},
+	{0x001e1c, 0x001e1c, PG_U_UPPERCASE_LETTER},
+	{0x001e1d, 0x001e1d, PG_U_LOWERCASE_LETTER},
+	{0x001e1e, 0x001e1e, PG_U_UPPERCASE_LETTER},
+	{0x001e1f, 0x001e1f, PG_U_LOWERCASE_LETTER},
+	{0x001e20, 0x001e20, PG_U_UPPERCASE_LETTER},
+	{0x001e21, 0x001e21, PG_U_LOWERCASE_LETTER},
+	{0x001e22, 0x001e22, PG_U_UPPERCASE_LETTER},
+	{0x001e23, 0x001e23, PG_U_LOWERCASE_LETTER},
+	{0x001e24, 0x001e24, PG_U_UPPERCASE_LETTER},
+	{0x001e25, 0x001e25, PG_U_LOWERCASE_LETTER},
+	{0x001e26, 0x001e26, PG_U_UPPERCASE_LETTER},
+	{0x001e27, 0x001e27, PG_U_LOWERCASE_LETTER},
+	{0x001e28, 0x001e28, PG_U_UPPERCASE_LETTER},
+	{0x001e29, 0x001e29, PG_U_LOWERCASE_LETTER},
+	{0x001e2a, 0x001e2a, PG_U_UPPERCASE_LETTER},
+	{0x001e2b, 0x001e2b, PG_U_LOWERCASE_LETTER},
+	{0x001e2c, 0x001e2c, PG_U_UPPERCASE_LETTER},
+	{0x001e2d, 0x001e2d, PG_U_LOWERCASE_LETTER},
+	{0x001e2e, 0x001e2e, PG_U_UPPERCASE_LETTER},
+	{0x001e2f, 0x001e2f, PG_U_LOWERCASE_LETTER},
+	{0x001e30, 0x001e30, PG_U_UPPERCASE_LETTER},
+	{0x001e31, 0x001e31, PG_U_LOWERCASE_LETTER},
+	{0x001e32, 0x001e32, PG_U_UPPERCASE_LETTER},
+	{0x001e33, 0x001e33, PG_U_LOWERCASE_LETTER},
+	{0x001e34, 0x001e34, PG_U_UPPERCASE_LETTER},
+	{0x001e35, 0x001e35, PG_U_LOWERCASE_LETTER},
+	{0x001e36, 0x001e36, PG_U_UPPERCASE_LETTER},
+	{0x001e37, 0x001e37, PG_U_LOWERCASE_LETTER},
+	{0x001e38, 0x001e38, PG_U_UPPERCASE_LETTER},
+	{0x001e39, 0x001e39, PG_U_LOWERCASE_LETTER},
+	{0x001e3a, 0x001e3a, PG_U_UPPERCASE_LETTER},
+	{0x001e3b, 0x001e3b, PG_U_LOWERCASE_LETTER},
+	{0x001e3c, 0x001e3c, PG_U_UPPERCASE_LETTER},
+	{0x001e3d, 0x001e3d, PG_U_LOWERCASE_LETTER},
+	{0x001e3e, 0x001e3e, PG_U_UPPERCASE_LETTER},
+	{0x001e3f, 0x001e3f, PG_U_LOWERCASE_LETTER},
+	{0x001e40, 0x001e40, PG_U_UPPERCASE_LETTER},
+	{0x001e41, 0x001e41, PG_U_LOWERCASE_LETTER},
+	{0x001e42, 0x001e42, PG_U_UPPERCASE_LETTER},
+	{0x001e43, 0x001e43, PG_U_LOWERCASE_LETTER},
+	{0x001e44, 0x001e44, PG_U_UPPERCASE_LETTER},
+	{0x001e45, 0x001e45, PG_U_LOWERCASE_LETTER},
+	{0x001e46, 0x001e46, PG_U_UPPERCASE_LETTER},
+	{0x001e47, 0x001e47, PG_U_LOWERCASE_LETTER},
+	{0x001e48, 0x001e48, PG_U_UPPERCASE_LETTER},
+	{0x001e49, 0x001e49, PG_U_LOWERCASE_LETTER},
+	{0x001e4a, 0x001e4a, PG_U_UPPERCASE_LETTER},
+	{0x001e4b, 0x001e4b, PG_U_LOWERCASE_LETTER},
+	{0x001e4c, 0x001e4c, PG_U_UPPERCASE_LETTER},
+	{0x001e4d, 0x001e4d, PG_U_LOWERCASE_LETTER},
+	{0x001e4e, 0x001e4e, PG_U_UPPERCASE_LETTER},
+	{0x001e4f, 0x001e4f, PG_U_LOWERCASE_LETTER},
+	{0x001e50, 0x001e50, PG_U_UPPERCASE_LETTER},
+	{0x001e51, 0x001e51, PG_U_LOWERCASE_LETTER},
+	{0x001e52, 0x001e52, PG_U_UPPERCASE_LETTER},
+	{0x001e53, 0x001e53, PG_U_LOWERCASE_LETTER},
+	{0x001e54, 0x001e54, PG_U_UPPERCASE_LETTER},
+	{0x001e55, 0x001e55, PG_U_LOWERCASE_LETTER},
+	{0x001e56, 0x001e56, PG_U_UPPERCASE_LETTER},
+	{0x001e57, 0x001e57, PG_U_LOWERCASE_LETTER},
+	{0x001e58, 0x001e58, PG_U_UPPERCASE_LETTER},
+	{0x001e59, 0x001e59, PG_U_LOWERCASE_LETTER},
+	{0x001e5a, 0x001e5a, PG_U_UPPERCASE_LETTER},
+	{0x001e5b, 0x001e5b, PG_U_LOWERCASE_LETTER},
+	{0x001e5c, 0x001e5c, PG_U_UPPERCASE_LETTER},
+	{0x001e5d, 0x001e5d, PG_U_LOWERCASE_LETTER},
+	{0x001e5e, 0x001e5e, PG_U_UPPERCASE_LETTER},
+	{0x001e5f, 0x001e5f, PG_U_LOWERCASE_LETTER},
+	{0x001e60, 0x001e60, PG_U_UPPERCASE_LETTER},
+	{0x001e61, 0x001e61, PG_U_LOWERCASE_LETTER},
+	{0x001e62, 0x001e62, PG_U_UPPERCASE_LETTER},
+	{0x001e63, 0x001e63, PG_U_LOWERCASE_LETTER},
+	{0x001e64, 0x001e64, PG_U_UPPERCASE_LETTER},
+	{0x001e65, 0x001e65, PG_U_LOWERCASE_LETTER},
+	{0x001e66, 0x001e66, PG_U_UPPERCASE_LETTER},
+	{0x001e67, 0x001e67, PG_U_LOWERCASE_LETTER},
+	{0x001e68, 0x001e68, PG_U_UPPERCASE_LETTER},
+	{0x001e69, 0x001e69, PG_U_LOWERCASE_LETTER},
+	{0x001e6a, 0x001e6a, PG_U_UPPERCASE_LETTER},
+	{0x001e6b, 0x001e6b, PG_U_LOWERCASE_LETTER},
+	{0x001e6c, 0x001e6c, PG_U_UPPERCASE_LETTER},
+	{0x001e6d, 0x001e6d, PG_U_LOWERCASE_LETTER},
+	{0x001e6e, 0x001e6e, PG_U_UPPERCASE_LETTER},
+	{0x001e6f, 0x001e6f, PG_U_LOWERCASE_LETTER},
+	{0x001e70, 0x001e70, PG_U_UPPERCASE_LETTER},
+	{0x001e71, 0x001e71, PG_U_LOWERCASE_LETTER},
+	{0x001e72, 0x001e72, PG_U_UPPERCASE_LETTER},
+	{0x001e73, 0x001e73, PG_U_LOWERCASE_LETTER},
+	{0x001e74, 0x001e74, PG_U_UPPERCASE_LETTER},
+	{0x001e75, 0x001e75, PG_U_LOWERCASE_LETTER},
+	{0x001e76, 0x001e76, PG_U_UPPERCASE_LETTER},
+	{0x001e77, 0x001e77, PG_U_LOWERCASE_LETTER},
+	{0x001e78, 0x001e78, PG_U_UPPERCASE_LETTER},
+	{0x001e79, 0x001e79, PG_U_LOWERCASE_LETTER},
+	{0x001e7a, 0x001e7a, PG_U_UPPERCASE_LETTER},
+	{0x001e7b, 0x001e7b, PG_U_LOWERCASE_LETTER},
+	{0x001e7c, 0x001e7c, PG_U_UPPERCASE_LETTER},
+	{0x001e7d, 0x001e7d, PG_U_LOWERCASE_LETTER},
+	{0x001e7e, 0x001e7e, PG_U_UPPERCASE_LETTER},
+	{0x001e7f, 0x001e7f, PG_U_LOWERCASE_LETTER},
+	{0x001e80, 0x001e80, PG_U_UPPERCASE_LETTER},
+	{0x001e81, 0x001e81, PG_U_LOWERCASE_LETTER},
+	{0x001e82, 0x001e82, PG_U_UPPERCASE_LETTER},
+	{0x001e83, 0x001e83, PG_U_LOWERCASE_LETTER},
+	{0x001e84, 0x001e84, PG_U_UPPERCASE_LETTER},
+	{0x001e85, 0x001e85, PG_U_LOWERCASE_LETTER},
+	{0x001e86, 0x001e86, PG_U_UPPERCASE_LETTER},
+	{0x001e87, 0x001e87, PG_U_LOWERCASE_LETTER},
+	{0x001e88, 0x001e88, PG_U_UPPERCASE_LETTER},
+	{0x001e89, 0x001e89, PG_U_LOWERCASE_LETTER},
+	{0x001e8a, 0x001e8a, PG_U_UPPERCASE_LETTER},
+	{0x001e8b, 0x001e8b, PG_U_LOWERCASE_LETTER},
+	{0x001e8c, 0x001e8c, PG_U_UPPERCASE_LETTER},
+	{0x001e8d, 0x001e8d, PG_U_LOWERCASE_LETTER},
+	{0x001e8e, 0x001e8e, PG_U_UPPERCASE_LETTER},
+	{0x001e8f, 0x001e8f, PG_U_LOWERCASE_LETTER},
+	{0x001e90, 0x001e90, PG_U_UPPERCASE_LETTER},
+	{0x001e91, 0x001e91, PG_U_LOWERCASE_LETTER},
+	{0x001e92, 0x001e92, PG_U_UPPERCASE_LETTER},
+	{0x001e93, 0x001e93, PG_U_LOWERCASE_LETTER},
+	{0x001e94, 0x001e94, PG_U_UPPERCASE_LETTER},
+	{0x001e95, 0x001e9d, PG_U_LOWERCASE_LETTER},
+	{0x001e9e, 0x001e9e, PG_U_UPPERCASE_LETTER},
+	{0x001e9f, 0x001e9f, PG_U_LOWERCASE_LETTER},
+	{0x001ea0, 0x001ea0, PG_U_UPPERCASE_LETTER},
+	{0x001ea1, 0x001ea1, PG_U_LOWERCASE_LETTER},
+	{0x001ea2, 0x001ea2, PG_U_UPPERCASE_LETTER},
+	{0x001ea3, 0x001ea3, PG_U_LOWERCASE_LETTER},
+	{0x001ea4, 0x001ea4, PG_U_UPPERCASE_LETTER},
+	{0x001ea5, 0x001ea5, PG_U_LOWERCASE_LETTER},
+	{0x001ea6, 0x001ea6, PG_U_UPPERCASE_LETTER},
+	{0x001ea7, 0x001ea7, PG_U_LOWERCASE_LETTER},
+	{0x001ea8, 0x001ea8, PG_U_UPPERCASE_LETTER},
+	{0x001ea9, 0x001ea9, PG_U_LOWERCASE_LETTER},
+	{0x001eaa, 0x001eaa, PG_U_UPPERCASE_LETTER},
+	{0x001eab, 0x001eab, PG_U_LOWERCASE_LETTER},
+	{0x001eac, 0x001eac, PG_U_UPPERCASE_LETTER},
+	{0x001ead, 0x001ead, PG_U_LOWERCASE_LETTER},
+	{0x001eae, 0x001eae, PG_U_UPPERCASE_LETTER},
+	{0x001eaf, 0x001eaf, PG_U_LOWERCASE_LETTER},
+	{0x001eb0, 0x001eb0, PG_U_UPPERCASE_LETTER},
+	{0x001eb1, 0x001eb1, PG_U_LOWERCASE_LETTER},
+	{0x001eb2, 0x001eb2, PG_U_UPPERCASE_LETTER},
+	{0x001eb3, 0x001eb3, PG_U_LOWERCASE_LETTER},
+	{0x001eb4, 0x001eb4, PG_U_UPPERCASE_LETTER},
+	{0x001eb5, 0x001eb5, PG_U_LOWERCASE_LETTER},
+	{0x001eb6, 0x001eb6, PG_U_UPPERCASE_LETTER},
+	{0x001eb7, 0x001eb7, PG_U_LOWERCASE_LETTER},
+	{0x001eb8, 0x001eb8, PG_U_UPPERCASE_LETTER},
+	{0x001eb9, 0x001eb9, PG_U_LOWERCASE_LETTER},
+	{0x001eba, 0x001eba, PG_U_UPPERCASE_LETTER},
+	{0x001ebb, 0x001ebb, PG_U_LOWERCASE_LETTER},
+	{0x001ebc, 0x001ebc, PG_U_UPPERCASE_LETTER},
+	{0x001ebd, 0x001ebd, PG_U_LOWERCASE_LETTER},
+	{0x001ebe, 0x001ebe, PG_U_UPPERCASE_LETTER},
+	{0x001ebf, 0x001ebf, PG_U_LOWERCASE_LETTER},
+	{0x001ec0, 0x001ec0, PG_U_UPPERCASE_LETTER},
+	{0x001ec1, 0x001ec1, PG_U_LOWERCASE_LETTER},
+	{0x001ec2, 0x001ec2, PG_U_UPPERCASE_LETTER},
+	{0x001ec3, 0x001ec3, PG_U_LOWERCASE_LETTER},
+	{0x001ec4, 0x001ec4, PG_U_UPPERCASE_LETTER},
+	{0x001ec5, 0x001ec5, PG_U_LOWERCASE_LETTER},
+	{0x001ec6, 0x001ec6, PG_U_UPPERCASE_LETTER},
+	{0x001ec7, 0x001ec7, PG_U_LOWERCASE_LETTER},
+	{0x001ec8, 0x001ec8, PG_U_UPPERCASE_LETTER},
+	{0x001ec9, 0x001ec9, PG_U_LOWERCASE_LETTER},
+	{0x001eca, 0x001eca, PG_U_UPPERCASE_LETTER},
+	{0x001ecb, 0x001ecb, PG_U_LOWERCASE_LETTER},
+	{0x001ecc, 0x001ecc, PG_U_UPPERCASE_LETTER},
+	{0x001ecd, 0x001ecd, PG_U_LOWERCASE_LETTER},
+	{0x001ece, 0x001ece, PG_U_UPPERCASE_LETTER},
+	{0x001ecf, 0x001ecf, PG_U_LOWERCASE_LETTER},
+	{0x001ed0, 0x001ed0, PG_U_UPPERCASE_LETTER},
+	{0x001ed1, 0x001ed1, PG_U_LOWERCASE_LETTER},
+	{0x001ed2, 0x001ed2, PG_U_UPPERCASE_LETTER},
+	{0x001ed3, 0x001ed3, PG_U_LOWERCASE_LETTER},
+	{0x001ed4, 0x001ed4, PG_U_UPPERCASE_LETTER},
+	{0x001ed5, 0x001ed5, PG_U_LOWERCASE_LETTER},
+	{0x001ed6, 0x001ed6, PG_U_UPPERCASE_LETTER},
+	{0x001ed7, 0x001ed7, PG_U_LOWERCASE_LETTER},
+	{0x001ed8, 0x001ed8, PG_U_UPPERCASE_LETTER},
+	{0x001ed9, 0x001ed9, PG_U_LOWERCASE_LETTER},
+	{0x001eda, 0x001eda, PG_U_UPPERCASE_LETTER},
+	{0x001edb, 0x001edb, PG_U_LOWERCASE_LETTER},
+	{0x001edc, 0x001edc, PG_U_UPPERCASE_LETTER},
+	{0x001edd, 0x001edd, PG_U_LOWERCASE_LETTER},
+	{0x001ede, 0x001ede, PG_U_UPPERCASE_LETTER},
+	{0x001edf, 0x001edf, PG_U_LOWERCASE_LETTER},
+	{0x001ee0, 0x001ee0, PG_U_UPPERCASE_LETTER},
+	{0x001ee1, 0x001ee1, PG_U_LOWERCASE_LETTER},
+	{0x001ee2, 0x001ee2, PG_U_UPPERCASE_LETTER},
+	{0x001ee3, 0x001ee3, PG_U_LOWERCASE_LETTER},
+	{0x001ee4, 0x001ee4, PG_U_UPPERCASE_LETTER},
+	{0x001ee5, 0x001ee5, PG_U_LOWERCASE_LETTER},
+	{0x001ee6, 0x001ee6, PG_U_UPPERCASE_LETTER},
+	{0x001ee7, 0x001ee7, PG_U_LOWERCASE_LETTER},
+	{0x001ee8, 0x001ee8, PG_U_UPPERCASE_LETTER},
+	{0x001ee9, 0x001ee9, PG_U_LOWERCASE_LETTER},
+	{0x001eea, 0x001eea, PG_U_UPPERCASE_LETTER},
+	{0x001eeb, 0x001eeb, PG_U_LOWERCASE_LETTER},
+	{0x001eec, 0x001eec, PG_U_UPPERCASE_LETTER},
+	{0x001eed, 0x001eed, PG_U_LOWERCASE_LETTER},
+	{0x001eee, 0x001eee, PG_U_UPPERCASE_LETTER},
+	{0x001eef, 0x001eef, PG_U_LOWERCASE_LETTER},
+	{0x001ef0, 0x001ef0, PG_U_UPPERCASE_LETTER},
+	{0x001ef1, 0x001ef1, PG_U_LOWERCASE_LETTER},
+	{0x001ef2, 0x001ef2, PG_U_UPPERCASE_LETTER},
+	{0x001ef3, 0x001ef3, PG_U_LOWERCASE_LETTER},
+	{0x001ef4, 0x001ef4, PG_U_UPPERCASE_LETTER},
+	{0x001ef5, 0x001ef5, PG_U_LOWERCASE_LETTER},
+	{0x001ef6, 0x001ef6, PG_U_UPPERCASE_LETTER},
+	{0x001ef7, 0x001ef7, PG_U_LOWERCASE_LETTER},
+	{0x001ef8, 0x001ef8, PG_U_UPPERCASE_LETTER},
+	{0x001ef9, 0x001ef9, PG_U_LOWERCASE_LETTER},
+	{0x001efa, 0x001efa, PG_U_UPPERCASE_LETTER},
+	{0x001efb, 0x001efb, PG_U_LOWERCASE_LETTER},
+	{0x001efc, 0x001efc, PG_U_UPPERCASE_LETTER},
+	{0x001efd, 0x001efd, PG_U_LOWERCASE_LETTER},
+	{0x001efe, 0x001efe, PG_U_UPPERCASE_LETTER},
+	{0x001eff, 0x001f07, PG_U_LOWERCASE_LETTER},
+	{0x001f08, 0x001f0f, PG_U_UPPERCASE_LETTER},
+	{0x001f10, 0x001f15, PG_U_LOWERCASE_LETTER},
+	{0x001f16, 0x001f17, PG_U_UNASSIGNED},
+	{0x001f18, 0x001f1d, PG_U_UPPERCASE_LETTER},
+	{0x001f1e, 0x001f1f, PG_U_UNASSIGNED},
+	{0x001f20, 0x001f27, PG_U_LOWERCASE_LETTER},
+	{0x001f28, 0x001f2f, PG_U_UPPERCASE_LETTER},
+	{0x001f30, 0x001f37, PG_U_LOWERCASE_LETTER},
+	{0x001f38, 0x001f3f, PG_U_UPPERCASE_LETTER},
+	{0x001f40, 0x001f45, PG_U_LOWERCASE_LETTER},
+	{0x001f46, 0x001f47, PG_U_UNASSIGNED},
+	{0x001f48, 0x001f4d, PG_U_UPPERCASE_LETTER},
+	{0x001f4e, 0x001f4f, PG_U_UNASSIGNED},
+	{0x001f50, 0x001f57, PG_U_LOWERCASE_LETTER},
+	{0x001f58, 0x001f58, PG_U_UNASSIGNED},
+	{0x001f59, 0x001f59, PG_U_UPPERCASE_LETTER},
+	{0x001f5a, 0x001f5a, PG_U_UNASSIGNED},
+	{0x001f5b, 0x001f5b, PG_U_UPPERCASE_LETTER},
+	{0x001f5c, 0x001f5c, PG_U_UNASSIGNED},
+	{0x001f5d, 0x001f5d, PG_U_UPPERCASE_LETTER},
+	{0x001f5e, 0x001f5e, PG_U_UNASSIGNED},
+	{0x001f5f, 0x001f5f, PG_U_UPPERCASE_LETTER},
+	{0x001f60, 0x001f67, PG_U_LOWERCASE_LETTER},
+	{0x001f68, 0x001f6f, PG_U_UPPERCASE_LETTER},
+	{0x001f70, 0x001f7d, PG_U_LOWERCASE_LETTER},
+	{0x001f7e, 0x001f7f, PG_U_UNASSIGNED},
+	{0x001f80, 0x001f87, PG_U_LOWERCASE_LETTER},
+	{0x001f88, 0x001f8f, PG_U_TITLECASE_LETTER},
+	{0x001f90, 0x001f97, PG_U_LOWERCASE_LETTER},
+	{0x001f98, 0x001f9f, PG_U_TITLECASE_LETTER},
+	{0x001fa0, 0x001fa7, PG_U_LOWERCASE_LETTER},
+	{0x001fa8, 0x001faf, PG_U_TITLECASE_LETTER},
+	{0x001fb0, 0x001fb4, PG_U_LOWERCASE_LETTER},
+	{0x001fb5, 0x001fb5, PG_U_UNASSIGNED},
+	{0x001fb6, 0x001fb7, PG_U_LOWERCASE_LETTER},
+	{0x001fb8, 0x001fbb, PG_U_UPPERCASE_LETTER},
+	{0x001fbc, 0x001fbc, PG_U_TITLECASE_LETTER},
+	{0x001fbd, 0x001fbd, PG_U_MODIFIER_SYMBOL},
+	{0x001fbe, 0x001fbe, PG_U_LOWERCASE_LETTER},
+	{0x001fbf, 0x001fc1, PG_U_MODIFIER_SYMBOL},
+	{0x001fc2, 0x001fc4, PG_U_LOWERCASE_LETTER},
+	{0x001fc5, 0x001fc5, PG_U_UNASSIGNED},
+	{0x001fc6, 0x001fc7, PG_U_LOWERCASE_LETTER},
+	{0x001fc8, 0x001fcb, PG_U_UPPERCASE_LETTER},
+	{0x001fcc, 0x001fcc, PG_U_TITLECASE_LETTER},
+	{0x001fcd, 0x001fcf, PG_U_MODIFIER_SYMBOL},
+	{0x001fd0, 0x001fd3, PG_U_LOWERCASE_LETTER},
+	{0x001fd4, 0x001fd5, PG_U_UNASSIGNED},
+	{0x001fd6, 0x001fd7, PG_U_LOWERCASE_LETTER},
+	{0x001fd8, 0x001fdb, PG_U_UPPERCASE_LETTER},
+	{0x001fdc, 0x001fdc, PG_U_UNASSIGNED},
+	{0x001fdd, 0x001fdf, PG_U_MODIFIER_SYMBOL},
+	{0x001fe0, 0x001fe7, PG_U_LOWERCASE_LETTER},
+	{0x001fe8, 0x001fec, PG_U_UPPERCASE_LETTER},
+	{0x001fed, 0x001fef, PG_U_MODIFIER_SYMBOL},
+	{0x001ff0, 0x001ff1, PG_U_UNASSIGNED},
+	{0x001ff2, 0x001ff4, PG_U_LOWERCASE_LETTER},
+	{0x001ff5, 0x001ff5, PG_U_UNASSIGNED},
+	{0x001ff6, 0x001ff7, PG_U_LOWERCASE_LETTER},
+	{0x001ff8, 0x001ffb, PG_U_UPPERCASE_LETTER},
+	{0x001ffc, 0x001ffc, PG_U_TITLECASE_LETTER},
+	{0x001ffd, 0x001ffe, PG_U_MODIFIER_SYMBOL},
+	{0x001fff, 0x001fff, PG_U_UNASSIGNED},
+	{0x002000, 0x00200a, PG_U_SPACE_SEPARATOR},
+	{0x00200b, 0x00200f, PG_U_FORMAT_CHAR},
+	{0x002010, 0x002015, PG_U_DASH_PUNCTUATION},
+	{0x002016, 0x002017, PG_U_OTHER_PUNCTUATION},
+	{0x002018, 0x002018, PG_U_INITIAL_PUNCTUATION},
+	{0x002019, 0x002019, PG_U_FINAL_PUNCTUATION},
+	{0x00201a, 0x00201a, PG_U_START_PUNCTUATION},
+	{0x00201b, 0x00201c, PG_U_INITIAL_PUNCTUATION},
+	{0x00201d, 0x00201d, PG_U_FINAL_PUNCTUATION},
+	{0x00201e, 0x00201e, PG_U_START_PUNCTUATION},
+	{0x00201f, 0x00201f, PG_U_INITIAL_PUNCTUATION},
+	{0x002020, 0x002027, PG_U_OTHER_PUNCTUATION},
+	{0x002028, 0x002028, PG_U_LINE_SEPARATOR},
+	{0x002029, 0x002029, PG_U_PARAGRAPH_SEPARATOR},
+	{0x00202a, 0x00202e, PG_U_FORMAT_CHAR},
+	{0x00202f, 0x00202f, PG_U_SPACE_SEPARATOR},
+	{0x002030, 0x002038, PG_U_OTHER_PUNCTUATION},
+	{0x002039, 0x002039, PG_U_INITIAL_PUNCTUATION},
+	{0x00203a, 0x00203a, PG_U_FINAL_PUNCTUATION},
+	{0x00203b, 0x00203e, PG_U_OTHER_PUNCTUATION},
+	{0x00203f, 0x002040, PG_U_CONNECTOR_PUNCTUATION},
+	{0x002041, 0x002043, PG_U_OTHER_PUNCTUATION},
+	{0x002044, 0x002044, PG_U_MATH_SYMBOL},
+	{0x002045, 0x002045, PG_U_START_PUNCTUATION},
+	{0x002046, 0x002046, PG_U_END_PUNCTUATION},
+	{0x002047, 0x002051, PG_U_OTHER_PUNCTUATION},
+	{0x002052, 0x002052, PG_U_MATH_SYMBOL},
+	{0x002053, 0x002053, PG_U_OTHER_PUNCTUATION},
+	{0x002054, 0x002054, PG_U_CONNECTOR_PUNCTUATION},
+	{0x002055, 0x00205e, PG_U_OTHER_PUNCTUATION},
+	{0x00205f, 0x00205f, PG_U_SPACE_SEPARATOR},
+	{0x002060, 0x002064, PG_U_FORMAT_CHAR},
+	{0x002065, 0x002065, PG_U_UNASSIGNED},
+	{0x002066, 0x00206f, PG_U_FORMAT_CHAR},
+	{0x002070, 0x002070, PG_U_OTHER_NUMBER},
+	{0x002071, 0x002071, PG_U_MODIFIER_LETTER},
+	{0x002072, 0x002073, PG_U_UNASSIGNED},
+	{0x002074, 0x002079, PG_U_OTHER_NUMBER},
+	{0x00207a, 0x00207c, PG_U_MATH_SYMBOL},
+	{0x00207d, 0x00207d, PG_U_START_PUNCTUATION},
+	{0x00207e, 0x00207e, PG_U_END_PUNCTUATION},
+	{0x00207f, 0x00207f, PG_U_MODIFIER_LETTER},
+	{0x002080, 0x002089, PG_U_OTHER_NUMBER},
+	{0x00208a, 0x00208c, PG_U_MATH_SYMBOL},
+	{0x00208d, 0x00208d, PG_U_START_PUNCTUATION},
+	{0x00208e, 0x00208e, PG_U_END_PUNCTUATION},
+	{0x00208f, 0x00208f, PG_U_UNASSIGNED},
+	{0x002090, 0x00209c, PG_U_MODIFIER_LETTER},
+	{0x00209d, 0x00209f, PG_U_UNASSIGNED},
+	{0x0020a0, 0x0020c0, PG_U_CURRENCY_SYMBOL},
+	{0x0020c1, 0x0020cf, PG_U_UNASSIGNED},
+	{0x0020d0, 0x0020dc, PG_U_NON_SPACING_MARK},
+	{0x0020dd, 0x0020e0, PG_U_ENCLOSING_MARK},
+	{0x0020e1, 0x0020e1, PG_U_NON_SPACING_MARK},
+	{0x0020e2, 0x0020e4, PG_U_ENCLOSING_MARK},
+	{0x0020e5, 0x0020f0, PG_U_NON_SPACING_MARK},
+	{0x0020f1, 0x0020ff, PG_U_UNASSIGNED},
+	{0x002100, 0x002101, PG_U_OTHER_SYMBOL},
+	{0x002102, 0x002102, PG_U_UPPERCASE_LETTER},
+	{0x002103, 0x002106, PG_U_OTHER_SYMBOL},
+	{0x002107, 0x002107, PG_U_UPPERCASE_LETTER},
+	{0x002108, 0x002109, PG_U_OTHER_SYMBOL},
+	{0x00210a, 0x00210a, PG_U_LOWERCASE_LETTER},
+	{0x00210b, 0x00210d, PG_U_UPPERCASE_LETTER},
+	{0x00210e, 0x00210f, PG_U_LOWERCASE_LETTER},
+	{0x002110, 0x002112, PG_U_UPPERCASE_LETTER},
+	{0x002113, 0x002113, PG_U_LOWERCASE_LETTER},
+	{0x002114, 0x002114, PG_U_OTHER_SYMBOL},
+	{0x002115, 0x002115, PG_U_UPPERCASE_LETTER},
+	{0x002116, 0x002117, PG_U_OTHER_SYMBOL},
+	{0x002118, 0x002118, PG_U_MATH_SYMBOL},
+	{0x002119, 0x00211d, PG_U_UPPERCASE_LETTER},
+	{0x00211e, 0x002123, PG_U_OTHER_SYMBOL},
+	{0x002124, 0x002124, PG_U_UPPERCASE_LETTER},
+	{0x002125, 0x002125, PG_U_OTHER_SYMBOL},
+	{0x002126, 0x002126, PG_U_UPPERCASE_LETTER},
+	{0x002127, 0x002127, PG_U_OTHER_SYMBOL},
+	{0x002128, 0x002128, PG_U_UPPERCASE_LETTER},
+	{0x002129, 0x002129, PG_U_OTHER_SYMBOL},
+	{0x00212a, 0x00212d, PG_U_UPPERCASE_LETTER},
+	{0x00212e, 0x00212e, PG_U_OTHER_SYMBOL},
+	{0x00212f, 0x00212f, PG_U_LOWERCASE_LETTER},
+	{0x002130, 0x002133, PG_U_UPPERCASE_LETTER},
+	{0x002134, 0x002134, PG_U_LOWERCASE_LETTER},
+	{0x002135, 0x002138, PG_U_OTHER_LETTER},
+	{0x002139, 0x002139, PG_U_LOWERCASE_LETTER},
+	{0x00213a, 0x00213b, PG_U_OTHER_SYMBOL},
+	{0x00213c, 0x00213d, PG_U_LOWERCASE_LETTER},
+	{0x00213e, 0x00213f, PG_U_UPPERCASE_LETTER},
+	{0x002140, 0x002144, PG_U_MATH_SYMBOL},
+	{0x002145, 0x002145, PG_U_UPPERCASE_LETTER},
+	{0x002146, 0x002149, PG_U_LOWERCASE_LETTER},
+	{0x00214a, 0x00214a, PG_U_OTHER_SYMBOL},
+	{0x00214b, 0x00214b, PG_U_MATH_SYMBOL},
+	{0x00214c, 0x00214d, PG_U_OTHER_SYMBOL},
+	{0x00214e, 0x00214e, PG_U_LOWERCASE_LETTER},
+	{0x00214f, 0x00214f, PG_U_OTHER_SYMBOL},
+	{0x002150, 0x00215f, PG_U_OTHER_NUMBER},
+	{0x002160, 0x002182, PG_U_LETTER_NUMBER},
+	{0x002183, 0x002183, PG_U_UPPERCASE_LETTER},
+	{0x002184, 0x002184, PG_U_LOWERCASE_LETTER},
+	{0x002185, 0x002188, PG_U_LETTER_NUMBER},
+	{0x002189, 0x002189, PG_U_OTHER_NUMBER},
+	{0x00218a, 0x00218b, PG_U_OTHER_SYMBOL},
+	{0x00218c, 0x00218f, PG_U_UNASSIGNED},
+	{0x002190, 0x002194, PG_U_MATH_SYMBOL},
+	{0x002195, 0x002199, PG_U_OTHER_SYMBOL},
+	{0x00219a, 0x00219b, PG_U_MATH_SYMBOL},
+	{0x00219c, 0x00219f, PG_U_OTHER_SYMBOL},
+	{0x0021a0, 0x0021a0, PG_U_MATH_SYMBOL},
+	{0x0021a1, 0x0021a2, PG_U_OTHER_SYMBOL},
+	{0x0021a3, 0x0021a3, PG_U_MATH_SYMBOL},
+	{0x0021a4, 0x0021a5, PG_U_OTHER_SYMBOL},
+	{0x0021a6, 0x0021a6, PG_U_MATH_SYMBOL},
+	{0x0021a7, 0x0021ad, PG_U_OTHER_SYMBOL},
+	{0x0021ae, 0x0021ae, PG_U_MATH_SYMBOL},
+	{0x0021af, 0x0021cd, PG_U_OTHER_SYMBOL},
+	{0x0021ce, 0x0021cf, PG_U_MATH_SYMBOL},
+	{0x0021d0, 0x0021d1, PG_U_OTHER_SYMBOL},
+	{0x0021d2, 0x0021d2, PG_U_MATH_SYMBOL},
+	{0x0021d3, 0x0021d3, PG_U_OTHER_SYMBOL},
+	{0x0021d4, 0x0021d4, PG_U_MATH_SYMBOL},
+	{0x0021d5, 0x0021f3, PG_U_OTHER_SYMBOL},
+	{0x0021f4, 0x0022ff, PG_U_MATH_SYMBOL},
+	{0x002300, 0x002307, PG_U_OTHER_SYMBOL},
+	{0x002308, 0x002308, PG_U_START_PUNCTUATION},
+	{0x002309, 0x002309, PG_U_END_PUNCTUATION},
+	{0x00230a, 0x00230a, PG_U_START_PUNCTUATION},
+	{0x00230b, 0x00230b, PG_U_END_PUNCTUATION},
+	{0x00230c, 0x00231f, PG_U_OTHER_SYMBOL},
+	{0x002320, 0x002321, PG_U_MATH_SYMBOL},
+	{0x002322, 0x002328, PG_U_OTHER_SYMBOL},
+	{0x002329, 0x002329, PG_U_START_PUNCTUATION},
+	{0x00232a, 0x00232a, PG_U_END_PUNCTUATION},
+	{0x00232b, 0x00237b, PG_U_OTHER_SYMBOL},
+	{0x00237c, 0x00237c, PG_U_MATH_SYMBOL},
+	{0x00237d, 0x00239a, PG_U_OTHER_SYMBOL},
+	{0x00239b, 0x0023b3, PG_U_MATH_SYMBOL},
+	{0x0023b4, 0x0023db, PG_U_OTHER_SYMBOL},
+	{0x0023dc, 0x0023e1, PG_U_MATH_SYMBOL},
+	{0x0023e2, 0x002426, PG_U_OTHER_SYMBOL},
+	{0x002427, 0x00243f, PG_U_UNASSIGNED},
+	{0x002440, 0x00244a, PG_U_OTHER_SYMBOL},
+	{0x00244b, 0x00245f, PG_U_UNASSIGNED},
+	{0x002460, 0x00249b, PG_U_OTHER_NUMBER},
+	{0x00249c, 0x0024e9, PG_U_OTHER_SYMBOL},
+	{0x0024ea, 0x0024ff, PG_U_OTHER_NUMBER},
+	{0x002500, 0x0025b6, PG_U_OTHER_SYMBOL},
+	{0x0025b7, 0x0025b7, PG_U_MATH_SYMBOL},
+	{0x0025b8, 0x0025c0, PG_U_OTHER_SYMBOL},
+	{0x0025c1, 0x0025c1, PG_U_MATH_SYMBOL},
+	{0x0025c2, 0x0025f7, PG_U_OTHER_SYMBOL},
+	{0x0025f8, 0x0025ff, PG_U_MATH_SYMBOL},
+	{0x002600, 0x00266e, PG_U_OTHER_SYMBOL},
+	{0x00266f, 0x00266f, PG_U_MATH_SYMBOL},
+	{0x002670, 0x002767, PG_U_OTHER_SYMBOL},
+	{0x002768, 0x002768, PG_U_START_PUNCTUATION},
+	{0x002769, 0x002769, PG_U_END_PUNCTUATION},
+	{0x00276a, 0x00276a, PG_U_START_PUNCTUATION},
+	{0x00276b, 0x00276b, PG_U_END_PUNCTUATION},
+	{0x00276c, 0x00276c, PG_U_START_PUNCTUATION},
+	{0x00276d, 0x00276d, PG_U_END_PUNCTUATION},
+	{0x00276e, 0x00276e, PG_U_START_PUNCTUATION},
+	{0x00276f, 0x00276f, PG_U_END_PUNCTUATION},
+	{0x002770, 0x002770, PG_U_START_PUNCTUATION},
+	{0x002771, 0x002771, PG_U_END_PUNCTUATION},
+	{0x002772, 0x002772, PG_U_START_PUNCTUATION},
+	{0x002773, 0x002773, PG_U_END_PUNCTUATION},
+	{0x002774, 0x002774, PG_U_START_PUNCTUATION},
+	{0x002775, 0x002775, PG_U_END_PUNCTUATION},
+	{0x002776, 0x002793, PG_U_OTHER_NUMBER},
+	{0x002794, 0x0027bf, PG_U_OTHER_SYMBOL},
+	{0x0027c0, 0x0027c4, PG_U_MATH_SYMBOL},
+	{0x0027c5, 0x0027c5, PG_U_START_PUNCTUATION},
+	{0x0027c6, 0x0027c6, PG_U_END_PUNCTUATION},
+	{0x0027c7, 0x0027e5, PG_U_MATH_SYMBOL},
+	{0x0027e6, 0x0027e6, PG_U_START_PUNCTUATION},
+	{0x0027e7, 0x0027e7, PG_U_END_PUNCTUATION},
+	{0x0027e8, 0x0027e8, PG_U_START_PUNCTUATION},
+	{0x0027e9, 0x0027e9, PG_U_END_PUNCTUATION},
+	{0x0027ea, 0x0027ea, PG_U_START_PUNCTUATION},
+	{0x0027eb, 0x0027eb, PG_U_END_PUNCTUATION},
+	{0x0027ec, 0x0027ec, PG_U_START_PUNCTUATION},
+	{0x0027ed, 0x0027ed, PG_U_END_PUNCTUATION},
+	{0x0027ee, 0x0027ee, PG_U_START_PUNCTUATION},
+	{0x0027ef, 0x0027ef, PG_U_END_PUNCTUATION},
+	{0x0027f0, 0x0027ff, PG_U_MATH_SYMBOL},
+	{0x002800, 0x0028ff, PG_U_OTHER_SYMBOL},
+	{0x002900, 0x002982, PG_U_MATH_SYMBOL},
+	{0x002983, 0x002983, PG_U_START_PUNCTUATION},
+	{0x002984, 0x002984, PG_U_END_PUNCTUATION},
+	{0x002985, 0x002985, PG_U_START_PUNCTUATION},
+	{0x002986, 0x002986, PG_U_END_PUNCTUATION},
+	{0x002987, 0x002987, PG_U_START_PUNCTUATION},
+	{0x002988, 0x002988, PG_U_END_PUNCTUATION},
+	{0x002989, 0x002989, PG_U_START_PUNCTUATION},
+	{0x00298a, 0x00298a, PG_U_END_PUNCTUATION},
+	{0x00298b, 0x00298b, PG_U_START_PUNCTUATION},
+	{0x00298c, 0x00298c, PG_U_END_PUNCTUATION},
+	{0x00298d, 0x00298d, PG_U_START_PUNCTUATION},
+	{0x00298e, 0x00298e, PG_U_END_PUNCTUATION},
+	{0x00298f, 0x00298f, PG_U_START_PUNCTUATION},
+	{0x002990, 0x002990, PG_U_END_PUNCTUATION},
+	{0x002991, 0x002991, PG_U_START_PUNCTUATION},
+	{0x002992, 0x002992, PG_U_END_PUNCTUATION},
+	{0x002993, 0x002993, PG_U_START_PUNCTUATION},
+	{0x002994, 0x002994, PG_U_END_PUNCTUATION},
+	{0x002995, 0x002995, PG_U_START_PUNCTUATION},
+	{0x002996, 0x002996, PG_U_END_PUNCTUATION},
+	{0x002997, 0x002997, PG_U_START_PUNCTUATION},
+	{0x002998, 0x002998, PG_U_END_PUNCTUATION},
+	{0x002999, 0x0029d7, PG_U_MATH_SYMBOL},
+	{0x0029d8, 0x0029d8, PG_U_START_PUNCTUATION},
+	{0x0029d9, 0x0029d9, PG_U_END_PUNCTUATION},
+	{0x0029da, 0x0029da, PG_U_START_PUNCTUATION},
+	{0x0029db, 0x0029db, PG_U_END_PUNCTUATION},
+	{0x0029dc, 0x0029fb, PG_U_MATH_SYMBOL},
+	{0x0029fc, 0x0029fc, PG_U_START_PUNCTUATION},
+	{0x0029fd, 0x0029fd, PG_U_END_PUNCTUATION},
+	{0x0029fe, 0x002aff, PG_U_MATH_SYMBOL},
+	{0x002b00, 0x002b2f, PG_U_OTHER_SYMBOL},
+	{0x002b30, 0x002b44, PG_U_MATH_SYMBOL},
+	{0x002b45, 0x002b46, PG_U_OTHER_SYMBOL},
+	{0x002b47, 0x002b4c, PG_U_MATH_SYMBOL},
+	{0x002b4d, 0x002b73, PG_U_OTHER_SYMBOL},
+	{0x002b74, 0x002b75, PG_U_UNASSIGNED},
+	{0x002b76, 0x002b95, PG_U_OTHER_SYMBOL},
+	{0x002b96, 0x002b96, PG_U_UNASSIGNED},
+	{0x002b97, 0x002bff, PG_U_OTHER_SYMBOL},
+	{0x002c00, 0x002c2f, PG_U_UPPERCASE_LETTER},
+	{0x002c30, 0x002c5f, PG_U_LOWERCASE_LETTER},
+	{0x002c60, 0x002c60, PG_U_UPPERCASE_LETTER},
+	{0x002c61, 0x002c61, PG_U_LOWERCASE_LETTER},
+	{0x002c62, 0x002c64, PG_U_UPPERCASE_LETTER},
+	{0x002c65, 0x002c66, PG_U_LOWERCASE_LETTER},
+	{0x002c67, 0x002c67, PG_U_UPPERCASE_LETTER},
+	{0x002c68, 0x002c68, PG_U_LOWERCASE_LETTER},
+	{0x002c69, 0x002c69, PG_U_UPPERCASE_LETTER},
+	{0x002c6a, 0x002c6a, PG_U_LOWERCASE_LETTER},
+	{0x002c6b, 0x002c6b, PG_U_UPPERCASE_LETTER},
+	{0x002c6c, 0x002c6c, PG_U_LOWERCASE_LETTER},
+	{0x002c6d, 0x002c70, PG_U_UPPERCASE_LETTER},
+	{0x002c71, 0x002c71, PG_U_LOWERCASE_LETTER},
+	{0x002c72, 0x002c72, PG_U_UPPERCASE_LETTER},
+	{0x002c73, 0x002c74, PG_U_LOWERCASE_LETTER},
+	{0x002c75, 0x002c75, PG_U_UPPERCASE_LETTER},
+	{0x002c76, 0x002c7b, PG_U_LOWERCASE_LETTER},
+	{0x002c7c, 0x002c7d, PG_U_MODIFIER_LETTER},
+	{0x002c7e, 0x002c80, PG_U_UPPERCASE_LETTER},
+	{0x002c81, 0x002c81, PG_U_LOWERCASE_LETTER},
+	{0x002c82, 0x002c82, PG_U_UPPERCASE_LETTER},
+	{0x002c83, 0x002c83, PG_U_LOWERCASE_LETTER},
+	{0x002c84, 0x002c84, PG_U_UPPERCASE_LETTER},
+	{0x002c85, 0x002c85, PG_U_LOWERCASE_LETTER},
+	{0x002c86, 0x002c86, PG_U_UPPERCASE_LETTER},
+	{0x002c87, 0x002c87, PG_U_LOWERCASE_LETTER},
+	{0x002c88, 0x002c88, PG_U_UPPERCASE_LETTER},
+	{0x002c89, 0x002c89, PG_U_LOWERCASE_LETTER},
+	{0x002c8a, 0x002c8a, PG_U_UPPERCASE_LETTER},
+	{0x002c8b, 0x002c8b, PG_U_LOWERCASE_LETTER},
+	{0x002c8c, 0x002c8c, PG_U_UPPERCASE_LETTER},
+	{0x002c8d, 0x002c8d, PG_U_LOWERCASE_LETTER},
+	{0x002c8e, 0x002c8e, PG_U_UPPERCASE_LETTER},
+	{0x002c8f, 0x002c8f, PG_U_LOWERCASE_LETTER},
+	{0x002c90, 0x002c90, PG_U_UPPERCASE_LETTER},
+	{0x002c91, 0x002c91, PG_U_LOWERCASE_LETTER},
+	{0x002c92, 0x002c92, PG_U_UPPERCASE_LETTER},
+	{0x002c93, 0x002c93, PG_U_LOWERCASE_LETTER},
+	{0x002c94, 0x002c94, PG_U_UPPERCASE_LETTER},
+	{0x002c95, 0x002c95, PG_U_LOWERCASE_LETTER},
+	{0x002c96, 0x002c96, PG_U_UPPERCASE_LETTER},
+	{0x002c97, 0x002c97, PG_U_LOWERCASE_LETTER},
+	{0x002c98, 0x002c98, PG_U_UPPERCASE_LETTER},
+	{0x002c99, 0x002c99, PG_U_LOWERCASE_LETTER},
+	{0x002c9a, 0x002c9a, PG_U_UPPERCASE_LETTER},
+	{0x002c9b, 0x002c9b, PG_U_LOWERCASE_LETTER},
+	{0x002c9c, 0x002c9c, PG_U_UPPERCASE_LETTER},
+	{0x002c9d, 0x002c9d, PG_U_LOWERCASE_LETTER},
+	{0x002c9e, 0x002c9e, PG_U_UPPERCASE_LETTER},
+	{0x002c9f, 0x002c9f, PG_U_LOWERCASE_LETTER},
+	{0x002ca0, 0x002ca0, PG_U_UPPERCASE_LETTER},
+	{0x002ca1, 0x002ca1, PG_U_LOWERCASE_LETTER},
+	{0x002ca2, 0x002ca2, PG_U_UPPERCASE_LETTER},
+	{0x002ca3, 0x002ca3, PG_U_LOWERCASE_LETTER},
+	{0x002ca4, 0x002ca4, PG_U_UPPERCASE_LETTER},
+	{0x002ca5, 0x002ca5, PG_U_LOWERCASE_LETTER},
+	{0x002ca6, 0x002ca6, PG_U_UPPERCASE_LETTER},
+	{0x002ca7, 0x002ca7, PG_U_LOWERCASE_LETTER},
+	{0x002ca8, 0x002ca8, PG_U_UPPERCASE_LETTER},
+	{0x002ca9, 0x002ca9, PG_U_LOWERCASE_LETTER},
+	{0x002caa, 0x002caa, PG_U_UPPERCASE_LETTER},
+	{0x002cab, 0x002cab, PG_U_LOWERCASE_LETTER},
+	{0x002cac, 0x002cac, PG_U_UPPERCASE_LETTER},
+	{0x002cad, 0x002cad, PG_U_LOWERCASE_LETTER},
+	{0x002cae, 0x002cae, PG_U_UPPERCASE_LETTER},
+	{0x002caf, 0x002caf, PG_U_LOWERCASE_LETTER},
+	{0x002cb0, 0x002cb0, PG_U_UPPERCASE_LETTER},
+	{0x002cb1, 0x002cb1, PG_U_LOWERCASE_LETTER},
+	{0x002cb2, 0x002cb2, PG_U_UPPERCASE_LETTER},
+	{0x002cb3, 0x002cb3, PG_U_LOWERCASE_LETTER},
+	{0x002cb4, 0x002cb4, PG_U_UPPERCASE_LETTER},
+	{0x002cb5, 0x002cb5, PG_U_LOWERCASE_LETTER},
+	{0x002cb6, 0x002cb6, PG_U_UPPERCASE_LETTER},
+	{0x002cb7, 0x002cb7, PG_U_LOWERCASE_LETTER},
+	{0x002cb8, 0x002cb8, PG_U_UPPERCASE_LETTER},
+	{0x002cb9, 0x002cb9, PG_U_LOWERCASE_LETTER},
+	{0x002cba, 0x002cba, PG_U_UPPERCASE_LETTER},
+	{0x002cbb, 0x002cbb, PG_U_LOWERCASE_LETTER},
+	{0x002cbc, 0x002cbc, PG_U_UPPERCASE_LETTER},
+	{0x002cbd, 0x002cbd, PG_U_LOWERCASE_LETTER},
+	{0x002cbe, 0x002cbe, PG_U_UPPERCASE_LETTER},
+	{0x002cbf, 0x002cbf, PG_U_LOWERCASE_LETTER},
+	{0x002cc0, 0x002cc0, PG_U_UPPERCASE_LETTER},
+	{0x002cc1, 0x002cc1, PG_U_LOWERCASE_LETTER},
+	{0x002cc2, 0x002cc2, PG_U_UPPERCASE_LETTER},
+	{0x002cc3, 0x002cc3, PG_U_LOWERCASE_LETTER},
+	{0x002cc4, 0x002cc4, PG_U_UPPERCASE_LETTER},
+	{0x002cc5, 0x002cc5, PG_U_LOWERCASE_LETTER},
+	{0x002cc6, 0x002cc6, PG_U_UPPERCASE_LETTER},
+	{0x002cc7, 0x002cc7, PG_U_LOWERCASE_LETTER},
+	{0x002cc8, 0x002cc8, PG_U_UPPERCASE_LETTER},
+	{0x002cc9, 0x002cc9, PG_U_LOWERCASE_LETTER},
+	{0x002cca, 0x002cca, PG_U_UPPERCASE_LETTER},
+	{0x002ccb, 0x002ccb, PG_U_LOWERCASE_LETTER},
+	{0x002ccc, 0x002ccc, PG_U_UPPERCASE_LETTER},
+	{0x002ccd, 0x002ccd, PG_U_LOWERCASE_LETTER},
+	{0x002cce, 0x002cce, PG_U_UPPERCASE_LETTER},
+	{0x002ccf, 0x002ccf, PG_U_LOWERCASE_LETTER},
+	{0x002cd0, 0x002cd0, PG_U_UPPERCASE_LETTER},
+	{0x002cd1, 0x002cd1, PG_U_LOWERCASE_LETTER},
+	{0x002cd2, 0x002cd2, PG_U_UPPERCASE_LETTER},
+	{0x002cd3, 0x002cd3, PG_U_LOWERCASE_LETTER},
+	{0x002cd4, 0x002cd4, PG_U_UPPERCASE_LETTER},
+	{0x002cd5, 0x002cd5, PG_U_LOWERCASE_LETTER},
+	{0x002cd6, 0x002cd6, PG_U_UPPERCASE_LETTER},
+	{0x002cd7, 0x002cd7, PG_U_LOWERCASE_LETTER},
+	{0x002cd8, 0x002cd8, PG_U_UPPERCASE_LETTER},
+	{0x002cd9, 0x002cd9, PG_U_LOWERCASE_LETTER},
+	{0x002cda, 0x002cda, PG_U_UPPERCASE_LETTER},
+	{0x002cdb, 0x002cdb, PG_U_LOWERCASE_LETTER},
+	{0x002cdc, 0x002cdc, PG_U_UPPERCASE_LETTER},
+	{0x002cdd, 0x002cdd, PG_U_LOWERCASE_LETTER},
+	{0x002cde, 0x002cde, PG_U_UPPERCASE_LETTER},
+	{0x002cdf, 0x002cdf, PG_U_LOWERCASE_LETTER},
+	{0x002ce0, 0x002ce0, PG_U_UPPERCASE_LETTER},
+	{0x002ce1, 0x002ce1, PG_U_LOWERCASE_LETTER},
+	{0x002ce2, 0x002ce2, PG_U_UPPERCASE_LETTER},
+	{0x002ce3, 0x002ce4, PG_U_LOWERCASE_LETTER},
+	{0x002ce5, 0x002cea, PG_U_OTHER_SYMBOL},
+	{0x002ceb, 0x002ceb, PG_U_UPPERCASE_LETTER},
+	{0x002cec, 0x002cec, PG_U_LOWERCASE_LETTER},
+	{0x002ced, 0x002ced, PG_U_UPPERCASE_LETTER},
+	{0x002cee, 0x002cee, PG_U_LOWERCASE_LETTER},
+	{0x002cef, 0x002cf1, PG_U_NON_SPACING_MARK},
+	{0x002cf2, 0x002cf2, PG_U_UPPERCASE_LETTER},
+	{0x002cf3, 0x002cf3, PG_U_LOWERCASE_LETTER},
+	{0x002cf4, 0x002cf8, PG_U_UNASSIGNED},
+	{0x002cf9, 0x002cfc, PG_U_OTHER_PUNCTUATION},
+	{0x002cfd, 0x002cfd, PG_U_OTHER_NUMBER},
+	{0x002cfe, 0x002cff, PG_U_OTHER_PUNCTUATION},
+	{0x002d00, 0x002d25, PG_U_LOWERCASE_LETTER},
+	{0x002d26, 0x002d26, PG_U_UNASSIGNED},
+	{0x002d27, 0x002d27, PG_U_LOWERCASE_LETTER},
+	{0x002d28, 0x002d2c, PG_U_UNASSIGNED},
+	{0x002d2d, 0x002d2d, PG_U_LOWERCASE_LETTER},
+	{0x002d2e, 0x002d2f, PG_U_UNASSIGNED},
+	{0x002d30, 0x002d67, PG_U_OTHER_LETTER},
+	{0x002d68, 0x002d6e, PG_U_UNASSIGNED},
+	{0x002d6f, 0x002d6f, PG_U_MODIFIER_LETTER},
+	{0x002d70, 0x002d70, PG_U_OTHER_PUNCTUATION},
+	{0x002d71, 0x002d7e, PG_U_UNASSIGNED},
+	{0x002d7f, 0x002d7f, PG_U_NON_SPACING_MARK},
+	{0x002d80, 0x002d96, PG_U_OTHER_LETTER},
+	{0x002d97, 0x002d9f, PG_U_UNASSIGNED},
+	{0x002da0, 0x002da6, PG_U_OTHER_LETTER},
+	{0x002da7, 0x002da7, PG_U_UNASSIGNED},
+	{0x002da8, 0x002dae, PG_U_OTHER_LETTER},
+	{0x002daf, 0x002daf, PG_U_UNASSIGNED},
+	{0x002db0, 0x002db6, PG_U_OTHER_LETTER},
+	{0x002db7, 0x002db7, PG_U_UNASSIGNED},
+	{0x002db8, 0x002dbe, PG_U_OTHER_LETTER},
+	{0x002dbf, 0x002dbf, PG_U_UNASSIGNED},
+	{0x002dc0, 0x002dc6, PG_U_OTHER_LETTER},
+	{0x002dc7, 0x002dc7, PG_U_UNASSIGNED},
+	{0x002dc8, 0x002dce, PG_U_OTHER_LETTER},
+	{0x002dcf, 0x002dcf, PG_U_UNASSIGNED},
+	{0x002dd0, 0x002dd6, PG_U_OTHER_LETTER},
+	{0x002dd7, 0x002dd7, PG_U_UNASSIGNED},
+	{0x002dd8, 0x002dde, PG_U_OTHER_LETTER},
+	{0x002ddf, 0x002ddf, PG_U_UNASSIGNED},
+	{0x002de0, 0x002dff, PG_U_NON_SPACING_MARK},
+	{0x002e00, 0x002e01, PG_U_OTHER_PUNCTUATION},
+	{0x002e02, 0x002e02, PG_U_INITIAL_PUNCTUATION},
+	{0x002e03, 0x002e03, PG_U_FINAL_PUNCTUATION},
+	{0x002e04, 0x002e04, PG_U_INITIAL_PUNCTUATION},
+	{0x002e05, 0x002e05, PG_U_FINAL_PUNCTUATION},
+	{0x002e06, 0x002e08, PG_U_OTHER_PUNCTUATION},
+	{0x002e09, 0x002e09, PG_U_INITIAL_PUNCTUATION},
+	{0x002e0a, 0x002e0a, PG_U_FINAL_PUNCTUATION},
+	{0x002e0b, 0x002e0b, PG_U_OTHER_PUNCTUATION},
+	{0x002e0c, 0x002e0c, PG_U_INITIAL_PUNCTUATION},
+	{0x002e0d, 0x002e0d, PG_U_FINAL_PUNCTUATION},
+	{0x002e0e, 0x002e16, PG_U_OTHER_PUNCTUATION},
+	{0x002e17, 0x002e17, PG_U_DASH_PUNCTUATION},
+	{0x002e18, 0x002e19, PG_U_OTHER_PUNCTUATION},
+	{0x002e1a, 0x002e1a, PG_U_DASH_PUNCTUATION},
+	{0x002e1b, 0x002e1b, PG_U_OTHER_PUNCTUATION},
+	{0x002e1c, 0x002e1c, PG_U_INITIAL_PUNCTUATION},
+	{0x002e1d, 0x002e1d, PG_U_FINAL_PUNCTUATION},
+	{0x002e1e, 0x002e1f, PG_U_OTHER_PUNCTUATION},
+	{0x002e20, 0x002e20, PG_U_INITIAL_PUNCTUATION},
+	{0x002e21, 0x002e21, PG_U_FINAL_PUNCTUATION},
+	{0x002e22, 0x002e22, PG_U_START_PUNCTUATION},
+	{0x002e23, 0x002e23, PG_U_END_PUNCTUATION},
+	{0x002e24, 0x002e24, PG_U_START_PUNCTUATION},
+	{0x002e25, 0x002e25, PG_U_END_PUNCTUATION},
+	{0x002e26, 0x002e26, PG_U_START_PUNCTUATION},
+	{0x002e27, 0x002e27, PG_U_END_PUNCTUATION},
+	{0x002e28, 0x002e28, PG_U_START_PUNCTUATION},
+	{0x002e29, 0x002e29, PG_U_END_PUNCTUATION},
+	{0x002e2a, 0x002e2e, PG_U_OTHER_PUNCTUATION},
+	{0x002e2f, 0x002e2f, PG_U_MODIFIER_LETTER},
+	{0x002e30, 0x002e39, PG_U_OTHER_PUNCTUATION},
+	{0x002e3a, 0x002e3b, PG_U_DASH_PUNCTUATION},
+	{0x002e3c, 0x002e3f, PG_U_OTHER_PUNCTUATION},
+	{0x002e40, 0x002e40, PG_U_DASH_PUNCTUATION},
+	{0x002e41, 0x002e41, PG_U_OTHER_PUNCTUATION},
+	{0x002e42, 0x002e42, PG_U_START_PUNCTUATION},
+	{0x002e43, 0x002e4f, PG_U_OTHER_PUNCTUATION},
+	{0x002e50, 0x002e51, PG_U_OTHER_SYMBOL},
+	{0x002e52, 0x002e54, PG_U_OTHER_PUNCTUATION},
+	{0x002e55, 0x002e55, PG_U_START_PUNCTUATION},
+	{0x002e56, 0x002e56, PG_U_END_PUNCTUATION},
+	{0x002e57, 0x002e57, PG_U_START_PUNCTUATION},
+	{0x002e58, 0x002e58, PG_U_END_PUNCTUATION},
+	{0x002e59, 0x002e59, PG_U_START_PUNCTUATION},
+	{0x002e5a, 0x002e5a, PG_U_END_PUNCTUATION},
+	{0x002e5b, 0x002e5b, PG_U_START_PUNCTUATION},
+	{0x002e5c, 0x002e5c, PG_U_END_PUNCTUATION},
+	{0x002e5d, 0x002e5d, PG_U_DASH_PUNCTUATION},
+	{0x002e5e, 0x002e7f, PG_U_UNASSIGNED},
+	{0x002e80, 0x002e99, PG_U_OTHER_SYMBOL},
+	{0x002e9a, 0x002e9a, PG_U_UNASSIGNED},
+	{0x002e9b, 0x002ef3, PG_U_OTHER_SYMBOL},
+	{0x002ef4, 0x002eff, PG_U_UNASSIGNED},
+	{0x002f00, 0x002fd5, PG_U_OTHER_SYMBOL},
+	{0x002fd6, 0x002fef, PG_U_UNASSIGNED},
+	{0x002ff0, 0x002fff, PG_U_OTHER_SYMBOL},
+	{0x003000, 0x003000, PG_U_SPACE_SEPARATOR},
+	{0x003001, 0x003003, PG_U_OTHER_PUNCTUATION},
+	{0x003004, 0x003004, PG_U_OTHER_SYMBOL},
+	{0x003005, 0x003005, PG_U_MODIFIER_LETTER},
+	{0x003006, 0x003006, PG_U_OTHER_LETTER},
+	{0x003007, 0x003007, PG_U_LETTER_NUMBER},
+	{0x003008, 0x003008, PG_U_START_PUNCTUATION},
+	{0x003009, 0x003009, PG_U_END_PUNCTUATION},
+	{0x00300a, 0x00300a, PG_U_START_PUNCTUATION},
+	{0x00300b, 0x00300b, PG_U_END_PUNCTUATION},
+	{0x00300c, 0x00300c, PG_U_START_PUNCTUATION},
+	{0x00300d, 0x00300d, PG_U_END_PUNCTUATION},
+	{0x00300e, 0x00300e, PG_U_START_PUNCTUATION},
+	{0x00300f, 0x00300f, PG_U_END_PUNCTUATION},
+	{0x003010, 0x003010, PG_U_START_PUNCTUATION},
+	{0x003011, 0x003011, PG_U_END_PUNCTUATION},
+	{0x003012, 0x003013, PG_U_OTHER_SYMBOL},
+	{0x003014, 0x003014, PG_U_START_PUNCTUATION},
+	{0x003015, 0x003015, PG_U_END_PUNCTUATION},
+	{0x003016, 0x003016, PG_U_START_PUNCTUATION},
+	{0x003017, 0x003017, PG_U_END_PUNCTUATION},
+	{0x003018, 0x003018, PG_U_START_PUNCTUATION},
+	{0x003019, 0x003019, PG_U_END_PUNCTUATION},
+	{0x00301a, 0x00301a, PG_U_START_PUNCTUATION},
+	{0x00301b, 0x00301b, PG_U_END_PUNCTUATION},
+	{0x00301c, 0x00301c, PG_U_DASH_PUNCTUATION},
+	{0x00301d, 0x00301d, PG_U_START_PUNCTUATION},
+	{0x00301e, 0x00301f, PG_U_END_PUNCTUATION},
+	{0x003020, 0x003020, PG_U_OTHER_SYMBOL},
+	{0x003021, 0x003029, PG_U_LETTER_NUMBER},
+	{0x00302a, 0x00302d, PG_U_NON_SPACING_MARK},
+	{0x00302e, 0x00302f, PG_U_COMBINING_SPACING_MARK},
+	{0x003030, 0x003030, PG_U_DASH_PUNCTUATION},
+	{0x003031, 0x003035, PG_U_MODIFIER_LETTER},
+	{0x003036, 0x003037, PG_U_OTHER_SYMBOL},
+	{0x003038, 0x00303a, PG_U_LETTER_NUMBER},
+	{0x00303b, 0x00303b, PG_U_MODIFIER_LETTER},
+	{0x00303c, 0x00303c, PG_U_OTHER_LETTER},
+	{0x00303d, 0x00303d, PG_U_OTHER_PUNCTUATION},
+	{0x00303e, 0x00303f, PG_U_OTHER_SYMBOL},
+	{0x003040, 0x003040, PG_U_UNASSIGNED},
+	{0x003041, 0x003096, PG_U_OTHER_LETTER},
+	{0x003097, 0x003098, PG_U_UNASSIGNED},
+	{0x003099, 0x00309a, PG_U_NON_SPACING_MARK},
+	{0x00309b, 0x00309c, PG_U_MODIFIER_SYMBOL},
+	{0x00309d, 0x00309e, PG_U_MODIFIER_LETTER},
+	{0x00309f, 0x00309f, PG_U_OTHER_LETTER},
+	{0x0030a0, 0x0030a0, PG_U_DASH_PUNCTUATION},
+	{0x0030a1, 0x0030fa, PG_U_OTHER_LETTER},
+	{0x0030fb, 0x0030fb, PG_U_OTHER_PUNCTUATION},
+	{0x0030fc, 0x0030fe, PG_U_MODIFIER_LETTER},
+	{0x0030ff, 0x0030ff, PG_U_OTHER_LETTER},
+	{0x003100, 0x003104, PG_U_UNASSIGNED},
+	{0x003105, 0x00312f, PG_U_OTHER_LETTER},
+	{0x003130, 0x003130, PG_U_UNASSIGNED},
+	{0x003131, 0x00318e, PG_U_OTHER_LETTER},
+	{0x00318f, 0x00318f, PG_U_UNASSIGNED},
+	{0x003190, 0x003191, PG_U_OTHER_SYMBOL},
+	{0x003192, 0x003195, PG_U_OTHER_NUMBER},
+	{0x003196, 0x00319f, PG_U_OTHER_SYMBOL},
+	{0x0031a0, 0x0031bf, PG_U_OTHER_LETTER},
+	{0x0031c0, 0x0031e3, PG_U_OTHER_SYMBOL},
+	{0x0031e4, 0x0031ee, PG_U_UNASSIGNED},
+	{0x0031ef, 0x0031ef, PG_U_OTHER_SYMBOL},
+	{0x0031f0, 0x0031ff, PG_U_OTHER_LETTER},
+	{0x003200, 0x00321e, PG_U_OTHER_SYMBOL},
+	{0x00321f, 0x00321f, PG_U_UNASSIGNED},
+	{0x003220, 0x003229, PG_U_OTHER_NUMBER},
+	{0x00322a, 0x003247, PG_U_OTHER_SYMBOL},
+	{0x003248, 0x00324f, PG_U_OTHER_NUMBER},
+	{0x003250, 0x003250, PG_U_OTHER_SYMBOL},
+	{0x003251, 0x00325f, PG_U_OTHER_NUMBER},
+	{0x003260, 0x00327f, PG_U_OTHER_SYMBOL},
+	{0x003280, 0x003289, PG_U_OTHER_NUMBER},
+	{0x00328a, 0x0032b0, PG_U_OTHER_SYMBOL},
+	{0x0032b1, 0x0032bf, PG_U_OTHER_NUMBER},
+	{0x0032c0, 0x0033ff, PG_U_OTHER_SYMBOL},
+	{0x003400, 0x004dbf, PG_U_OTHER_LETTER},
+	{0x004dc0, 0x004dff, PG_U_OTHER_SYMBOL},
+	{0x004e00, 0x00a014, PG_U_OTHER_LETTER},
+	{0x00a015, 0x00a015, PG_U_MODIFIER_LETTER},
+	{0x00a016, 0x00a48c, PG_U_OTHER_LETTER},
+	{0x00a48d, 0x00a48f, PG_U_UNASSIGNED},
+	{0x00a490, 0x00a4c6, PG_U_OTHER_SYMBOL},
+	{0x00a4c7, 0x00a4cf, PG_U_UNASSIGNED},
+	{0x00a4d0, 0x00a4f7, PG_U_OTHER_LETTER},
+	{0x00a4f8, 0x00a4fd, PG_U_MODIFIER_LETTER},
+	{0x00a4fe, 0x00a4ff, PG_U_OTHER_PUNCTUATION},
+	{0x00a500, 0x00a60b, PG_U_OTHER_LETTER},
+	{0x00a60c, 0x00a60c, PG_U_MODIFIER_LETTER},
+	{0x00a60d, 0x00a60f, PG_U_OTHER_PUNCTUATION},
+	{0x00a610, 0x00a61f, PG_U_OTHER_LETTER},
+	{0x00a620, 0x00a629, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a62a, 0x00a62b, PG_U_OTHER_LETTER},
+	{0x00a62c, 0x00a63f, PG_U_UNASSIGNED},
+	{0x00a640, 0x00a640, PG_U_UPPERCASE_LETTER},
+	{0x00a641, 0x00a641, PG_U_LOWERCASE_LETTER},
+	{0x00a642, 0x00a642, PG_U_UPPERCASE_LETTER},
+	{0x00a643, 0x00a643, PG_U_LOWERCASE_LETTER},
+	{0x00a644, 0x00a644, PG_U_UPPERCASE_LETTER},
+	{0x00a645, 0x00a645, PG_U_LOWERCASE_LETTER},
+	{0x00a646, 0x00a646, PG_U_UPPERCASE_LETTER},
+	{0x00a647, 0x00a647, PG_U_LOWERCASE_LETTER},
+	{0x00a648, 0x00a648, PG_U_UPPERCASE_LETTER},
+	{0x00a649, 0x00a649, PG_U_LOWERCASE_LETTER},
+	{0x00a64a, 0x00a64a, PG_U_UPPERCASE_LETTER},
+	{0x00a64b, 0x00a64b, PG_U_LOWERCASE_LETTER},
+	{0x00a64c, 0x00a64c, PG_U_UPPERCASE_LETTER},
+	{0x00a64d, 0x00a64d, PG_U_LOWERCASE_LETTER},
+	{0x00a64e, 0x00a64e, PG_U_UPPERCASE_LETTER},
+	{0x00a64f, 0x00a64f, PG_U_LOWERCASE_LETTER},
+	{0x00a650, 0x00a650, PG_U_UPPERCASE_LETTER},
+	{0x00a651, 0x00a651, PG_U_LOWERCASE_LETTER},
+	{0x00a652, 0x00a652, PG_U_UPPERCASE_LETTER},
+	{0x00a653, 0x00a653, PG_U_LOWERCASE_LETTER},
+	{0x00a654, 0x00a654, PG_U_UPPERCASE_LETTER},
+	{0x00a655, 0x00a655, PG_U_LOWERCASE_LETTER},
+	{0x00a656, 0x00a656, PG_U_UPPERCASE_LETTER},
+	{0x00a657, 0x00a657, PG_U_LOWERCASE_LETTER},
+	{0x00a658, 0x00a658, PG_U_UPPERCASE_LETTER},
+	{0x00a659, 0x00a659, PG_U_LOWERCASE_LETTER},
+	{0x00a65a, 0x00a65a, PG_U_UPPERCASE_LETTER},
+	{0x00a65b, 0x00a65b, PG_U_LOWERCASE_LETTER},
+	{0x00a65c, 0x00a65c, PG_U_UPPERCASE_LETTER},
+	{0x00a65d, 0x00a65d, PG_U_LOWERCASE_LETTER},
+	{0x00a65e, 0x00a65e, PG_U_UPPERCASE_LETTER},
+	{0x00a65f, 0x00a65f, PG_U_LOWERCASE_LETTER},
+	{0x00a660, 0x00a660, PG_U_UPPERCASE_LETTER},
+	{0x00a661, 0x00a661, PG_U_LOWERCASE_LETTER},
+	{0x00a662, 0x00a662, PG_U_UPPERCASE_LETTER},
+	{0x00a663, 0x00a663, PG_U_LOWERCASE_LETTER},
+	{0x00a664, 0x00a664, PG_U_UPPERCASE_LETTER},
+	{0x00a665, 0x00a665, PG_U_LOWERCASE_LETTER},
+	{0x00a666, 0x00a666, PG_U_UPPERCASE_LETTER},
+	{0x00a667, 0x00a667, PG_U_LOWERCASE_LETTER},
+	{0x00a668, 0x00a668, PG_U_UPPERCASE_LETTER},
+	{0x00a669, 0x00a669, PG_U_LOWERCASE_LETTER},
+	{0x00a66a, 0x00a66a, PG_U_UPPERCASE_LETTER},
+	{0x00a66b, 0x00a66b, PG_U_LOWERCASE_LETTER},
+	{0x00a66c, 0x00a66c, PG_U_UPPERCASE_LETTER},
+	{0x00a66d, 0x00a66d, PG_U_LOWERCASE_LETTER},
+	{0x00a66e, 0x00a66e, PG_U_OTHER_LETTER},
+	{0x00a66f, 0x00a66f, PG_U_NON_SPACING_MARK},
+	{0x00a670, 0x00a672, PG_U_ENCLOSING_MARK},
+	{0x00a673, 0x00a673, PG_U_OTHER_PUNCTUATION},
+	{0x00a674, 0x00a67d, PG_U_NON_SPACING_MARK},
+	{0x00a67e, 0x00a67e, PG_U_OTHER_PUNCTUATION},
+	{0x00a67f, 0x00a67f, PG_U_MODIFIER_LETTER},
+	{0x00a680, 0x00a680, PG_U_UPPERCASE_LETTER},
+	{0x00a681, 0x00a681, PG_U_LOWERCASE_LETTER},
+	{0x00a682, 0x00a682, PG_U_UPPERCASE_LETTER},
+	{0x00a683, 0x00a683, PG_U_LOWERCASE_LETTER},
+	{0x00a684, 0x00a684, PG_U_UPPERCASE_LETTER},
+	{0x00a685, 0x00a685, PG_U_LOWERCASE_LETTER},
+	{0x00a686, 0x00a686, PG_U_UPPERCASE_LETTER},
+	{0x00a687, 0x00a687, PG_U_LOWERCASE_LETTER},
+	{0x00a688, 0x00a688, PG_U_UPPERCASE_LETTER},
+	{0x00a689, 0x00a689, PG_U_LOWERCASE_LETTER},
+	{0x00a68a, 0x00a68a, PG_U_UPPERCASE_LETTER},
+	{0x00a68b, 0x00a68b, PG_U_LOWERCASE_LETTER},
+	{0x00a68c, 0x00a68c, PG_U_UPPERCASE_LETTER},
+	{0x00a68d, 0x00a68d, PG_U_LOWERCASE_LETTER},
+	{0x00a68e, 0x00a68e, PG_U_UPPERCASE_LETTER},
+	{0x00a68f, 0x00a68f, PG_U_LOWERCASE_LETTER},
+	{0x00a690, 0x00a690, PG_U_UPPERCASE_LETTER},
+	{0x00a691, 0x00a691, PG_U_LOWERCASE_LETTER},
+	{0x00a692, 0x00a692, PG_U_UPPERCASE_LETTER},
+	{0x00a693, 0x00a693, PG_U_LOWERCASE_LETTER},
+	{0x00a694, 0x00a694, PG_U_UPPERCASE_LETTER},
+	{0x00a695, 0x00a695, PG_U_LOWERCASE_LETTER},
+	{0x00a696, 0x00a696, PG_U_UPPERCASE_LETTER},
+	{0x00a697, 0x00a697, PG_U_LOWERCASE_LETTER},
+	{0x00a698, 0x00a698, PG_U_UPPERCASE_LETTER},
+	{0x00a699, 0x00a699, PG_U_LOWERCASE_LETTER},
+	{0x00a69a, 0x00a69a, PG_U_UPPERCASE_LETTER},
+	{0x00a69b, 0x00a69b, PG_U_LOWERCASE_LETTER},
+	{0x00a69c, 0x00a69d, PG_U_MODIFIER_LETTER},
+	{0x00a69e, 0x00a69f, PG_U_NON_SPACING_MARK},
+	{0x00a6a0, 0x00a6e5, PG_U_OTHER_LETTER},
+	{0x00a6e6, 0x00a6ef, PG_U_LETTER_NUMBER},
+	{0x00a6f0, 0x00a6f1, PG_U_NON_SPACING_MARK},
+	{0x00a6f2, 0x00a6f7, PG_U_OTHER_PUNCTUATION},
+	{0x00a6f8, 0x00a6ff, PG_U_UNASSIGNED},
+	{0x00a700, 0x00a716, PG_U_MODIFIER_SYMBOL},
+	{0x00a717, 0x00a71f, PG_U_MODIFIER_LETTER},
+	{0x00a720, 0x00a721, PG_U_MODIFIER_SYMBOL},
+	{0x00a722, 0x00a722, PG_U_UPPERCASE_LETTER},
+	{0x00a723, 0x00a723, PG_U_LOWERCASE_LETTER},
+	{0x00a724, 0x00a724, PG_U_UPPERCASE_LETTER},
+	{0x00a725, 0x00a725, PG_U_LOWERCASE_LETTER},
+	{0x00a726, 0x00a726, PG_U_UPPERCASE_LETTER},
+	{0x00a727, 0x00a727, PG_U_LOWERCASE_LETTER},
+	{0x00a728, 0x00a728, PG_U_UPPERCASE_LETTER},
+	{0x00a729, 0x00a729, PG_U_LOWERCASE_LETTER},
+	{0x00a72a, 0x00a72a, PG_U_UPPERCASE_LETTER},
+	{0x00a72b, 0x00a72b, PG_U_LOWERCASE_LETTER},
+	{0x00a72c, 0x00a72c, PG_U_UPPERCASE_LETTER},
+	{0x00a72d, 0x00a72d, PG_U_LOWERCASE_LETTER},
+	{0x00a72e, 0x00a72e, PG_U_UPPERCASE_LETTER},
+	{0x00a72f, 0x00a731, PG_U_LOWERCASE_LETTER},
+	{0x00a732, 0x00a732, PG_U_UPPERCASE_LETTER},
+	{0x00a733, 0x00a733, PG_U_LOWERCASE_LETTER},
+	{0x00a734, 0x00a734, PG_U_UPPERCASE_LETTER},
+	{0x00a735, 0x00a735, PG_U_LOWERCASE_LETTER},
+	{0x00a736, 0x00a736, PG_U_UPPERCASE_LETTER},
+	{0x00a737, 0x00a737, PG_U_LOWERCASE_LETTER},
+	{0x00a738, 0x00a738, PG_U_UPPERCASE_LETTER},
+	{0x00a739, 0x00a739, PG_U_LOWERCASE_LETTER},
+	{0x00a73a, 0x00a73a, PG_U_UPPERCASE_LETTER},
+	{0x00a73b, 0x00a73b, PG_U_LOWERCASE_LETTER},
+	{0x00a73c, 0x00a73c, PG_U_UPPERCASE_LETTER},
+	{0x00a73d, 0x00a73d, PG_U_LOWERCASE_LETTER},
+	{0x00a73e, 0x00a73e, PG_U_UPPERCASE_LETTER},
+	{0x00a73f, 0x00a73f, PG_U_LOWERCASE_LETTER},
+	{0x00a740, 0x00a740, PG_U_UPPERCASE_LETTER},
+	{0x00a741, 0x00a741, PG_U_LOWERCASE_LETTER},
+	{0x00a742, 0x00a742, PG_U_UPPERCASE_LETTER},
+	{0x00a743, 0x00a743, PG_U_LOWERCASE_LETTER},
+	{0x00a744, 0x00a744, PG_U_UPPERCASE_LETTER},
+	{0x00a745, 0x00a745, PG_U_LOWERCASE_LETTER},
+	{0x00a746, 0x00a746, PG_U_UPPERCASE_LETTER},
+	{0x00a747, 0x00a747, PG_U_LOWERCASE_LETTER},
+	{0x00a748, 0x00a748, PG_U_UPPERCASE_LETTER},
+	{0x00a749, 0x00a749, PG_U_LOWERCASE_LETTER},
+	{0x00a74a, 0x00a74a, PG_U_UPPERCASE_LETTER},
+	{0x00a74b, 0x00a74b, PG_U_LOWERCASE_LETTER},
+	{0x00a74c, 0x00a74c, PG_U_UPPERCASE_LETTER},
+	{0x00a74d, 0x00a74d, PG_U_LOWERCASE_LETTER},
+	{0x00a74e, 0x00a74e, PG_U_UPPERCASE_LETTER},
+	{0x00a74f, 0x00a74f, PG_U_LOWERCASE_LETTER},
+	{0x00a750, 0x00a750, PG_U_UPPERCASE_LETTER},
+	{0x00a751, 0x00a751, PG_U_LOWERCASE_LETTER},
+	{0x00a752, 0x00a752, PG_U_UPPERCASE_LETTER},
+	{0x00a753, 0x00a753, PG_U_LOWERCASE_LETTER},
+	{0x00a754, 0x00a754, PG_U_UPPERCASE_LETTER},
+	{0x00a755, 0x00a755, PG_U_LOWERCASE_LETTER},
+	{0x00a756, 0x00a756, PG_U_UPPERCASE_LETTER},
+	{0x00a757, 0x00a757, PG_U_LOWERCASE_LETTER},
+	{0x00a758, 0x00a758, PG_U_UPPERCASE_LETTER},
+	{0x00a759, 0x00a759, PG_U_LOWERCASE_LETTER},
+	{0x00a75a, 0x00a75a, PG_U_UPPERCASE_LETTER},
+	{0x00a75b, 0x00a75b, PG_U_LOWERCASE_LETTER},
+	{0x00a75c, 0x00a75c, PG_U_UPPERCASE_LETTER},
+	{0x00a75d, 0x00a75d, PG_U_LOWERCASE_LETTER},
+	{0x00a75e, 0x00a75e, PG_U_UPPERCASE_LETTER},
+	{0x00a75f, 0x00a75f, PG_U_LOWERCASE_LETTER},
+	{0x00a760, 0x00a760, PG_U_UPPERCASE_LETTER},
+	{0x00a761, 0x00a761, PG_U_LOWERCASE_LETTER},
+	{0x00a762, 0x00a762, PG_U_UPPERCASE_LETTER},
+	{0x00a763, 0x00a763, PG_U_LOWERCASE_LETTER},
+	{0x00a764, 0x00a764, PG_U_UPPERCASE_LETTER},
+	{0x00a765, 0x00a765, PG_U_LOWERCASE_LETTER},
+	{0x00a766, 0x00a766, PG_U_UPPERCASE_LETTER},
+	{0x00a767, 0x00a767, PG_U_LOWERCASE_LETTER},
+	{0x00a768, 0x00a768, PG_U_UPPERCASE_LETTER},
+	{0x00a769, 0x00a769, PG_U_LOWERCASE_LETTER},
+	{0x00a76a, 0x00a76a, PG_U_UPPERCASE_LETTER},
+	{0x00a76b, 0x00a76b, PG_U_LOWERCASE_LETTER},
+	{0x00a76c, 0x00a76c, PG_U_UPPERCASE_LETTER},
+	{0x00a76d, 0x00a76d, PG_U_LOWERCASE_LETTER},
+	{0x00a76e, 0x00a76e, PG_U_UPPERCASE_LETTER},
+	{0x00a76f, 0x00a76f, PG_U_LOWERCASE_LETTER},
+	{0x00a770, 0x00a770, PG_U_MODIFIER_LETTER},
+	{0x00a771, 0x00a778, PG_U_LOWERCASE_LETTER},
+	{0x00a779, 0x00a779, PG_U_UPPERCASE_LETTER},
+	{0x00a77a, 0x00a77a, PG_U_LOWERCASE_LETTER},
+	{0x00a77b, 0x00a77b, PG_U_UPPERCASE_LETTER},
+	{0x00a77c, 0x00a77c, PG_U_LOWERCASE_LETTER},
+	{0x00a77d, 0x00a77e, PG_U_UPPERCASE_LETTER},
+	{0x00a77f, 0x00a77f, PG_U_LOWERCASE_LETTER},
+	{0x00a780, 0x00a780, PG_U_UPPERCASE_LETTER},
+	{0x00a781, 0x00a781, PG_U_LOWERCASE_LETTER},
+	{0x00a782, 0x00a782, PG_U_UPPERCASE_LETTER},
+	{0x00a783, 0x00a783, PG_U_LOWERCASE_LETTER},
+	{0x00a784, 0x00a784, PG_U_UPPERCASE_LETTER},
+	{0x00a785, 0x00a785, PG_U_LOWERCASE_LETTER},
+	{0x00a786, 0x00a786, PG_U_UPPERCASE_LETTER},
+	{0x00a787, 0x00a787, PG_U_LOWERCASE_LETTER},
+	{0x00a788, 0x00a788, PG_U_MODIFIER_LETTER},
+	{0x00a789, 0x00a78a, PG_U_MODIFIER_SYMBOL},
+	{0x00a78b, 0x00a78b, PG_U_UPPERCASE_LETTER},
+	{0x00a78c, 0x00a78c, PG_U_LOWERCASE_LETTER},
+	{0x00a78d, 0x00a78d, PG_U_UPPERCASE_LETTER},
+	{0x00a78e, 0x00a78e, PG_U_LOWERCASE_LETTER},
+	{0x00a78f, 0x00a78f, PG_U_OTHER_LETTER},
+	{0x00a790, 0x00a790, PG_U_UPPERCASE_LETTER},
+	{0x00a791, 0x00a791, PG_U_LOWERCASE_LETTER},
+	{0x00a792, 0x00a792, PG_U_UPPERCASE_LETTER},
+	{0x00a793, 0x00a795, PG_U_LOWERCASE_LETTER},
+	{0x00a796, 0x00a796, PG_U_UPPERCASE_LETTER},
+	{0x00a797, 0x00a797, PG_U_LOWERCASE_LETTER},
+	{0x00a798, 0x00a798, PG_U_UPPERCASE_LETTER},
+	{0x00a799, 0x00a799, PG_U_LOWERCASE_LETTER},
+	{0x00a79a, 0x00a79a, PG_U_UPPERCASE_LETTER},
+	{0x00a79b, 0x00a79b, PG_U_LOWERCASE_LETTER},
+	{0x00a79c, 0x00a79c, PG_U_UPPERCASE_LETTER},
+	{0x00a79d, 0x00a79d, PG_U_LOWERCASE_LETTER},
+	{0x00a79e, 0x00a79e, PG_U_UPPERCASE_LETTER},
+	{0x00a79f, 0x00a79f, PG_U_LOWERCASE_LETTER},
+	{0x00a7a0, 0x00a7a0, PG_U_UPPERCASE_LETTER},
+	{0x00a7a1, 0x00a7a1, PG_U_LOWERCASE_LETTER},
+	{0x00a7a2, 0x00a7a2, PG_U_UPPERCASE_LETTER},
+	{0x00a7a3, 0x00a7a3, PG_U_LOWERCASE_LETTER},
+	{0x00a7a4, 0x00a7a4, PG_U_UPPERCASE_LETTER},
+	{0x00a7a5, 0x00a7a5, PG_U_LOWERCASE_LETTER},
+	{0x00a7a6, 0x00a7a6, PG_U_UPPERCASE_LETTER},
+	{0x00a7a7, 0x00a7a7, PG_U_LOWERCASE_LETTER},
+	{0x00a7a8, 0x00a7a8, PG_U_UPPERCASE_LETTER},
+	{0x00a7a9, 0x00a7a9, PG_U_LOWERCASE_LETTER},
+	{0x00a7aa, 0x00a7ae, PG_U_UPPERCASE_LETTER},
+	{0x00a7af, 0x00a7af, PG_U_LOWERCASE_LETTER},
+	{0x00a7b0, 0x00a7b4, PG_U_UPPERCASE_LETTER},
+	{0x00a7b5, 0x00a7b5, PG_U_LOWERCASE_LETTER},
+	{0x00a7b6, 0x00a7b6, PG_U_UPPERCASE_LETTER},
+	{0x00a7b7, 0x00a7b7, PG_U_LOWERCASE_LETTER},
+	{0x00a7b8, 0x00a7b8, PG_U_UPPERCASE_LETTER},
+	{0x00a7b9, 0x00a7b9, PG_U_LOWERCASE_LETTER},
+	{0x00a7ba, 0x00a7ba, PG_U_UPPERCASE_LETTER},
+	{0x00a7bb, 0x00a7bb, PG_U_LOWERCASE_LETTER},
+	{0x00a7bc, 0x00a7bc, PG_U_UPPERCASE_LETTER},
+	{0x00a7bd, 0x00a7bd, PG_U_LOWERCASE_LETTER},
+	{0x00a7be, 0x00a7be, PG_U_UPPERCASE_LETTER},
+	{0x00a7bf, 0x00a7bf, PG_U_LOWERCASE_LETTER},
+	{0x00a7c0, 0x00a7c0, PG_U_UPPERCASE_LETTER},
+	{0x00a7c1, 0x00a7c1, PG_U_LOWERCASE_LETTER},
+	{0x00a7c2, 0x00a7c2, PG_U_UPPERCASE_LETTER},
+	{0x00a7c3, 0x00a7c3, PG_U_LOWERCASE_LETTER},
+	{0x00a7c4, 0x00a7c7, PG_U_UPPERCASE_LETTER},
+	{0x00a7c8, 0x00a7c8, PG_U_LOWERCASE_LETTER},
+	{0x00a7c9, 0x00a7c9, PG_U_UPPERCASE_LETTER},
+	{0x00a7ca, 0x00a7ca, PG_U_LOWERCASE_LETTER},
+	{0x00a7cb, 0x00a7cf, PG_U_UNASSIGNED},
+	{0x00a7d0, 0x00a7d0, PG_U_UPPERCASE_LETTER},
+	{0x00a7d1, 0x00a7d1, PG_U_LOWERCASE_LETTER},
+	{0x00a7d2, 0x00a7d2, PG_U_UNASSIGNED},
+	{0x00a7d3, 0x00a7d3, PG_U_LOWERCASE_LETTER},
+	{0x00a7d4, 0x00a7d4, PG_U_UNASSIGNED},
+	{0x00a7d5, 0x00a7d5, PG_U_LOWERCASE_LETTER},
+	{0x00a7d6, 0x00a7d6, PG_U_UPPERCASE_LETTER},
+	{0x00a7d7, 0x00a7d7, PG_U_LOWERCASE_LETTER},
+	{0x00a7d8, 0x00a7d8, PG_U_UPPERCASE_LETTER},
+	{0x00a7d9, 0x00a7d9, PG_U_LOWERCASE_LETTER},
+	{0x00a7da, 0x00a7f1, PG_U_UNASSIGNED},
+	{0x00a7f2, 0x00a7f4, PG_U_MODIFIER_LETTER},
+	{0x00a7f5, 0x00a7f5, PG_U_UPPERCASE_LETTER},
+	{0x00a7f6, 0x00a7f6, PG_U_LOWERCASE_LETTER},
+	{0x00a7f7, 0x00a7f7, PG_U_OTHER_LETTER},
+	{0x00a7f8, 0x00a7f9, PG_U_MODIFIER_LETTER},
+	{0x00a7fa, 0x00a7fa, PG_U_LOWERCASE_LETTER},
+	{0x00a7fb, 0x00a801, PG_U_OTHER_LETTER},
+	{0x00a802, 0x00a802, PG_U_NON_SPACING_MARK},
+	{0x00a803, 0x00a805, PG_U_OTHER_LETTER},
+	{0x00a806, 0x00a806, PG_U_NON_SPACING_MARK},
+	{0x00a807, 0x00a80a, PG_U_OTHER_LETTER},
+	{0x00a80b, 0x00a80b, PG_U_NON_SPACING_MARK},
+	{0x00a80c, 0x00a822, PG_U_OTHER_LETTER},
+	{0x00a823, 0x00a824, PG_U_COMBINING_SPACING_MARK},
+	{0x00a825, 0x00a826, PG_U_NON_SPACING_MARK},
+	{0x00a827, 0x00a827, PG_U_COMBINING_SPACING_MARK},
+	{0x00a828, 0x00a82b, PG_U_OTHER_SYMBOL},
+	{0x00a82c, 0x00a82c, PG_U_NON_SPACING_MARK},
+	{0x00a82d, 0x00a82f, PG_U_UNASSIGNED},
+	{0x00a830, 0x00a835, PG_U_OTHER_NUMBER},
+	{0x00a836, 0x00a837, PG_U_OTHER_SYMBOL},
+	{0x00a838, 0x00a838, PG_U_CURRENCY_SYMBOL},
+	{0x00a839, 0x00a839, PG_U_OTHER_SYMBOL},
+	{0x00a83a, 0x00a83f, PG_U_UNASSIGNED},
+	{0x00a840, 0x00a873, PG_U_OTHER_LETTER},
+	{0x00a874, 0x00a877, PG_U_OTHER_PUNCTUATION},
+	{0x00a878, 0x00a87f, PG_U_UNASSIGNED},
+	{0x00a880, 0x00a881, PG_U_COMBINING_SPACING_MARK},
+	{0x00a882, 0x00a8b3, PG_U_OTHER_LETTER},
+	{0x00a8b4, 0x00a8c3, PG_U_COMBINING_SPACING_MARK},
+	{0x00a8c4, 0x00a8c5, PG_U_NON_SPACING_MARK},
+	{0x00a8c6, 0x00a8cd, PG_U_UNASSIGNED},
+	{0x00a8ce, 0x00a8cf, PG_U_OTHER_PUNCTUATION},
+	{0x00a8d0, 0x00a8d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a8da, 0x00a8df, PG_U_UNASSIGNED},
+	{0x00a8e0, 0x00a8f1, PG_U_NON_SPACING_MARK},
+	{0x00a8f2, 0x00a8f7, PG_U_OTHER_LETTER},
+	{0x00a8f8, 0x00a8fa, PG_U_OTHER_PUNCTUATION},
+	{0x00a8fb, 0x00a8fb, PG_U_OTHER_LETTER},
+	{0x00a8fc, 0x00a8fc, PG_U_OTHER_PUNCTUATION},
+	{0x00a8fd, 0x00a8fe, PG_U_OTHER_LETTER},
+	{0x00a8ff, 0x00a8ff, PG_U_NON_SPACING_MARK},
+	{0x00a900, 0x00a909, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a90a, 0x00a925, PG_U_OTHER_LETTER},
+	{0x00a926, 0x00a92d, PG_U_NON_SPACING_MARK},
+	{0x00a92e, 0x00a92f, PG_U_OTHER_PUNCTUATION},
+	{0x00a930, 0x00a946, PG_U_OTHER_LETTER},
+	{0x00a947, 0x00a951, PG_U_NON_SPACING_MARK},
+	{0x00a952, 0x00a953, PG_U_COMBINING_SPACING_MARK},
+	{0x00a954, 0x00a95e, PG_U_UNASSIGNED},
+	{0x00a95f, 0x00a95f, PG_U_OTHER_PUNCTUATION},
+	{0x00a960, 0x00a97c, PG_U_OTHER_LETTER},
+	{0x00a97d, 0x00a97f, PG_U_UNASSIGNED},
+	{0x00a980, 0x00a982, PG_U_NON_SPACING_MARK},
+	{0x00a983, 0x00a983, PG_U_COMBINING_SPACING_MARK},
+	{0x00a984, 0x00a9b2, PG_U_OTHER_LETTER},
+	{0x00a9b3, 0x00a9b3, PG_U_NON_SPACING_MARK},
+	{0x00a9b4, 0x00a9b5, PG_U_COMBINING_SPACING_MARK},
+	{0x00a9b6, 0x00a9b9, PG_U_NON_SPACING_MARK},
+	{0x00a9ba, 0x00a9bb, PG_U_COMBINING_SPACING_MARK},
+	{0x00a9bc, 0x00a9bd, PG_U_NON_SPACING_MARK},
+	{0x00a9be, 0x00a9c0, PG_U_COMBINING_SPACING_MARK},
+	{0x00a9c1, 0x00a9cd, PG_U_OTHER_PUNCTUATION},
+	{0x00a9ce, 0x00a9ce, PG_U_UNASSIGNED},
+	{0x00a9cf, 0x00a9cf, PG_U_MODIFIER_LETTER},
+	{0x00a9d0, 0x00a9d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a9da, 0x00a9dd, PG_U_UNASSIGNED},
+	{0x00a9de, 0x00a9df, PG_U_OTHER_PUNCTUATION},
+	{0x00a9e0, 0x00a9e4, PG_U_OTHER_LETTER},
+	{0x00a9e5, 0x00a9e5, PG_U_NON_SPACING_MARK},
+	{0x00a9e6, 0x00a9e6, PG_U_MODIFIER_LETTER},
+	{0x00a9e7, 0x00a9ef, PG_U_OTHER_LETTER},
+	{0x00a9f0, 0x00a9f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a9fa, 0x00a9fe, PG_U_OTHER_LETTER},
+	{0x00a9ff, 0x00a9ff, PG_U_UNASSIGNED},
+	{0x00aa00, 0x00aa28, PG_U_OTHER_LETTER},
+	{0x00aa29, 0x00aa2e, PG_U_NON_SPACING_MARK},
+	{0x00aa2f, 0x00aa30, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa31, 0x00aa32, PG_U_NON_SPACING_MARK},
+	{0x00aa33, 0x00aa34, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa35, 0x00aa36, PG_U_NON_SPACING_MARK},
+	{0x00aa37, 0x00aa3f, PG_U_UNASSIGNED},
+	{0x00aa40, 0x00aa42, PG_U_OTHER_LETTER},
+	{0x00aa43, 0x00aa43, PG_U_NON_SPACING_MARK},
+	{0x00aa44, 0x00aa4b, PG_U_OTHER_LETTER},
+	{0x00aa4c, 0x00aa4c, PG_U_NON_SPACING_MARK},
+	{0x00aa4d, 0x00aa4d, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa4e, 0x00aa4f, PG_U_UNASSIGNED},
+	{0x00aa50, 0x00aa59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00aa5a, 0x00aa5b, PG_U_UNASSIGNED},
+	{0x00aa5c, 0x00aa5f, PG_U_OTHER_PUNCTUATION},
+	{0x00aa60, 0x00aa6f, PG_U_OTHER_LETTER},
+	{0x00aa70, 0x00aa70, PG_U_MODIFIER_LETTER},
+	{0x00aa71, 0x00aa76, PG_U_OTHER_LETTER},
+	{0x00aa77, 0x00aa79, PG_U_OTHER_SYMBOL},
+	{0x00aa7a, 0x00aa7a, PG_U_OTHER_LETTER},
+	{0x00aa7b, 0x00aa7b, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa7c, 0x00aa7c, PG_U_NON_SPACING_MARK},
+	{0x00aa7d, 0x00aa7d, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa7e, 0x00aaaf, PG_U_OTHER_LETTER},
+	{0x00aab0, 0x00aab0, PG_U_NON_SPACING_MARK},
+	{0x00aab1, 0x00aab1, PG_U_OTHER_LETTER},
+	{0x00aab2, 0x00aab4, PG_U_NON_SPACING_MARK},
+	{0x00aab5, 0x00aab6, PG_U_OTHER_LETTER},
+	{0x00aab7, 0x00aab8, PG_U_NON_SPACING_MARK},
+	{0x00aab9, 0x00aabd, PG_U_OTHER_LETTER},
+	{0x00aabe, 0x00aabf, PG_U_NON_SPACING_MARK},
+	{0x00aac0, 0x00aac0, PG_U_OTHER_LETTER},
+	{0x00aac1, 0x00aac1, PG_U_NON_SPACING_MARK},
+	{0x00aac2, 0x00aac2, PG_U_OTHER_LETTER},
+	{0x00aac3, 0x00aada, PG_U_UNASSIGNED},
+	{0x00aadb, 0x00aadc, PG_U_OTHER_LETTER},
+	{0x00aadd, 0x00aadd, PG_U_MODIFIER_LETTER},
+	{0x00aade, 0x00aadf, PG_U_OTHER_PUNCTUATION},
+	{0x00aae0, 0x00aaea, PG_U_OTHER_LETTER},
+	{0x00aaeb, 0x00aaeb, PG_U_COMBINING_SPACING_MARK},
+	{0x00aaec, 0x00aaed, PG_U_NON_SPACING_MARK},
+	{0x00aaee, 0x00aaef, PG_U_COMBINING_SPACING_MARK},
+	{0x00aaf0, 0x00aaf1, PG_U_OTHER_PUNCTUATION},
+	{0x00aaf2, 0x00aaf2, PG_U_OTHER_LETTER},
+	{0x00aaf3, 0x00aaf4, PG_U_MODIFIER_LETTER},
+	{0x00aaf5, 0x00aaf5, PG_U_COMBINING_SPACING_MARK},
+	{0x00aaf6, 0x00aaf6, PG_U_NON_SPACING_MARK},
+	{0x00aaf7, 0x00ab00, PG_U_UNASSIGNED},
+	{0x00ab01, 0x00ab06, PG_U_OTHER_LETTER},
+	{0x00ab07, 0x00ab08, PG_U_UNASSIGNED},
+	{0x00ab09, 0x00ab0e, PG_U_OTHER_LETTER},
+	{0x00ab0f, 0x00ab10, PG_U_UNASSIGNED},
+	{0x00ab11, 0x00ab16, PG_U_OTHER_LETTER},
+	{0x00ab17, 0x00ab1f, PG_U_UNASSIGNED},
+	{0x00ab20, 0x00ab26, PG_U_OTHER_LETTER},
+	{0x00ab27, 0x00ab27, PG_U_UNASSIGNED},
+	{0x00ab28, 0x00ab2e, PG_U_OTHER_LETTER},
+	{0x00ab2f, 0x00ab2f, PG_U_UNASSIGNED},
+	{0x00ab30, 0x00ab5a, PG_U_LOWERCASE_LETTER},
+	{0x00ab5b, 0x00ab5b, PG_U_MODIFIER_SYMBOL},
+	{0x00ab5c, 0x00ab5f, PG_U_MODIFIER_LETTER},
+	{0x00ab60, 0x00ab68, PG_U_LOWERCASE_LETTER},
+	{0x00ab69, 0x00ab69, PG_U_MODIFIER_LETTER},
+	{0x00ab6a, 0x00ab6b, PG_U_MODIFIER_SYMBOL},
+	{0x00ab6c, 0x00ab6f, PG_U_UNASSIGNED},
+	{0x00ab70, 0x00abbf, PG_U_LOWERCASE_LETTER},
+	{0x00abc0, 0x00abe2, PG_U_OTHER_LETTER},
+	{0x00abe3, 0x00abe4, PG_U_COMBINING_SPACING_MARK},
+	{0x00abe5, 0x00abe5, PG_U_NON_SPACING_MARK},
+	{0x00abe6, 0x00abe7, PG_U_COMBINING_SPACING_MARK},
+	{0x00abe8, 0x00abe8, PG_U_NON_SPACING_MARK},
+	{0x00abe9, 0x00abea, PG_U_COMBINING_SPACING_MARK},
+	{0x00abeb, 0x00abeb, PG_U_OTHER_PUNCTUATION},
+	{0x00abec, 0x00abec, PG_U_COMBINING_SPACING_MARK},
+	{0x00abed, 0x00abed, PG_U_NON_SPACING_MARK},
+	{0x00abee, 0x00abef, PG_U_UNASSIGNED},
+	{0x00abf0, 0x00abf9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00abfa, 0x00abff, PG_U_UNASSIGNED},
+	{0x00ac00, 0x00d7a3, PG_U_OTHER_LETTER},
+	{0x00d7a4, 0x00d7af, PG_U_UNASSIGNED},
+	{0x00d7b0, 0x00d7c6, PG_U_OTHER_LETTER},
+	{0x00d7c7, 0x00d7ca, PG_U_UNASSIGNED},
+	{0x00d7cb, 0x00d7fb, PG_U_OTHER_LETTER},
+	{0x00d7fc, 0x00d7ff, PG_U_UNASSIGNED},
+	{0x00d800, 0x00dfff, PG_U_SURROGATE},
+	{0x00e000, 0x00f8ff, PG_U_PRIVATE_USE_CHAR},
+	{0x00f900, 0x00fa6d, PG_U_OTHER_LETTER},
+	{0x00fa6e, 0x00fa6f, PG_U_UNASSIGNED},
+	{0x00fa70, 0x00fad9, PG_U_OTHER_LETTER},
+	{0x00fada, 0x00faff, PG_U_UNASSIGNED},
+	{0x00fb00, 0x00fb06, PG_U_LOWERCASE_LETTER},
+	{0x00fb07, 0x00fb12, PG_U_UNASSIGNED},
+	{0x00fb13, 0x00fb17, PG_U_LOWERCASE_LETTER},
+	{0x00fb18, 0x00fb1c, PG_U_UNASSIGNED},
+	{0x00fb1d, 0x00fb1d, PG_U_OTHER_LETTER},
+	{0x00fb1e, 0x00fb1e, PG_U_NON_SPACING_MARK},
+	{0x00fb1f, 0x00fb28, PG_U_OTHER_LETTER},
+	{0x00fb29, 0x00fb29, PG_U_MATH_SYMBOL},
+	{0x00fb2a, 0x00fb36, PG_U_OTHER_LETTER},
+	{0x00fb37, 0x00fb37, PG_U_UNASSIGNED},
+	{0x00fb38, 0x00fb3c, PG_U_OTHER_LETTER},
+	{0x00fb3d, 0x00fb3d, PG_U_UNASSIGNED},
+	{0x00fb3e, 0x00fb3e, PG_U_OTHER_LETTER},
+	{0x00fb3f, 0x00fb3f, PG_U_UNASSIGNED},
+	{0x00fb40, 0x00fb41, PG_U_OTHER_LETTER},
+	{0x00fb42, 0x00fb42, PG_U_UNASSIGNED},
+	{0x00fb43, 0x00fb44, PG_U_OTHER_LETTER},
+	{0x00fb45, 0x00fb45, PG_U_UNASSIGNED},
+	{0x00fb46, 0x00fbb1, PG_U_OTHER_LETTER},
+	{0x00fbb2, 0x00fbc2, PG_U_MODIFIER_SYMBOL},
+	{0x00fbc3, 0x00fbd2, PG_U_UNASSIGNED},
+	{0x00fbd3, 0x00fd3d, PG_U_OTHER_LETTER},
+	{0x00fd3e, 0x00fd3e, PG_U_END_PUNCTUATION},
+	{0x00fd3f, 0x00fd3f, PG_U_START_PUNCTUATION},
+	{0x00fd40, 0x00fd4f, PG_U_OTHER_SYMBOL},
+	{0x00fd50, 0x00fd8f, PG_U_OTHER_LETTER},
+	{0x00fd90, 0x00fd91, PG_U_UNASSIGNED},
+	{0x00fd92, 0x00fdc7, PG_U_OTHER_LETTER},
+	{0x00fdc8, 0x00fdce, PG_U_UNASSIGNED},
+	{0x00fdcf, 0x00fdcf, PG_U_OTHER_SYMBOL},
+	{0x00fdd0, 0x00fdef, PG_U_UNASSIGNED},
+	{0x00fdf0, 0x00fdfb, PG_U_OTHER_LETTER},
+	{0x00fdfc, 0x00fdfc, PG_U_CURRENCY_SYMBOL},
+	{0x00fdfd, 0x00fdff, PG_U_OTHER_SYMBOL},
+	{0x00fe00, 0x00fe0f, PG_U_NON_SPACING_MARK},
+	{0x00fe10, 0x00fe16, PG_U_OTHER_PUNCTUATION},
+	{0x00fe17, 0x00fe17, PG_U_START_PUNCTUATION},
+	{0x00fe18, 0x00fe18, PG_U_END_PUNCTUATION},
+	{0x00fe19, 0x00fe19, PG_U_OTHER_PUNCTUATION},
+	{0x00fe1a, 0x00fe1f, PG_U_UNASSIGNED},
+	{0x00fe20, 0x00fe2f, PG_U_NON_SPACING_MARK},
+	{0x00fe30, 0x00fe30, PG_U_OTHER_PUNCTUATION},
+	{0x00fe31, 0x00fe32, PG_U_DASH_PUNCTUATION},
+	{0x00fe33, 0x00fe34, PG_U_CONNECTOR_PUNCTUATION},
+	{0x00fe35, 0x00fe35, PG_U_START_PUNCTUATION},
+	{0x00fe36, 0x00fe36, PG_U_END_PUNCTUATION},
+	{0x00fe37, 0x00fe37, PG_U_START_PUNCTUATION},
+	{0x00fe38, 0x00fe38, PG_U_END_PUNCTUATION},
+	{0x00fe39, 0x00fe39, PG_U_START_PUNCTUATION},
+	{0x00fe3a, 0x00fe3a, PG_U_END_PUNCTUATION},
+	{0x00fe3b, 0x00fe3b, PG_U_START_PUNCTUATION},
+	{0x00fe3c, 0x00fe3c, PG_U_END_PUNCTUATION},
+	{0x00fe3d, 0x00fe3d, PG_U_START_PUNCTUATION},
+	{0x00fe3e, 0x00fe3e, PG_U_END_PUNCTUATION},
+	{0x00fe3f, 0x00fe3f, PG_U_START_PUNCTUATION},
+	{0x00fe40, 0x00fe40, PG_U_END_PUNCTUATION},
+	{0x00fe41, 0x00fe41, PG_U_START_PUNCTUATION},
+	{0x00fe42, 0x00fe42, PG_U_END_PUNCTUATION},
+	{0x00fe43, 0x00fe43, PG_U_START_PUNCTUATION},
+	{0x00fe44, 0x00fe44, PG_U_END_PUNCTUATION},
+	{0x00fe45, 0x00fe46, PG_U_OTHER_PUNCTUATION},
+	{0x00fe47, 0x00fe47, PG_U_START_PUNCTUATION},
+	{0x00fe48, 0x00fe48, PG_U_END_PUNCTUATION},
+	{0x00fe49, 0x00fe4c, PG_U_OTHER_PUNCTUATION},
+	{0x00fe4d, 0x00fe4f, PG_U_CONNECTOR_PUNCTUATION},
+	{0x00fe50, 0x00fe52, PG_U_OTHER_PUNCTUATION},
+	{0x00fe53, 0x00fe53, PG_U_UNASSIGNED},
+	{0x00fe54, 0x00fe57, PG_U_OTHER_PUNCTUATION},
+	{0x00fe58, 0x00fe58, PG_U_DASH_PUNCTUATION},
+	{0x00fe59, 0x00fe59, PG_U_START_PUNCTUATION},
+	{0x00fe5a, 0x00fe5a, PG_U_END_PUNCTUATION},
+	{0x00fe5b, 0x00fe5b, PG_U_START_PUNCTUATION},
+	{0x00fe5c, 0x00fe5c, PG_U_END_PUNCTUATION},
+	{0x00fe5d, 0x00fe5d, PG_U_START_PUNCTUATION},
+	{0x00fe5e, 0x00fe5e, PG_U_END_PUNCTUATION},
+	{0x00fe5f, 0x00fe61, PG_U_OTHER_PUNCTUATION},
+	{0x00fe62, 0x00fe62, PG_U_MATH_SYMBOL},
+	{0x00fe63, 0x00fe63, PG_U_DASH_PUNCTUATION},
+	{0x00fe64, 0x00fe66, PG_U_MATH_SYMBOL},
+	{0x00fe67, 0x00fe67, PG_U_UNASSIGNED},
+	{0x00fe68, 0x00fe68, PG_U_OTHER_PUNCTUATION},
+	{0x00fe69, 0x00fe69, PG_U_CURRENCY_SYMBOL},
+	{0x00fe6a, 0x00fe6b, PG_U_OTHER_PUNCTUATION},
+	{0x00fe6c, 0x00fe6f, PG_U_UNASSIGNED},
+	{0x00fe70, 0x00fe74, PG_U_OTHER_LETTER},
+	{0x00fe75, 0x00fe75, PG_U_UNASSIGNED},
+	{0x00fe76, 0x00fefc, PG_U_OTHER_LETTER},
+	{0x00fefd, 0x00fefe, PG_U_UNASSIGNED},
+	{0x00feff, 0x00feff, PG_U_FORMAT_CHAR},
+	{0x00ff00, 0x00ff00, PG_U_UNASSIGNED},
+	{0x00ff01, 0x00ff03, PG_U_OTHER_PUNCTUATION},
+	{0x00ff04, 0x00ff04, PG_U_CURRENCY_SYMBOL},
+	{0x00ff05, 0x00ff07, PG_U_OTHER_PUNCTUATION},
+	{0x00ff08, 0x00ff08, PG_U_START_PUNCTUATION},
+	{0x00ff09, 0x00ff09, PG_U_END_PUNCTUATION},
+	{0x00ff0a, 0x00ff0a, PG_U_OTHER_PUNCTUATION},
+	{0x00ff0b, 0x00ff0b, PG_U_MATH_SYMBOL},
+	{0x00ff0c, 0x00ff0c, PG_U_OTHER_PUNCTUATION},
+	{0x00ff0d, 0x00ff0d, PG_U_DASH_PUNCTUATION},
+	{0x00ff0e, 0x00ff0f, PG_U_OTHER_PUNCTUATION},
+	{0x00ff10, 0x00ff19, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00ff1a, 0x00ff1b, PG_U_OTHER_PUNCTUATION},
+	{0x00ff1c, 0x00ff1e, PG_U_MATH_SYMBOL},
+	{0x00ff1f, 0x00ff20, PG_U_OTHER_PUNCTUATION},
+	{0x00ff21, 0x00ff3a, PG_U_UPPERCASE_LETTER},
+	{0x00ff3b, 0x00ff3b, PG_U_START_PUNCTUATION},
+	{0x00ff3c, 0x00ff3c, PG_U_OTHER_PUNCTUATION},
+	{0x00ff3d, 0x00ff3d, PG_U_END_PUNCTUATION},
+	{0x00ff3e, 0x00ff3e, PG_U_MODIFIER_SYMBOL},
+	{0x00ff3f, 0x00ff3f, PG_U_CONNECTOR_PUNCTUATION},
+	{0x00ff40, 0x00ff40, PG_U_MODIFIER_SYMBOL},
+	{0x00ff41, 0x00ff5a, PG_U_LOWERCASE_LETTER},
+	{0x00ff5b, 0x00ff5b, PG_U_START_PUNCTUATION},
+	{0x00ff5c, 0x00ff5c, PG_U_MATH_SYMBOL},
+	{0x00ff5d, 0x00ff5d, PG_U_END_PUNCTUATION},
+	{0x00ff5e, 0x00ff5e, PG_U_MATH_SYMBOL},
+	{0x00ff5f, 0x00ff5f, PG_U_START_PUNCTUATION},
+	{0x00ff60, 0x00ff60, PG_U_END_PUNCTUATION},
+	{0x00ff61, 0x00ff61, PG_U_OTHER_PUNCTUATION},
+	{0x00ff62, 0x00ff62, PG_U_START_PUNCTUATION},
+	{0x00ff63, 0x00ff63, PG_U_END_PUNCTUATION},
+	{0x00ff64, 0x00ff65, PG_U_OTHER_PUNCTUATION},
+	{0x00ff66, 0x00ff6f, PG_U_OTHER_LETTER},
+	{0x00ff70, 0x00ff70, PG_U_MODIFIER_LETTER},
+	{0x00ff71, 0x00ff9d, PG_U_OTHER_LETTER},
+	{0x00ff9e, 0x00ff9f, PG_U_MODIFIER_LETTER},
+	{0x00ffa0, 0x00ffbe, PG_U_OTHER_LETTER},
+	{0x00ffbf, 0x00ffc1, PG_U_UNASSIGNED},
+	{0x00ffc2, 0x00ffc7, PG_U_OTHER_LETTER},
+	{0x00ffc8, 0x00ffc9, PG_U_UNASSIGNED},
+	{0x00ffca, 0x00ffcf, PG_U_OTHER_LETTER},
+	{0x00ffd0, 0x00ffd1, PG_U_UNASSIGNED},
+	{0x00ffd2, 0x00ffd7, PG_U_OTHER_LETTER},
+	{0x00ffd8, 0x00ffd9, PG_U_UNASSIGNED},
+	{0x00ffda, 0x00ffdc, PG_U_OTHER_LETTER},
+	{0x00ffdd, 0x00ffdf, PG_U_UNASSIGNED},
+	{0x00ffe0, 0x00ffe1, PG_U_CURRENCY_SYMBOL},
+	{0x00ffe2, 0x00ffe2, PG_U_MATH_SYMBOL},
+	{0x00ffe3, 0x00ffe3, PG_U_MODIFIER_SYMBOL},
+	{0x00ffe4, 0x00ffe4, PG_U_OTHER_SYMBOL},
+	{0x00ffe5, 0x00ffe6, PG_U_CURRENCY_SYMBOL},
+	{0x00ffe7, 0x00ffe7, PG_U_UNASSIGNED},
+	{0x00ffe8, 0x00ffe8, PG_U_OTHER_SYMBOL},
+	{0x00ffe9, 0x00ffec, PG_U_MATH_SYMBOL},
+	{0x00ffed, 0x00ffee, PG_U_OTHER_SYMBOL},
+	{0x00ffef, 0x00fff8, PG_U_UNASSIGNED},
+	{0x00fff9, 0x00fffb, PG_U_FORMAT_CHAR},
+	{0x00fffc, 0x00fffd, PG_U_OTHER_SYMBOL},
+	{0x00fffe, 0x00ffff, PG_U_UNASSIGNED},
+	{0x010000, 0x01000b, PG_U_OTHER_LETTER},
+	{0x01000c, 0x01000c, PG_U_UNASSIGNED},
+	{0x01000d, 0x010026, PG_U_OTHER_LETTER},
+	{0x010027, 0x010027, PG_U_UNASSIGNED},
+	{0x010028, 0x01003a, PG_U_OTHER_LETTER},
+	{0x01003b, 0x01003b, PG_U_UNASSIGNED},
+	{0x01003c, 0x01003d, PG_U_OTHER_LETTER},
+	{0x01003e, 0x01003e, PG_U_UNASSIGNED},
+	{0x01003f, 0x01004d, PG_U_OTHER_LETTER},
+	{0x01004e, 0x01004f, PG_U_UNASSIGNED},
+	{0x010050, 0x01005d, PG_U_OTHER_LETTER},
+	{0x01005e, 0x01007f, PG_U_UNASSIGNED},
+	{0x010080, 0x0100fa, PG_U_OTHER_LETTER},
+	{0x0100fb, 0x0100ff, PG_U_UNASSIGNED},
+	{0x010100, 0x010102, PG_U_OTHER_PUNCTUATION},
+	{0x010103, 0x010106, PG_U_UNASSIGNED},
+	{0x010107, 0x010133, PG_U_OTHER_NUMBER},
+	{0x010134, 0x010136, PG_U_UNASSIGNED},
+	{0x010137, 0x01013f, PG_U_OTHER_SYMBOL},
+	{0x010140, 0x010174, PG_U_LETTER_NUMBER},
+	{0x010175, 0x010178, PG_U_OTHER_NUMBER},
+	{0x010179, 0x010189, PG_U_OTHER_SYMBOL},
+	{0x01018a, 0x01018b, PG_U_OTHER_NUMBER},
+	{0x01018c, 0x01018e, PG_U_OTHER_SYMBOL},
+	{0x01018f, 0x01018f, PG_U_UNASSIGNED},
+	{0x010190, 0x01019c, PG_U_OTHER_SYMBOL},
+	{0x01019d, 0x01019f, PG_U_UNASSIGNED},
+	{0x0101a0, 0x0101a0, PG_U_OTHER_SYMBOL},
+	{0x0101a1, 0x0101cf, PG_U_UNASSIGNED},
+	{0x0101d0, 0x0101fc, PG_U_OTHER_SYMBOL},
+	{0x0101fd, 0x0101fd, PG_U_NON_SPACING_MARK},
+	{0x0101fe, 0x01027f, PG_U_UNASSIGNED},
+	{0x010280, 0x01029c, PG_U_OTHER_LETTER},
+	{0x01029d, 0x01029f, PG_U_UNASSIGNED},
+	{0x0102a0, 0x0102d0, PG_U_OTHER_LETTER},
+	{0x0102d1, 0x0102df, PG_U_UNASSIGNED},
+	{0x0102e0, 0x0102e0, PG_U_NON_SPACING_MARK},
+	{0x0102e1, 0x0102fb, PG_U_OTHER_NUMBER},
+	{0x0102fc, 0x0102ff, PG_U_UNASSIGNED},
+	{0x010300, 0x01031f, PG_U_OTHER_LETTER},
+	{0x010320, 0x010323, PG_U_OTHER_NUMBER},
+	{0x010324, 0x01032c, PG_U_UNASSIGNED},
+	{0x01032d, 0x010340, PG_U_OTHER_LETTER},
+	{0x010341, 0x010341, PG_U_LETTER_NUMBER},
+	{0x010342, 0x010349, PG_U_OTHER_LETTER},
+	{0x01034a, 0x01034a, PG_U_LETTER_NUMBER},
+	{0x01034b, 0x01034f, PG_U_UNASSIGNED},
+	{0x010350, 0x010375, PG_U_OTHER_LETTER},
+	{0x010376, 0x01037a, PG_U_NON_SPACING_MARK},
+	{0x01037b, 0x01037f, PG_U_UNASSIGNED},
+	{0x010380, 0x01039d, PG_U_OTHER_LETTER},
+	{0x01039e, 0x01039e, PG_U_UNASSIGNED},
+	{0x01039f, 0x01039f, PG_U_OTHER_PUNCTUATION},
+	{0x0103a0, 0x0103c3, PG_U_OTHER_LETTER},
+	{0x0103c4, 0x0103c7, PG_U_UNASSIGNED},
+	{0x0103c8, 0x0103cf, PG_U_OTHER_LETTER},
+	{0x0103d0, 0x0103d0, PG_U_OTHER_PUNCTUATION},
+	{0x0103d1, 0x0103d5, PG_U_LETTER_NUMBER},
+	{0x0103d6, 0x0103ff, PG_U_UNASSIGNED},
+	{0x010400, 0x010427, PG_U_UPPERCASE_LETTER},
+	{0x010428, 0x01044f, PG_U_LOWERCASE_LETTER},
+	{0x010450, 0x01049d, PG_U_OTHER_LETTER},
+	{0x01049e, 0x01049f, PG_U_UNASSIGNED},
+	{0x0104a0, 0x0104a9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0104aa, 0x0104af, PG_U_UNASSIGNED},
+	{0x0104b0, 0x0104d3, PG_U_UPPERCASE_LETTER},
+	{0x0104d4, 0x0104d7, PG_U_UNASSIGNED},
+	{0x0104d8, 0x0104fb, PG_U_LOWERCASE_LETTER},
+	{0x0104fc, 0x0104ff, PG_U_UNASSIGNED},
+	{0x010500, 0x010527, PG_U_OTHER_LETTER},
+	{0x010528, 0x01052f, PG_U_UNASSIGNED},
+	{0x010530, 0x010563, PG_U_OTHER_LETTER},
+	{0x010564, 0x01056e, PG_U_UNASSIGNED},
+	{0x01056f, 0x01056f, PG_U_OTHER_PUNCTUATION},
+	{0x010570, 0x01057a, PG_U_UPPERCASE_LETTER},
+	{0x01057b, 0x01057b, PG_U_UNASSIGNED},
+	{0x01057c, 0x01058a, PG_U_UPPERCASE_LETTER},
+	{0x01058b, 0x01058b, PG_U_UNASSIGNED},
+	{0x01058c, 0x010592, PG_U_UPPERCASE_LETTER},
+	{0x010593, 0x010593, PG_U_UNASSIGNED},
+	{0x010594, 0x010595, PG_U_UPPERCASE_LETTER},
+	{0x010596, 0x010596, PG_U_UNASSIGNED},
+	{0x010597, 0x0105a1, PG_U_LOWERCASE_LETTER},
+	{0x0105a2, 0x0105a2, PG_U_UNASSIGNED},
+	{0x0105a3, 0x0105b1, PG_U_LOWERCASE_LETTER},
+	{0x0105b2, 0x0105b2, PG_U_UNASSIGNED},
+	{0x0105b3, 0x0105b9, PG_U_LOWERCASE_LETTER},
+	{0x0105ba, 0x0105ba, PG_U_UNASSIGNED},
+	{0x0105bb, 0x0105bc, PG_U_LOWERCASE_LETTER},
+	{0x0105bd, 0x0105ff, PG_U_UNASSIGNED},
+	{0x010600, 0x010736, PG_U_OTHER_LETTER},
+	{0x010737, 0x01073f, PG_U_UNASSIGNED},
+	{0x010740, 0x010755, PG_U_OTHER_LETTER},
+	{0x010756, 0x01075f, PG_U_UNASSIGNED},
+	{0x010760, 0x010767, PG_U_OTHER_LETTER},
+	{0x010768, 0x01077f, PG_U_UNASSIGNED},
+	{0x010780, 0x010785, PG_U_MODIFIER_LETTER},
+	{0x010786, 0x010786, PG_U_UNASSIGNED},
+	{0x010787, 0x0107b0, PG_U_MODIFIER_LETTER},
+	{0x0107b1, 0x0107b1, PG_U_UNASSIGNED},
+	{0x0107b2, 0x0107ba, PG_U_MODIFIER_LETTER},
+	{0x0107bb, 0x0107ff, PG_U_UNASSIGNED},
+	{0x010800, 0x010805, PG_U_OTHER_LETTER},
+	{0x010806, 0x010807, PG_U_UNASSIGNED},
+	{0x010808, 0x010808, PG_U_OTHER_LETTER},
+	{0x010809, 0x010809, PG_U_UNASSIGNED},
+	{0x01080a, 0x010835, PG_U_OTHER_LETTER},
+	{0x010836, 0x010836, PG_U_UNASSIGNED},
+	{0x010837, 0x010838, PG_U_OTHER_LETTER},
+	{0x010839, 0x01083b, PG_U_UNASSIGNED},
+	{0x01083c, 0x01083c, PG_U_OTHER_LETTER},
+	{0x01083d, 0x01083e, PG_U_UNASSIGNED},
+	{0x01083f, 0x010855, PG_U_OTHER_LETTER},
+	{0x010856, 0x010856, PG_U_UNASSIGNED},
+	{0x010857, 0x010857, PG_U_OTHER_PUNCTUATION},
+	{0x010858, 0x01085f, PG_U_OTHER_NUMBER},
+	{0x010860, 0x010876, PG_U_OTHER_LETTER},
+	{0x010877, 0x010878, PG_U_OTHER_SYMBOL},
+	{0x010879, 0x01087f, PG_U_OTHER_NUMBER},
+	{0x010880, 0x01089e, PG_U_OTHER_LETTER},
+	{0x01089f, 0x0108a6, PG_U_UNASSIGNED},
+	{0x0108a7, 0x0108af, PG_U_OTHER_NUMBER},
+	{0x0108b0, 0x0108df, PG_U_UNASSIGNED},
+	{0x0108e0, 0x0108f2, PG_U_OTHER_LETTER},
+	{0x0108f3, 0x0108f3, PG_U_UNASSIGNED},
+	{0x0108f4, 0x0108f5, PG_U_OTHER_LETTER},
+	{0x0108f6, 0x0108fa, PG_U_UNASSIGNED},
+	{0x0108fb, 0x0108ff, PG_U_OTHER_NUMBER},
+	{0x010900, 0x010915, PG_U_OTHER_LETTER},
+	{0x010916, 0x01091b, PG_U_OTHER_NUMBER},
+	{0x01091c, 0x01091e, PG_U_UNASSIGNED},
+	{0x01091f, 0x01091f, PG_U_OTHER_PUNCTUATION},
+	{0x010920, 0x010939, PG_U_OTHER_LETTER},
+	{0x01093a, 0x01093e, PG_U_UNASSIGNED},
+	{0x01093f, 0x01093f, PG_U_OTHER_PUNCTUATION},
+	{0x010940, 0x01097f, PG_U_UNASSIGNED},
+	{0x010980, 0x0109b7, PG_U_OTHER_LETTER},
+	{0x0109b8, 0x0109bb, PG_U_UNASSIGNED},
+	{0x0109bc, 0x0109bd, PG_U_OTHER_NUMBER},
+	{0x0109be, 0x0109bf, PG_U_OTHER_LETTER},
+	{0x0109c0, 0x0109cf, PG_U_OTHER_NUMBER},
+	{0x0109d0, 0x0109d1, PG_U_UNASSIGNED},
+	{0x0109d2, 0x0109ff, PG_U_OTHER_NUMBER},
+	{0x010a00, 0x010a00, PG_U_OTHER_LETTER},
+	{0x010a01, 0x010a03, PG_U_NON_SPACING_MARK},
+	{0x010a04, 0x010a04, PG_U_UNASSIGNED},
+	{0x010a05, 0x010a06, PG_U_NON_SPACING_MARK},
+	{0x010a07, 0x010a0b, PG_U_UNASSIGNED},
+	{0x010a0c, 0x010a0f, PG_U_NON_SPACING_MARK},
+	{0x010a10, 0x010a13, PG_U_OTHER_LETTER},
+	{0x010a14, 0x010a14, PG_U_UNASSIGNED},
+	{0x010a15, 0x010a17, PG_U_OTHER_LETTER},
+	{0x010a18, 0x010a18, PG_U_UNASSIGNED},
+	{0x010a19, 0x010a35, PG_U_OTHER_LETTER},
+	{0x010a36, 0x010a37, PG_U_UNASSIGNED},
+	{0x010a38, 0x010a3a, PG_U_NON_SPACING_MARK},
+	{0x010a3b, 0x010a3e, PG_U_UNASSIGNED},
+	{0x010a3f, 0x010a3f, PG_U_NON_SPACING_MARK},
+	{0x010a40, 0x010a48, PG_U_OTHER_NUMBER},
+	{0x010a49, 0x010a4f, PG_U_UNASSIGNED},
+	{0x010a50, 0x010a58, PG_U_OTHER_PUNCTUATION},
+	{0x010a59, 0x010a5f, PG_U_UNASSIGNED},
+	{0x010a60, 0x010a7c, PG_U_OTHER_LETTER},
+	{0x010a7d, 0x010a7e, PG_U_OTHER_NUMBER},
+	{0x010a7f, 0x010a7f, PG_U_OTHER_PUNCTUATION},
+	{0x010a80, 0x010a9c, PG_U_OTHER_LETTER},
+	{0x010a9d, 0x010a9f, PG_U_OTHER_NUMBER},
+	{0x010aa0, 0x010abf, PG_U_UNASSIGNED},
+	{0x010ac0, 0x010ac7, PG_U_OTHER_LETTER},
+	{0x010ac8, 0x010ac8, PG_U_OTHER_SYMBOL},
+	{0x010ac9, 0x010ae4, PG_U_OTHER_LETTER},
+	{0x010ae5, 0x010ae6, PG_U_NON_SPACING_MARK},
+	{0x010ae7, 0x010aea, PG_U_UNASSIGNED},
+	{0x010aeb, 0x010aef, PG_U_OTHER_NUMBER},
+	{0x010af0, 0x010af6, PG_U_OTHER_PUNCTUATION},
+	{0x010af7, 0x010aff, PG_U_UNASSIGNED},
+	{0x010b00, 0x010b35, PG_U_OTHER_LETTER},
+	{0x010b36, 0x010b38, PG_U_UNASSIGNED},
+	{0x010b39, 0x010b3f, PG_U_OTHER_PUNCTUATION},
+	{0x010b40, 0x010b55, PG_U_OTHER_LETTER},
+	{0x010b56, 0x010b57, PG_U_UNASSIGNED},
+	{0x010b58, 0x010b5f, PG_U_OTHER_NUMBER},
+	{0x010b60, 0x010b72, PG_U_OTHER_LETTER},
+	{0x010b73, 0x010b77, PG_U_UNASSIGNED},
+	{0x010b78, 0x010b7f, PG_U_OTHER_NUMBER},
+	{0x010b80, 0x010b91, PG_U_OTHER_LETTER},
+	{0x010b92, 0x010b98, PG_U_UNASSIGNED},
+	{0x010b99, 0x010b9c, PG_U_OTHER_PUNCTUATION},
+	{0x010b9d, 0x010ba8, PG_U_UNASSIGNED},
+	{0x010ba9, 0x010baf, PG_U_OTHER_NUMBER},
+	{0x010bb0, 0x010bff, PG_U_UNASSIGNED},
+	{0x010c00, 0x010c48, PG_U_OTHER_LETTER},
+	{0x010c49, 0x010c7f, PG_U_UNASSIGNED},
+	{0x010c80, 0x010cb2, PG_U_UPPERCASE_LETTER},
+	{0x010cb3, 0x010cbf, PG_U_UNASSIGNED},
+	{0x010cc0, 0x010cf2, PG_U_LOWERCASE_LETTER},
+	{0x010cf3, 0x010cf9, PG_U_UNASSIGNED},
+	{0x010cfa, 0x010cff, PG_U_OTHER_NUMBER},
+	{0x010d00, 0x010d23, PG_U_OTHER_LETTER},
+	{0x010d24, 0x010d27, PG_U_NON_SPACING_MARK},
+	{0x010d28, 0x010d2f, PG_U_UNASSIGNED},
+	{0x010d30, 0x010d39, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x010d3a, 0x010e5f, PG_U_UNASSIGNED},
+	{0x010e60, 0x010e7e, PG_U_OTHER_NUMBER},
+	{0x010e7f, 0x010e7f, PG_U_UNASSIGNED},
+	{0x010e80, 0x010ea9, PG_U_OTHER_LETTER},
+	{0x010eaa, 0x010eaa, PG_U_UNASSIGNED},
+	{0x010eab, 0x010eac, PG_U_NON_SPACING_MARK},
+	{0x010ead, 0x010ead, PG_U_DASH_PUNCTUATION},
+	{0x010eae, 0x010eaf, PG_U_UNASSIGNED},
+	{0x010eb0, 0x010eb1, PG_U_OTHER_LETTER},
+	{0x010eb2, 0x010efc, PG_U_UNASSIGNED},
+	{0x010efd, 0x010eff, PG_U_NON_SPACING_MARK},
+	{0x010f00, 0x010f1c, PG_U_OTHER_LETTER},
+	{0x010f1d, 0x010f26, PG_U_OTHER_NUMBER},
+	{0x010f27, 0x010f27, PG_U_OTHER_LETTER},
+	{0x010f28, 0x010f2f, PG_U_UNASSIGNED},
+	{0x010f30, 0x010f45, PG_U_OTHER_LETTER},
+	{0x010f46, 0x010f50, PG_U_NON_SPACING_MARK},
+	{0x010f51, 0x010f54, PG_U_OTHER_NUMBER},
+	{0x010f55, 0x010f59, PG_U_OTHER_PUNCTUATION},
+	{0x010f5a, 0x010f6f, PG_U_UNASSIGNED},
+	{0x010f70, 0x010f81, PG_U_OTHER_LETTER},
+	{0x010f82, 0x010f85, PG_U_NON_SPACING_MARK},
+	{0x010f86, 0x010f89, PG_U_OTHER_PUNCTUATION},
+	{0x010f8a, 0x010faf, PG_U_UNASSIGNED},
+	{0x010fb0, 0x010fc4, PG_U_OTHER_LETTER},
+	{0x010fc5, 0x010fcb, PG_U_OTHER_NUMBER},
+	{0x010fcc, 0x010fdf, PG_U_UNASSIGNED},
+	{0x010fe0, 0x010ff6, PG_U_OTHER_LETTER},
+	{0x010ff7, 0x010fff, PG_U_UNASSIGNED},
+	{0x011000, 0x011000, PG_U_COMBINING_SPACING_MARK},
+	{0x011001, 0x011001, PG_U_NON_SPACING_MARK},
+	{0x011002, 0x011002, PG_U_COMBINING_SPACING_MARK},
+	{0x011003, 0x011037, PG_U_OTHER_LETTER},
+	{0x011038, 0x011046, PG_U_NON_SPACING_MARK},
+	{0x011047, 0x01104d, PG_U_OTHER_PUNCTUATION},
+	{0x01104e, 0x011051, PG_U_UNASSIGNED},
+	{0x011052, 0x011065, PG_U_OTHER_NUMBER},
+	{0x011066, 0x01106f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011070, 0x011070, PG_U_NON_SPACING_MARK},
+	{0x011071, 0x011072, PG_U_OTHER_LETTER},
+	{0x011073, 0x011074, PG_U_NON_SPACING_MARK},
+	{0x011075, 0x011075, PG_U_OTHER_LETTER},
+	{0x011076, 0x01107e, PG_U_UNASSIGNED},
+	{0x01107f, 0x011081, PG_U_NON_SPACING_MARK},
+	{0x011082, 0x011082, PG_U_COMBINING_SPACING_MARK},
+	{0x011083, 0x0110af, PG_U_OTHER_LETTER},
+	{0x0110b0, 0x0110b2, PG_U_COMBINING_SPACING_MARK},
+	{0x0110b3, 0x0110b6, PG_U_NON_SPACING_MARK},
+	{0x0110b7, 0x0110b8, PG_U_COMBINING_SPACING_MARK},
+	{0x0110b9, 0x0110ba, PG_U_NON_SPACING_MARK},
+	{0x0110bb, 0x0110bc, PG_U_OTHER_PUNCTUATION},
+	{0x0110bd, 0x0110bd, PG_U_FORMAT_CHAR},
+	{0x0110be, 0x0110c1, PG_U_OTHER_PUNCTUATION},
+	{0x0110c2, 0x0110c2, PG_U_NON_SPACING_MARK},
+	{0x0110c3, 0x0110cc, PG_U_UNASSIGNED},
+	{0x0110cd, 0x0110cd, PG_U_FORMAT_CHAR},
+	{0x0110ce, 0x0110cf, PG_U_UNASSIGNED},
+	{0x0110d0, 0x0110e8, PG_U_OTHER_LETTER},
+	{0x0110e9, 0x0110ef, PG_U_UNASSIGNED},
+	{0x0110f0, 0x0110f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0110fa, 0x0110ff, PG_U_UNASSIGNED},
+	{0x011100, 0x011102, PG_U_NON_SPACING_MARK},
+	{0x011103, 0x011126, PG_U_OTHER_LETTER},
+	{0x011127, 0x01112b, PG_U_NON_SPACING_MARK},
+	{0x01112c, 0x01112c, PG_U_COMBINING_SPACING_MARK},
+	{0x01112d, 0x011134, PG_U_NON_SPACING_MARK},
+	{0x011135, 0x011135, PG_U_UNASSIGNED},
+	{0x011136, 0x01113f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011140, 0x011143, PG_U_OTHER_PUNCTUATION},
+	{0x011144, 0x011144, PG_U_OTHER_LETTER},
+	{0x011145, 0x011146, PG_U_COMBINING_SPACING_MARK},
+	{0x011147, 0x011147, PG_U_OTHER_LETTER},
+	{0x011148, 0x01114f, PG_U_UNASSIGNED},
+	{0x011150, 0x011172, PG_U_OTHER_LETTER},
+	{0x011173, 0x011173, PG_U_NON_SPACING_MARK},
+	{0x011174, 0x011175, PG_U_OTHER_PUNCTUATION},
+	{0x011176, 0x011176, PG_U_OTHER_LETTER},
+	{0x011177, 0x01117f, PG_U_UNASSIGNED},
+	{0x011180, 0x011181, PG_U_NON_SPACING_MARK},
+	{0x011182, 0x011182, PG_U_COMBINING_SPACING_MARK},
+	{0x011183, 0x0111b2, PG_U_OTHER_LETTER},
+	{0x0111b3, 0x0111b5, PG_U_COMBINING_SPACING_MARK},
+	{0x0111b6, 0x0111be, PG_U_NON_SPACING_MARK},
+	{0x0111bf, 0x0111c0, PG_U_COMBINING_SPACING_MARK},
+	{0x0111c1, 0x0111c4, PG_U_OTHER_LETTER},
+	{0x0111c5, 0x0111c8, PG_U_OTHER_PUNCTUATION},
+	{0x0111c9, 0x0111cc, PG_U_NON_SPACING_MARK},
+	{0x0111cd, 0x0111cd, PG_U_OTHER_PUNCTUATION},
+	{0x0111ce, 0x0111ce, PG_U_COMBINING_SPACING_MARK},
+	{0x0111cf, 0x0111cf, PG_U_NON_SPACING_MARK},
+	{0x0111d0, 0x0111d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0111da, 0x0111da, PG_U_OTHER_LETTER},
+	{0x0111db, 0x0111db, PG_U_OTHER_PUNCTUATION},
+	{0x0111dc, 0x0111dc, PG_U_OTHER_LETTER},
+	{0x0111dd, 0x0111df, PG_U_OTHER_PUNCTUATION},
+	{0x0111e0, 0x0111e0, PG_U_UNASSIGNED},
+	{0x0111e1, 0x0111f4, PG_U_OTHER_NUMBER},
+	{0x0111f5, 0x0111ff, PG_U_UNASSIGNED},
+	{0x011200, 0x011211, PG_U_OTHER_LETTER},
+	{0x011212, 0x011212, PG_U_UNASSIGNED},
+	{0x011213, 0x01122b, PG_U_OTHER_LETTER},
+	{0x01122c, 0x01122e, PG_U_COMBINING_SPACING_MARK},
+	{0x01122f, 0x011231, PG_U_NON_SPACING_MARK},
+	{0x011232, 0x011233, PG_U_COMBINING_SPACING_MARK},
+	{0x011234, 0x011234, PG_U_NON_SPACING_MARK},
+	{0x011235, 0x011235, PG_U_COMBINING_SPACING_MARK},
+	{0x011236, 0x011237, PG_U_NON_SPACING_MARK},
+	{0x011238, 0x01123d, PG_U_OTHER_PUNCTUATION},
+	{0x01123e, 0x01123e, PG_U_NON_SPACING_MARK},
+	{0x01123f, 0x011240, PG_U_OTHER_LETTER},
+	{0x011241, 0x011241, PG_U_NON_SPACING_MARK},
+	{0x011242, 0x01127f, PG_U_UNASSIGNED},
+	{0x011280, 0x011286, PG_U_OTHER_LETTER},
+	{0x011287, 0x011287, PG_U_UNASSIGNED},
+	{0x011288, 0x011288, PG_U_OTHER_LETTER},
+	{0x011289, 0x011289, PG_U_UNASSIGNED},
+	{0x01128a, 0x01128d, PG_U_OTHER_LETTER},
+	{0x01128e, 0x01128e, PG_U_UNASSIGNED},
+	{0x01128f, 0x01129d, PG_U_OTHER_LETTER},
+	{0x01129e, 0x01129e, PG_U_UNASSIGNED},
+	{0x01129f, 0x0112a8, PG_U_OTHER_LETTER},
+	{0x0112a9, 0x0112a9, PG_U_OTHER_PUNCTUATION},
+	{0x0112aa, 0x0112af, PG_U_UNASSIGNED},
+	{0x0112b0, 0x0112de, PG_U_OTHER_LETTER},
+	{0x0112df, 0x0112df, PG_U_NON_SPACING_MARK},
+	{0x0112e0, 0x0112e2, PG_U_COMBINING_SPACING_MARK},
+	{0x0112e3, 0x0112ea, PG_U_NON_SPACING_MARK},
+	{0x0112eb, 0x0112ef, PG_U_UNASSIGNED},
+	{0x0112f0, 0x0112f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0112fa, 0x0112ff, PG_U_UNASSIGNED},
+	{0x011300, 0x011301, PG_U_NON_SPACING_MARK},
+	{0x011302, 0x011303, PG_U_COMBINING_SPACING_MARK},
+	{0x011304, 0x011304, PG_U_UNASSIGNED},
+	{0x011305, 0x01130c, PG_U_OTHER_LETTER},
+	{0x01130d, 0x01130e, PG_U_UNASSIGNED},
+	{0x01130f, 0x011310, PG_U_OTHER_LETTER},
+	{0x011311, 0x011312, PG_U_UNASSIGNED},
+	{0x011313, 0x011328, PG_U_OTHER_LETTER},
+	{0x011329, 0x011329, PG_U_UNASSIGNED},
+	{0x01132a, 0x011330, PG_U_OTHER_LETTER},
+	{0x011331, 0x011331, PG_U_UNASSIGNED},
+	{0x011332, 0x011333, PG_U_OTHER_LETTER},
+	{0x011334, 0x011334, PG_U_UNASSIGNED},
+	{0x011335, 0x011339, PG_U_OTHER_LETTER},
+	{0x01133a, 0x01133a, PG_U_UNASSIGNED},
+	{0x01133b, 0x01133c, PG_U_NON_SPACING_MARK},
+	{0x01133d, 0x01133d, PG_U_OTHER_LETTER},
+	{0x01133e, 0x01133f, PG_U_COMBINING_SPACING_MARK},
+	{0x011340, 0x011340, PG_U_NON_SPACING_MARK},
+	{0x011341, 0x011344, PG_U_COMBINING_SPACING_MARK},
+	{0x011345, 0x011346, PG_U_UNASSIGNED},
+	{0x011347, 0x011348, PG_U_COMBINING_SPACING_MARK},
+	{0x011349, 0x01134a, PG_U_UNASSIGNED},
+	{0x01134b, 0x01134d, PG_U_COMBINING_SPACING_MARK},
+	{0x01134e, 0x01134f, PG_U_UNASSIGNED},
+	{0x011350, 0x011350, PG_U_OTHER_LETTER},
+	{0x011351, 0x011356, PG_U_UNASSIGNED},
+	{0x011357, 0x011357, PG_U_COMBINING_SPACING_MARK},
+	{0x011358, 0x01135c, PG_U_UNASSIGNED},
+	{0x01135d, 0x011361, PG_U_OTHER_LETTER},
+	{0x011362, 0x011363, PG_U_COMBINING_SPACING_MARK},
+	{0x011364, 0x011365, PG_U_UNASSIGNED},
+	{0x011366, 0x01136c, PG_U_NON_SPACING_MARK},
+	{0x01136d, 0x01136f, PG_U_UNASSIGNED},
+	{0x011370, 0x011374, PG_U_NON_SPACING_MARK},
+	{0x011375, 0x0113ff, PG_U_UNASSIGNED},
+	{0x011400, 0x011434, PG_U_OTHER_LETTER},
+	{0x011435, 0x011437, PG_U_COMBINING_SPACING_MARK},
+	{0x011438, 0x01143f, PG_U_NON_SPACING_MARK},
+	{0x011440, 0x011441, PG_U_COMBINING_SPACING_MARK},
+	{0x011442, 0x011444, PG_U_NON_SPACING_MARK},
+	{0x011445, 0x011445, PG_U_COMBINING_SPACING_MARK},
+	{0x011446, 0x011446, PG_U_NON_SPACING_MARK},
+	{0x011447, 0x01144a, PG_U_OTHER_LETTER},
+	{0x01144b, 0x01144f, PG_U_OTHER_PUNCTUATION},
+	{0x011450, 0x011459, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01145a, 0x01145b, PG_U_OTHER_PUNCTUATION},
+	{0x01145c, 0x01145c, PG_U_UNASSIGNED},
+	{0x01145d, 0x01145d, PG_U_OTHER_PUNCTUATION},
+	{0x01145e, 0x01145e, PG_U_NON_SPACING_MARK},
+	{0x01145f, 0x011461, PG_U_OTHER_LETTER},
+	{0x011462, 0x01147f, PG_U_UNASSIGNED},
+	{0x011480, 0x0114af, PG_U_OTHER_LETTER},
+	{0x0114b0, 0x0114b2, PG_U_COMBINING_SPACING_MARK},
+	{0x0114b3, 0x0114b8, PG_U_NON_SPACING_MARK},
+	{0x0114b9, 0x0114b9, PG_U_COMBINING_SPACING_MARK},
+	{0x0114ba, 0x0114ba, PG_U_NON_SPACING_MARK},
+	{0x0114bb, 0x0114be, PG_U_COMBINING_SPACING_MARK},
+	{0x0114bf, 0x0114c0, PG_U_NON_SPACING_MARK},
+	{0x0114c1, 0x0114c1, PG_U_COMBINING_SPACING_MARK},
+	{0x0114c2, 0x0114c3, PG_U_NON_SPACING_MARK},
+	{0x0114c4, 0x0114c5, PG_U_OTHER_LETTER},
+	{0x0114c6, 0x0114c6, PG_U_OTHER_PUNCTUATION},
+	{0x0114c7, 0x0114c7, PG_U_OTHER_LETTER},
+	{0x0114c8, 0x0114cf, PG_U_UNASSIGNED},
+	{0x0114d0, 0x0114d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0114da, 0x01157f, PG_U_UNASSIGNED},
+	{0x011580, 0x0115ae, PG_U_OTHER_LETTER},
+	{0x0115af, 0x0115b1, PG_U_COMBINING_SPACING_MARK},
+	{0x0115b2, 0x0115b5, PG_U_NON_SPACING_MARK},
+	{0x0115b6, 0x0115b7, PG_U_UNASSIGNED},
+	{0x0115b8, 0x0115bb, PG_U_COMBINING_SPACING_MARK},
+	{0x0115bc, 0x0115bd, PG_U_NON_SPACING_MARK},
+	{0x0115be, 0x0115be, PG_U_COMBINING_SPACING_MARK},
+	{0x0115bf, 0x0115c0, PG_U_NON_SPACING_MARK},
+	{0x0115c1, 0x0115d7, PG_U_OTHER_PUNCTUATION},
+	{0x0115d8, 0x0115db, PG_U_OTHER_LETTER},
+	{0x0115dc, 0x0115dd, PG_U_NON_SPACING_MARK},
+	{0x0115de, 0x0115ff, PG_U_UNASSIGNED},
+	{0x011600, 0x01162f, PG_U_OTHER_LETTER},
+	{0x011630, 0x011632, PG_U_COMBINING_SPACING_MARK},
+	{0x011633, 0x01163a, PG_U_NON_SPACING_MARK},
+	{0x01163b, 0x01163c, PG_U_COMBINING_SPACING_MARK},
+	{0x01163d, 0x01163d, PG_U_NON_SPACING_MARK},
+	{0x01163e, 0x01163e, PG_U_COMBINING_SPACING_MARK},
+	{0x01163f, 0x011640, PG_U_NON_SPACING_MARK},
+	{0x011641, 0x011643, PG_U_OTHER_PUNCTUATION},
+	{0x011644, 0x011644, PG_U_OTHER_LETTER},
+	{0x011645, 0x01164f, PG_U_UNASSIGNED},
+	{0x011650, 0x011659, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01165a, 0x01165f, PG_U_UNASSIGNED},
+	{0x011660, 0x01166c, PG_U_OTHER_PUNCTUATION},
+	{0x01166d, 0x01167f, PG_U_UNASSIGNED},
+	{0x011680, 0x0116aa, PG_U_OTHER_LETTER},
+	{0x0116ab, 0x0116ab, PG_U_NON_SPACING_MARK},
+	{0x0116ac, 0x0116ac, PG_U_COMBINING_SPACING_MARK},
+	{0x0116ad, 0x0116ad, PG_U_NON_SPACING_MARK},
+	{0x0116ae, 0x0116af, PG_U_COMBINING_SPACING_MARK},
+	{0x0116b0, 0x0116b5, PG_U_NON_SPACING_MARK},
+	{0x0116b6, 0x0116b6, PG_U_COMBINING_SPACING_MARK},
+	{0x0116b7, 0x0116b7, PG_U_NON_SPACING_MARK},
+	{0x0116b8, 0x0116b8, PG_U_OTHER_LETTER},
+	{0x0116b9, 0x0116b9, PG_U_OTHER_PUNCTUATION},
+	{0x0116ba, 0x0116bf, PG_U_UNASSIGNED},
+	{0x0116c0, 0x0116c9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0116ca, 0x0116ff, PG_U_UNASSIGNED},
+	{0x011700, 0x01171a, PG_U_OTHER_LETTER},
+	{0x01171b, 0x01171c, PG_U_UNASSIGNED},
+	{0x01171d, 0x01171f, PG_U_NON_SPACING_MARK},
+	{0x011720, 0x011721, PG_U_COMBINING_SPACING_MARK},
+	{0x011722, 0x011725, PG_U_NON_SPACING_MARK},
+	{0x011726, 0x011726, PG_U_COMBINING_SPACING_MARK},
+	{0x011727, 0x01172b, PG_U_NON_SPACING_MARK},
+	{0x01172c, 0x01172f, PG_U_UNASSIGNED},
+	{0x011730, 0x011739, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01173a, 0x01173b, PG_U_OTHER_NUMBER},
+	{0x01173c, 0x01173e, PG_U_OTHER_PUNCTUATION},
+	{0x01173f, 0x01173f, PG_U_OTHER_SYMBOL},
+	{0x011740, 0x011746, PG_U_OTHER_LETTER},
+	{0x011747, 0x0117ff, PG_U_UNASSIGNED},
+	{0x011800, 0x01182b, PG_U_OTHER_LETTER},
+	{0x01182c, 0x01182e, PG_U_COMBINING_SPACING_MARK},
+	{0x01182f, 0x011837, PG_U_NON_SPACING_MARK},
+	{0x011838, 0x011838, PG_U_COMBINING_SPACING_MARK},
+	{0x011839, 0x01183a, PG_U_NON_SPACING_MARK},
+	{0x01183b, 0x01183b, PG_U_OTHER_PUNCTUATION},
+	{0x01183c, 0x01189f, PG_U_UNASSIGNED},
+	{0x0118a0, 0x0118bf, PG_U_UPPERCASE_LETTER},
+	{0x0118c0, 0x0118df, PG_U_LOWERCASE_LETTER},
+	{0x0118e0, 0x0118e9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0118ea, 0x0118f2, PG_U_OTHER_NUMBER},
+	{0x0118f3, 0x0118fe, PG_U_UNASSIGNED},
+	{0x0118ff, 0x011906, PG_U_OTHER_LETTER},
+	{0x011907, 0x011908, PG_U_UNASSIGNED},
+	{0x011909, 0x011909, PG_U_OTHER_LETTER},
+	{0x01190a, 0x01190b, PG_U_UNASSIGNED},
+	{0x01190c, 0x011913, PG_U_OTHER_LETTER},
+	{0x011914, 0x011914, PG_U_UNASSIGNED},
+	{0x011915, 0x011916, PG_U_OTHER_LETTER},
+	{0x011917, 0x011917, PG_U_UNASSIGNED},
+	{0x011918, 0x01192f, PG_U_OTHER_LETTER},
+	{0x011930, 0x011935, PG_U_COMBINING_SPACING_MARK},
+	{0x011936, 0x011936, PG_U_UNASSIGNED},
+	{0x011937, 0x011938, PG_U_COMBINING_SPACING_MARK},
+	{0x011939, 0x01193a, PG_U_UNASSIGNED},
+	{0x01193b, 0x01193c, PG_U_NON_SPACING_MARK},
+	{0x01193d, 0x01193d, PG_U_COMBINING_SPACING_MARK},
+	{0x01193e, 0x01193e, PG_U_NON_SPACING_MARK},
+	{0x01193f, 0x01193f, PG_U_OTHER_LETTER},
+	{0x011940, 0x011940, PG_U_COMBINING_SPACING_MARK},
+	{0x011941, 0x011941, PG_U_OTHER_LETTER},
+	{0x011942, 0x011942, PG_U_COMBINING_SPACING_MARK},
+	{0x011943, 0x011943, PG_U_NON_SPACING_MARK},
+	{0x011944, 0x011946, PG_U_OTHER_PUNCTUATION},
+	{0x011947, 0x01194f, PG_U_UNASSIGNED},
+	{0x011950, 0x011959, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01195a, 0x01199f, PG_U_UNASSIGNED},
+	{0x0119a0, 0x0119a7, PG_U_OTHER_LETTER},
+	{0x0119a8, 0x0119a9, PG_U_UNASSIGNED},
+	{0x0119aa, 0x0119d0, PG_U_OTHER_LETTER},
+	{0x0119d1, 0x0119d3, PG_U_COMBINING_SPACING_MARK},
+	{0x0119d4, 0x0119d7, PG_U_NON_SPACING_MARK},
+	{0x0119d8, 0x0119d9, PG_U_UNASSIGNED},
+	{0x0119da, 0x0119db, PG_U_NON_SPACING_MARK},
+	{0x0119dc, 0x0119df, PG_U_COMBINING_SPACING_MARK},
+	{0x0119e0, 0x0119e0, PG_U_NON_SPACING_MARK},
+	{0x0119e1, 0x0119e1, PG_U_OTHER_LETTER},
+	{0x0119e2, 0x0119e2, PG_U_OTHER_PUNCTUATION},
+	{0x0119e3, 0x0119e3, PG_U_OTHER_LETTER},
+	{0x0119e4, 0x0119e4, PG_U_COMBINING_SPACING_MARK},
+	{0x0119e5, 0x0119ff, PG_U_UNASSIGNED},
+	{0x011a00, 0x011a00, PG_U_OTHER_LETTER},
+	{0x011a01, 0x011a0a, PG_U_NON_SPACING_MARK},
+	{0x011a0b, 0x011a32, PG_U_OTHER_LETTER},
+	{0x011a33, 0x011a38, PG_U_NON_SPACING_MARK},
+	{0x011a39, 0x011a39, PG_U_COMBINING_SPACING_MARK},
+	{0x011a3a, 0x011a3a, PG_U_OTHER_LETTER},
+	{0x011a3b, 0x011a3e, PG_U_NON_SPACING_MARK},
+	{0x011a3f, 0x011a46, PG_U_OTHER_PUNCTUATION},
+	{0x011a47, 0x011a47, PG_U_NON_SPACING_MARK},
+	{0x011a48, 0x011a4f, PG_U_UNASSIGNED},
+	{0x011a50, 0x011a50, PG_U_OTHER_LETTER},
+	{0x011a51, 0x011a56, PG_U_NON_SPACING_MARK},
+	{0x011a57, 0x011a58, PG_U_COMBINING_SPACING_MARK},
+	{0x011a59, 0x011a5b, PG_U_NON_SPACING_MARK},
+	{0x011a5c, 0x011a89, PG_U_OTHER_LETTER},
+	{0x011a8a, 0x011a96, PG_U_NON_SPACING_MARK},
+	{0x011a97, 0x011a97, PG_U_COMBINING_SPACING_MARK},
+	{0x011a98, 0x011a99, PG_U_NON_SPACING_MARK},
+	{0x011a9a, 0x011a9c, PG_U_OTHER_PUNCTUATION},
+	{0x011a9d, 0x011a9d, PG_U_OTHER_LETTER},
+	{0x011a9e, 0x011aa2, PG_U_OTHER_PUNCTUATION},
+	{0x011aa3, 0x011aaf, PG_U_UNASSIGNED},
+	{0x011ab0, 0x011af8, PG_U_OTHER_LETTER},
+	{0x011af9, 0x011aff, PG_U_UNASSIGNED},
+	{0x011b00, 0x011b09, PG_U_OTHER_PUNCTUATION},
+	{0x011b0a, 0x011bff, PG_U_UNASSIGNED},
+	{0x011c00, 0x011c08, PG_U_OTHER_LETTER},
+	{0x011c09, 0x011c09, PG_U_UNASSIGNED},
+	{0x011c0a, 0x011c2e, PG_U_OTHER_LETTER},
+	{0x011c2f, 0x011c2f, PG_U_COMBINING_SPACING_MARK},
+	{0x011c30, 0x011c36, PG_U_NON_SPACING_MARK},
+	{0x011c37, 0x011c37, PG_U_UNASSIGNED},
+	{0x011c38, 0x011c3d, PG_U_NON_SPACING_MARK},
+	{0x011c3e, 0x011c3e, PG_U_COMBINING_SPACING_MARK},
+	{0x011c3f, 0x011c3f, PG_U_NON_SPACING_MARK},
+	{0x011c40, 0x011c40, PG_U_OTHER_LETTER},
+	{0x011c41, 0x011c45, PG_U_OTHER_PUNCTUATION},
+	{0x011c46, 0x011c4f, PG_U_UNASSIGNED},
+	{0x011c50, 0x011c59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011c5a, 0x011c6c, PG_U_OTHER_NUMBER},
+	{0x011c6d, 0x011c6f, PG_U_UNASSIGNED},
+	{0x011c70, 0x011c71, PG_U_OTHER_PUNCTUATION},
+	{0x011c72, 0x011c8f, PG_U_OTHER_LETTER},
+	{0x011c90, 0x011c91, PG_U_UNASSIGNED},
+	{0x011c92, 0x011ca7, PG_U_NON_SPACING_MARK},
+	{0x011ca8, 0x011ca8, PG_U_UNASSIGNED},
+	{0x011ca9, 0x011ca9, PG_U_COMBINING_SPACING_MARK},
+	{0x011caa, 0x011cb0, PG_U_NON_SPACING_MARK},
+	{0x011cb1, 0x011cb1, PG_U_COMBINING_SPACING_MARK},
+	{0x011cb2, 0x011cb3, PG_U_NON_SPACING_MARK},
+	{0x011cb4, 0x011cb4, PG_U_COMBINING_SPACING_MARK},
+	{0x011cb5, 0x011cb6, PG_U_NON_SPACING_MARK},
+	{0x011cb7, 0x011cff, PG_U_UNASSIGNED},
+	{0x011d00, 0x011d06, PG_U_OTHER_LETTER},
+	{0x011d07, 0x011d07, PG_U_UNASSIGNED},
+	{0x011d08, 0x011d09, PG_U_OTHER_LETTER},
+	{0x011d0a, 0x011d0a, PG_U_UNASSIGNED},
+	{0x011d0b, 0x011d30, PG_U_OTHER_LETTER},
+	{0x011d31, 0x011d36, PG_U_NON_SPACING_MARK},
+	{0x011d37, 0x011d39, PG_U_UNASSIGNED},
+	{0x011d3a, 0x011d3a, PG_U_NON_SPACING_MARK},
+	{0x011d3b, 0x011d3b, PG_U_UNASSIGNED},
+	{0x011d3c, 0x011d3d, PG_U_NON_SPACING_MARK},
+	{0x011d3e, 0x011d3e, PG_U_UNASSIGNED},
+	{0x011d3f, 0x011d45, PG_U_NON_SPACING_MARK},
+	{0x011d46, 0x011d46, PG_U_OTHER_LETTER},
+	{0x011d47, 0x011d47, PG_U_NON_SPACING_MARK},
+	{0x011d48, 0x011d4f, PG_U_UNASSIGNED},
+	{0x011d50, 0x011d59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011d5a, 0x011d5f, PG_U_UNASSIGNED},
+	{0x011d60, 0x011d65, PG_U_OTHER_LETTER},
+	{0x011d66, 0x011d66, PG_U_UNASSIGNED},
+	{0x011d67, 0x011d68, PG_U_OTHER_LETTER},
+	{0x011d69, 0x011d69, PG_U_UNASSIGNED},
+	{0x011d6a, 0x011d89, PG_U_OTHER_LETTER},
+	{0x011d8a, 0x011d8e, PG_U_COMBINING_SPACING_MARK},
+	{0x011d8f, 0x011d8f, PG_U_UNASSIGNED},
+	{0x011d90, 0x011d91, PG_U_NON_SPACING_MARK},
+	{0x011d92, 0x011d92, PG_U_UNASSIGNED},
+	{0x011d93, 0x011d94, PG_U_COMBINING_SPACING_MARK},
+	{0x011d95, 0x011d95, PG_U_NON_SPACING_MARK},
+	{0x011d96, 0x011d96, PG_U_COMBINING_SPACING_MARK},
+	{0x011d97, 0x011d97, PG_U_NON_SPACING_MARK},
+	{0x011d98, 0x011d98, PG_U_OTHER_LETTER},
+	{0x011d99, 0x011d9f, PG_U_UNASSIGNED},
+	{0x011da0, 0x011da9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011daa, 0x011edf, PG_U_UNASSIGNED},
+	{0x011ee0, 0x011ef2, PG_U_OTHER_LETTER},
+	{0x011ef3, 0x011ef4, PG_U_NON_SPACING_MARK},
+	{0x011ef5, 0x011ef6, PG_U_COMBINING_SPACING_MARK},
+	{0x011ef7, 0x011ef8, PG_U_OTHER_PUNCTUATION},
+	{0x011ef9, 0x011eff, PG_U_UNASSIGNED},
+	{0x011f00, 0x011f01, PG_U_NON_SPACING_MARK},
+	{0x011f02, 0x011f02, PG_U_OTHER_LETTER},
+	{0x011f03, 0x011f03, PG_U_COMBINING_SPACING_MARK},
+	{0x011f04, 0x011f10, PG_U_OTHER_LETTER},
+	{0x011f11, 0x011f11, PG_U_UNASSIGNED},
+	{0x011f12, 0x011f33, PG_U_OTHER_LETTER},
+	{0x011f34, 0x011f35, PG_U_COMBINING_SPACING_MARK},
+	{0x011f36, 0x011f3a, PG_U_NON_SPACING_MARK},
+	{0x011f3b, 0x011f3d, PG_U_UNASSIGNED},
+	{0x011f3e, 0x011f3f, PG_U_COMBINING_SPACING_MARK},
+	{0x011f40, 0x011f40, PG_U_NON_SPACING_MARK},
+	{0x011f41, 0x011f41, PG_U_COMBINING_SPACING_MARK},
+	{0x011f42, 0x011f42, PG_U_NON_SPACING_MARK},
+	{0x011f43, 0x011f4f, PG_U_OTHER_PUNCTUATION},
+	{0x011f50, 0x011f59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011f5a, 0x011faf, PG_U_UNASSIGNED},
+	{0x011fb0, 0x011fb0, PG_U_OTHER_LETTER},
+	{0x011fb1, 0x011fbf, PG_U_UNASSIGNED},
+	{0x011fc0, 0x011fd4, PG_U_OTHER_NUMBER},
+	{0x011fd5, 0x011fdc, PG_U_OTHER_SYMBOL},
+	{0x011fdd, 0x011fe0, PG_U_CURRENCY_SYMBOL},
+	{0x011fe1, 0x011ff1, PG_U_OTHER_SYMBOL},
+	{0x011ff2, 0x011ffe, PG_U_UNASSIGNED},
+	{0x011fff, 0x011fff, PG_U_OTHER_PUNCTUATION},
+	{0x012000, 0x012399, PG_U_OTHER_LETTER},
+	{0x01239a, 0x0123ff, PG_U_UNASSIGNED},
+	{0x012400, 0x01246e, PG_U_LETTER_NUMBER},
+	{0x01246f, 0x01246f, PG_U_UNASSIGNED},
+	{0x012470, 0x012474, PG_U_OTHER_PUNCTUATION},
+	{0x012475, 0x01247f, PG_U_UNASSIGNED},
+	{0x012480, 0x012543, PG_U_OTHER_LETTER},
+	{0x012544, 0x012f8f, PG_U_UNASSIGNED},
+	{0x012f90, 0x012ff0, PG_U_OTHER_LETTER},
+	{0x012ff1, 0x012ff2, PG_U_OTHER_PUNCTUATION},
+	{0x012ff3, 0x012fff, PG_U_UNASSIGNED},
+	{0x013000, 0x01342f, PG_U_OTHER_LETTER},
+	{0x013430, 0x01343f, PG_U_FORMAT_CHAR},
+	{0x013440, 0x013440, PG_U_NON_SPACING_MARK},
+	{0x013441, 0x013446, PG_U_OTHER_LETTER},
+	{0x013447, 0x013455, PG_U_NON_SPACING_MARK},
+	{0x013456, 0x0143ff, PG_U_UNASSIGNED},
+	{0x014400, 0x014646, PG_U_OTHER_LETTER},
+	{0x014647, 0x0167ff, PG_U_UNASSIGNED},
+	{0x016800, 0x016a38, PG_U_OTHER_LETTER},
+	{0x016a39, 0x016a3f, PG_U_UNASSIGNED},
+	{0x016a40, 0x016a5e, PG_U_OTHER_LETTER},
+	{0x016a5f, 0x016a5f, PG_U_UNASSIGNED},
+	{0x016a60, 0x016a69, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x016a6a, 0x016a6d, PG_U_UNASSIGNED},
+	{0x016a6e, 0x016a6f, PG_U_OTHER_PUNCTUATION},
+	{0x016a70, 0x016abe, PG_U_OTHER_LETTER},
+	{0x016abf, 0x016abf, PG_U_UNASSIGNED},
+	{0x016ac0, 0x016ac9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x016aca, 0x016acf, PG_U_UNASSIGNED},
+	{0x016ad0, 0x016aed, PG_U_OTHER_LETTER},
+	{0x016aee, 0x016aef, PG_U_UNASSIGNED},
+	{0x016af0, 0x016af4, PG_U_NON_SPACING_MARK},
+	{0x016af5, 0x016af5, PG_U_OTHER_PUNCTUATION},
+	{0x016af6, 0x016aff, PG_U_UNASSIGNED},
+	{0x016b00, 0x016b2f, PG_U_OTHER_LETTER},
+	{0x016b30, 0x016b36, PG_U_NON_SPACING_MARK},
+	{0x016b37, 0x016b3b, PG_U_OTHER_PUNCTUATION},
+	{0x016b3c, 0x016b3f, PG_U_OTHER_SYMBOL},
+	{0x016b40, 0x016b43, PG_U_MODIFIER_LETTER},
+	{0x016b44, 0x016b44, PG_U_OTHER_PUNCTUATION},
+	{0x016b45, 0x016b45, PG_U_OTHER_SYMBOL},
+	{0x016b46, 0x016b4f, PG_U_UNASSIGNED},
+	{0x016b50, 0x016b59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x016b5a, 0x016b5a, PG_U_UNASSIGNED},
+	{0x016b5b, 0x016b61, PG_U_OTHER_NUMBER},
+	{0x016b62, 0x016b62, PG_U_UNASSIGNED},
+	{0x016b63, 0x016b77, PG_U_OTHER_LETTER},
+	{0x016b78, 0x016b7c, PG_U_UNASSIGNED},
+	{0x016b7d, 0x016b8f, PG_U_OTHER_LETTER},
+	{0x016b90, 0x016e3f, PG_U_UNASSIGNED},
+	{0x016e40, 0x016e5f, PG_U_UPPERCASE_LETTER},
+	{0x016e60, 0x016e7f, PG_U_LOWERCASE_LETTER},
+	{0x016e80, 0x016e96, PG_U_OTHER_NUMBER},
+	{0x016e97, 0x016e9a, PG_U_OTHER_PUNCTUATION},
+	{0x016e9b, 0x016eff, PG_U_UNASSIGNED},
+	{0x016f00, 0x016f4a, PG_U_OTHER_LETTER},
+	{0x016f4b, 0x016f4e, PG_U_UNASSIGNED},
+	{0x016f4f, 0x016f4f, PG_U_NON_SPACING_MARK},
+	{0x016f50, 0x016f50, PG_U_OTHER_LETTER},
+	{0x016f51, 0x016f87, PG_U_COMBINING_SPACING_MARK},
+	{0x016f88, 0x016f8e, PG_U_UNASSIGNED},
+	{0x016f8f, 0x016f92, PG_U_NON_SPACING_MARK},
+	{0x016f93, 0x016f9f, PG_U_MODIFIER_LETTER},
+	{0x016fa0, 0x016fdf, PG_U_UNASSIGNED},
+	{0x016fe0, 0x016fe1, PG_U_MODIFIER_LETTER},
+	{0x016fe2, 0x016fe2, PG_U_OTHER_PUNCTUATION},
+	{0x016fe3, 0x016fe3, PG_U_MODIFIER_LETTER},
+	{0x016fe4, 0x016fe4, PG_U_NON_SPACING_MARK},
+	{0x016fe5, 0x016fef, PG_U_UNASSIGNED},
+	{0x016ff0, 0x016ff1, PG_U_COMBINING_SPACING_MARK},
+	{0x016ff2, 0x016fff, PG_U_UNASSIGNED},
+	{0x017000, 0x0187f7, PG_U_OTHER_LETTER},
+	{0x0187f8, 0x0187ff, PG_U_UNASSIGNED},
+	{0x018800, 0x018cd5, PG_U_OTHER_LETTER},
+	{0x018cd6, 0x018cff, PG_U_UNASSIGNED},
+	{0x018d00, 0x018d08, PG_U_OTHER_LETTER},
+	{0x018d09, 0x01afef, PG_U_UNASSIGNED},
+	{0x01aff0, 0x01aff3, PG_U_MODIFIER_LETTER},
+	{0x01aff4, 0x01aff4, PG_U_UNASSIGNED},
+	{0x01aff5, 0x01affb, PG_U_MODIFIER_LETTER},
+	{0x01affc, 0x01affc, PG_U_UNASSIGNED},
+	{0x01affd, 0x01affe, PG_U_MODIFIER_LETTER},
+	{0x01afff, 0x01afff, PG_U_UNASSIGNED},
+	{0x01b000, 0x01b122, PG_U_OTHER_LETTER},
+	{0x01b123, 0x01b131, PG_U_UNASSIGNED},
+	{0x01b132, 0x01b132, PG_U_OTHER_LETTER},
+	{0x01b133, 0x01b14f, PG_U_UNASSIGNED},
+	{0x01b150, 0x01b152, PG_U_OTHER_LETTER},
+	{0x01b153, 0x01b154, PG_U_UNASSIGNED},
+	{0x01b155, 0x01b155, PG_U_OTHER_LETTER},
+	{0x01b156, 0x01b163, PG_U_UNASSIGNED},
+	{0x01b164, 0x01b167, PG_U_OTHER_LETTER},
+	{0x01b168, 0x01b16f, PG_U_UNASSIGNED},
+	{0x01b170, 0x01b2fb, PG_U_OTHER_LETTER},
+	{0x01b2fc, 0x01bbff, PG_U_UNASSIGNED},
+	{0x01bc00, 0x01bc6a, PG_U_OTHER_LETTER},
+	{0x01bc6b, 0x01bc6f, PG_U_UNASSIGNED},
+	{0x01bc70, 0x01bc7c, PG_U_OTHER_LETTER},
+	{0x01bc7d, 0x01bc7f, PG_U_UNASSIGNED},
+	{0x01bc80, 0x01bc88, PG_U_OTHER_LETTER},
+	{0x01bc89, 0x01bc8f, PG_U_UNASSIGNED},
+	{0x01bc90, 0x01bc99, PG_U_OTHER_LETTER},
+	{0x01bc9a, 0x01bc9b, PG_U_UNASSIGNED},
+	{0x01bc9c, 0x01bc9c, PG_U_OTHER_SYMBOL},
+	{0x01bc9d, 0x01bc9e, PG_U_NON_SPACING_MARK},
+	{0x01bc9f, 0x01bc9f, PG_U_OTHER_PUNCTUATION},
+	{0x01bca0, 0x01bca3, PG_U_FORMAT_CHAR},
+	{0x01bca4, 0x01ceff, PG_U_UNASSIGNED},
+	{0x01cf00, 0x01cf2d, PG_U_NON_SPACING_MARK},
+	{0x01cf2e, 0x01cf2f, PG_U_UNASSIGNED},
+	{0x01cf30, 0x01cf46, PG_U_NON_SPACING_MARK},
+	{0x01cf47, 0x01cf4f, PG_U_UNASSIGNED},
+	{0x01cf50, 0x01cfc3, PG_U_OTHER_SYMBOL},
+	{0x01cfc4, 0x01cfff, PG_U_UNASSIGNED},
+	{0x01d000, 0x01d0f5, PG_U_OTHER_SYMBOL},
+	{0x01d0f6, 0x01d0ff, PG_U_UNASSIGNED},
+	{0x01d100, 0x01d126, PG_U_OTHER_SYMBOL},
+	{0x01d127, 0x01d128, PG_U_UNASSIGNED},
+	{0x01d129, 0x01d164, PG_U_OTHER_SYMBOL},
+	{0x01d165, 0x01d166, PG_U_COMBINING_SPACING_MARK},
+	{0x01d167, 0x01d169, PG_U_NON_SPACING_MARK},
+	{0x01d16a, 0x01d16c, PG_U_OTHER_SYMBOL},
+	{0x01d16d, 0x01d172, PG_U_COMBINING_SPACING_MARK},
+	{0x01d173, 0x01d17a, PG_U_FORMAT_CHAR},
+	{0x01d17b, 0x01d182, PG_U_NON_SPACING_MARK},
+	{0x01d183, 0x01d184, PG_U_OTHER_SYMBOL},
+	{0x01d185, 0x01d18b, PG_U_NON_SPACING_MARK},
+	{0x01d18c, 0x01d1a9, PG_U_OTHER_SYMBOL},
+	{0x01d1aa, 0x01d1ad, PG_U_NON_SPACING_MARK},
+	{0x01d1ae, 0x01d1ea, PG_U_OTHER_SYMBOL},
+	{0x01d1eb, 0x01d1ff, PG_U_UNASSIGNED},
+	{0x01d200, 0x01d241, PG_U_OTHER_SYMBOL},
+	{0x01d242, 0x01d244, PG_U_NON_SPACING_MARK},
+	{0x01d245, 0x01d245, PG_U_OTHER_SYMBOL},
+	{0x01d246, 0x01d2bf, PG_U_UNASSIGNED},
+	{0x01d2c0, 0x01d2d3, PG_U_OTHER_NUMBER},
+	{0x01d2d4, 0x01d2df, PG_U_UNASSIGNED},
+	{0x01d2e0, 0x01d2f3, PG_U_OTHER_NUMBER},
+	{0x01d2f4, 0x01d2ff, PG_U_UNASSIGNED},
+	{0x01d300, 0x01d356, PG_U_OTHER_SYMBOL},
+	{0x01d357, 0x01d35f, PG_U_UNASSIGNED},
+	{0x01d360, 0x01d378, PG_U_OTHER_NUMBER},
+	{0x01d379, 0x01d3ff, PG_U_UNASSIGNED},
+	{0x01d400, 0x01d419, PG_U_UPPERCASE_LETTER},
+	{0x01d41a, 0x01d433, PG_U_LOWERCASE_LETTER},
+	{0x01d434, 0x01d44d, PG_U_UPPERCASE_LETTER},
+	{0x01d44e, 0x01d454, PG_U_LOWERCASE_LETTER},
+	{0x01d455, 0x01d455, PG_U_UNASSIGNED},
+	{0x01d456, 0x01d467, PG_U_LOWERCASE_LETTER},
+	{0x01d468, 0x01d481, PG_U_UPPERCASE_LETTER},
+	{0x01d482, 0x01d49b, PG_U_LOWERCASE_LETTER},
+	{0x01d49c, 0x01d49c, PG_U_UPPERCASE_LETTER},
+	{0x01d49d, 0x01d49d, PG_U_UNASSIGNED},
+	{0x01d49e, 0x01d49f, PG_U_UPPERCASE_LETTER},
+	{0x01d4a0, 0x01d4a1, PG_U_UNASSIGNED},
+	{0x01d4a2, 0x01d4a2, PG_U_UPPERCASE_LETTER},
+	{0x01d4a3, 0x01d4a4, PG_U_UNASSIGNED},
+	{0x01d4a5, 0x01d4a6, PG_U_UPPERCASE_LETTER},
+	{0x01d4a7, 0x01d4a8, PG_U_UNASSIGNED},
+	{0x01d4a9, 0x01d4ac, PG_U_UPPERCASE_LETTER},
+	{0x01d4ad, 0x01d4ad, PG_U_UNASSIGNED},
+	{0x01d4ae, 0x01d4b5, PG_U_UPPERCASE_LETTER},
+	{0x01d4b6, 0x01d4b9, PG_U_LOWERCASE_LETTER},
+	{0x01d4ba, 0x01d4ba, PG_U_UNASSIGNED},
+	{0x01d4bb, 0x01d4bb, PG_U_LOWERCASE_LETTER},
+	{0x01d4bc, 0x01d4bc, PG_U_UNASSIGNED},
+	{0x01d4bd, 0x01d4c3, PG_U_LOWERCASE_LETTER},
+	{0x01d4c4, 0x01d4c4, PG_U_UNASSIGNED},
+	{0x01d4c5, 0x01d4cf, PG_U_LOWERCASE_LETTER},
+	{0x01d4d0, 0x01d4e9, PG_U_UPPERCASE_LETTER},
+	{0x01d4ea, 0x01d503, PG_U_LOWERCASE_LETTER},
+	{0x01d504, 0x01d505, PG_U_UPPERCASE_LETTER},
+	{0x01d506, 0x01d506, PG_U_UNASSIGNED},
+	{0x01d507, 0x01d50a, PG_U_UPPERCASE_LETTER},
+	{0x01d50b, 0x01d50c, PG_U_UNASSIGNED},
+	{0x01d50d, 0x01d514, PG_U_UPPERCASE_LETTER},
+	{0x01d515, 0x01d515, PG_U_UNASSIGNED},
+	{0x01d516, 0x01d51c, PG_U_UPPERCASE_LETTER},
+	{0x01d51d, 0x01d51d, PG_U_UNASSIGNED},
+	{0x01d51e, 0x01d537, PG_U_LOWERCASE_LETTER},
+	{0x01d538, 0x01d539, PG_U_UPPERCASE_LETTER},
+	{0x01d53a, 0x01d53a, PG_U_UNASSIGNED},
+	{0x01d53b, 0x01d53e, PG_U_UPPERCASE_LETTER},
+	{0x01d53f, 0x01d53f, PG_U_UNASSIGNED},
+	{0x01d540, 0x01d544, PG_U_UPPERCASE_LETTER},
+	{0x01d545, 0x01d545, PG_U_UNASSIGNED},
+	{0x01d546, 0x01d546, PG_U_UPPERCASE_LETTER},
+	{0x01d547, 0x01d549, PG_U_UNASSIGNED},
+	{0x01d54a, 0x01d550, PG_U_UPPERCASE_LETTER},
+	{0x01d551, 0x01d551, PG_U_UNASSIGNED},
+	{0x01d552, 0x01d56b, PG_U_LOWERCASE_LETTER},
+	{0x01d56c, 0x01d585, PG_U_UPPERCASE_LETTER},
+	{0x01d586, 0x01d59f, PG_U_LOWERCASE_LETTER},
+	{0x01d5a0, 0x01d5b9, PG_U_UPPERCASE_LETTER},
+	{0x01d5ba, 0x01d5d3, PG_U_LOWERCASE_LETTER},
+	{0x01d5d4, 0x01d5ed, PG_U_UPPERCASE_LETTER},
+	{0x01d5ee, 0x01d607, PG_U_LOWERCASE_LETTER},
+	{0x01d608, 0x01d621, PG_U_UPPERCASE_LETTER},
+	{0x01d622, 0x01d63b, PG_U_LOWERCASE_LETTER},
+	{0x01d63c, 0x01d655, PG_U_UPPERCASE_LETTER},
+	{0x01d656, 0x01d66f, PG_U_LOWERCASE_LETTER},
+	{0x01d670, 0x01d689, PG_U_UPPERCASE_LETTER},
+	{0x01d68a, 0x01d6a5, PG_U_LOWERCASE_LETTER},
+	{0x01d6a6, 0x01d6a7, PG_U_UNASSIGNED},
+	{0x01d6a8, 0x01d6c0, PG_U_UPPERCASE_LETTER},
+	{0x01d6c1, 0x01d6c1, PG_U_MATH_SYMBOL},
+	{0x01d6c2, 0x01d6da, PG_U_LOWERCASE_LETTER},
+	{0x01d6db, 0x01d6db, PG_U_MATH_SYMBOL},
+	{0x01d6dc, 0x01d6e1, PG_U_LOWERCASE_LETTER},
+	{0x01d6e2, 0x01d6fa, PG_U_UPPERCASE_LETTER},
+	{0x01d6fb, 0x01d6fb, PG_U_MATH_SYMBOL},
+	{0x01d6fc, 0x01d714, PG_U_LOWERCASE_LETTER},
+	{0x01d715, 0x01d715, PG_U_MATH_SYMBOL},
+	{0x01d716, 0x01d71b, PG_U_LOWERCASE_LETTER},
+	{0x01d71c, 0x01d734, PG_U_UPPERCASE_LETTER},
+	{0x01d735, 0x01d735, PG_U_MATH_SYMBOL},
+	{0x01d736, 0x01d74e, PG_U_LOWERCASE_LETTER},
+	{0x01d74f, 0x01d74f, PG_U_MATH_SYMBOL},
+	{0x01d750, 0x01d755, PG_U_LOWERCASE_LETTER},
+	{0x01d756, 0x01d76e, PG_U_UPPERCASE_LETTER},
+	{0x01d76f, 0x01d76f, PG_U_MATH_SYMBOL},
+	{0x01d770, 0x01d788, PG_U_LOWERCASE_LETTER},
+	{0x01d789, 0x01d789, PG_U_MATH_SYMBOL},
+	{0x01d78a, 0x01d78f, PG_U_LOWERCASE_LETTER},
+	{0x01d790, 0x01d7a8, PG_U_UPPERCASE_LETTER},
+	{0x01d7a9, 0x01d7a9, PG_U_MATH_SYMBOL},
+	{0x01d7aa, 0x01d7c2, PG_U_LOWERCASE_LETTER},
+	{0x01d7c3, 0x01d7c3, PG_U_MATH_SYMBOL},
+	{0x01d7c4, 0x01d7c9, PG_U_LOWERCASE_LETTER},
+	{0x01d7ca, 0x01d7ca, PG_U_UPPERCASE_LETTER},
+	{0x01d7cb, 0x01d7cb, PG_U_LOWERCASE_LETTER},
+	{0x01d7cc, 0x01d7cd, PG_U_UNASSIGNED},
+	{0x01d7ce, 0x01d7ff, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01d800, 0x01d9ff, PG_U_OTHER_SYMBOL},
+	{0x01da00, 0x01da36, PG_U_NON_SPACING_MARK},
+	{0x01da37, 0x01da3a, PG_U_OTHER_SYMBOL},
+	{0x01da3b, 0x01da6c, PG_U_NON_SPACING_MARK},
+	{0x01da6d, 0x01da74, PG_U_OTHER_SYMBOL},
+	{0x01da75, 0x01da75, PG_U_NON_SPACING_MARK},
+	{0x01da76, 0x01da83, PG_U_OTHER_SYMBOL},
+	{0x01da84, 0x01da84, PG_U_NON_SPACING_MARK},
+	{0x01da85, 0x01da86, PG_U_OTHER_SYMBOL},
+	{0x01da87, 0x01da8b, PG_U_OTHER_PUNCTUATION},
+	{0x01da8c, 0x01da9a, PG_U_UNASSIGNED},
+	{0x01da9b, 0x01da9f, PG_U_NON_SPACING_MARK},
+	{0x01daa0, 0x01daa0, PG_U_UNASSIGNED},
+	{0x01daa1, 0x01daaf, PG_U_NON_SPACING_MARK},
+	{0x01dab0, 0x01deff, PG_U_UNASSIGNED},
+	{0x01df00, 0x01df09, PG_U_LOWERCASE_LETTER},
+	{0x01df0a, 0x01df0a, PG_U_OTHER_LETTER},
+	{0x01df0b, 0x01df1e, PG_U_LOWERCASE_LETTER},
+	{0x01df1f, 0x01df24, PG_U_UNASSIGNED},
+	{0x01df25, 0x01df2a, PG_U_LOWERCASE_LETTER},
+	{0x01df2b, 0x01dfff, PG_U_UNASSIGNED},
+	{0x01e000, 0x01e006, PG_U_NON_SPACING_MARK},
+	{0x01e007, 0x01e007, PG_U_UNASSIGNED},
+	{0x01e008, 0x01e018, PG_U_NON_SPACING_MARK},
+	{0x01e019, 0x01e01a, PG_U_UNASSIGNED},
+	{0x01e01b, 0x01e021, PG_U_NON_SPACING_MARK},
+	{0x01e022, 0x01e022, PG_U_UNASSIGNED},
+	{0x01e023, 0x01e024, PG_U_NON_SPACING_MARK},
+	{0x01e025, 0x01e025, PG_U_UNASSIGNED},
+	{0x01e026, 0x01e02a, PG_U_NON_SPACING_MARK},
+	{0x01e02b, 0x01e02f, PG_U_UNASSIGNED},
+	{0x01e030, 0x01e06d, PG_U_MODIFIER_LETTER},
+	{0x01e06e, 0x01e08e, PG_U_UNASSIGNED},
+	{0x01e08f, 0x01e08f, PG_U_NON_SPACING_MARK},
+	{0x01e090, 0x01e0ff, PG_U_UNASSIGNED},
+	{0x01e100, 0x01e12c, PG_U_OTHER_LETTER},
+	{0x01e12d, 0x01e12f, PG_U_UNASSIGNED},
+	{0x01e130, 0x01e136, PG_U_NON_SPACING_MARK},
+	{0x01e137, 0x01e13d, PG_U_MODIFIER_LETTER},
+	{0x01e13e, 0x01e13f, PG_U_UNASSIGNED},
+	{0x01e140, 0x01e149, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e14a, 0x01e14d, PG_U_UNASSIGNED},
+	{0x01e14e, 0x01e14e, PG_U_OTHER_LETTER},
+	{0x01e14f, 0x01e14f, PG_U_OTHER_SYMBOL},
+	{0x01e150, 0x01e28f, PG_U_UNASSIGNED},
+	{0x01e290, 0x01e2ad, PG_U_OTHER_LETTER},
+	{0x01e2ae, 0x01e2ae, PG_U_NON_SPACING_MARK},
+	{0x01e2af, 0x01e2bf, PG_U_UNASSIGNED},
+	{0x01e2c0, 0x01e2eb, PG_U_OTHER_LETTER},
+	{0x01e2ec, 0x01e2ef, PG_U_NON_SPACING_MARK},
+	{0x01e2f0, 0x01e2f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e2fa, 0x01e2fe, PG_U_UNASSIGNED},
+	{0x01e2ff, 0x01e2ff, PG_U_CURRENCY_SYMBOL},
+	{0x01e300, 0x01e4cf, PG_U_UNASSIGNED},
+	{0x01e4d0, 0x01e4ea, PG_U_OTHER_LETTER},
+	{0x01e4eb, 0x01e4eb, PG_U_MODIFIER_LETTER},
+	{0x01e4ec, 0x01e4ef, PG_U_NON_SPACING_MARK},
+	{0x01e4f0, 0x01e4f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e4fa, 0x01e7df, PG_U_UNASSIGNED},
+	{0x01e7e0, 0x01e7e6, PG_U_OTHER_LETTER},
+	{0x01e7e7, 0x01e7e7, PG_U_UNASSIGNED},
+	{0x01e7e8, 0x01e7eb, PG_U_OTHER_LETTER},
+	{0x01e7ec, 0x01e7ec, PG_U_UNASSIGNED},
+	{0x01e7ed, 0x01e7ee, PG_U_OTHER_LETTER},
+	{0x01e7ef, 0x01e7ef, PG_U_UNASSIGNED},
+	{0x01e7f0, 0x01e7fe, PG_U_OTHER_LETTER},
+	{0x01e7ff, 0x01e7ff, PG_U_UNASSIGNED},
+	{0x01e800, 0x01e8c4, PG_U_OTHER_LETTER},
+	{0x01e8c5, 0x01e8c6, PG_U_UNASSIGNED},
+	{0x01e8c7, 0x01e8cf, PG_U_OTHER_NUMBER},
+	{0x01e8d0, 0x01e8d6, PG_U_NON_SPACING_MARK},
+	{0x01e8d7, 0x01e8ff, PG_U_UNASSIGNED},
+	{0x01e900, 0x01e921, PG_U_UPPERCASE_LETTER},
+	{0x01e922, 0x01e943, PG_U_LOWERCASE_LETTER},
+	{0x01e944, 0x01e94a, PG_U_NON_SPACING_MARK},
+	{0x01e94b, 0x01e94b, PG_U_MODIFIER_LETTER},
+	{0x01e94c, 0x01e94f, PG_U_UNASSIGNED},
+	{0x01e950, 0x01e959, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e95a, 0x01e95d, PG_U_UNASSIGNED},
+	{0x01e95e, 0x01e95f, PG_U_OTHER_PUNCTUATION},
+	{0x01e960, 0x01ec70, PG_U_UNASSIGNED},
+	{0x01ec71, 0x01ecab, PG_U_OTHER_NUMBER},
+	{0x01ecac, 0x01ecac, PG_U_OTHER_SYMBOL},
+	{0x01ecad, 0x01ecaf, PG_U_OTHER_NUMBER},
+	{0x01ecb0, 0x01ecb0, PG_U_CURRENCY_SYMBOL},
+	{0x01ecb1, 0x01ecb4, PG_U_OTHER_NUMBER},
+	{0x01ecb5, 0x01ed00, PG_U_UNASSIGNED},
+	{0x01ed01, 0x01ed2d, PG_U_OTHER_NUMBER},
+	{0x01ed2e, 0x01ed2e, PG_U_OTHER_SYMBOL},
+	{0x01ed2f, 0x01ed3d, PG_U_OTHER_NUMBER},
+	{0x01ed3e, 0x01edff, PG_U_UNASSIGNED},
+	{0x01ee00, 0x01ee03, PG_U_OTHER_LETTER},
+	{0x01ee04, 0x01ee04, PG_U_UNASSIGNED},
+	{0x01ee05, 0x01ee1f, PG_U_OTHER_LETTER},
+	{0x01ee20, 0x01ee20, PG_U_UNASSIGNED},
+	{0x01ee21, 0x01ee22, PG_U_OTHER_LETTER},
+	{0x01ee23, 0x01ee23, PG_U_UNASSIGNED},
+	{0x01ee24, 0x01ee24, PG_U_OTHER_LETTER},
+	{0x01ee25, 0x01ee26, PG_U_UNASSIGNED},
+	{0x01ee27, 0x01ee27, PG_U_OTHER_LETTER},
+	{0x01ee28, 0x01ee28, PG_U_UNASSIGNED},
+	{0x01ee29, 0x01ee32, PG_U_OTHER_LETTER},
+	{0x01ee33, 0x01ee33, PG_U_UNASSIGNED},
+	{0x01ee34, 0x01ee37, PG_U_OTHER_LETTER},
+	{0x01ee38, 0x01ee38, PG_U_UNASSIGNED},
+	{0x01ee39, 0x01ee39, PG_U_OTHER_LETTER},
+	{0x01ee3a, 0x01ee3a, PG_U_UNASSIGNED},
+	{0x01ee3b, 0x01ee3b, PG_U_OTHER_LETTER},
+	{0x01ee3c, 0x01ee41, PG_U_UNASSIGNED},
+	{0x01ee42, 0x01ee42, PG_U_OTHER_LETTER},
+	{0x01ee43, 0x01ee46, PG_U_UNASSIGNED},
+	{0x01ee47, 0x01ee47, PG_U_OTHER_LETTER},
+	{0x01ee48, 0x01ee48, PG_U_UNASSIGNED},
+	{0x01ee49, 0x01ee49, PG_U_OTHER_LETTER},
+	{0x01ee4a, 0x01ee4a, PG_U_UNASSIGNED},
+	{0x01ee4b, 0x01ee4b, PG_U_OTHER_LETTER},
+	{0x01ee4c, 0x01ee4c, PG_U_UNASSIGNED},
+	{0x01ee4d, 0x01ee4f, PG_U_OTHER_LETTER},
+	{0x01ee50, 0x01ee50, PG_U_UNASSIGNED},
+	{0x01ee51, 0x01ee52, PG_U_OTHER_LETTER},
+	{0x01ee53, 0x01ee53, PG_U_UNASSIGNED},
+	{0x01ee54, 0x01ee54, PG_U_OTHER_LETTER},
+	{0x01ee55, 0x01ee56, PG_U_UNASSIGNED},
+	{0x01ee57, 0x01ee57, PG_U_OTHER_LETTER},
+	{0x01ee58, 0x01ee58, PG_U_UNASSIGNED},
+	{0x01ee59, 0x01ee59, PG_U_OTHER_LETTER},
+	{0x01ee5a, 0x01ee5a, PG_U_UNASSIGNED},
+	{0x01ee5b, 0x01ee5b, PG_U_OTHER_LETTER},
+	{0x01ee5c, 0x01ee5c, PG_U_UNASSIGNED},
+	{0x01ee5d, 0x01ee5d, PG_U_OTHER_LETTER},
+	{0x01ee5e, 0x01ee5e, PG_U_UNASSIGNED},
+	{0x01ee5f, 0x01ee5f, PG_U_OTHER_LETTER},
+	{0x01ee60, 0x01ee60, PG_U_UNASSIGNED},
+	{0x01ee61, 0x01ee62, PG_U_OTHER_LETTER},
+	{0x01ee63, 0x01ee63, PG_U_UNASSIGNED},
+	{0x01ee64, 0x01ee64, PG_U_OTHER_LETTER},
+	{0x01ee65, 0x01ee66, PG_U_UNASSIGNED},
+	{0x01ee67, 0x01ee6a, PG_U_OTHER_LETTER},
+	{0x01ee6b, 0x01ee6b, PG_U_UNASSIGNED},
+	{0x01ee6c, 0x01ee72, PG_U_OTHER_LETTER},
+	{0x01ee73, 0x01ee73, PG_U_UNASSIGNED},
+	{0x01ee74, 0x01ee77, PG_U_OTHER_LETTER},
+	{0x01ee78, 0x01ee78, PG_U_UNASSIGNED},
+	{0x01ee79, 0x01ee7c, PG_U_OTHER_LETTER},
+	{0x01ee7d, 0x01ee7d, PG_U_UNASSIGNED},
+	{0x01ee7e, 0x01ee7e, PG_U_OTHER_LETTER},
+	{0x01ee7f, 0x01ee7f, PG_U_UNASSIGNED},
+	{0x01ee80, 0x01ee89, PG_U_OTHER_LETTER},
+	{0x01ee8a, 0x01ee8a, PG_U_UNASSIGNED},
+	{0x01ee8b, 0x01ee9b, PG_U_OTHER_LETTER},
+	{0x01ee9c, 0x01eea0, PG_U_UNASSIGNED},
+	{0x01eea1, 0x01eea3, PG_U_OTHER_LETTER},
+	{0x01eea4, 0x01eea4, PG_U_UNASSIGNED},
+	{0x01eea5, 0x01eea9, PG_U_OTHER_LETTER},
+	{0x01eeaa, 0x01eeaa, PG_U_UNASSIGNED},
+	{0x01eeab, 0x01eebb, PG_U_OTHER_LETTER},
+	{0x01eebc, 0x01eeef, PG_U_UNASSIGNED},
+	{0x01eef0, 0x01eef1, PG_U_MATH_SYMBOL},
+	{0x01eef2, 0x01efff, PG_U_UNASSIGNED},
+	{0x01f000, 0x01f02b, PG_U_OTHER_SYMBOL},
+	{0x01f02c, 0x01f02f, PG_U_UNASSIGNED},
+	{0x01f030, 0x01f093, PG_U_OTHER_SYMBOL},
+	{0x01f094, 0x01f09f, PG_U_UNASSIGNED},
+	{0x01f0a0, 0x01f0ae, PG_U_OTHER_SYMBOL},
+	{0x01f0af, 0x01f0b0, PG_U_UNASSIGNED},
+	{0x01f0b1, 0x01f0bf, PG_U_OTHER_SYMBOL},
+	{0x01f0c0, 0x01f0c0, PG_U_UNASSIGNED},
+	{0x01f0c1, 0x01f0cf, PG_U_OTHER_SYMBOL},
+	{0x01f0d0, 0x01f0d0, PG_U_UNASSIGNED},
+	{0x01f0d1, 0x01f0f5, PG_U_OTHER_SYMBOL},
+	{0x01f0f6, 0x01f0ff, PG_U_UNASSIGNED},
+	{0x01f100, 0x01f10c, PG_U_OTHER_NUMBER},
+	{0x01f10d, 0x01f1ad, PG_U_OTHER_SYMBOL},
+	{0x01f1ae, 0x01f1e5, PG_U_UNASSIGNED},
+	{0x01f1e6, 0x01f202, PG_U_OTHER_SYMBOL},
+	{0x01f203, 0x01f20f, PG_U_UNASSIGNED},
+	{0x01f210, 0x01f23b, PG_U_OTHER_SYMBOL},
+	{0x01f23c, 0x01f23f, PG_U_UNASSIGNED},
+	{0x01f240, 0x01f248, PG_U_OTHER_SYMBOL},
+	{0x01f249, 0x01f24f, PG_U_UNASSIGNED},
+	{0x01f250, 0x01f251, PG_U_OTHER_SYMBOL},
+	{0x01f252, 0x01f25f, PG_U_UNASSIGNED},
+	{0x01f260, 0x01f265, PG_U_OTHER_SYMBOL},
+	{0x01f266, 0x01f2ff, PG_U_UNASSIGNED},
+	{0x01f300, 0x01f3fa, PG_U_OTHER_SYMBOL},
+	{0x01f3fb, 0x01f3ff, PG_U_MODIFIER_SYMBOL},
+	{0x01f400, 0x01f6d7, PG_U_OTHER_SYMBOL},
+	{0x01f6d8, 0x01f6db, PG_U_UNASSIGNED},
+	{0x01f6dc, 0x01f6ec, PG_U_OTHER_SYMBOL},
+	{0x01f6ed, 0x01f6ef, PG_U_UNASSIGNED},
+	{0x01f6f0, 0x01f6fc, PG_U_OTHER_SYMBOL},
+	{0x01f6fd, 0x01f6ff, PG_U_UNASSIGNED},
+	{0x01f700, 0x01f776, PG_U_OTHER_SYMBOL},
+	{0x01f777, 0x01f77a, PG_U_UNASSIGNED},
+	{0x01f77b, 0x01f7d9, PG_U_OTHER_SYMBOL},
+	{0x01f7da, 0x01f7df, PG_U_UNASSIGNED},
+	{0x01f7e0, 0x01f7eb, PG_U_OTHER_SYMBOL},
+	{0x01f7ec, 0x01f7ef, PG_U_UNASSIGNED},
+	{0x01f7f0, 0x01f7f0, PG_U_OTHER_SYMBOL},
+	{0x01f7f1, 0x01f7ff, PG_U_UNASSIGNED},
+	{0x01f800, 0x01f80b, PG_U_OTHER_SYMBOL},
+	{0x01f80c, 0x01f80f, PG_U_UNASSIGNED},
+	{0x01f810, 0x01f847, PG_U_OTHER_SYMBOL},
+	{0x01f848, 0x01f84f, PG_U_UNASSIGNED},
+	{0x01f850, 0x01f859, PG_U_OTHER_SYMBOL},
+	{0x01f85a, 0x01f85f, PG_U_UNASSIGNED},
+	{0x01f860, 0x01f887, PG_U_OTHER_SYMBOL},
+	{0x01f888, 0x01f88f, PG_U_UNASSIGNED},
+	{0x01f890, 0x01f8ad, PG_U_OTHER_SYMBOL},
+	{0x01f8ae, 0x01f8af, PG_U_UNASSIGNED},
+	{0x01f8b0, 0x01f8b1, PG_U_OTHER_SYMBOL},
+	{0x01f8b2, 0x01f8ff, PG_U_UNASSIGNED},
+	{0x01f900, 0x01fa53, PG_U_OTHER_SYMBOL},
+	{0x01fa54, 0x01fa5f, PG_U_UNASSIGNED},
+	{0x01fa60, 0x01fa6d, PG_U_OTHER_SYMBOL},
+	{0x01fa6e, 0x01fa6f, PG_U_UNASSIGNED},
+	{0x01fa70, 0x01fa7c, PG_U_OTHER_SYMBOL},
+	{0x01fa7d, 0x01fa7f, PG_U_UNASSIGNED},
+	{0x01fa80, 0x01fa88, PG_U_OTHER_SYMBOL},
+	{0x01fa89, 0x01fa8f, PG_U_UNASSIGNED},
+	{0x01fa90, 0x01fabd, PG_U_OTHER_SYMBOL},
+	{0x01fabe, 0x01fabe, PG_U_UNASSIGNED},
+	{0x01fabf, 0x01fac5, PG_U_OTHER_SYMBOL},
+	{0x01fac6, 0x01facd, PG_U_UNASSIGNED},
+	{0x01face, 0x01fadb, PG_U_OTHER_SYMBOL},
+	{0x01fadc, 0x01fadf, PG_U_UNASSIGNED},
+	{0x01fae0, 0x01fae8, PG_U_OTHER_SYMBOL},
+	{0x01fae9, 0x01faef, PG_U_UNASSIGNED},
+	{0x01faf0, 0x01faf8, PG_U_OTHER_SYMBOL},
+	{0x01faf9, 0x01faff, PG_U_UNASSIGNED},
+	{0x01fb00, 0x01fb92, PG_U_OTHER_SYMBOL},
+	{0x01fb93, 0x01fb93, PG_U_UNASSIGNED},
+	{0x01fb94, 0x01fbca, PG_U_OTHER_SYMBOL},
+	{0x01fbcb, 0x01fbef, PG_U_UNASSIGNED},
+	{0x01fbf0, 0x01fbf9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01fbfa, 0x01ffff, PG_U_UNASSIGNED},
+	{0x020000, 0x02a6df, PG_U_OTHER_LETTER},
+	{0x02a6e0, 0x02a6ff, PG_U_UNASSIGNED},
+	{0x02a700, 0x02b739, PG_U_OTHER_LETTER},
+	{0x02b73a, 0x02b73f, PG_U_UNASSIGNED},
+	{0x02b740, 0x02b81d, PG_U_OTHER_LETTER},
+	{0x02b81e, 0x02b81f, PG_U_UNASSIGNED},
+	{0x02b820, 0x02cea1, PG_U_OTHER_LETTER},
+	{0x02cea2, 0x02ceaf, PG_U_UNASSIGNED},
+	{0x02ceb0, 0x02ebe0, PG_U_OTHER_LETTER},
+	{0x02ebe1, 0x02ebef, PG_U_UNASSIGNED},
+	{0x02ebf0, 0x02ee5d, PG_U_OTHER_LETTER},
+	{0x02ee5e, 0x02f7ff, PG_U_UNASSIGNED},
+	{0x02f800, 0x02fa1d, PG_U_OTHER_LETTER},
+	{0x02fa1e, 0x02ffff, PG_U_UNASSIGNED},
+	{0x030000, 0x03134a, PG_U_OTHER_LETTER},
+	{0x03134b, 0x03134f, PG_U_UNASSIGNED},
+	{0x031350, 0x0323af, PG_U_OTHER_LETTER},
+	{0x0323b0, 0x0e0000, PG_U_UNASSIGNED},
+	{0x0e0001, 0x0e0001, PG_U_FORMAT_CHAR},
+	{0x0e0002, 0x0e001f, PG_U_UNASSIGNED},
+	{0x0e0020, 0x0e007f, PG_U_FORMAT_CHAR},
+	{0x0e0080, 0x0e00ff, PG_U_UNASSIGNED},
+	{0x0e0100, 0x0e01ef, PG_U_NON_SPACING_MARK},
+	{0x0e01f0, 0x0effff, PG_U_UNASSIGNED},
+	{0x0f0000, 0x0ffffd, PG_U_PRIVATE_USE_CHAR},
+	{0x0ffffe, 0x0fffff, PG_U_UNASSIGNED},
+	{0x100000, 0x10fffd, PG_U_PRIVATE_USE_CHAR},
+	{0x10fffe, 0x10ffff, PG_U_UNASSIGNED}
+};
+
diff --git a/src/include/common/unicode_version.h b/src/include/common/unicode_version.h
new file mode 100644
index 0000000000..8e3d484ae6
--- /dev/null
+++ b/src/include/common/unicode_version.h
@@ -0,0 +1,16 @@
+/*-------------------------------------------------------------------------
+ *
+ * unicode_version.h
+ *	  Unicode version used by Postgres.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/unicode_version.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#define PG_UNICODE_VERSION		"15.1"
+
+
diff --git a/src/test/icu/t/010_database.pl b/src/test/icu/t/010_database.pl
index 0e9446cebe..67fc3bbf19 100644
--- a/src/test/icu/t/010_database.pl
+++ b/src/test/icu/t/010_database.pl
@@ -27,6 +27,10 @@ CREATE TABLE icu (def text, en text COLLATE "en-x-icu", upfirst text COLLATE upp
 INSERT INTO icu VALUES ('a', 'a', 'a'), ('b', 'b', 'b'), ('A', 'A', 'A'), ('B', 'B', 'B');
 });
 
+is( $node1->safe_psql('dbicu', q{SELECT icu_unicode_version() IS NOT NULL}),
+	qq(t),
+	'ICU unicode version defined');
+
 is( $node1->safe_psql('dbicu', q{SELECT def FROM icu ORDER BY def}),
 	qq(A
 a
diff --git a/src/test/regress/expected/unicode.out b/src/test/regress/expected/unicode.out
index f2713a2326..a846acb8c4 100644
--- a/src/test/regress/expected/unicode.out
+++ b/src/test/regress/expected/unicode.out
@@ -8,6 +8,24 @@ SELECT U&'\0061\0308bc' <> U&'\00E4bc' COLLATE "C" AS sanity_check;
  t
 (1 row)
 
+SELECT unicode_version() IS NOT NULL;
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT unicode_is_valid(U&'abc');
+ unicode_is_valid 
+------------------
+ t
+(1 row)
+
+SELECT unicode_is_valid(U&'abc\+10FFFF');
+ unicode_is_valid 
+------------------
+ f
+(1 row)
+
 SELECT normalize('');
  normalize 
 -----------
diff --git a/src/test/regress/sql/unicode.sql b/src/test/regress/sql/unicode.sql
index 63cd523f85..9f775266e9 100644
--- a/src/test/regress/sql/unicode.sql
+++ b/src/test/regress/sql/unicode.sql
@@ -5,6 +5,10 @@ SELECT getdatabaseencoding() <> 'UTF8' AS skip_test \gset
 
 SELECT U&'\0061\0308bc' <> U&'\00E4bc' COLLATE "C" AS sanity_check;
 
+SELECT unicode_version() IS NOT NULL;
+SELECT unicode_is_valid(U&'abc');
+SELECT unicode_is_valid(U&'abc\+10FFFF');
+
 SELECT normalize('');
 SELECT normalize(U&'\0061\0308\24D1c') = U&'\00E4\24D1c' COLLATE "C" AS test_default;
 SELECT normalize(U&'\0061\0308\24D1c', NFC) = U&'\00E4\24D1c' COLLATE "C" AS test_nfc;
-- 
2.34.1

#51Peter Eisentraut
peter@eisentraut.org
In reply to: Robert Haas (#49)
Re: Pre-proposal: unicode normalized text

On 10.10.23 16:02, Robert Haas wrote:

On Tue, Oct 10, 2023 at 2:44 AM Peter Eisentraut <peter@eisentraut.org> wrote:

Can you restate what this is supposed to be for? This thread appears to
have morphed from "let's normalize everything" to "let's check for
unassigned code points", but I'm not sure what we are aiming for now.

Jeff can say what he wants it for, but one obvious application would
be to have the ability to add a CHECK constraint that forbids
inserting unassigned code points into your database, which would be
useful if you're worried about forward-compatibility with collation
definitions that might be extended to cover those code points in the
future.

I don't see how this would really work in practice. Whether your data
has unassigned code points or not, when the collations are updated to
the next Unicode version, the collations will have a new version number,
and so you need to run the refresh procedure in any case.

#52Peter Eisentraut
peter@eisentraut.org
In reply to: Jeff Davis (#50)
Re: Pre-proposal: unicode normalized text

On 11.10.23 03:08, Jeff Davis wrote:

* unicode_is_valid(text): returns true if all codepoints are
assigned, false otherwise

We need to be careful about precise terminology. "Valid" has a defined
meaning for Unicode. A byte sequence can be valid or not as UTF-8. But
a string containing unassigned code points is not not-"valid" as Unicode.

* unicode_version(): version of unicode Postgres is built with
* icu_unicode_version(): version of Unicode ICU is built with

This seems easy enough, but it's not clear what users would actually do
with that.

#53Jeff Davis
pgsql@j-davis.com
In reply to: Peter Eisentraut (#52)
Re: Pre-proposal: unicode normalized text

On Wed, 2023-10-11 at 08:56 +0200, Peter Eisentraut wrote:

On 11.10.23 03:08, Jeff Davis wrote:

   * unicode_is_valid(text): returns true if all codepoints are
assigned, false otherwise

We need to be careful about precise terminology.  "Valid" has a
defined
meaning for Unicode.  A byte sequence can be valid or not as UTF-8. 
But
a string containing unassigned code points is not not-"valid" as
Unicode.

Agreed. Perhaps "unicode_assigned()" is better?

   * unicode_version(): version of unicode Postgres is built with
   * icu_unicode_version(): version of Unicode ICU is built with

This seems easy enough, but it's not clear what users would actually
do
with that.

Just there to make it visible. If it affects the semantics (which it
does currently for normalization) it seems wise to have some way to
access the version.

Regards,
Jeff Davis

#54Jeff Davis
pgsql@j-davis.com
In reply to: Peter Eisentraut (#51)
Re: Pre-proposal: unicode normalized text

On Wed, 2023-10-11 at 08:51 +0200, Peter Eisentraut wrote:

I don't see how this would really work in practice.  Whether your
data
has unassigned code points or not, when the collations are updated to
the next Unicode version, the collations will have a new version
number,
and so you need to run the refresh procedure in any case.

Even with a version number, we don't provide a great reresh procedure
or document how it should be done. In practice, avoiding unassigned
code points might mitigate some kinds of problems, especially for glibc
which has a very coarse version number.

In any case, a CHECK constraint to avoid unassigned code points has
utility to be forward-compatible with normalization, and also might
just be a good sanity check.

Regards,
Jeff Davis

#55Jeff Davis
pgsql@j-davis.com
In reply to: Peter Eisentraut (#52)
1 attachment(s)
Re: Pre-proposal: unicode normalized text

On Wed, 2023-10-11 at 08:56 +0200, Peter Eisentraut wrote:

We need to be careful about precise terminology.  "Valid" has a
defined
meaning for Unicode.  A byte sequence can be valid or not as UTF-8. 
But
a string containing unassigned code points is not not-"valid" as
Unicode.

New patch attached, function name is "unicode_assigned".

I believe the patch has utility as-is, but I've been brainstorming a
few more ideas that could build on it:

* Add a per-database option to enforce only storing assigned unicode
code points.

* (More radical) Add a per-database option to normalize all text in
NFC.

* Do character classification in Unicode rather than relying on
glibc/ICU. This would affect regex character classes, etc., but not
affect upper/lower/initcap nor collation. I did some experiments and
the General Category doesn't change a lot: a total of 197 characters
changed their General Category since Unicode 6.0.0, and only 5 since
ICU 11.0.0. I'm not quite sure how to expose this, but it seems like a
nicer way to handle it than tying it into the collation provider.

Regards,
Jeff Davis

Attachments:

v3-0001-Additional-unicode-primitive-functions.patchtext/x-patch; charset=UTF-8; name=v3-0001-Additional-unicode-primitive-functions.patchDownload
From 6dc71653c461cb54d85ddc516529189d9b87a0dd Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Thu, 5 Oct 2023 17:01:03 -0700
Subject: [PATCH v3] Additional unicode primitive functions.

Introduce unicode_version(), icu_unicode_version(), and
unicode_assigned().

The latter requires introducing a new lookup table, which is generated
along with the other lookup tables.

Discussion: https://postgr.es/m/CA+TgmoYzYR-yhU6k1XFCADeyj=Oyz2PkVsa3iKv+keM8wp-F_A@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |  113 +-
 src/backend/utils/adt/varlena.c               |   61 +
 src/common/Makefile                           |    1 +
 src/common/meson.build                        |    1 +
 src/common/unicode/Makefile                   |   19 +-
 src/common/unicode/category_test.c            |  103 +
 .../generate-unicode_category_table.pl        |  202 +
 .../unicode/generate-unicode_version.pl       |   48 +
 src/common/unicode/meson.build                |   40 +
 src/common/unicode/norm_test.c                |    2 +-
 src/common/unicode_category.c                 |  197 +
 src/include/catalog/pg_proc.dat               |   12 +
 src/include/common/unicode_category.h         |   57 +
 src/include/common/unicode_category_table.h   | 4039 +++++++++++++++++
 src/include/common/unicode_version.h          |   16 +
 src/test/icu/t/010_database.pl                |    4 +
 src/test/regress/expected/unicode.out         |   18 +
 src/test/regress/sql/unicode.sql              |    4 +
 18 files changed, 4915 insertions(+), 22 deletions(-)
 create mode 100644 src/common/unicode/category_test.c
 create mode 100644 src/common/unicode/generate-unicode_category_table.pl
 create mode 100644 src/common/unicode/generate-unicode_version.pl
 create mode 100644 src/common/unicode_category.c
 create mode 100644 src/include/common/unicode_category.h
 create mode 100644 src/include/common/unicode_category_table.h
 create mode 100644 src/include/common/unicode_version.h

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index affd1254bb..c7fe9cc59e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -2859,6 +2859,26 @@ repeat('Pg', 4) <returnvalue>PgPgPgPg</returnvalue>
        </para></entry>
       </row>
 
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>unicode_assigned</primary>
+        </indexterm>
+        <function>unicode_assigned</function> ( <type>text</type> )
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Returns <literal>true</literal> if all characters in the string are
+        assigned Unicode codepoints; <literal>false</literal> otherwise. This
+        function can only be used when the server encoding is
+        <literal>UTF8</literal>.
+       </para>
+       <para>
+        <literal>upper('tom')</literal>
+        <returnvalue>TOM</returnvalue>
+       </para></entry>
+      </row>
+
       <row>
        <entry role="func_table_entry"><para role="func_signature">
         <indexterm>
@@ -23428,25 +23448,6 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
         This is equivalent to <function>current_user</function>.
        </para></entry>
       </row>
-
-      <row>
-       <entry role="func_table_entry"><para role="func_signature">
-        <indexterm>
-         <primary>version</primary>
-        </indexterm>
-        <function>version</function> ()
-        <returnvalue>text</returnvalue>
-       </para>
-       <para>
-        Returns a string describing the <productname>PostgreSQL</productname>
-        server's version.  You can also get this information from
-        <xref linkend="guc-server-version"/>, or for a machine-readable
-        version use <xref linkend="guc-server-version-num"/>.  Software
-        developers should use <varname>server_version_num</varname> (available
-        since 8.2) or <xref linkend="libpq-PQserverVersion"/> instead of
-        parsing the text version.
-       </para></entry>
-      </row>
      </tbody>
     </tgroup>
    </table>
@@ -26333,6 +26334,80 @@ SELECT collation for ('foo' COLLATE "de_DE");
 
   </sect2>
 
+  <sect2 id="functions-info-version">
+   <title>Version Information Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-version"/>
+    print version information.
+   </para>
+
+   <table id="functions-version">
+    <title>Version Information Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>version</primary>
+        </indexterm>
+        <function>version</function> ()
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Returns a string describing the <productname>PostgreSQL</productname>
+        server's version.  You can also get this information from
+        <xref linkend="guc-server-version"/>, or for a machine-readable
+        version use <xref linkend="guc-server-version-num"/>.  Software
+        developers should use <varname>server_version_num</varname> (available
+        since 8.2) or <xref linkend="libpq-PQserverVersion"/> instead of
+        parsing the text version.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>unicode_version</primary>
+        </indexterm>
+        <function>unicode_version</function> ()
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Returns a string representing the version of Unicode used by
+        <productname>PostgreSQL</productname>.
+       </para></entry>
+      </row>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>icu_unicode_version</primary>
+        </indexterm>
+        <function>icu_unicode_version</function> ()
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Returns a string representing the version of Unicode used by ICU, if
+        the server was built with ICU support; otherwise returns
+        <literal>NULL</literal> </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   </sect1>
 
   <sect1 id="functions-admin">
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 72e1e24fe0..b69ff26d0f 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -23,7 +23,9 @@
 #include "catalog/pg_type.h"
 #include "common/hashfn.h"
 #include "common/int.h"
+#include "common/unicode_category.h"
 #include "common/unicode_norm.h"
+#include "common/unicode_version.h"
 #include "funcapi.h"
 #include "lib/hyperloglog.h"
 #include "libpq/pqformat.h"
@@ -6239,6 +6241,65 @@ unicode_norm_form_from_string(const char *formstr)
 	return form;
 }
 
+/*
+ * Returns version of Unicode used by Postgres in "major.minor" format. The
+ * third component ("update version") never involves additions to the
+ * character repertiore and is unimportant for most purposes.
+ *
+ * See: https://unicode.org/versions/
+ */
+Datum
+unicode_version(PG_FUNCTION_ARGS)
+{
+	PG_RETURN_TEXT_P(cstring_to_text(PG_UNICODE_VERSION));
+}
+
+/*
+ * Returns version of Unicode used by ICU, if enabled; otherwise NULL.
+ */
+Datum
+icu_unicode_version(PG_FUNCTION_ARGS)
+{
+#ifdef USE_ICU
+	PG_RETURN_TEXT_P(cstring_to_text(U_UNICODE_VERSION));
+#else
+	PG_RETURN_NULL();
+#endif
+}
+
+/*
+ * Check whether the string contains only assigned Unicode code
+ * points. Requires that the database encoding is UTF-8.
+ */
+Datum
+unicode_assigned(PG_FUNCTION_ARGS)
+{
+	text			*input = PG_GETARG_TEXT_PP(0);
+	unsigned char	*p;
+	int				 size;
+
+	if (GetDatabaseEncoding() != PG_UTF8)
+		ereport(ERROR,
+				(errcode(ERRCODE_SYNTAX_ERROR),
+				 errmsg("Unicode normalization can only be performed if server encoding is UTF8")));
+
+	/* convert to pg_wchar */
+	size = pg_mbstrlen_with_len(VARDATA_ANY(input), VARSIZE_ANY_EXHDR(input));
+	p = (unsigned char *) VARDATA_ANY(input);
+	for (int i = 0; i < size; i++)
+	{
+		pg_wchar	uchar	 = utf8_to_unicode(p);
+		int			category = unicode_category(uchar);
+
+		if (category == PG_U_UNASSIGNED)
+			PG_RETURN_BOOL(false);
+
+		p += pg_utf_mblen(p);
+	}
+
+	PG_RETURN_BOOL(true);
+}
+
 Datum
 unicode_normalize_func(PG_FUNCTION_ARGS)
 {
diff --git a/src/common/Makefile b/src/common/Makefile
index 70884be00c..8de31d4763 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -78,6 +78,7 @@ OBJS_COMMON = \
 	scram-common.o \
 	string.o \
 	stringinfo.o \
+	unicode_category.o \
 	unicode_norm.o \
 	username.o \
 	wait_error.o \
diff --git a/src/common/meson.build b/src/common/meson.build
index ae05ac63cf..8be145c0fb 100644
--- a/src/common/meson.build
+++ b/src/common/meson.build
@@ -30,6 +30,7 @@ common_sources = files(
   'scram-common.c',
   'string.c',
   'stringinfo.c',
+  'unicode_category.c',
   'unicode_norm.c',
   'username.c',
   'wait_error.c',
diff --git a/src/common/unicode/Makefile b/src/common/unicode/Makefile
index 382da476cf..27a7d5a807 100644
--- a/src/common/unicode/Makefile
+++ b/src/common/unicode/Makefile
@@ -15,11 +15,15 @@ include $(top_builddir)/src/Makefile.global
 override CPPFLAGS := -DFRONTEND -I. $(CPPFLAGS)
 LIBS += $(PTHREAD_LIBS)
 
+LDFLAGS_INTERNAL += $(ICU_LIBS)
+CPPFLAGS += $(ICU_CFLAGS)
+
 # By default, do nothing.
 all:
 
-update-unicode: unicode_norm_table.h unicode_nonspacing_table.h unicode_east_asian_fw_table.h unicode_normprops_table.h unicode_norm_hashfunc.h
+update-unicode: unicode_category_table.h unicode_norm_table.h unicode_nonspacing_table.h unicode_east_asian_fw_table.h unicode_normprops_table.h unicode_norm_hashfunc.h unicode_version.h
 	mv $^ $(top_srcdir)/src/include/common/
+	$(MAKE) category-check
 	$(MAKE) normalization-check
 
 # These files are part of the Unicode Character Database. Download
@@ -28,6 +32,12 @@ update-unicode: unicode_norm_table.h unicode_nonspacing_table.h unicode_east_asi
 UnicodeData.txt EastAsianWidth.txt DerivedNormalizationProps.txt CompositionExclusions.txt NormalizationTest.txt: $(top_builddir)/src/Makefile.global
 	$(DOWNLOAD) https://www.unicode.org/Public/$(UNICODE_VERSION)/ucd/$(@F)
 
+unicode_version.h: generate-unicode_version.pl
+	$(PERL) $< --version $(UNICODE_VERSION)
+
+unicode_category_table.h: generate-unicode_category_table.pl UnicodeData.txt
+	$(PERL) $<
+
 # Generation of conversion tables used for string normalization with
 # UTF-8 strings.
 unicode_norm_hashfunc.h: unicode_norm_table.h
@@ -45,9 +55,14 @@ unicode_normprops_table.h: generate-unicode_normprops_table.pl DerivedNormalizat
 	$(PERL) $^ >$@
 
 # Test suite
+category-check: category_test
+	./category_test
+
 normalization-check: norm_test
 	./norm_test
 
+category_test: category_test.o ../unicode_category.o | submake-common
+
 norm_test: norm_test.o ../unicode_norm.o | submake-common
 
 norm_test.o: norm_test_table.h
@@ -64,7 +79,7 @@ norm_test_table.h: generate-norm_test_table.pl NormalizationTest.txt
 
 
 clean:
-	rm -f $(OBJS) norm_test norm_test.o
+	rm -f $(OBJS) category_test category_test.o norm_test norm_test.o
 
 distclean: clean
 	rm -f UnicodeData.txt EastAsianWidth.txt CompositionExclusions.txt NormalizationTest.txt norm_test_table.h unicode_norm_table.h
diff --git a/src/common/unicode/category_test.c b/src/common/unicode/category_test.c
new file mode 100644
index 0000000000..2cbd4250f9
--- /dev/null
+++ b/src/common/unicode/category_test.c
@@ -0,0 +1,103 @@
+/*-------------------------------------------------------------------------
+ * category_test.c
+ *		Program to test Unicode general category functions.
+ *
+ * Portions Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/common/unicode/category_test.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#ifdef USE_ICU
+#include <unicode/uchar.h>
+#endif
+#include "common/unicode_category.h"
+#include "common/unicode_version.h"
+
+/*
+ * Parse X.Y[.Z] into integer composed from X and Y.
+ */
+static int
+parse_unicode_version(const char *version)
+{
+	int n, major, minor;
+
+	n = sscanf(version, "%d.%d", &major, &minor);
+
+	Assert(n == 2);
+	Assert(major < 0xff && minor < 0xff);
+
+	return major * 100 + minor;
+}
+
+/*
+ * Exhaustively test that the Unicode category for each codepoint matches that
+ * returned by ICU.
+ */
+int
+main(int argc, char **argv)
+{
+#ifdef USE_ICU
+	int		pg_unicode_version = parse_unicode_version(PG_UNICODE_VERSION);
+	int		icu_unicode_version = parse_unicode_version(U_UNICODE_VERSION);
+	int		pg_skipped_codepoints  = 0;
+	int		icu_skipped_codepoints = 0;
+
+	printf("Postgres Unicode Version:\t%s\n", PG_UNICODE_VERSION);
+	printf("ICU Unicode Version:\t\t%s\n", U_UNICODE_VERSION);
+
+	for (UChar32 code = 0; code <= 0x10ffff; code++)
+	{
+		uint8_t pg_category = unicode_category(code);
+		uint8_t icu_category = u_charType(code);
+		if (pg_category != icu_category)
+		{
+			/*
+			 * A version mismatch means that some assigned codepoints in the
+			 * newer version may be unassigned in the older version. That's
+			 * OK, though the test will not cover those codepoints marked
+			 * unassigned in the older version (that is, it will no longer be
+			 * an exhaustive test).
+			 */
+			if (pg_category == PG_U_UNASSIGNED &&
+				pg_unicode_version < icu_unicode_version)
+				pg_skipped_codepoints++;
+			else if (icu_category == PG_U_UNASSIGNED &&
+					 icu_unicode_version < pg_unicode_version)
+				icu_skipped_codepoints++;
+			else
+			{
+				printf("FAILURE for codepoint %06x\n", code);
+				printf("Postgres category:	%02d %s %s\n", pg_category,
+					   unicode_category_short(pg_category),
+					   unicode_category_string(pg_category));
+				printf("ICU category:		%02d %s %s\n", icu_category,
+					   unicode_category_short(icu_category),
+					   unicode_category_string(icu_category));
+				printf("\n");
+				exit(1);
+			}
+		}
+	}
+
+	if (pg_skipped_codepoints > 0)
+		printf("Skipped %d codepoints unassigned in Postgres due to Unicode version mismatch.\n",
+			   pg_skipped_codepoints);
+	if (icu_skipped_codepoints > 0)
+		printf("Skipped %d codepoints unassigned in ICU due to Unicode version mismatch.\n",
+			   icu_skipped_codepoints);
+
+	printf("category_test: All tests successful!\n");
+	exit(0);
+#else
+	printf("ICU support required for test; skipping.\n");
+	exit(0);
+#endif
+}
diff --git a/src/common/unicode/generate-unicode_category_table.pl b/src/common/unicode/generate-unicode_category_table.pl
new file mode 100644
index 0000000000..bec34d591d
--- /dev/null
+++ b/src/common/unicode/generate-unicode_category_table.pl
@@ -0,0 +1,202 @@
+#!/usr/bin/perl
+#
+# Generate a code point category table and its lookup utilities, using
+# Unicode data files as input.
+#
+# Input: UnicodeData.txt
+# Output: unicode_category_table.h
+#
+# Copyright (c) 2000-2023, PostgreSQL Global Development Group
+
+use strict;
+use warnings;
+use Getopt::Long;
+
+use FindBin;
+use lib "$FindBin::RealBin/../../tools/";
+
+my $CATEGORY_UNASSIGNED = 'Cn';
+
+my $output_path = '.';
+
+GetOptions('outdir:s' => \$output_path);
+
+my $output_table_file = "$output_path/unicode_category_table.h";
+
+my $FH;
+
+# Read entries from UnicodeData.txt into a list of codepoint ranges
+# and their general category.
+my @category_ranges = ();
+my $range_start = undef;
+my $range_end = undef;
+my $range_category = undef;
+
+# If between a "<..., First>" entry and a "<..., Last>" entry, the gap in
+# codepoints represents a range, and $gap_category is equal to the
+# category for both (which must match). Otherwise, the gap represents
+# unassigned code points.
+my $gap_category = undef;
+
+open($FH, '<', "$output_path/UnicodeData.txt")
+  or die "Could not open $output_path/UnicodeData.txt: $!.";
+while (my $line = <$FH>)
+{
+	my @elts = split(';', $line);
+	my $code = hex($elts[0]);
+	my $name = $elts[1];
+	my $category = $elts[2];
+
+	die "codepoint out of range" if $code > 0x10FFFF;
+	die "unassigned codepoint in UnicodeData.txt" if $category eq $CATEGORY_UNASSIGNED;
+
+	if (!defined($range_start)) {
+		my $code_str = sprintf "0x%06x", $code;
+		die if defined($range_end) || defined($range_category) || defined($gap_category);
+		die "unexpected first entry <..., Last>" if ($name =~ /Last>/);
+		die "expected 0x000000 for first entry, got $code_str" if $code != 0x000000;
+
+		# initialize
+		$range_start = $code;
+		$range_end = $code;
+		$range_category = $category;
+		if ($name =~ /<.*, First>$/) {
+			$gap_category = $category;
+		} else {
+			$gap_category = $CATEGORY_UNASSIGNED;
+		}
+		next;
+	}
+
+	# Gap in codepoints detected. If it's a different category than
+	# the current range, emit the current range and initialize a new
+	# range representing the gap.
+	if ($range_end + 1 != $code && $range_category ne $gap_category) {
+		push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+		$range_start = $range_end + 1;
+		$range_end = $code - 1;
+		$range_category = $gap_category;
+	}
+
+	# different category; new range
+	if ($range_category ne $category) {
+		push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+		$range_start = $code;
+		$range_end = $code;
+		$range_category = $category;
+	}
+
+	if ($name =~ /<.*, First>$/) {
+		die "<..., First> entry unexpectedly follows <..., Last> entry"
+		  if $gap_category ne $CATEGORY_UNASSIGNED;
+		$gap_category = $category;
+	}
+	elsif ($name =~ /<.*, Last>$/) {
+		die "<..., First> and <..., Last> entries have mismatching general category"
+		  if $gap_category ne $category;
+		$gap_category = $CATEGORY_UNASSIGNED;
+	}
+	else {
+		die "unexpected entry found between <..., First> and <..., Last>"
+		  if $gap_category ne $CATEGORY_UNASSIGNED;
+	}
+
+	$range_end = $code;
+}
+close $FH;
+
+die "<..., First> entry with no corresponding <..., Last> entry"
+  if $gap_category ne $CATEGORY_UNASSIGNED;
+
+# emit final range
+push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+
+# emit range for any unassigned code points after last entry
+if ($range_end < 0x10FFFF) {
+	$range_start = $range_end + 1;
+	$range_end = 0x10FFFF;
+	$range_category = $CATEGORY_UNASSIGNED;
+	push(@category_ranges, {start => $range_start, end => $range_end, category => $range_category});
+}
+
+my $num_ranges = scalar @category_ranges;
+
+# See: https://www.unicode.org/reports/tr44/#General_Category_Values
+my $categories = {
+	Cn => 'PG_U_UNASSIGNED',
+	Lu => 'PG_U_UPPERCASE_LETTER',
+	Ll => 'PG_U_LOWERCASE_LETTER',
+	Lt => 'PG_U_TITLECASE_LETTER',
+	Lm => 'PG_U_MODIFIER_LETTER',
+	Lo => 'PG_U_OTHER_LETTER',
+	Mn => 'PG_U_NON_SPACING_MARK',
+	Me => 'PG_U_ENCLOSING_MARK',
+	Mc => 'PG_U_COMBINING_SPACING_MARK',
+	Nd => 'PG_U_DECIMAL_DIGIT_NUMBER',
+	Nl => 'PG_U_LETTER_NUMBER',
+	No => 'PG_U_OTHER_NUMBER',
+	Zs => 'PG_U_SPACE_SEPARATOR',
+	Zl => 'PG_U_LINE_SEPARATOR',
+	Zp => 'PG_U_PARAGRAPH_SEPARATOR',
+	Cc => 'PG_U_CONTROL_CHAR',
+	Cf => 'PG_U_FORMAT_CHAR',
+	Co => 'PG_U_PRIVATE_USE_CHAR',
+	Cs => 'PG_U_SURROGATE',
+	Pd => 'PG_U_DASH_PUNCTUATION',
+	Ps => 'PG_U_START_PUNCTUATION',
+	Pe => 'PG_U_END_PUNCTUATION',
+	Pc => 'PG_U_CONNECTOR_PUNCTUATION',
+	Po => 'PG_U_OTHER_PUNCTUATION',
+	Sm => 'PG_U_MATH_SYMBOL',
+	Sc => 'PG_U_CURRENCY_SYMBOL',
+	Sk => 'PG_U_MODIFIER_SYMBOL',
+	So => 'PG_U_OTHER_SYMBOL',
+	Pi => 'PG_U_INITIAL_PUNCTUATION',
+	Pf => 'PG_U_FINAL_PUNCTUATION'
+};
+
+# Start writing out the output files
+open my $OT, '>', $output_table_file
+  or die "Could not open output file $output_table_file: $!\n";
+
+print $OT <<HEADER;
+/*-------------------------------------------------------------------------
+ *
+ * unicode_category_table.h
+ *	  Category table for Unicode character classification.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/unicode_category_table.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * File auto-generated by src/common/unicode/generate-unicode_category_table.pl,
+ * do not edit. There is deliberately not an #ifndef PG_UNICODE_CATEGORY_TABLE_H
+ * here.
+ */
+typedef struct
+{
+	uint32		first;	/* Unicode codepoint */
+	uint32		last;		/* Unicode codepoint */
+	uint8		category;		/* combining class of character */
+} pg_category_range;
+
+/* table of Unicode codepoint ranges and their categories */
+static const pg_category_range unicode_categories[$num_ranges] =
+{
+HEADER
+
+my $firsttime = 1;
+foreach my $range (@category_ranges) {
+	printf $OT ",\n" unless $firsttime;
+	$firsttime = 0;
+
+	my $category = $categories->{$range->{category}};
+	die "category missing: $range->{category}" unless $category;
+	printf $OT "\t{0x%06x, 0x%06x, %s}", $range->{start}, $range->{end}, $category;
+}
+print $OT "\n};\n\n";
diff --git a/src/common/unicode/generate-unicode_version.pl b/src/common/unicode/generate-unicode_version.pl
new file mode 100644
index 0000000000..4dd400e32d
--- /dev/null
+++ b/src/common/unicode/generate-unicode_version.pl
@@ -0,0 +1,48 @@
+#!/usr/bin/perl
+#
+# Generate header file with Unicode version used by Postgres.
+#
+# Output: unicode_version.h
+#
+# Copyright (c) 2000-2023, PostgreSQL Global Development Group
+
+use strict;
+use warnings;
+use Getopt::Long;
+
+use FindBin;
+use lib "$FindBin::RealBin/../../tools/";
+
+my $output_path = '.';
+my $version_str = undef;
+
+GetOptions('outdir:s' => \$output_path, 'version:s' => \$version_str);
+
+my @version_parts = split /\./, $version_str;
+
+my $unicode_version_str = sprintf "%d.%d", $version_parts[0], $version_parts[1];
+
+my $output_file = "$output_path/unicode_version.h";
+
+# Start writing out the output files
+open my $OT, '>', $output_file
+  or die "Could not open output file $output_file: $!\n";
+
+print $OT <<HEADER;
+/*-------------------------------------------------------------------------
+ *
+ * unicode_version.h
+ *	  Unicode version used by Postgres.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/unicode_version.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#define PG_UNICODE_VERSION		"$unicode_version_str"
+
+
+HEADER
diff --git a/src/common/unicode/meson.build b/src/common/unicode/meson.build
index 357ca2f9fb..6af46122c4 100644
--- a/src/common/unicode/meson.build
+++ b/src/common/unicode/meson.build
@@ -24,6 +24,25 @@ endforeach
 
 update_unicode_targets = []
 
+update_unicode_targets += \
+  custom_target('unicode_version.h',
+    output: ['unicode_version.h'],
+    command: [
+      perl, files('generate-unicode_version.pl'),
+      '--outdir', '@OUTDIR@', '--version', UNICODE_VERSION],
+    build_by_default: false,
+  )
+
+update_unicode_targets += \
+  custom_target('unicode_category_table.h',
+    input: [unicode_data['UnicodeData.txt']],
+    output: ['unicode_category_table.h'],
+    command: [
+      perl, files('generate-unicode_category_table.pl'),
+      '--outdir', '@OUTDIR@', '@INPUT@'],
+    build_by_default: false,
+  )
+
 update_unicode_targets += \
   custom_target('unicode_norm_table.h',
     input: [unicode_data['UnicodeData.txt'], unicode_data['CompositionExclusions.txt']],
@@ -73,6 +92,17 @@ norm_test_table = custom_target('norm_test_table.h',
 
 inc = include_directories('.')
 
+category_test = executable('category_test',
+  ['category_test.c'],
+  dependencies: [frontend_port_code, icu],
+  include_directories: inc,
+  link_with: [common_static, pgport_static],
+  build_by_default: false,
+  kwargs: default_bin_args + {
+    'install': false,
+  }
+)
+
 norm_test = executable('norm_test',
   ['norm_test.c', norm_test_table],
   dependencies: [frontend_port_code],
@@ -86,6 +116,16 @@ norm_test = executable('norm_test',
 
 update_unicode_dep = []
 
+if not meson.is_cross_build()
+  update_unicode_dep += custom_target('category_test.run',
+    output: 'category_test.run',
+    input: update_unicode_targets,
+    command: [category_test, UNICODE_VERSION],
+    build_by_default: false,
+    build_always_stale: true,
+  )
+endif
+
 if not meson.is_cross_build()
   update_unicode_dep += custom_target('norm_test.run',
     output: 'norm_test.run',
diff --git a/src/common/unicode/norm_test.c b/src/common/unicode/norm_test.c
index 809a6dee54..b6097b912a 100644
--- a/src/common/unicode/norm_test.c
+++ b/src/common/unicode/norm_test.c
@@ -81,6 +81,6 @@ main(int argc, char **argv)
 		}
 	}
 
-	printf("All tests successful!\n");
+	printf("norm_test: All tests successful!\n");
 	exit(0);
 }
diff --git a/src/common/unicode_category.c b/src/common/unicode_category.c
new file mode 100644
index 0000000000..b8b8ee8a4e
--- /dev/null
+++ b/src/common/unicode_category.c
@@ -0,0 +1,197 @@
+/*-------------------------------------------------------------------------
+ * unicode_category.c
+ *		Determine general category of Unicode characters.
+ *
+ * Portions Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/common/unicode_category.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef FRONTEND
+#include "postgres.h"
+#else
+#include "postgres_fe.h"
+#endif
+
+#include "common/unicode_category.h"
+#include "common/unicode_category_table.h"
+
+/*
+ * Unicode general category for the given codepoint.
+ */
+pg_unicode_category
+unicode_category(pg_wchar ucs)
+{
+	int	min = 0;
+	int	mid;
+	int max = (sizeof(unicode_categories) / sizeof(pg_category_range)) - 1;
+
+	Assert (ucs >= unicode_categories[0].first &&
+			ucs <= unicode_categories[max].last);
+
+	while (max >= min)
+	{
+		mid = (min + max) / 2;
+		if (ucs > unicode_categories[mid].last)
+			min = mid + 1;
+		else if (ucs < unicode_categories[mid].first)
+			max = mid - 1;
+		else
+			return unicode_categories[mid].category;
+	}
+
+	Assert(false);
+	return (pg_unicode_category) -1;
+}
+
+/*
+ * Description of Unicode general category.
+ *
+ * See: https://www.unicode.org/reports/tr44/#General_Category_Values
+ */
+const char *
+unicode_category_string(pg_unicode_category category)
+{
+	switch (category)
+	{
+		case PG_U_UNASSIGNED:
+			return "Unassigned";
+		case PG_U_UPPERCASE_LETTER:
+			return "Uppercase_Letter";
+		case PG_U_LOWERCASE_LETTER:
+			return "Lowercase_Letter";
+		case PG_U_TITLECASE_LETTER:
+			return "Titlecase_Letter";
+		case PG_U_MODIFIER_LETTER:
+			return "Modifier_Letter";
+		case PG_U_OTHER_LETTER:
+			return "Other_Letter";
+		case PG_U_NON_SPACING_MARK:
+			return "Nonspacing_Mark";
+		case PG_U_ENCLOSING_MARK:
+			return "Enclosing_Mark";
+		case PG_U_COMBINING_SPACING_MARK:
+			return "Spacing_Mark";
+		case PG_U_DECIMAL_DIGIT_NUMBER:
+			return "Decimal_Number";
+		case PG_U_LETTER_NUMBER:
+			return "Letter_Number";
+		case PG_U_OTHER_NUMBER:
+			return "Other_Number";
+		case PG_U_SPACE_SEPARATOR:
+			return "Space_Separator";
+		case PG_U_LINE_SEPARATOR:
+			return "Line_Separator";
+		case PG_U_PARAGRAPH_SEPARATOR:
+			return "Paragraph_Separator";
+		case PG_U_CONTROL_CHAR:
+			return "Control";
+		case PG_U_FORMAT_CHAR:
+			return "Format";
+		case PG_U_PRIVATE_USE_CHAR:
+			return "Private_Use";
+		case PG_U_SURROGATE:
+			return "Surrogate";
+		case PG_U_DASH_PUNCTUATION:
+			return "Dash_Punctuation";
+		case PG_U_START_PUNCTUATION:
+			return "Open_Punctuation";
+		case PG_U_END_PUNCTUATION:
+			return "Close_Punctuation";
+		case PG_U_CONNECTOR_PUNCTUATION:
+			return "Connector_Punctuation";
+		case PG_U_OTHER_PUNCTUATION:
+			return "Other_Punctuation";
+		case PG_U_MATH_SYMBOL:
+			return "Math_Symbol";
+		case PG_U_CURRENCY_SYMBOL:
+			return "Currency_Symbol";
+		case PG_U_MODIFIER_SYMBOL:
+			return "Modifier_Symbol";
+		case PG_U_OTHER_SYMBOL:
+			return "Other_Symbol";
+		case PG_U_INITIAL_PUNCTUATION:
+			return "Initial_Punctuation";
+		case PG_U_FINAL_PUNCTUATION:
+			return "Final_Punctuation";
+		default:
+			return "Unrecognized";
+	}
+}
+
+/*
+ * Short code for Unicode general category.
+ *
+ * See: https://www.unicode.org/reports/tr44/#General_Category_Values
+ */
+const char *
+unicode_category_short(pg_unicode_category category)
+{
+	switch (category)
+	{
+		case PG_U_UNASSIGNED:
+			return "Cn";
+		case PG_U_UPPERCASE_LETTER:
+			return "Lu";
+		case PG_U_LOWERCASE_LETTER:
+			return "Ll";
+		case PG_U_TITLECASE_LETTER:
+			return "Lt";
+		case PG_U_MODIFIER_LETTER:
+			return "Lm";
+		case PG_U_OTHER_LETTER:
+			return "Lo";
+		case PG_U_NON_SPACING_MARK:
+			return "Mn";
+		case PG_U_ENCLOSING_MARK:
+			return "Me";
+		case PG_U_COMBINING_SPACING_MARK:
+			return "Mc";
+		case PG_U_DECIMAL_DIGIT_NUMBER:
+			return "Nd";
+		case PG_U_LETTER_NUMBER:
+			return "Nl";
+		case PG_U_OTHER_NUMBER:
+			return "No";
+		case PG_U_SPACE_SEPARATOR:
+			return "Zs";
+		case PG_U_LINE_SEPARATOR:
+			return "Zl";
+		case PG_U_PARAGRAPH_SEPARATOR:
+			return "Zp";
+		case PG_U_CONTROL_CHAR:
+			return "Cc";
+		case PG_U_FORMAT_CHAR:
+			return "Cf";
+		case PG_U_PRIVATE_USE_CHAR:
+			return "Co";
+		case PG_U_SURROGATE:
+			return "Cs";
+		case PG_U_DASH_PUNCTUATION:
+			return "Pd";
+		case PG_U_START_PUNCTUATION:
+			return "Ps";
+		case PG_U_END_PUNCTUATION:
+			return "Pe";
+		case PG_U_CONNECTOR_PUNCTUATION:
+			return "Pc";
+		case PG_U_OTHER_PUNCTUATION:
+			return "Po";
+		case PG_U_MATH_SYMBOL:
+			return "Sm";
+		case PG_U_CURRENCY_SYMBOL:
+			return "Sc";
+		case PG_U_MODIFIER_SYMBOL:
+			return "Sk";
+		case PG_U_OTHER_SYMBOL:
+			return "So";
+		case PG_U_INITIAL_PUNCTUATION:
+			return "Pi";
+		case PG_U_FINAL_PUNCTUATION:
+			return "Pf";
+		default:
+			return "??";
+	}
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 72ea4aa8b8..89c7c7ba31 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12019,6 +12019,18 @@
   proname => 'pg_partition_root', prorettype => 'regclass',
   proargtypes => 'regclass', prosrc => 'pg_partition_root' },
 
+{ oid => '4549', descr => 'Unicode version used by Postgres',
+  proname => 'unicode_version', prorettype => 'text', proargtypes => '',
+  prosrc => 'unicode_version' },
+
+{ oid => '6099', descr => 'Unicode version used by ICU, if enabled',
+  proname => 'icu_unicode_version', prorettype => 'text', proargtypes => '',
+  prosrc => 'icu_unicode_version' },
+
+{ oid => '6105', descr => 'check valid Unicode',
+  proname => 'unicode_assigned', prorettype => 'bool', proargtypes => 'text',
+  prosrc => 'unicode_assigned' },
+
 { oid => '4350', descr => 'Unicode normalization',
   proname => 'normalize', prorettype => 'text', proargtypes => 'text text',
   prosrc => 'unicode_normalize_func' },
diff --git a/src/include/common/unicode_category.h b/src/include/common/unicode_category.h
new file mode 100644
index 0000000000..e4301be726
--- /dev/null
+++ b/src/include/common/unicode_category.h
@@ -0,0 +1,57 @@
+/*-------------------------------------------------------------------------
+ *
+ * unicode_category.h
+ *	  Routines for determining the category of Unicode characters.
+ *
+ * These definitions can be used by both frontend and backend code.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * src/include/common/unicode_category.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef UNICODE_CATEGORY_H
+#define UNICODE_CATEGORY_H
+
+#include "mb/pg_wchar.h"
+
+/* matches corresponding numeric values of UCharCategory, defined by ICU */
+typedef enum pg_unicode_category {
+	PG_U_UNASSIGNED = 0,
+	PG_U_UPPERCASE_LETTER = 1,
+	PG_U_LOWERCASE_LETTER = 2,
+	PG_U_TITLECASE_LETTER = 3,
+	PG_U_MODIFIER_LETTER = 4,
+	PG_U_OTHER_LETTER = 5,
+	PG_U_NON_SPACING_MARK = 6,
+	PG_U_ENCLOSING_MARK = 7,
+	PG_U_COMBINING_SPACING_MARK = 8,
+	PG_U_DECIMAL_DIGIT_NUMBER = 9,
+	PG_U_LETTER_NUMBER = 10,
+	PG_U_OTHER_NUMBER = 11,
+	PG_U_SPACE_SEPARATOR = 12,
+	PG_U_LINE_SEPARATOR = 13,
+	PG_U_PARAGRAPH_SEPARATOR = 14,
+	PG_U_CONTROL_CHAR = 15,
+	PG_U_FORMAT_CHAR = 16,
+	PG_U_PRIVATE_USE_CHAR = 17,
+	PG_U_SURROGATE = 18,
+	PG_U_DASH_PUNCTUATION = 19,
+	PG_U_START_PUNCTUATION = 20,
+	PG_U_END_PUNCTUATION = 21,
+	PG_U_CONNECTOR_PUNCTUATION = 22,
+	PG_U_OTHER_PUNCTUATION = 23,
+	PG_U_MATH_SYMBOL = 24,
+	PG_U_CURRENCY_SYMBOL = 25,
+	PG_U_MODIFIER_SYMBOL = 26,
+	PG_U_OTHER_SYMBOL = 27,
+	PG_U_INITIAL_PUNCTUATION = 28,
+	PG_U_FINAL_PUNCTUATION = 29
+} pg_unicode_category;
+
+extern pg_unicode_category unicode_category(pg_wchar ucs);
+const char *unicode_category_string(pg_unicode_category category);
+const char *unicode_category_short(pg_unicode_category category);
+
+#endif							/* UNICODE_CATEGORY_H */
diff --git a/src/include/common/unicode_category_table.h b/src/include/common/unicode_category_table.h
new file mode 100644
index 0000000000..3125cbdbf5
--- /dev/null
+++ b/src/include/common/unicode_category_table.h
@@ -0,0 +1,4039 @@
+/*-------------------------------------------------------------------------
+ *
+ * unicode_category_table.h
+ *	  Category table for Unicode character classification.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/unicode_category_table.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * File auto-generated by src/common/unicode/generate-unicode_category_table.pl,
+ * do not edit. There is deliberately not an #ifndef PG_UNICODE_CATEGORY_TABLE_H
+ * here.
+ */
+typedef struct
+{
+	uint32		first;	/* Unicode codepoint */
+	uint32		last;		/* Unicode codepoint */
+	uint8		category;		/* combining class of character */
+} pg_category_range;
+
+/* table of Unicode codepoint ranges and their categories */
+static const pg_category_range unicode_categories[4009] =
+{
+	{0x000000, 0x00001f, PG_U_CONTROL_CHAR},
+	{0x000020, 0x000020, PG_U_SPACE_SEPARATOR},
+	{0x000021, 0x000023, PG_U_OTHER_PUNCTUATION},
+	{0x000024, 0x000024, PG_U_CURRENCY_SYMBOL},
+	{0x000025, 0x000027, PG_U_OTHER_PUNCTUATION},
+	{0x000028, 0x000028, PG_U_START_PUNCTUATION},
+	{0x000029, 0x000029, PG_U_END_PUNCTUATION},
+	{0x00002a, 0x00002a, PG_U_OTHER_PUNCTUATION},
+	{0x00002b, 0x00002b, PG_U_MATH_SYMBOL},
+	{0x00002c, 0x00002c, PG_U_OTHER_PUNCTUATION},
+	{0x00002d, 0x00002d, PG_U_DASH_PUNCTUATION},
+	{0x00002e, 0x00002f, PG_U_OTHER_PUNCTUATION},
+	{0x000030, 0x000039, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00003a, 0x00003b, PG_U_OTHER_PUNCTUATION},
+	{0x00003c, 0x00003e, PG_U_MATH_SYMBOL},
+	{0x00003f, 0x000040, PG_U_OTHER_PUNCTUATION},
+	{0x000041, 0x00005a, PG_U_UPPERCASE_LETTER},
+	{0x00005b, 0x00005b, PG_U_START_PUNCTUATION},
+	{0x00005c, 0x00005c, PG_U_OTHER_PUNCTUATION},
+	{0x00005d, 0x00005d, PG_U_END_PUNCTUATION},
+	{0x00005e, 0x00005e, PG_U_MODIFIER_SYMBOL},
+	{0x00005f, 0x00005f, PG_U_CONNECTOR_PUNCTUATION},
+	{0x000060, 0x000060, PG_U_MODIFIER_SYMBOL},
+	{0x000061, 0x00007a, PG_U_LOWERCASE_LETTER},
+	{0x00007b, 0x00007b, PG_U_START_PUNCTUATION},
+	{0x00007c, 0x00007c, PG_U_MATH_SYMBOL},
+	{0x00007d, 0x00007d, PG_U_END_PUNCTUATION},
+	{0x00007e, 0x00007e, PG_U_MATH_SYMBOL},
+	{0x00007f, 0x00009f, PG_U_CONTROL_CHAR},
+	{0x0000a0, 0x0000a0, PG_U_SPACE_SEPARATOR},
+	{0x0000a1, 0x0000a1, PG_U_OTHER_PUNCTUATION},
+	{0x0000a2, 0x0000a5, PG_U_CURRENCY_SYMBOL},
+	{0x0000a6, 0x0000a6, PG_U_OTHER_SYMBOL},
+	{0x0000a7, 0x0000a7, PG_U_OTHER_PUNCTUATION},
+	{0x0000a8, 0x0000a8, PG_U_MODIFIER_SYMBOL},
+	{0x0000a9, 0x0000a9, PG_U_OTHER_SYMBOL},
+	{0x0000aa, 0x0000aa, PG_U_OTHER_LETTER},
+	{0x0000ab, 0x0000ab, PG_U_INITIAL_PUNCTUATION},
+	{0x0000ac, 0x0000ac, PG_U_MATH_SYMBOL},
+	{0x0000ad, 0x0000ad, PG_U_FORMAT_CHAR},
+	{0x0000ae, 0x0000ae, PG_U_OTHER_SYMBOL},
+	{0x0000af, 0x0000af, PG_U_MODIFIER_SYMBOL},
+	{0x0000b0, 0x0000b0, PG_U_OTHER_SYMBOL},
+	{0x0000b1, 0x0000b1, PG_U_MATH_SYMBOL},
+	{0x0000b2, 0x0000b3, PG_U_OTHER_NUMBER},
+	{0x0000b4, 0x0000b4, PG_U_MODIFIER_SYMBOL},
+	{0x0000b5, 0x0000b5, PG_U_LOWERCASE_LETTER},
+	{0x0000b6, 0x0000b7, PG_U_OTHER_PUNCTUATION},
+	{0x0000b8, 0x0000b8, PG_U_MODIFIER_SYMBOL},
+	{0x0000b9, 0x0000b9, PG_U_OTHER_NUMBER},
+	{0x0000ba, 0x0000ba, PG_U_OTHER_LETTER},
+	{0x0000bb, 0x0000bb, PG_U_FINAL_PUNCTUATION},
+	{0x0000bc, 0x0000be, PG_U_OTHER_NUMBER},
+	{0x0000bf, 0x0000bf, PG_U_OTHER_PUNCTUATION},
+	{0x0000c0, 0x0000d6, PG_U_UPPERCASE_LETTER},
+	{0x0000d7, 0x0000d7, PG_U_MATH_SYMBOL},
+	{0x0000d8, 0x0000de, PG_U_UPPERCASE_LETTER},
+	{0x0000df, 0x0000f6, PG_U_LOWERCASE_LETTER},
+	{0x0000f7, 0x0000f7, PG_U_MATH_SYMBOL},
+	{0x0000f8, 0x0000ff, PG_U_LOWERCASE_LETTER},
+	{0x000100, 0x000100, PG_U_UPPERCASE_LETTER},
+	{0x000101, 0x000101, PG_U_LOWERCASE_LETTER},
+	{0x000102, 0x000102, PG_U_UPPERCASE_LETTER},
+	{0x000103, 0x000103, PG_U_LOWERCASE_LETTER},
+	{0x000104, 0x000104, PG_U_UPPERCASE_LETTER},
+	{0x000105, 0x000105, PG_U_LOWERCASE_LETTER},
+	{0x000106, 0x000106, PG_U_UPPERCASE_LETTER},
+	{0x000107, 0x000107, PG_U_LOWERCASE_LETTER},
+	{0x000108, 0x000108, PG_U_UPPERCASE_LETTER},
+	{0x000109, 0x000109, PG_U_LOWERCASE_LETTER},
+	{0x00010a, 0x00010a, PG_U_UPPERCASE_LETTER},
+	{0x00010b, 0x00010b, PG_U_LOWERCASE_LETTER},
+	{0x00010c, 0x00010c, PG_U_UPPERCASE_LETTER},
+	{0x00010d, 0x00010d, PG_U_LOWERCASE_LETTER},
+	{0x00010e, 0x00010e, PG_U_UPPERCASE_LETTER},
+	{0x00010f, 0x00010f, PG_U_LOWERCASE_LETTER},
+	{0x000110, 0x000110, PG_U_UPPERCASE_LETTER},
+	{0x000111, 0x000111, PG_U_LOWERCASE_LETTER},
+	{0x000112, 0x000112, PG_U_UPPERCASE_LETTER},
+	{0x000113, 0x000113, PG_U_LOWERCASE_LETTER},
+	{0x000114, 0x000114, PG_U_UPPERCASE_LETTER},
+	{0x000115, 0x000115, PG_U_LOWERCASE_LETTER},
+	{0x000116, 0x000116, PG_U_UPPERCASE_LETTER},
+	{0x000117, 0x000117, PG_U_LOWERCASE_LETTER},
+	{0x000118, 0x000118, PG_U_UPPERCASE_LETTER},
+	{0x000119, 0x000119, PG_U_LOWERCASE_LETTER},
+	{0x00011a, 0x00011a, PG_U_UPPERCASE_LETTER},
+	{0x00011b, 0x00011b, PG_U_LOWERCASE_LETTER},
+	{0x00011c, 0x00011c, PG_U_UPPERCASE_LETTER},
+	{0x00011d, 0x00011d, PG_U_LOWERCASE_LETTER},
+	{0x00011e, 0x00011e, PG_U_UPPERCASE_LETTER},
+	{0x00011f, 0x00011f, PG_U_LOWERCASE_LETTER},
+	{0x000120, 0x000120, PG_U_UPPERCASE_LETTER},
+	{0x000121, 0x000121, PG_U_LOWERCASE_LETTER},
+	{0x000122, 0x000122, PG_U_UPPERCASE_LETTER},
+	{0x000123, 0x000123, PG_U_LOWERCASE_LETTER},
+	{0x000124, 0x000124, PG_U_UPPERCASE_LETTER},
+	{0x000125, 0x000125, PG_U_LOWERCASE_LETTER},
+	{0x000126, 0x000126, PG_U_UPPERCASE_LETTER},
+	{0x000127, 0x000127, PG_U_LOWERCASE_LETTER},
+	{0x000128, 0x000128, PG_U_UPPERCASE_LETTER},
+	{0x000129, 0x000129, PG_U_LOWERCASE_LETTER},
+	{0x00012a, 0x00012a, PG_U_UPPERCASE_LETTER},
+	{0x00012b, 0x00012b, PG_U_LOWERCASE_LETTER},
+	{0x00012c, 0x00012c, PG_U_UPPERCASE_LETTER},
+	{0x00012d, 0x00012d, PG_U_LOWERCASE_LETTER},
+	{0x00012e, 0x00012e, PG_U_UPPERCASE_LETTER},
+	{0x00012f, 0x00012f, PG_U_LOWERCASE_LETTER},
+	{0x000130, 0x000130, PG_U_UPPERCASE_LETTER},
+	{0x000131, 0x000131, PG_U_LOWERCASE_LETTER},
+	{0x000132, 0x000132, PG_U_UPPERCASE_LETTER},
+	{0x000133, 0x000133, PG_U_LOWERCASE_LETTER},
+	{0x000134, 0x000134, PG_U_UPPERCASE_LETTER},
+	{0x000135, 0x000135, PG_U_LOWERCASE_LETTER},
+	{0x000136, 0x000136, PG_U_UPPERCASE_LETTER},
+	{0x000137, 0x000138, PG_U_LOWERCASE_LETTER},
+	{0x000139, 0x000139, PG_U_UPPERCASE_LETTER},
+	{0x00013a, 0x00013a, PG_U_LOWERCASE_LETTER},
+	{0x00013b, 0x00013b, PG_U_UPPERCASE_LETTER},
+	{0x00013c, 0x00013c, PG_U_LOWERCASE_LETTER},
+	{0x00013d, 0x00013d, PG_U_UPPERCASE_LETTER},
+	{0x00013e, 0x00013e, PG_U_LOWERCASE_LETTER},
+	{0x00013f, 0x00013f, PG_U_UPPERCASE_LETTER},
+	{0x000140, 0x000140, PG_U_LOWERCASE_LETTER},
+	{0x000141, 0x000141, PG_U_UPPERCASE_LETTER},
+	{0x000142, 0x000142, PG_U_LOWERCASE_LETTER},
+	{0x000143, 0x000143, PG_U_UPPERCASE_LETTER},
+	{0x000144, 0x000144, PG_U_LOWERCASE_LETTER},
+	{0x000145, 0x000145, PG_U_UPPERCASE_LETTER},
+	{0x000146, 0x000146, PG_U_LOWERCASE_LETTER},
+	{0x000147, 0x000147, PG_U_UPPERCASE_LETTER},
+	{0x000148, 0x000149, PG_U_LOWERCASE_LETTER},
+	{0x00014a, 0x00014a, PG_U_UPPERCASE_LETTER},
+	{0x00014b, 0x00014b, PG_U_LOWERCASE_LETTER},
+	{0x00014c, 0x00014c, PG_U_UPPERCASE_LETTER},
+	{0x00014d, 0x00014d, PG_U_LOWERCASE_LETTER},
+	{0x00014e, 0x00014e, PG_U_UPPERCASE_LETTER},
+	{0x00014f, 0x00014f, PG_U_LOWERCASE_LETTER},
+	{0x000150, 0x000150, PG_U_UPPERCASE_LETTER},
+	{0x000151, 0x000151, PG_U_LOWERCASE_LETTER},
+	{0x000152, 0x000152, PG_U_UPPERCASE_LETTER},
+	{0x000153, 0x000153, PG_U_LOWERCASE_LETTER},
+	{0x000154, 0x000154, PG_U_UPPERCASE_LETTER},
+	{0x000155, 0x000155, PG_U_LOWERCASE_LETTER},
+	{0x000156, 0x000156, PG_U_UPPERCASE_LETTER},
+	{0x000157, 0x000157, PG_U_LOWERCASE_LETTER},
+	{0x000158, 0x000158, PG_U_UPPERCASE_LETTER},
+	{0x000159, 0x000159, PG_U_LOWERCASE_LETTER},
+	{0x00015a, 0x00015a, PG_U_UPPERCASE_LETTER},
+	{0x00015b, 0x00015b, PG_U_LOWERCASE_LETTER},
+	{0x00015c, 0x00015c, PG_U_UPPERCASE_LETTER},
+	{0x00015d, 0x00015d, PG_U_LOWERCASE_LETTER},
+	{0x00015e, 0x00015e, PG_U_UPPERCASE_LETTER},
+	{0x00015f, 0x00015f, PG_U_LOWERCASE_LETTER},
+	{0x000160, 0x000160, PG_U_UPPERCASE_LETTER},
+	{0x000161, 0x000161, PG_U_LOWERCASE_LETTER},
+	{0x000162, 0x000162, PG_U_UPPERCASE_LETTER},
+	{0x000163, 0x000163, PG_U_LOWERCASE_LETTER},
+	{0x000164, 0x000164, PG_U_UPPERCASE_LETTER},
+	{0x000165, 0x000165, PG_U_LOWERCASE_LETTER},
+	{0x000166, 0x000166, PG_U_UPPERCASE_LETTER},
+	{0x000167, 0x000167, PG_U_LOWERCASE_LETTER},
+	{0x000168, 0x000168, PG_U_UPPERCASE_LETTER},
+	{0x000169, 0x000169, PG_U_LOWERCASE_LETTER},
+	{0x00016a, 0x00016a, PG_U_UPPERCASE_LETTER},
+	{0x00016b, 0x00016b, PG_U_LOWERCASE_LETTER},
+	{0x00016c, 0x00016c, PG_U_UPPERCASE_LETTER},
+	{0x00016d, 0x00016d, PG_U_LOWERCASE_LETTER},
+	{0x00016e, 0x00016e, PG_U_UPPERCASE_LETTER},
+	{0x00016f, 0x00016f, PG_U_LOWERCASE_LETTER},
+	{0x000170, 0x000170, PG_U_UPPERCASE_LETTER},
+	{0x000171, 0x000171, PG_U_LOWERCASE_LETTER},
+	{0x000172, 0x000172, PG_U_UPPERCASE_LETTER},
+	{0x000173, 0x000173, PG_U_LOWERCASE_LETTER},
+	{0x000174, 0x000174, PG_U_UPPERCASE_LETTER},
+	{0x000175, 0x000175, PG_U_LOWERCASE_LETTER},
+	{0x000176, 0x000176, PG_U_UPPERCASE_LETTER},
+	{0x000177, 0x000177, PG_U_LOWERCASE_LETTER},
+	{0x000178, 0x000179, PG_U_UPPERCASE_LETTER},
+	{0x00017a, 0x00017a, PG_U_LOWERCASE_LETTER},
+	{0x00017b, 0x00017b, PG_U_UPPERCASE_LETTER},
+	{0x00017c, 0x00017c, PG_U_LOWERCASE_LETTER},
+	{0x00017d, 0x00017d, PG_U_UPPERCASE_LETTER},
+	{0x00017e, 0x000180, PG_U_LOWERCASE_LETTER},
+	{0x000181, 0x000182, PG_U_UPPERCASE_LETTER},
+	{0x000183, 0x000183, PG_U_LOWERCASE_LETTER},
+	{0x000184, 0x000184, PG_U_UPPERCASE_LETTER},
+	{0x000185, 0x000185, PG_U_LOWERCASE_LETTER},
+	{0x000186, 0x000187, PG_U_UPPERCASE_LETTER},
+	{0x000188, 0x000188, PG_U_LOWERCASE_LETTER},
+	{0x000189, 0x00018b, PG_U_UPPERCASE_LETTER},
+	{0x00018c, 0x00018d, PG_U_LOWERCASE_LETTER},
+	{0x00018e, 0x000191, PG_U_UPPERCASE_LETTER},
+	{0x000192, 0x000192, PG_U_LOWERCASE_LETTER},
+	{0x000193, 0x000194, PG_U_UPPERCASE_LETTER},
+	{0x000195, 0x000195, PG_U_LOWERCASE_LETTER},
+	{0x000196, 0x000198, PG_U_UPPERCASE_LETTER},
+	{0x000199, 0x00019b, PG_U_LOWERCASE_LETTER},
+	{0x00019c, 0x00019d, PG_U_UPPERCASE_LETTER},
+	{0x00019e, 0x00019e, PG_U_LOWERCASE_LETTER},
+	{0x00019f, 0x0001a0, PG_U_UPPERCASE_LETTER},
+	{0x0001a1, 0x0001a1, PG_U_LOWERCASE_LETTER},
+	{0x0001a2, 0x0001a2, PG_U_UPPERCASE_LETTER},
+	{0x0001a3, 0x0001a3, PG_U_LOWERCASE_LETTER},
+	{0x0001a4, 0x0001a4, PG_U_UPPERCASE_LETTER},
+	{0x0001a5, 0x0001a5, PG_U_LOWERCASE_LETTER},
+	{0x0001a6, 0x0001a7, PG_U_UPPERCASE_LETTER},
+	{0x0001a8, 0x0001a8, PG_U_LOWERCASE_LETTER},
+	{0x0001a9, 0x0001a9, PG_U_UPPERCASE_LETTER},
+	{0x0001aa, 0x0001ab, PG_U_LOWERCASE_LETTER},
+	{0x0001ac, 0x0001ac, PG_U_UPPERCASE_LETTER},
+	{0x0001ad, 0x0001ad, PG_U_LOWERCASE_LETTER},
+	{0x0001ae, 0x0001af, PG_U_UPPERCASE_LETTER},
+	{0x0001b0, 0x0001b0, PG_U_LOWERCASE_LETTER},
+	{0x0001b1, 0x0001b3, PG_U_UPPERCASE_LETTER},
+	{0x0001b4, 0x0001b4, PG_U_LOWERCASE_LETTER},
+	{0x0001b5, 0x0001b5, PG_U_UPPERCASE_LETTER},
+	{0x0001b6, 0x0001b6, PG_U_LOWERCASE_LETTER},
+	{0x0001b7, 0x0001b8, PG_U_UPPERCASE_LETTER},
+	{0x0001b9, 0x0001ba, PG_U_LOWERCASE_LETTER},
+	{0x0001bb, 0x0001bb, PG_U_OTHER_LETTER},
+	{0x0001bc, 0x0001bc, PG_U_UPPERCASE_LETTER},
+	{0x0001bd, 0x0001bf, PG_U_LOWERCASE_LETTER},
+	{0x0001c0, 0x0001c3, PG_U_OTHER_LETTER},
+	{0x0001c4, 0x0001c4, PG_U_UPPERCASE_LETTER},
+	{0x0001c5, 0x0001c5, PG_U_TITLECASE_LETTER},
+	{0x0001c6, 0x0001c6, PG_U_LOWERCASE_LETTER},
+	{0x0001c7, 0x0001c7, PG_U_UPPERCASE_LETTER},
+	{0x0001c8, 0x0001c8, PG_U_TITLECASE_LETTER},
+	{0x0001c9, 0x0001c9, PG_U_LOWERCASE_LETTER},
+	{0x0001ca, 0x0001ca, PG_U_UPPERCASE_LETTER},
+	{0x0001cb, 0x0001cb, PG_U_TITLECASE_LETTER},
+	{0x0001cc, 0x0001cc, PG_U_LOWERCASE_LETTER},
+	{0x0001cd, 0x0001cd, PG_U_UPPERCASE_LETTER},
+	{0x0001ce, 0x0001ce, PG_U_LOWERCASE_LETTER},
+	{0x0001cf, 0x0001cf, PG_U_UPPERCASE_LETTER},
+	{0x0001d0, 0x0001d0, PG_U_LOWERCASE_LETTER},
+	{0x0001d1, 0x0001d1, PG_U_UPPERCASE_LETTER},
+	{0x0001d2, 0x0001d2, PG_U_LOWERCASE_LETTER},
+	{0x0001d3, 0x0001d3, PG_U_UPPERCASE_LETTER},
+	{0x0001d4, 0x0001d4, PG_U_LOWERCASE_LETTER},
+	{0x0001d5, 0x0001d5, PG_U_UPPERCASE_LETTER},
+	{0x0001d6, 0x0001d6, PG_U_LOWERCASE_LETTER},
+	{0x0001d7, 0x0001d7, PG_U_UPPERCASE_LETTER},
+	{0x0001d8, 0x0001d8, PG_U_LOWERCASE_LETTER},
+	{0x0001d9, 0x0001d9, PG_U_UPPERCASE_LETTER},
+	{0x0001da, 0x0001da, PG_U_LOWERCASE_LETTER},
+	{0x0001db, 0x0001db, PG_U_UPPERCASE_LETTER},
+	{0x0001dc, 0x0001dd, PG_U_LOWERCASE_LETTER},
+	{0x0001de, 0x0001de, PG_U_UPPERCASE_LETTER},
+	{0x0001df, 0x0001df, PG_U_LOWERCASE_LETTER},
+	{0x0001e0, 0x0001e0, PG_U_UPPERCASE_LETTER},
+	{0x0001e1, 0x0001e1, PG_U_LOWERCASE_LETTER},
+	{0x0001e2, 0x0001e2, PG_U_UPPERCASE_LETTER},
+	{0x0001e3, 0x0001e3, PG_U_LOWERCASE_LETTER},
+	{0x0001e4, 0x0001e4, PG_U_UPPERCASE_LETTER},
+	{0x0001e5, 0x0001e5, PG_U_LOWERCASE_LETTER},
+	{0x0001e6, 0x0001e6, PG_U_UPPERCASE_LETTER},
+	{0x0001e7, 0x0001e7, PG_U_LOWERCASE_LETTER},
+	{0x0001e8, 0x0001e8, PG_U_UPPERCASE_LETTER},
+	{0x0001e9, 0x0001e9, PG_U_LOWERCASE_LETTER},
+	{0x0001ea, 0x0001ea, PG_U_UPPERCASE_LETTER},
+	{0x0001eb, 0x0001eb, PG_U_LOWERCASE_LETTER},
+	{0x0001ec, 0x0001ec, PG_U_UPPERCASE_LETTER},
+	{0x0001ed, 0x0001ed, PG_U_LOWERCASE_LETTER},
+	{0x0001ee, 0x0001ee, PG_U_UPPERCASE_LETTER},
+	{0x0001ef, 0x0001f0, PG_U_LOWERCASE_LETTER},
+	{0x0001f1, 0x0001f1, PG_U_UPPERCASE_LETTER},
+	{0x0001f2, 0x0001f2, PG_U_TITLECASE_LETTER},
+	{0x0001f3, 0x0001f3, PG_U_LOWERCASE_LETTER},
+	{0x0001f4, 0x0001f4, PG_U_UPPERCASE_LETTER},
+	{0x0001f5, 0x0001f5, PG_U_LOWERCASE_LETTER},
+	{0x0001f6, 0x0001f8, PG_U_UPPERCASE_LETTER},
+	{0x0001f9, 0x0001f9, PG_U_LOWERCASE_LETTER},
+	{0x0001fa, 0x0001fa, PG_U_UPPERCASE_LETTER},
+	{0x0001fb, 0x0001fb, PG_U_LOWERCASE_LETTER},
+	{0x0001fc, 0x0001fc, PG_U_UPPERCASE_LETTER},
+	{0x0001fd, 0x0001fd, PG_U_LOWERCASE_LETTER},
+	{0x0001fe, 0x0001fe, PG_U_UPPERCASE_LETTER},
+	{0x0001ff, 0x0001ff, PG_U_LOWERCASE_LETTER},
+	{0x000200, 0x000200, PG_U_UPPERCASE_LETTER},
+	{0x000201, 0x000201, PG_U_LOWERCASE_LETTER},
+	{0x000202, 0x000202, PG_U_UPPERCASE_LETTER},
+	{0x000203, 0x000203, PG_U_LOWERCASE_LETTER},
+	{0x000204, 0x000204, PG_U_UPPERCASE_LETTER},
+	{0x000205, 0x000205, PG_U_LOWERCASE_LETTER},
+	{0x000206, 0x000206, PG_U_UPPERCASE_LETTER},
+	{0x000207, 0x000207, PG_U_LOWERCASE_LETTER},
+	{0x000208, 0x000208, PG_U_UPPERCASE_LETTER},
+	{0x000209, 0x000209, PG_U_LOWERCASE_LETTER},
+	{0x00020a, 0x00020a, PG_U_UPPERCASE_LETTER},
+	{0x00020b, 0x00020b, PG_U_LOWERCASE_LETTER},
+	{0x00020c, 0x00020c, PG_U_UPPERCASE_LETTER},
+	{0x00020d, 0x00020d, PG_U_LOWERCASE_LETTER},
+	{0x00020e, 0x00020e, PG_U_UPPERCASE_LETTER},
+	{0x00020f, 0x00020f, PG_U_LOWERCASE_LETTER},
+	{0x000210, 0x000210, PG_U_UPPERCASE_LETTER},
+	{0x000211, 0x000211, PG_U_LOWERCASE_LETTER},
+	{0x000212, 0x000212, PG_U_UPPERCASE_LETTER},
+	{0x000213, 0x000213, PG_U_LOWERCASE_LETTER},
+	{0x000214, 0x000214, PG_U_UPPERCASE_LETTER},
+	{0x000215, 0x000215, PG_U_LOWERCASE_LETTER},
+	{0x000216, 0x000216, PG_U_UPPERCASE_LETTER},
+	{0x000217, 0x000217, PG_U_LOWERCASE_LETTER},
+	{0x000218, 0x000218, PG_U_UPPERCASE_LETTER},
+	{0x000219, 0x000219, PG_U_LOWERCASE_LETTER},
+	{0x00021a, 0x00021a, PG_U_UPPERCASE_LETTER},
+	{0x00021b, 0x00021b, PG_U_LOWERCASE_LETTER},
+	{0x00021c, 0x00021c, PG_U_UPPERCASE_LETTER},
+	{0x00021d, 0x00021d, PG_U_LOWERCASE_LETTER},
+	{0x00021e, 0x00021e, PG_U_UPPERCASE_LETTER},
+	{0x00021f, 0x00021f, PG_U_LOWERCASE_LETTER},
+	{0x000220, 0x000220, PG_U_UPPERCASE_LETTER},
+	{0x000221, 0x000221, PG_U_LOWERCASE_LETTER},
+	{0x000222, 0x000222, PG_U_UPPERCASE_LETTER},
+	{0x000223, 0x000223, PG_U_LOWERCASE_LETTER},
+	{0x000224, 0x000224, PG_U_UPPERCASE_LETTER},
+	{0x000225, 0x000225, PG_U_LOWERCASE_LETTER},
+	{0x000226, 0x000226, PG_U_UPPERCASE_LETTER},
+	{0x000227, 0x000227, PG_U_LOWERCASE_LETTER},
+	{0x000228, 0x000228, PG_U_UPPERCASE_LETTER},
+	{0x000229, 0x000229, PG_U_LOWERCASE_LETTER},
+	{0x00022a, 0x00022a, PG_U_UPPERCASE_LETTER},
+	{0x00022b, 0x00022b, PG_U_LOWERCASE_LETTER},
+	{0x00022c, 0x00022c, PG_U_UPPERCASE_LETTER},
+	{0x00022d, 0x00022d, PG_U_LOWERCASE_LETTER},
+	{0x00022e, 0x00022e, PG_U_UPPERCASE_LETTER},
+	{0x00022f, 0x00022f, PG_U_LOWERCASE_LETTER},
+	{0x000230, 0x000230, PG_U_UPPERCASE_LETTER},
+	{0x000231, 0x000231, PG_U_LOWERCASE_LETTER},
+	{0x000232, 0x000232, PG_U_UPPERCASE_LETTER},
+	{0x000233, 0x000239, PG_U_LOWERCASE_LETTER},
+	{0x00023a, 0x00023b, PG_U_UPPERCASE_LETTER},
+	{0x00023c, 0x00023c, PG_U_LOWERCASE_LETTER},
+	{0x00023d, 0x00023e, PG_U_UPPERCASE_LETTER},
+	{0x00023f, 0x000240, PG_U_LOWERCASE_LETTER},
+	{0x000241, 0x000241, PG_U_UPPERCASE_LETTER},
+	{0x000242, 0x000242, PG_U_LOWERCASE_LETTER},
+	{0x000243, 0x000246, PG_U_UPPERCASE_LETTER},
+	{0x000247, 0x000247, PG_U_LOWERCASE_LETTER},
+	{0x000248, 0x000248, PG_U_UPPERCASE_LETTER},
+	{0x000249, 0x000249, PG_U_LOWERCASE_LETTER},
+	{0x00024a, 0x00024a, PG_U_UPPERCASE_LETTER},
+	{0x00024b, 0x00024b, PG_U_LOWERCASE_LETTER},
+	{0x00024c, 0x00024c, PG_U_UPPERCASE_LETTER},
+	{0x00024d, 0x00024d, PG_U_LOWERCASE_LETTER},
+	{0x00024e, 0x00024e, PG_U_UPPERCASE_LETTER},
+	{0x00024f, 0x000293, PG_U_LOWERCASE_LETTER},
+	{0x000294, 0x000294, PG_U_OTHER_LETTER},
+	{0x000295, 0x0002af, PG_U_LOWERCASE_LETTER},
+	{0x0002b0, 0x0002c1, PG_U_MODIFIER_LETTER},
+	{0x0002c2, 0x0002c5, PG_U_MODIFIER_SYMBOL},
+	{0x0002c6, 0x0002d1, PG_U_MODIFIER_LETTER},
+	{0x0002d2, 0x0002df, PG_U_MODIFIER_SYMBOL},
+	{0x0002e0, 0x0002e4, PG_U_MODIFIER_LETTER},
+	{0x0002e5, 0x0002eb, PG_U_MODIFIER_SYMBOL},
+	{0x0002ec, 0x0002ec, PG_U_MODIFIER_LETTER},
+	{0x0002ed, 0x0002ed, PG_U_MODIFIER_SYMBOL},
+	{0x0002ee, 0x0002ee, PG_U_MODIFIER_LETTER},
+	{0x0002ef, 0x0002ff, PG_U_MODIFIER_SYMBOL},
+	{0x000300, 0x00036f, PG_U_NON_SPACING_MARK},
+	{0x000370, 0x000370, PG_U_UPPERCASE_LETTER},
+	{0x000371, 0x000371, PG_U_LOWERCASE_LETTER},
+	{0x000372, 0x000372, PG_U_UPPERCASE_LETTER},
+	{0x000373, 0x000373, PG_U_LOWERCASE_LETTER},
+	{0x000374, 0x000374, PG_U_MODIFIER_LETTER},
+	{0x000375, 0x000375, PG_U_MODIFIER_SYMBOL},
+	{0x000376, 0x000376, PG_U_UPPERCASE_LETTER},
+	{0x000377, 0x000377, PG_U_LOWERCASE_LETTER},
+	{0x000378, 0x000379, PG_U_UNASSIGNED},
+	{0x00037a, 0x00037a, PG_U_MODIFIER_LETTER},
+	{0x00037b, 0x00037d, PG_U_LOWERCASE_LETTER},
+	{0x00037e, 0x00037e, PG_U_OTHER_PUNCTUATION},
+	{0x00037f, 0x00037f, PG_U_UPPERCASE_LETTER},
+	{0x000380, 0x000383, PG_U_UNASSIGNED},
+	{0x000384, 0x000385, PG_U_MODIFIER_SYMBOL},
+	{0x000386, 0x000386, PG_U_UPPERCASE_LETTER},
+	{0x000387, 0x000387, PG_U_OTHER_PUNCTUATION},
+	{0x000388, 0x00038a, PG_U_UPPERCASE_LETTER},
+	{0x00038b, 0x00038b, PG_U_UNASSIGNED},
+	{0x00038c, 0x00038c, PG_U_UPPERCASE_LETTER},
+	{0x00038d, 0x00038d, PG_U_UNASSIGNED},
+	{0x00038e, 0x00038f, PG_U_UPPERCASE_LETTER},
+	{0x000390, 0x000390, PG_U_LOWERCASE_LETTER},
+	{0x000391, 0x0003a1, PG_U_UPPERCASE_LETTER},
+	{0x0003a2, 0x0003a2, PG_U_UNASSIGNED},
+	{0x0003a3, 0x0003ab, PG_U_UPPERCASE_LETTER},
+	{0x0003ac, 0x0003ce, PG_U_LOWERCASE_LETTER},
+	{0x0003cf, 0x0003cf, PG_U_UPPERCASE_LETTER},
+	{0x0003d0, 0x0003d1, PG_U_LOWERCASE_LETTER},
+	{0x0003d2, 0x0003d4, PG_U_UPPERCASE_LETTER},
+	{0x0003d5, 0x0003d7, PG_U_LOWERCASE_LETTER},
+	{0x0003d8, 0x0003d8, PG_U_UPPERCASE_LETTER},
+	{0x0003d9, 0x0003d9, PG_U_LOWERCASE_LETTER},
+	{0x0003da, 0x0003da, PG_U_UPPERCASE_LETTER},
+	{0x0003db, 0x0003db, PG_U_LOWERCASE_LETTER},
+	{0x0003dc, 0x0003dc, PG_U_UPPERCASE_LETTER},
+	{0x0003dd, 0x0003dd, PG_U_LOWERCASE_LETTER},
+	{0x0003de, 0x0003de, PG_U_UPPERCASE_LETTER},
+	{0x0003df, 0x0003df, PG_U_LOWERCASE_LETTER},
+	{0x0003e0, 0x0003e0, PG_U_UPPERCASE_LETTER},
+	{0x0003e1, 0x0003e1, PG_U_LOWERCASE_LETTER},
+	{0x0003e2, 0x0003e2, PG_U_UPPERCASE_LETTER},
+	{0x0003e3, 0x0003e3, PG_U_LOWERCASE_LETTER},
+	{0x0003e4, 0x0003e4, PG_U_UPPERCASE_LETTER},
+	{0x0003e5, 0x0003e5, PG_U_LOWERCASE_LETTER},
+	{0x0003e6, 0x0003e6, PG_U_UPPERCASE_LETTER},
+	{0x0003e7, 0x0003e7, PG_U_LOWERCASE_LETTER},
+	{0x0003e8, 0x0003e8, PG_U_UPPERCASE_LETTER},
+	{0x0003e9, 0x0003e9, PG_U_LOWERCASE_LETTER},
+	{0x0003ea, 0x0003ea, PG_U_UPPERCASE_LETTER},
+	{0x0003eb, 0x0003eb, PG_U_LOWERCASE_LETTER},
+	{0x0003ec, 0x0003ec, PG_U_UPPERCASE_LETTER},
+	{0x0003ed, 0x0003ed, PG_U_LOWERCASE_LETTER},
+	{0x0003ee, 0x0003ee, PG_U_UPPERCASE_LETTER},
+	{0x0003ef, 0x0003f3, PG_U_LOWERCASE_LETTER},
+	{0x0003f4, 0x0003f4, PG_U_UPPERCASE_LETTER},
+	{0x0003f5, 0x0003f5, PG_U_LOWERCASE_LETTER},
+	{0x0003f6, 0x0003f6, PG_U_MATH_SYMBOL},
+	{0x0003f7, 0x0003f7, PG_U_UPPERCASE_LETTER},
+	{0x0003f8, 0x0003f8, PG_U_LOWERCASE_LETTER},
+	{0x0003f9, 0x0003fa, PG_U_UPPERCASE_LETTER},
+	{0x0003fb, 0x0003fc, PG_U_LOWERCASE_LETTER},
+	{0x0003fd, 0x00042f, PG_U_UPPERCASE_LETTER},
+	{0x000430, 0x00045f, PG_U_LOWERCASE_LETTER},
+	{0x000460, 0x000460, PG_U_UPPERCASE_LETTER},
+	{0x000461, 0x000461, PG_U_LOWERCASE_LETTER},
+	{0x000462, 0x000462, PG_U_UPPERCASE_LETTER},
+	{0x000463, 0x000463, PG_U_LOWERCASE_LETTER},
+	{0x000464, 0x000464, PG_U_UPPERCASE_LETTER},
+	{0x000465, 0x000465, PG_U_LOWERCASE_LETTER},
+	{0x000466, 0x000466, PG_U_UPPERCASE_LETTER},
+	{0x000467, 0x000467, PG_U_LOWERCASE_LETTER},
+	{0x000468, 0x000468, PG_U_UPPERCASE_LETTER},
+	{0x000469, 0x000469, PG_U_LOWERCASE_LETTER},
+	{0x00046a, 0x00046a, PG_U_UPPERCASE_LETTER},
+	{0x00046b, 0x00046b, PG_U_LOWERCASE_LETTER},
+	{0x00046c, 0x00046c, PG_U_UPPERCASE_LETTER},
+	{0x00046d, 0x00046d, PG_U_LOWERCASE_LETTER},
+	{0x00046e, 0x00046e, PG_U_UPPERCASE_LETTER},
+	{0x00046f, 0x00046f, PG_U_LOWERCASE_LETTER},
+	{0x000470, 0x000470, PG_U_UPPERCASE_LETTER},
+	{0x000471, 0x000471, PG_U_LOWERCASE_LETTER},
+	{0x000472, 0x000472, PG_U_UPPERCASE_LETTER},
+	{0x000473, 0x000473, PG_U_LOWERCASE_LETTER},
+	{0x000474, 0x000474, PG_U_UPPERCASE_LETTER},
+	{0x000475, 0x000475, PG_U_LOWERCASE_LETTER},
+	{0x000476, 0x000476, PG_U_UPPERCASE_LETTER},
+	{0x000477, 0x000477, PG_U_LOWERCASE_LETTER},
+	{0x000478, 0x000478, PG_U_UPPERCASE_LETTER},
+	{0x000479, 0x000479, PG_U_LOWERCASE_LETTER},
+	{0x00047a, 0x00047a, PG_U_UPPERCASE_LETTER},
+	{0x00047b, 0x00047b, PG_U_LOWERCASE_LETTER},
+	{0x00047c, 0x00047c, PG_U_UPPERCASE_LETTER},
+	{0x00047d, 0x00047d, PG_U_LOWERCASE_LETTER},
+	{0x00047e, 0x00047e, PG_U_UPPERCASE_LETTER},
+	{0x00047f, 0x00047f, PG_U_LOWERCASE_LETTER},
+	{0x000480, 0x000480, PG_U_UPPERCASE_LETTER},
+	{0x000481, 0x000481, PG_U_LOWERCASE_LETTER},
+	{0x000482, 0x000482, PG_U_OTHER_SYMBOL},
+	{0x000483, 0x000487, PG_U_NON_SPACING_MARK},
+	{0x000488, 0x000489, PG_U_ENCLOSING_MARK},
+	{0x00048a, 0x00048a, PG_U_UPPERCASE_LETTER},
+	{0x00048b, 0x00048b, PG_U_LOWERCASE_LETTER},
+	{0x00048c, 0x00048c, PG_U_UPPERCASE_LETTER},
+	{0x00048d, 0x00048d, PG_U_LOWERCASE_LETTER},
+	{0x00048e, 0x00048e, PG_U_UPPERCASE_LETTER},
+	{0x00048f, 0x00048f, PG_U_LOWERCASE_LETTER},
+	{0x000490, 0x000490, PG_U_UPPERCASE_LETTER},
+	{0x000491, 0x000491, PG_U_LOWERCASE_LETTER},
+	{0x000492, 0x000492, PG_U_UPPERCASE_LETTER},
+	{0x000493, 0x000493, PG_U_LOWERCASE_LETTER},
+	{0x000494, 0x000494, PG_U_UPPERCASE_LETTER},
+	{0x000495, 0x000495, PG_U_LOWERCASE_LETTER},
+	{0x000496, 0x000496, PG_U_UPPERCASE_LETTER},
+	{0x000497, 0x000497, PG_U_LOWERCASE_LETTER},
+	{0x000498, 0x000498, PG_U_UPPERCASE_LETTER},
+	{0x000499, 0x000499, PG_U_LOWERCASE_LETTER},
+	{0x00049a, 0x00049a, PG_U_UPPERCASE_LETTER},
+	{0x00049b, 0x00049b, PG_U_LOWERCASE_LETTER},
+	{0x00049c, 0x00049c, PG_U_UPPERCASE_LETTER},
+	{0x00049d, 0x00049d, PG_U_LOWERCASE_LETTER},
+	{0x00049e, 0x00049e, PG_U_UPPERCASE_LETTER},
+	{0x00049f, 0x00049f, PG_U_LOWERCASE_LETTER},
+	{0x0004a0, 0x0004a0, PG_U_UPPERCASE_LETTER},
+	{0x0004a1, 0x0004a1, PG_U_LOWERCASE_LETTER},
+	{0x0004a2, 0x0004a2, PG_U_UPPERCASE_LETTER},
+	{0x0004a3, 0x0004a3, PG_U_LOWERCASE_LETTER},
+	{0x0004a4, 0x0004a4, PG_U_UPPERCASE_LETTER},
+	{0x0004a5, 0x0004a5, PG_U_LOWERCASE_LETTER},
+	{0x0004a6, 0x0004a6, PG_U_UPPERCASE_LETTER},
+	{0x0004a7, 0x0004a7, PG_U_LOWERCASE_LETTER},
+	{0x0004a8, 0x0004a8, PG_U_UPPERCASE_LETTER},
+	{0x0004a9, 0x0004a9, PG_U_LOWERCASE_LETTER},
+	{0x0004aa, 0x0004aa, PG_U_UPPERCASE_LETTER},
+	{0x0004ab, 0x0004ab, PG_U_LOWERCASE_LETTER},
+	{0x0004ac, 0x0004ac, PG_U_UPPERCASE_LETTER},
+	{0x0004ad, 0x0004ad, PG_U_LOWERCASE_LETTER},
+	{0x0004ae, 0x0004ae, PG_U_UPPERCASE_LETTER},
+	{0x0004af, 0x0004af, PG_U_LOWERCASE_LETTER},
+	{0x0004b0, 0x0004b0, PG_U_UPPERCASE_LETTER},
+	{0x0004b1, 0x0004b1, PG_U_LOWERCASE_LETTER},
+	{0x0004b2, 0x0004b2, PG_U_UPPERCASE_LETTER},
+	{0x0004b3, 0x0004b3, PG_U_LOWERCASE_LETTER},
+	{0x0004b4, 0x0004b4, PG_U_UPPERCASE_LETTER},
+	{0x0004b5, 0x0004b5, PG_U_LOWERCASE_LETTER},
+	{0x0004b6, 0x0004b6, PG_U_UPPERCASE_LETTER},
+	{0x0004b7, 0x0004b7, PG_U_LOWERCASE_LETTER},
+	{0x0004b8, 0x0004b8, PG_U_UPPERCASE_LETTER},
+	{0x0004b9, 0x0004b9, PG_U_LOWERCASE_LETTER},
+	{0x0004ba, 0x0004ba, PG_U_UPPERCASE_LETTER},
+	{0x0004bb, 0x0004bb, PG_U_LOWERCASE_LETTER},
+	{0x0004bc, 0x0004bc, PG_U_UPPERCASE_LETTER},
+	{0x0004bd, 0x0004bd, PG_U_LOWERCASE_LETTER},
+	{0x0004be, 0x0004be, PG_U_UPPERCASE_LETTER},
+	{0x0004bf, 0x0004bf, PG_U_LOWERCASE_LETTER},
+	{0x0004c0, 0x0004c1, PG_U_UPPERCASE_LETTER},
+	{0x0004c2, 0x0004c2, PG_U_LOWERCASE_LETTER},
+	{0x0004c3, 0x0004c3, PG_U_UPPERCASE_LETTER},
+	{0x0004c4, 0x0004c4, PG_U_LOWERCASE_LETTER},
+	{0x0004c5, 0x0004c5, PG_U_UPPERCASE_LETTER},
+	{0x0004c6, 0x0004c6, PG_U_LOWERCASE_LETTER},
+	{0x0004c7, 0x0004c7, PG_U_UPPERCASE_LETTER},
+	{0x0004c8, 0x0004c8, PG_U_LOWERCASE_LETTER},
+	{0x0004c9, 0x0004c9, PG_U_UPPERCASE_LETTER},
+	{0x0004ca, 0x0004ca, PG_U_LOWERCASE_LETTER},
+	{0x0004cb, 0x0004cb, PG_U_UPPERCASE_LETTER},
+	{0x0004cc, 0x0004cc, PG_U_LOWERCASE_LETTER},
+	{0x0004cd, 0x0004cd, PG_U_UPPERCASE_LETTER},
+	{0x0004ce, 0x0004cf, PG_U_LOWERCASE_LETTER},
+	{0x0004d0, 0x0004d0, PG_U_UPPERCASE_LETTER},
+	{0x0004d1, 0x0004d1, PG_U_LOWERCASE_LETTER},
+	{0x0004d2, 0x0004d2, PG_U_UPPERCASE_LETTER},
+	{0x0004d3, 0x0004d3, PG_U_LOWERCASE_LETTER},
+	{0x0004d4, 0x0004d4, PG_U_UPPERCASE_LETTER},
+	{0x0004d5, 0x0004d5, PG_U_LOWERCASE_LETTER},
+	{0x0004d6, 0x0004d6, PG_U_UPPERCASE_LETTER},
+	{0x0004d7, 0x0004d7, PG_U_LOWERCASE_LETTER},
+	{0x0004d8, 0x0004d8, PG_U_UPPERCASE_LETTER},
+	{0x0004d9, 0x0004d9, PG_U_LOWERCASE_LETTER},
+	{0x0004da, 0x0004da, PG_U_UPPERCASE_LETTER},
+	{0x0004db, 0x0004db, PG_U_LOWERCASE_LETTER},
+	{0x0004dc, 0x0004dc, PG_U_UPPERCASE_LETTER},
+	{0x0004dd, 0x0004dd, PG_U_LOWERCASE_LETTER},
+	{0x0004de, 0x0004de, PG_U_UPPERCASE_LETTER},
+	{0x0004df, 0x0004df, PG_U_LOWERCASE_LETTER},
+	{0x0004e0, 0x0004e0, PG_U_UPPERCASE_LETTER},
+	{0x0004e1, 0x0004e1, PG_U_LOWERCASE_LETTER},
+	{0x0004e2, 0x0004e2, PG_U_UPPERCASE_LETTER},
+	{0x0004e3, 0x0004e3, PG_U_LOWERCASE_LETTER},
+	{0x0004e4, 0x0004e4, PG_U_UPPERCASE_LETTER},
+	{0x0004e5, 0x0004e5, PG_U_LOWERCASE_LETTER},
+	{0x0004e6, 0x0004e6, PG_U_UPPERCASE_LETTER},
+	{0x0004e7, 0x0004e7, PG_U_LOWERCASE_LETTER},
+	{0x0004e8, 0x0004e8, PG_U_UPPERCASE_LETTER},
+	{0x0004e9, 0x0004e9, PG_U_LOWERCASE_LETTER},
+	{0x0004ea, 0x0004ea, PG_U_UPPERCASE_LETTER},
+	{0x0004eb, 0x0004eb, PG_U_LOWERCASE_LETTER},
+	{0x0004ec, 0x0004ec, PG_U_UPPERCASE_LETTER},
+	{0x0004ed, 0x0004ed, PG_U_LOWERCASE_LETTER},
+	{0x0004ee, 0x0004ee, PG_U_UPPERCASE_LETTER},
+	{0x0004ef, 0x0004ef, PG_U_LOWERCASE_LETTER},
+	{0x0004f0, 0x0004f0, PG_U_UPPERCASE_LETTER},
+	{0x0004f1, 0x0004f1, PG_U_LOWERCASE_LETTER},
+	{0x0004f2, 0x0004f2, PG_U_UPPERCASE_LETTER},
+	{0x0004f3, 0x0004f3, PG_U_LOWERCASE_LETTER},
+	{0x0004f4, 0x0004f4, PG_U_UPPERCASE_LETTER},
+	{0x0004f5, 0x0004f5, PG_U_LOWERCASE_LETTER},
+	{0x0004f6, 0x0004f6, PG_U_UPPERCASE_LETTER},
+	{0x0004f7, 0x0004f7, PG_U_LOWERCASE_LETTER},
+	{0x0004f8, 0x0004f8, PG_U_UPPERCASE_LETTER},
+	{0x0004f9, 0x0004f9, PG_U_LOWERCASE_LETTER},
+	{0x0004fa, 0x0004fa, PG_U_UPPERCASE_LETTER},
+	{0x0004fb, 0x0004fb, PG_U_LOWERCASE_LETTER},
+	{0x0004fc, 0x0004fc, PG_U_UPPERCASE_LETTER},
+	{0x0004fd, 0x0004fd, PG_U_LOWERCASE_LETTER},
+	{0x0004fe, 0x0004fe, PG_U_UPPERCASE_LETTER},
+	{0x0004ff, 0x0004ff, PG_U_LOWERCASE_LETTER},
+	{0x000500, 0x000500, PG_U_UPPERCASE_LETTER},
+	{0x000501, 0x000501, PG_U_LOWERCASE_LETTER},
+	{0x000502, 0x000502, PG_U_UPPERCASE_LETTER},
+	{0x000503, 0x000503, PG_U_LOWERCASE_LETTER},
+	{0x000504, 0x000504, PG_U_UPPERCASE_LETTER},
+	{0x000505, 0x000505, PG_U_LOWERCASE_LETTER},
+	{0x000506, 0x000506, PG_U_UPPERCASE_LETTER},
+	{0x000507, 0x000507, PG_U_LOWERCASE_LETTER},
+	{0x000508, 0x000508, PG_U_UPPERCASE_LETTER},
+	{0x000509, 0x000509, PG_U_LOWERCASE_LETTER},
+	{0x00050a, 0x00050a, PG_U_UPPERCASE_LETTER},
+	{0x00050b, 0x00050b, PG_U_LOWERCASE_LETTER},
+	{0x00050c, 0x00050c, PG_U_UPPERCASE_LETTER},
+	{0x00050d, 0x00050d, PG_U_LOWERCASE_LETTER},
+	{0x00050e, 0x00050e, PG_U_UPPERCASE_LETTER},
+	{0x00050f, 0x00050f, PG_U_LOWERCASE_LETTER},
+	{0x000510, 0x000510, PG_U_UPPERCASE_LETTER},
+	{0x000511, 0x000511, PG_U_LOWERCASE_LETTER},
+	{0x000512, 0x000512, PG_U_UPPERCASE_LETTER},
+	{0x000513, 0x000513, PG_U_LOWERCASE_LETTER},
+	{0x000514, 0x000514, PG_U_UPPERCASE_LETTER},
+	{0x000515, 0x000515, PG_U_LOWERCASE_LETTER},
+	{0x000516, 0x000516, PG_U_UPPERCASE_LETTER},
+	{0x000517, 0x000517, PG_U_LOWERCASE_LETTER},
+	{0x000518, 0x000518, PG_U_UPPERCASE_LETTER},
+	{0x000519, 0x000519, PG_U_LOWERCASE_LETTER},
+	{0x00051a, 0x00051a, PG_U_UPPERCASE_LETTER},
+	{0x00051b, 0x00051b, PG_U_LOWERCASE_LETTER},
+	{0x00051c, 0x00051c, PG_U_UPPERCASE_LETTER},
+	{0x00051d, 0x00051d, PG_U_LOWERCASE_LETTER},
+	{0x00051e, 0x00051e, PG_U_UPPERCASE_LETTER},
+	{0x00051f, 0x00051f, PG_U_LOWERCASE_LETTER},
+	{0x000520, 0x000520, PG_U_UPPERCASE_LETTER},
+	{0x000521, 0x000521, PG_U_LOWERCASE_LETTER},
+	{0x000522, 0x000522, PG_U_UPPERCASE_LETTER},
+	{0x000523, 0x000523, PG_U_LOWERCASE_LETTER},
+	{0x000524, 0x000524, PG_U_UPPERCASE_LETTER},
+	{0x000525, 0x000525, PG_U_LOWERCASE_LETTER},
+	{0x000526, 0x000526, PG_U_UPPERCASE_LETTER},
+	{0x000527, 0x000527, PG_U_LOWERCASE_LETTER},
+	{0x000528, 0x000528, PG_U_UPPERCASE_LETTER},
+	{0x000529, 0x000529, PG_U_LOWERCASE_LETTER},
+	{0x00052a, 0x00052a, PG_U_UPPERCASE_LETTER},
+	{0x00052b, 0x00052b, PG_U_LOWERCASE_LETTER},
+	{0x00052c, 0x00052c, PG_U_UPPERCASE_LETTER},
+	{0x00052d, 0x00052d, PG_U_LOWERCASE_LETTER},
+	{0x00052e, 0x00052e, PG_U_UPPERCASE_LETTER},
+	{0x00052f, 0x00052f, PG_U_LOWERCASE_LETTER},
+	{0x000530, 0x000530, PG_U_UNASSIGNED},
+	{0x000531, 0x000556, PG_U_UPPERCASE_LETTER},
+	{0x000557, 0x000558, PG_U_UNASSIGNED},
+	{0x000559, 0x000559, PG_U_MODIFIER_LETTER},
+	{0x00055a, 0x00055f, PG_U_OTHER_PUNCTUATION},
+	{0x000560, 0x000588, PG_U_LOWERCASE_LETTER},
+	{0x000589, 0x000589, PG_U_OTHER_PUNCTUATION},
+	{0x00058a, 0x00058a, PG_U_DASH_PUNCTUATION},
+	{0x00058b, 0x00058c, PG_U_UNASSIGNED},
+	{0x00058d, 0x00058e, PG_U_OTHER_SYMBOL},
+	{0x00058f, 0x00058f, PG_U_CURRENCY_SYMBOL},
+	{0x000590, 0x000590, PG_U_UNASSIGNED},
+	{0x000591, 0x0005bd, PG_U_NON_SPACING_MARK},
+	{0x0005be, 0x0005be, PG_U_DASH_PUNCTUATION},
+	{0x0005bf, 0x0005bf, PG_U_NON_SPACING_MARK},
+	{0x0005c0, 0x0005c0, PG_U_OTHER_PUNCTUATION},
+	{0x0005c1, 0x0005c2, PG_U_NON_SPACING_MARK},
+	{0x0005c3, 0x0005c3, PG_U_OTHER_PUNCTUATION},
+	{0x0005c4, 0x0005c5, PG_U_NON_SPACING_MARK},
+	{0x0005c6, 0x0005c6, PG_U_OTHER_PUNCTUATION},
+	{0x0005c7, 0x0005c7, PG_U_NON_SPACING_MARK},
+	{0x0005c8, 0x0005cf, PG_U_UNASSIGNED},
+	{0x0005d0, 0x0005ea, PG_U_OTHER_LETTER},
+	{0x0005eb, 0x0005ee, PG_U_UNASSIGNED},
+	{0x0005ef, 0x0005f2, PG_U_OTHER_LETTER},
+	{0x0005f3, 0x0005f4, PG_U_OTHER_PUNCTUATION},
+	{0x0005f5, 0x0005ff, PG_U_UNASSIGNED},
+	{0x000600, 0x000605, PG_U_FORMAT_CHAR},
+	{0x000606, 0x000608, PG_U_MATH_SYMBOL},
+	{0x000609, 0x00060a, PG_U_OTHER_PUNCTUATION},
+	{0x00060b, 0x00060b, PG_U_CURRENCY_SYMBOL},
+	{0x00060c, 0x00060d, PG_U_OTHER_PUNCTUATION},
+	{0x00060e, 0x00060f, PG_U_OTHER_SYMBOL},
+	{0x000610, 0x00061a, PG_U_NON_SPACING_MARK},
+	{0x00061b, 0x00061b, PG_U_OTHER_PUNCTUATION},
+	{0x00061c, 0x00061c, PG_U_FORMAT_CHAR},
+	{0x00061d, 0x00061f, PG_U_OTHER_PUNCTUATION},
+	{0x000620, 0x00063f, PG_U_OTHER_LETTER},
+	{0x000640, 0x000640, PG_U_MODIFIER_LETTER},
+	{0x000641, 0x00064a, PG_U_OTHER_LETTER},
+	{0x00064b, 0x00065f, PG_U_NON_SPACING_MARK},
+	{0x000660, 0x000669, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00066a, 0x00066d, PG_U_OTHER_PUNCTUATION},
+	{0x00066e, 0x00066f, PG_U_OTHER_LETTER},
+	{0x000670, 0x000670, PG_U_NON_SPACING_MARK},
+	{0x000671, 0x0006d3, PG_U_OTHER_LETTER},
+	{0x0006d4, 0x0006d4, PG_U_OTHER_PUNCTUATION},
+	{0x0006d5, 0x0006d5, PG_U_OTHER_LETTER},
+	{0x0006d6, 0x0006dc, PG_U_NON_SPACING_MARK},
+	{0x0006dd, 0x0006dd, PG_U_FORMAT_CHAR},
+	{0x0006de, 0x0006de, PG_U_OTHER_SYMBOL},
+	{0x0006df, 0x0006e4, PG_U_NON_SPACING_MARK},
+	{0x0006e5, 0x0006e6, PG_U_MODIFIER_LETTER},
+	{0x0006e7, 0x0006e8, PG_U_NON_SPACING_MARK},
+	{0x0006e9, 0x0006e9, PG_U_OTHER_SYMBOL},
+	{0x0006ea, 0x0006ed, PG_U_NON_SPACING_MARK},
+	{0x0006ee, 0x0006ef, PG_U_OTHER_LETTER},
+	{0x0006f0, 0x0006f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0006fa, 0x0006fc, PG_U_OTHER_LETTER},
+	{0x0006fd, 0x0006fe, PG_U_OTHER_SYMBOL},
+	{0x0006ff, 0x0006ff, PG_U_OTHER_LETTER},
+	{0x000700, 0x00070d, PG_U_OTHER_PUNCTUATION},
+	{0x00070e, 0x00070e, PG_U_UNASSIGNED},
+	{0x00070f, 0x00070f, PG_U_FORMAT_CHAR},
+	{0x000710, 0x000710, PG_U_OTHER_LETTER},
+	{0x000711, 0x000711, PG_U_NON_SPACING_MARK},
+	{0x000712, 0x00072f, PG_U_OTHER_LETTER},
+	{0x000730, 0x00074a, PG_U_NON_SPACING_MARK},
+	{0x00074b, 0x00074c, PG_U_UNASSIGNED},
+	{0x00074d, 0x0007a5, PG_U_OTHER_LETTER},
+	{0x0007a6, 0x0007b0, PG_U_NON_SPACING_MARK},
+	{0x0007b1, 0x0007b1, PG_U_OTHER_LETTER},
+	{0x0007b2, 0x0007bf, PG_U_UNASSIGNED},
+	{0x0007c0, 0x0007c9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0007ca, 0x0007ea, PG_U_OTHER_LETTER},
+	{0x0007eb, 0x0007f3, PG_U_NON_SPACING_MARK},
+	{0x0007f4, 0x0007f5, PG_U_MODIFIER_LETTER},
+	{0x0007f6, 0x0007f6, PG_U_OTHER_SYMBOL},
+	{0x0007f7, 0x0007f9, PG_U_OTHER_PUNCTUATION},
+	{0x0007fa, 0x0007fa, PG_U_MODIFIER_LETTER},
+	{0x0007fb, 0x0007fc, PG_U_UNASSIGNED},
+	{0x0007fd, 0x0007fd, PG_U_NON_SPACING_MARK},
+	{0x0007fe, 0x0007ff, PG_U_CURRENCY_SYMBOL},
+	{0x000800, 0x000815, PG_U_OTHER_LETTER},
+	{0x000816, 0x000819, PG_U_NON_SPACING_MARK},
+	{0x00081a, 0x00081a, PG_U_MODIFIER_LETTER},
+	{0x00081b, 0x000823, PG_U_NON_SPACING_MARK},
+	{0x000824, 0x000824, PG_U_MODIFIER_LETTER},
+	{0x000825, 0x000827, PG_U_NON_SPACING_MARK},
+	{0x000828, 0x000828, PG_U_MODIFIER_LETTER},
+	{0x000829, 0x00082d, PG_U_NON_SPACING_MARK},
+	{0x00082e, 0x00082f, PG_U_UNASSIGNED},
+	{0x000830, 0x00083e, PG_U_OTHER_PUNCTUATION},
+	{0x00083f, 0x00083f, PG_U_UNASSIGNED},
+	{0x000840, 0x000858, PG_U_OTHER_LETTER},
+	{0x000859, 0x00085b, PG_U_NON_SPACING_MARK},
+	{0x00085c, 0x00085d, PG_U_UNASSIGNED},
+	{0x00085e, 0x00085e, PG_U_OTHER_PUNCTUATION},
+	{0x00085f, 0x00085f, PG_U_UNASSIGNED},
+	{0x000860, 0x00086a, PG_U_OTHER_LETTER},
+	{0x00086b, 0x00086f, PG_U_UNASSIGNED},
+	{0x000870, 0x000887, PG_U_OTHER_LETTER},
+	{0x000888, 0x000888, PG_U_MODIFIER_SYMBOL},
+	{0x000889, 0x00088e, PG_U_OTHER_LETTER},
+	{0x00088f, 0x00088f, PG_U_UNASSIGNED},
+	{0x000890, 0x000891, PG_U_FORMAT_CHAR},
+	{0x000892, 0x000897, PG_U_UNASSIGNED},
+	{0x000898, 0x00089f, PG_U_NON_SPACING_MARK},
+	{0x0008a0, 0x0008c8, PG_U_OTHER_LETTER},
+	{0x0008c9, 0x0008c9, PG_U_MODIFIER_LETTER},
+	{0x0008ca, 0x0008e1, PG_U_NON_SPACING_MARK},
+	{0x0008e2, 0x0008e2, PG_U_FORMAT_CHAR},
+	{0x0008e3, 0x000902, PG_U_NON_SPACING_MARK},
+	{0x000903, 0x000903, PG_U_COMBINING_SPACING_MARK},
+	{0x000904, 0x000939, PG_U_OTHER_LETTER},
+	{0x00093a, 0x00093a, PG_U_NON_SPACING_MARK},
+	{0x00093b, 0x00093b, PG_U_COMBINING_SPACING_MARK},
+	{0x00093c, 0x00093c, PG_U_NON_SPACING_MARK},
+	{0x00093d, 0x00093d, PG_U_OTHER_LETTER},
+	{0x00093e, 0x000940, PG_U_COMBINING_SPACING_MARK},
+	{0x000941, 0x000948, PG_U_NON_SPACING_MARK},
+	{0x000949, 0x00094c, PG_U_COMBINING_SPACING_MARK},
+	{0x00094d, 0x00094d, PG_U_NON_SPACING_MARK},
+	{0x00094e, 0x00094f, PG_U_COMBINING_SPACING_MARK},
+	{0x000950, 0x000950, PG_U_OTHER_LETTER},
+	{0x000951, 0x000957, PG_U_NON_SPACING_MARK},
+	{0x000958, 0x000961, PG_U_OTHER_LETTER},
+	{0x000962, 0x000963, PG_U_NON_SPACING_MARK},
+	{0x000964, 0x000965, PG_U_OTHER_PUNCTUATION},
+	{0x000966, 0x00096f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000970, 0x000970, PG_U_OTHER_PUNCTUATION},
+	{0x000971, 0x000971, PG_U_MODIFIER_LETTER},
+	{0x000972, 0x000980, PG_U_OTHER_LETTER},
+	{0x000981, 0x000981, PG_U_NON_SPACING_MARK},
+	{0x000982, 0x000983, PG_U_COMBINING_SPACING_MARK},
+	{0x000984, 0x000984, PG_U_UNASSIGNED},
+	{0x000985, 0x00098c, PG_U_OTHER_LETTER},
+	{0x00098d, 0x00098e, PG_U_UNASSIGNED},
+	{0x00098f, 0x000990, PG_U_OTHER_LETTER},
+	{0x000991, 0x000992, PG_U_UNASSIGNED},
+	{0x000993, 0x0009a8, PG_U_OTHER_LETTER},
+	{0x0009a9, 0x0009a9, PG_U_UNASSIGNED},
+	{0x0009aa, 0x0009b0, PG_U_OTHER_LETTER},
+	{0x0009b1, 0x0009b1, PG_U_UNASSIGNED},
+	{0x0009b2, 0x0009b2, PG_U_OTHER_LETTER},
+	{0x0009b3, 0x0009b5, PG_U_UNASSIGNED},
+	{0x0009b6, 0x0009b9, PG_U_OTHER_LETTER},
+	{0x0009ba, 0x0009bb, PG_U_UNASSIGNED},
+	{0x0009bc, 0x0009bc, PG_U_NON_SPACING_MARK},
+	{0x0009bd, 0x0009bd, PG_U_OTHER_LETTER},
+	{0x0009be, 0x0009c0, PG_U_COMBINING_SPACING_MARK},
+	{0x0009c1, 0x0009c4, PG_U_NON_SPACING_MARK},
+	{0x0009c5, 0x0009c6, PG_U_UNASSIGNED},
+	{0x0009c7, 0x0009c8, PG_U_COMBINING_SPACING_MARK},
+	{0x0009c9, 0x0009ca, PG_U_UNASSIGNED},
+	{0x0009cb, 0x0009cc, PG_U_COMBINING_SPACING_MARK},
+	{0x0009cd, 0x0009cd, PG_U_NON_SPACING_MARK},
+	{0x0009ce, 0x0009ce, PG_U_OTHER_LETTER},
+	{0x0009cf, 0x0009d6, PG_U_UNASSIGNED},
+	{0x0009d7, 0x0009d7, PG_U_COMBINING_SPACING_MARK},
+	{0x0009d8, 0x0009db, PG_U_UNASSIGNED},
+	{0x0009dc, 0x0009dd, PG_U_OTHER_LETTER},
+	{0x0009de, 0x0009de, PG_U_UNASSIGNED},
+	{0x0009df, 0x0009e1, PG_U_OTHER_LETTER},
+	{0x0009e2, 0x0009e3, PG_U_NON_SPACING_MARK},
+	{0x0009e4, 0x0009e5, PG_U_UNASSIGNED},
+	{0x0009e6, 0x0009ef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0009f0, 0x0009f1, PG_U_OTHER_LETTER},
+	{0x0009f2, 0x0009f3, PG_U_CURRENCY_SYMBOL},
+	{0x0009f4, 0x0009f9, PG_U_OTHER_NUMBER},
+	{0x0009fa, 0x0009fa, PG_U_OTHER_SYMBOL},
+	{0x0009fb, 0x0009fb, PG_U_CURRENCY_SYMBOL},
+	{0x0009fc, 0x0009fc, PG_U_OTHER_LETTER},
+	{0x0009fd, 0x0009fd, PG_U_OTHER_PUNCTUATION},
+	{0x0009fe, 0x0009fe, PG_U_NON_SPACING_MARK},
+	{0x0009ff, 0x000a00, PG_U_UNASSIGNED},
+	{0x000a01, 0x000a02, PG_U_NON_SPACING_MARK},
+	{0x000a03, 0x000a03, PG_U_COMBINING_SPACING_MARK},
+	{0x000a04, 0x000a04, PG_U_UNASSIGNED},
+	{0x000a05, 0x000a0a, PG_U_OTHER_LETTER},
+	{0x000a0b, 0x000a0e, PG_U_UNASSIGNED},
+	{0x000a0f, 0x000a10, PG_U_OTHER_LETTER},
+	{0x000a11, 0x000a12, PG_U_UNASSIGNED},
+	{0x000a13, 0x000a28, PG_U_OTHER_LETTER},
+	{0x000a29, 0x000a29, PG_U_UNASSIGNED},
+	{0x000a2a, 0x000a30, PG_U_OTHER_LETTER},
+	{0x000a31, 0x000a31, PG_U_UNASSIGNED},
+	{0x000a32, 0x000a33, PG_U_OTHER_LETTER},
+	{0x000a34, 0x000a34, PG_U_UNASSIGNED},
+	{0x000a35, 0x000a36, PG_U_OTHER_LETTER},
+	{0x000a37, 0x000a37, PG_U_UNASSIGNED},
+	{0x000a38, 0x000a39, PG_U_OTHER_LETTER},
+	{0x000a3a, 0x000a3b, PG_U_UNASSIGNED},
+	{0x000a3c, 0x000a3c, PG_U_NON_SPACING_MARK},
+	{0x000a3d, 0x000a3d, PG_U_UNASSIGNED},
+	{0x000a3e, 0x000a40, PG_U_COMBINING_SPACING_MARK},
+	{0x000a41, 0x000a42, PG_U_NON_SPACING_MARK},
+	{0x000a43, 0x000a46, PG_U_UNASSIGNED},
+	{0x000a47, 0x000a48, PG_U_NON_SPACING_MARK},
+	{0x000a49, 0x000a4a, PG_U_UNASSIGNED},
+	{0x000a4b, 0x000a4d, PG_U_NON_SPACING_MARK},
+	{0x000a4e, 0x000a50, PG_U_UNASSIGNED},
+	{0x000a51, 0x000a51, PG_U_NON_SPACING_MARK},
+	{0x000a52, 0x000a58, PG_U_UNASSIGNED},
+	{0x000a59, 0x000a5c, PG_U_OTHER_LETTER},
+	{0x000a5d, 0x000a5d, PG_U_UNASSIGNED},
+	{0x000a5e, 0x000a5e, PG_U_OTHER_LETTER},
+	{0x000a5f, 0x000a65, PG_U_UNASSIGNED},
+	{0x000a66, 0x000a6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000a70, 0x000a71, PG_U_NON_SPACING_MARK},
+	{0x000a72, 0x000a74, PG_U_OTHER_LETTER},
+	{0x000a75, 0x000a75, PG_U_NON_SPACING_MARK},
+	{0x000a76, 0x000a76, PG_U_OTHER_PUNCTUATION},
+	{0x000a77, 0x000a80, PG_U_UNASSIGNED},
+	{0x000a81, 0x000a82, PG_U_NON_SPACING_MARK},
+	{0x000a83, 0x000a83, PG_U_COMBINING_SPACING_MARK},
+	{0x000a84, 0x000a84, PG_U_UNASSIGNED},
+	{0x000a85, 0x000a8d, PG_U_OTHER_LETTER},
+	{0x000a8e, 0x000a8e, PG_U_UNASSIGNED},
+	{0x000a8f, 0x000a91, PG_U_OTHER_LETTER},
+	{0x000a92, 0x000a92, PG_U_UNASSIGNED},
+	{0x000a93, 0x000aa8, PG_U_OTHER_LETTER},
+	{0x000aa9, 0x000aa9, PG_U_UNASSIGNED},
+	{0x000aaa, 0x000ab0, PG_U_OTHER_LETTER},
+	{0x000ab1, 0x000ab1, PG_U_UNASSIGNED},
+	{0x000ab2, 0x000ab3, PG_U_OTHER_LETTER},
+	{0x000ab4, 0x000ab4, PG_U_UNASSIGNED},
+	{0x000ab5, 0x000ab9, PG_U_OTHER_LETTER},
+	{0x000aba, 0x000abb, PG_U_UNASSIGNED},
+	{0x000abc, 0x000abc, PG_U_NON_SPACING_MARK},
+	{0x000abd, 0x000abd, PG_U_OTHER_LETTER},
+	{0x000abe, 0x000ac0, PG_U_COMBINING_SPACING_MARK},
+	{0x000ac1, 0x000ac5, PG_U_NON_SPACING_MARK},
+	{0x000ac6, 0x000ac6, PG_U_UNASSIGNED},
+	{0x000ac7, 0x000ac8, PG_U_NON_SPACING_MARK},
+	{0x000ac9, 0x000ac9, PG_U_COMBINING_SPACING_MARK},
+	{0x000aca, 0x000aca, PG_U_UNASSIGNED},
+	{0x000acb, 0x000acc, PG_U_COMBINING_SPACING_MARK},
+	{0x000acd, 0x000acd, PG_U_NON_SPACING_MARK},
+	{0x000ace, 0x000acf, PG_U_UNASSIGNED},
+	{0x000ad0, 0x000ad0, PG_U_OTHER_LETTER},
+	{0x000ad1, 0x000adf, PG_U_UNASSIGNED},
+	{0x000ae0, 0x000ae1, PG_U_OTHER_LETTER},
+	{0x000ae2, 0x000ae3, PG_U_NON_SPACING_MARK},
+	{0x000ae4, 0x000ae5, PG_U_UNASSIGNED},
+	{0x000ae6, 0x000aef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000af0, 0x000af0, PG_U_OTHER_PUNCTUATION},
+	{0x000af1, 0x000af1, PG_U_CURRENCY_SYMBOL},
+	{0x000af2, 0x000af8, PG_U_UNASSIGNED},
+	{0x000af9, 0x000af9, PG_U_OTHER_LETTER},
+	{0x000afa, 0x000aff, PG_U_NON_SPACING_MARK},
+	{0x000b00, 0x000b00, PG_U_UNASSIGNED},
+	{0x000b01, 0x000b01, PG_U_NON_SPACING_MARK},
+	{0x000b02, 0x000b03, PG_U_COMBINING_SPACING_MARK},
+	{0x000b04, 0x000b04, PG_U_UNASSIGNED},
+	{0x000b05, 0x000b0c, PG_U_OTHER_LETTER},
+	{0x000b0d, 0x000b0e, PG_U_UNASSIGNED},
+	{0x000b0f, 0x000b10, PG_U_OTHER_LETTER},
+	{0x000b11, 0x000b12, PG_U_UNASSIGNED},
+	{0x000b13, 0x000b28, PG_U_OTHER_LETTER},
+	{0x000b29, 0x000b29, PG_U_UNASSIGNED},
+	{0x000b2a, 0x000b30, PG_U_OTHER_LETTER},
+	{0x000b31, 0x000b31, PG_U_UNASSIGNED},
+	{0x000b32, 0x000b33, PG_U_OTHER_LETTER},
+	{0x000b34, 0x000b34, PG_U_UNASSIGNED},
+	{0x000b35, 0x000b39, PG_U_OTHER_LETTER},
+	{0x000b3a, 0x000b3b, PG_U_UNASSIGNED},
+	{0x000b3c, 0x000b3c, PG_U_NON_SPACING_MARK},
+	{0x000b3d, 0x000b3d, PG_U_OTHER_LETTER},
+	{0x000b3e, 0x000b3e, PG_U_COMBINING_SPACING_MARK},
+	{0x000b3f, 0x000b3f, PG_U_NON_SPACING_MARK},
+	{0x000b40, 0x000b40, PG_U_COMBINING_SPACING_MARK},
+	{0x000b41, 0x000b44, PG_U_NON_SPACING_MARK},
+	{0x000b45, 0x000b46, PG_U_UNASSIGNED},
+	{0x000b47, 0x000b48, PG_U_COMBINING_SPACING_MARK},
+	{0x000b49, 0x000b4a, PG_U_UNASSIGNED},
+	{0x000b4b, 0x000b4c, PG_U_COMBINING_SPACING_MARK},
+	{0x000b4d, 0x000b4d, PG_U_NON_SPACING_MARK},
+	{0x000b4e, 0x000b54, PG_U_UNASSIGNED},
+	{0x000b55, 0x000b56, PG_U_NON_SPACING_MARK},
+	{0x000b57, 0x000b57, PG_U_COMBINING_SPACING_MARK},
+	{0x000b58, 0x000b5b, PG_U_UNASSIGNED},
+	{0x000b5c, 0x000b5d, PG_U_OTHER_LETTER},
+	{0x000b5e, 0x000b5e, PG_U_UNASSIGNED},
+	{0x000b5f, 0x000b61, PG_U_OTHER_LETTER},
+	{0x000b62, 0x000b63, PG_U_NON_SPACING_MARK},
+	{0x000b64, 0x000b65, PG_U_UNASSIGNED},
+	{0x000b66, 0x000b6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000b70, 0x000b70, PG_U_OTHER_SYMBOL},
+	{0x000b71, 0x000b71, PG_U_OTHER_LETTER},
+	{0x000b72, 0x000b77, PG_U_OTHER_NUMBER},
+	{0x000b78, 0x000b81, PG_U_UNASSIGNED},
+	{0x000b82, 0x000b82, PG_U_NON_SPACING_MARK},
+	{0x000b83, 0x000b83, PG_U_OTHER_LETTER},
+	{0x000b84, 0x000b84, PG_U_UNASSIGNED},
+	{0x000b85, 0x000b8a, PG_U_OTHER_LETTER},
+	{0x000b8b, 0x000b8d, PG_U_UNASSIGNED},
+	{0x000b8e, 0x000b90, PG_U_OTHER_LETTER},
+	{0x000b91, 0x000b91, PG_U_UNASSIGNED},
+	{0x000b92, 0x000b95, PG_U_OTHER_LETTER},
+	{0x000b96, 0x000b98, PG_U_UNASSIGNED},
+	{0x000b99, 0x000b9a, PG_U_OTHER_LETTER},
+	{0x000b9b, 0x000b9b, PG_U_UNASSIGNED},
+	{0x000b9c, 0x000b9c, PG_U_OTHER_LETTER},
+	{0x000b9d, 0x000b9d, PG_U_UNASSIGNED},
+	{0x000b9e, 0x000b9f, PG_U_OTHER_LETTER},
+	{0x000ba0, 0x000ba2, PG_U_UNASSIGNED},
+	{0x000ba3, 0x000ba4, PG_U_OTHER_LETTER},
+	{0x000ba5, 0x000ba7, PG_U_UNASSIGNED},
+	{0x000ba8, 0x000baa, PG_U_OTHER_LETTER},
+	{0x000bab, 0x000bad, PG_U_UNASSIGNED},
+	{0x000bae, 0x000bb9, PG_U_OTHER_LETTER},
+	{0x000bba, 0x000bbd, PG_U_UNASSIGNED},
+	{0x000bbe, 0x000bbf, PG_U_COMBINING_SPACING_MARK},
+	{0x000bc0, 0x000bc0, PG_U_NON_SPACING_MARK},
+	{0x000bc1, 0x000bc2, PG_U_COMBINING_SPACING_MARK},
+	{0x000bc3, 0x000bc5, PG_U_UNASSIGNED},
+	{0x000bc6, 0x000bc8, PG_U_COMBINING_SPACING_MARK},
+	{0x000bc9, 0x000bc9, PG_U_UNASSIGNED},
+	{0x000bca, 0x000bcc, PG_U_COMBINING_SPACING_MARK},
+	{0x000bcd, 0x000bcd, PG_U_NON_SPACING_MARK},
+	{0x000bce, 0x000bcf, PG_U_UNASSIGNED},
+	{0x000bd0, 0x000bd0, PG_U_OTHER_LETTER},
+	{0x000bd1, 0x000bd6, PG_U_UNASSIGNED},
+	{0x000bd7, 0x000bd7, PG_U_COMBINING_SPACING_MARK},
+	{0x000bd8, 0x000be5, PG_U_UNASSIGNED},
+	{0x000be6, 0x000bef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000bf0, 0x000bf2, PG_U_OTHER_NUMBER},
+	{0x000bf3, 0x000bf8, PG_U_OTHER_SYMBOL},
+	{0x000bf9, 0x000bf9, PG_U_CURRENCY_SYMBOL},
+	{0x000bfa, 0x000bfa, PG_U_OTHER_SYMBOL},
+	{0x000bfb, 0x000bff, PG_U_UNASSIGNED},
+	{0x000c00, 0x000c00, PG_U_NON_SPACING_MARK},
+	{0x000c01, 0x000c03, PG_U_COMBINING_SPACING_MARK},
+	{0x000c04, 0x000c04, PG_U_NON_SPACING_MARK},
+	{0x000c05, 0x000c0c, PG_U_OTHER_LETTER},
+	{0x000c0d, 0x000c0d, PG_U_UNASSIGNED},
+	{0x000c0e, 0x000c10, PG_U_OTHER_LETTER},
+	{0x000c11, 0x000c11, PG_U_UNASSIGNED},
+	{0x000c12, 0x000c28, PG_U_OTHER_LETTER},
+	{0x000c29, 0x000c29, PG_U_UNASSIGNED},
+	{0x000c2a, 0x000c39, PG_U_OTHER_LETTER},
+	{0x000c3a, 0x000c3b, PG_U_UNASSIGNED},
+	{0x000c3c, 0x000c3c, PG_U_NON_SPACING_MARK},
+	{0x000c3d, 0x000c3d, PG_U_OTHER_LETTER},
+	{0x000c3e, 0x000c40, PG_U_NON_SPACING_MARK},
+	{0x000c41, 0x000c44, PG_U_COMBINING_SPACING_MARK},
+	{0x000c45, 0x000c45, PG_U_UNASSIGNED},
+	{0x000c46, 0x000c48, PG_U_NON_SPACING_MARK},
+	{0x000c49, 0x000c49, PG_U_UNASSIGNED},
+	{0x000c4a, 0x000c4d, PG_U_NON_SPACING_MARK},
+	{0x000c4e, 0x000c54, PG_U_UNASSIGNED},
+	{0x000c55, 0x000c56, PG_U_NON_SPACING_MARK},
+	{0x000c57, 0x000c57, PG_U_UNASSIGNED},
+	{0x000c58, 0x000c5a, PG_U_OTHER_LETTER},
+	{0x000c5b, 0x000c5c, PG_U_UNASSIGNED},
+	{0x000c5d, 0x000c5d, PG_U_OTHER_LETTER},
+	{0x000c5e, 0x000c5f, PG_U_UNASSIGNED},
+	{0x000c60, 0x000c61, PG_U_OTHER_LETTER},
+	{0x000c62, 0x000c63, PG_U_NON_SPACING_MARK},
+	{0x000c64, 0x000c65, PG_U_UNASSIGNED},
+	{0x000c66, 0x000c6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000c70, 0x000c76, PG_U_UNASSIGNED},
+	{0x000c77, 0x000c77, PG_U_OTHER_PUNCTUATION},
+	{0x000c78, 0x000c7e, PG_U_OTHER_NUMBER},
+	{0x000c7f, 0x000c7f, PG_U_OTHER_SYMBOL},
+	{0x000c80, 0x000c80, PG_U_OTHER_LETTER},
+	{0x000c81, 0x000c81, PG_U_NON_SPACING_MARK},
+	{0x000c82, 0x000c83, PG_U_COMBINING_SPACING_MARK},
+	{0x000c84, 0x000c84, PG_U_OTHER_PUNCTUATION},
+	{0x000c85, 0x000c8c, PG_U_OTHER_LETTER},
+	{0x000c8d, 0x000c8d, PG_U_UNASSIGNED},
+	{0x000c8e, 0x000c90, PG_U_OTHER_LETTER},
+	{0x000c91, 0x000c91, PG_U_UNASSIGNED},
+	{0x000c92, 0x000ca8, PG_U_OTHER_LETTER},
+	{0x000ca9, 0x000ca9, PG_U_UNASSIGNED},
+	{0x000caa, 0x000cb3, PG_U_OTHER_LETTER},
+	{0x000cb4, 0x000cb4, PG_U_UNASSIGNED},
+	{0x000cb5, 0x000cb9, PG_U_OTHER_LETTER},
+	{0x000cba, 0x000cbb, PG_U_UNASSIGNED},
+	{0x000cbc, 0x000cbc, PG_U_NON_SPACING_MARK},
+	{0x000cbd, 0x000cbd, PG_U_OTHER_LETTER},
+	{0x000cbe, 0x000cbe, PG_U_COMBINING_SPACING_MARK},
+	{0x000cbf, 0x000cbf, PG_U_NON_SPACING_MARK},
+	{0x000cc0, 0x000cc4, PG_U_COMBINING_SPACING_MARK},
+	{0x000cc5, 0x000cc5, PG_U_UNASSIGNED},
+	{0x000cc6, 0x000cc6, PG_U_NON_SPACING_MARK},
+	{0x000cc7, 0x000cc8, PG_U_COMBINING_SPACING_MARK},
+	{0x000cc9, 0x000cc9, PG_U_UNASSIGNED},
+	{0x000cca, 0x000ccb, PG_U_COMBINING_SPACING_MARK},
+	{0x000ccc, 0x000ccd, PG_U_NON_SPACING_MARK},
+	{0x000cce, 0x000cd4, PG_U_UNASSIGNED},
+	{0x000cd5, 0x000cd6, PG_U_COMBINING_SPACING_MARK},
+	{0x000cd7, 0x000cdc, PG_U_UNASSIGNED},
+	{0x000cdd, 0x000cde, PG_U_OTHER_LETTER},
+	{0x000cdf, 0x000cdf, PG_U_UNASSIGNED},
+	{0x000ce0, 0x000ce1, PG_U_OTHER_LETTER},
+	{0x000ce2, 0x000ce3, PG_U_NON_SPACING_MARK},
+	{0x000ce4, 0x000ce5, PG_U_UNASSIGNED},
+	{0x000ce6, 0x000cef, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000cf0, 0x000cf0, PG_U_UNASSIGNED},
+	{0x000cf1, 0x000cf2, PG_U_OTHER_LETTER},
+	{0x000cf3, 0x000cf3, PG_U_COMBINING_SPACING_MARK},
+	{0x000cf4, 0x000cff, PG_U_UNASSIGNED},
+	{0x000d00, 0x000d01, PG_U_NON_SPACING_MARK},
+	{0x000d02, 0x000d03, PG_U_COMBINING_SPACING_MARK},
+	{0x000d04, 0x000d0c, PG_U_OTHER_LETTER},
+	{0x000d0d, 0x000d0d, PG_U_UNASSIGNED},
+	{0x000d0e, 0x000d10, PG_U_OTHER_LETTER},
+	{0x000d11, 0x000d11, PG_U_UNASSIGNED},
+	{0x000d12, 0x000d3a, PG_U_OTHER_LETTER},
+	{0x000d3b, 0x000d3c, PG_U_NON_SPACING_MARK},
+	{0x000d3d, 0x000d3d, PG_U_OTHER_LETTER},
+	{0x000d3e, 0x000d40, PG_U_COMBINING_SPACING_MARK},
+	{0x000d41, 0x000d44, PG_U_NON_SPACING_MARK},
+	{0x000d45, 0x000d45, PG_U_UNASSIGNED},
+	{0x000d46, 0x000d48, PG_U_COMBINING_SPACING_MARK},
+	{0x000d49, 0x000d49, PG_U_UNASSIGNED},
+	{0x000d4a, 0x000d4c, PG_U_COMBINING_SPACING_MARK},
+	{0x000d4d, 0x000d4d, PG_U_NON_SPACING_MARK},
+	{0x000d4e, 0x000d4e, PG_U_OTHER_LETTER},
+	{0x000d4f, 0x000d4f, PG_U_OTHER_SYMBOL},
+	{0x000d50, 0x000d53, PG_U_UNASSIGNED},
+	{0x000d54, 0x000d56, PG_U_OTHER_LETTER},
+	{0x000d57, 0x000d57, PG_U_COMBINING_SPACING_MARK},
+	{0x000d58, 0x000d5e, PG_U_OTHER_NUMBER},
+	{0x000d5f, 0x000d61, PG_U_OTHER_LETTER},
+	{0x000d62, 0x000d63, PG_U_NON_SPACING_MARK},
+	{0x000d64, 0x000d65, PG_U_UNASSIGNED},
+	{0x000d66, 0x000d6f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000d70, 0x000d78, PG_U_OTHER_NUMBER},
+	{0x000d79, 0x000d79, PG_U_OTHER_SYMBOL},
+	{0x000d7a, 0x000d7f, PG_U_OTHER_LETTER},
+	{0x000d80, 0x000d80, PG_U_UNASSIGNED},
+	{0x000d81, 0x000d81, PG_U_NON_SPACING_MARK},
+	{0x000d82, 0x000d83, PG_U_COMBINING_SPACING_MARK},
+	{0x000d84, 0x000d84, PG_U_UNASSIGNED},
+	{0x000d85, 0x000d96, PG_U_OTHER_LETTER},
+	{0x000d97, 0x000d99, PG_U_UNASSIGNED},
+	{0x000d9a, 0x000db1, PG_U_OTHER_LETTER},
+	{0x000db2, 0x000db2, PG_U_UNASSIGNED},
+	{0x000db3, 0x000dbb, PG_U_OTHER_LETTER},
+	{0x000dbc, 0x000dbc, PG_U_UNASSIGNED},
+	{0x000dbd, 0x000dbd, PG_U_OTHER_LETTER},
+	{0x000dbe, 0x000dbf, PG_U_UNASSIGNED},
+	{0x000dc0, 0x000dc6, PG_U_OTHER_LETTER},
+	{0x000dc7, 0x000dc9, PG_U_UNASSIGNED},
+	{0x000dca, 0x000dca, PG_U_NON_SPACING_MARK},
+	{0x000dcb, 0x000dce, PG_U_UNASSIGNED},
+	{0x000dcf, 0x000dd1, PG_U_COMBINING_SPACING_MARK},
+	{0x000dd2, 0x000dd4, PG_U_NON_SPACING_MARK},
+	{0x000dd5, 0x000dd5, PG_U_UNASSIGNED},
+	{0x000dd6, 0x000dd6, PG_U_NON_SPACING_MARK},
+	{0x000dd7, 0x000dd7, PG_U_UNASSIGNED},
+	{0x000dd8, 0x000ddf, PG_U_COMBINING_SPACING_MARK},
+	{0x000de0, 0x000de5, PG_U_UNASSIGNED},
+	{0x000de6, 0x000def, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000df0, 0x000df1, PG_U_UNASSIGNED},
+	{0x000df2, 0x000df3, PG_U_COMBINING_SPACING_MARK},
+	{0x000df4, 0x000df4, PG_U_OTHER_PUNCTUATION},
+	{0x000df5, 0x000e00, PG_U_UNASSIGNED},
+	{0x000e01, 0x000e30, PG_U_OTHER_LETTER},
+	{0x000e31, 0x000e31, PG_U_NON_SPACING_MARK},
+	{0x000e32, 0x000e33, PG_U_OTHER_LETTER},
+	{0x000e34, 0x000e3a, PG_U_NON_SPACING_MARK},
+	{0x000e3b, 0x000e3e, PG_U_UNASSIGNED},
+	{0x000e3f, 0x000e3f, PG_U_CURRENCY_SYMBOL},
+	{0x000e40, 0x000e45, PG_U_OTHER_LETTER},
+	{0x000e46, 0x000e46, PG_U_MODIFIER_LETTER},
+	{0x000e47, 0x000e4e, PG_U_NON_SPACING_MARK},
+	{0x000e4f, 0x000e4f, PG_U_OTHER_PUNCTUATION},
+	{0x000e50, 0x000e59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000e5a, 0x000e5b, PG_U_OTHER_PUNCTUATION},
+	{0x000e5c, 0x000e80, PG_U_UNASSIGNED},
+	{0x000e81, 0x000e82, PG_U_OTHER_LETTER},
+	{0x000e83, 0x000e83, PG_U_UNASSIGNED},
+	{0x000e84, 0x000e84, PG_U_OTHER_LETTER},
+	{0x000e85, 0x000e85, PG_U_UNASSIGNED},
+	{0x000e86, 0x000e8a, PG_U_OTHER_LETTER},
+	{0x000e8b, 0x000e8b, PG_U_UNASSIGNED},
+	{0x000e8c, 0x000ea3, PG_U_OTHER_LETTER},
+	{0x000ea4, 0x000ea4, PG_U_UNASSIGNED},
+	{0x000ea5, 0x000ea5, PG_U_OTHER_LETTER},
+	{0x000ea6, 0x000ea6, PG_U_UNASSIGNED},
+	{0x000ea7, 0x000eb0, PG_U_OTHER_LETTER},
+	{0x000eb1, 0x000eb1, PG_U_NON_SPACING_MARK},
+	{0x000eb2, 0x000eb3, PG_U_OTHER_LETTER},
+	{0x000eb4, 0x000ebc, PG_U_NON_SPACING_MARK},
+	{0x000ebd, 0x000ebd, PG_U_OTHER_LETTER},
+	{0x000ebe, 0x000ebf, PG_U_UNASSIGNED},
+	{0x000ec0, 0x000ec4, PG_U_OTHER_LETTER},
+	{0x000ec5, 0x000ec5, PG_U_UNASSIGNED},
+	{0x000ec6, 0x000ec6, PG_U_MODIFIER_LETTER},
+	{0x000ec7, 0x000ec7, PG_U_UNASSIGNED},
+	{0x000ec8, 0x000ece, PG_U_NON_SPACING_MARK},
+	{0x000ecf, 0x000ecf, PG_U_UNASSIGNED},
+	{0x000ed0, 0x000ed9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000eda, 0x000edb, PG_U_UNASSIGNED},
+	{0x000edc, 0x000edf, PG_U_OTHER_LETTER},
+	{0x000ee0, 0x000eff, PG_U_UNASSIGNED},
+	{0x000f00, 0x000f00, PG_U_OTHER_LETTER},
+	{0x000f01, 0x000f03, PG_U_OTHER_SYMBOL},
+	{0x000f04, 0x000f12, PG_U_OTHER_PUNCTUATION},
+	{0x000f13, 0x000f13, PG_U_OTHER_SYMBOL},
+	{0x000f14, 0x000f14, PG_U_OTHER_PUNCTUATION},
+	{0x000f15, 0x000f17, PG_U_OTHER_SYMBOL},
+	{0x000f18, 0x000f19, PG_U_NON_SPACING_MARK},
+	{0x000f1a, 0x000f1f, PG_U_OTHER_SYMBOL},
+	{0x000f20, 0x000f29, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x000f2a, 0x000f33, PG_U_OTHER_NUMBER},
+	{0x000f34, 0x000f34, PG_U_OTHER_SYMBOL},
+	{0x000f35, 0x000f35, PG_U_NON_SPACING_MARK},
+	{0x000f36, 0x000f36, PG_U_OTHER_SYMBOL},
+	{0x000f37, 0x000f37, PG_U_NON_SPACING_MARK},
+	{0x000f38, 0x000f38, PG_U_OTHER_SYMBOL},
+	{0x000f39, 0x000f39, PG_U_NON_SPACING_MARK},
+	{0x000f3a, 0x000f3a, PG_U_START_PUNCTUATION},
+	{0x000f3b, 0x000f3b, PG_U_END_PUNCTUATION},
+	{0x000f3c, 0x000f3c, PG_U_START_PUNCTUATION},
+	{0x000f3d, 0x000f3d, PG_U_END_PUNCTUATION},
+	{0x000f3e, 0x000f3f, PG_U_COMBINING_SPACING_MARK},
+	{0x000f40, 0x000f47, PG_U_OTHER_LETTER},
+	{0x000f48, 0x000f48, PG_U_UNASSIGNED},
+	{0x000f49, 0x000f6c, PG_U_OTHER_LETTER},
+	{0x000f6d, 0x000f70, PG_U_UNASSIGNED},
+	{0x000f71, 0x000f7e, PG_U_NON_SPACING_MARK},
+	{0x000f7f, 0x000f7f, PG_U_COMBINING_SPACING_MARK},
+	{0x000f80, 0x000f84, PG_U_NON_SPACING_MARK},
+	{0x000f85, 0x000f85, PG_U_OTHER_PUNCTUATION},
+	{0x000f86, 0x000f87, PG_U_NON_SPACING_MARK},
+	{0x000f88, 0x000f8c, PG_U_OTHER_LETTER},
+	{0x000f8d, 0x000f97, PG_U_NON_SPACING_MARK},
+	{0x000f98, 0x000f98, PG_U_UNASSIGNED},
+	{0x000f99, 0x000fbc, PG_U_NON_SPACING_MARK},
+	{0x000fbd, 0x000fbd, PG_U_UNASSIGNED},
+	{0x000fbe, 0x000fc5, PG_U_OTHER_SYMBOL},
+	{0x000fc6, 0x000fc6, PG_U_NON_SPACING_MARK},
+	{0x000fc7, 0x000fcc, PG_U_OTHER_SYMBOL},
+	{0x000fcd, 0x000fcd, PG_U_UNASSIGNED},
+	{0x000fce, 0x000fcf, PG_U_OTHER_SYMBOL},
+	{0x000fd0, 0x000fd4, PG_U_OTHER_PUNCTUATION},
+	{0x000fd5, 0x000fd8, PG_U_OTHER_SYMBOL},
+	{0x000fd9, 0x000fda, PG_U_OTHER_PUNCTUATION},
+	{0x000fdb, 0x000fff, PG_U_UNASSIGNED},
+	{0x001000, 0x00102a, PG_U_OTHER_LETTER},
+	{0x00102b, 0x00102c, PG_U_COMBINING_SPACING_MARK},
+	{0x00102d, 0x001030, PG_U_NON_SPACING_MARK},
+	{0x001031, 0x001031, PG_U_COMBINING_SPACING_MARK},
+	{0x001032, 0x001037, PG_U_NON_SPACING_MARK},
+	{0x001038, 0x001038, PG_U_COMBINING_SPACING_MARK},
+	{0x001039, 0x00103a, PG_U_NON_SPACING_MARK},
+	{0x00103b, 0x00103c, PG_U_COMBINING_SPACING_MARK},
+	{0x00103d, 0x00103e, PG_U_NON_SPACING_MARK},
+	{0x00103f, 0x00103f, PG_U_OTHER_LETTER},
+	{0x001040, 0x001049, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00104a, 0x00104f, PG_U_OTHER_PUNCTUATION},
+	{0x001050, 0x001055, PG_U_OTHER_LETTER},
+	{0x001056, 0x001057, PG_U_COMBINING_SPACING_MARK},
+	{0x001058, 0x001059, PG_U_NON_SPACING_MARK},
+	{0x00105a, 0x00105d, PG_U_OTHER_LETTER},
+	{0x00105e, 0x001060, PG_U_NON_SPACING_MARK},
+	{0x001061, 0x001061, PG_U_OTHER_LETTER},
+	{0x001062, 0x001064, PG_U_COMBINING_SPACING_MARK},
+	{0x001065, 0x001066, PG_U_OTHER_LETTER},
+	{0x001067, 0x00106d, PG_U_COMBINING_SPACING_MARK},
+	{0x00106e, 0x001070, PG_U_OTHER_LETTER},
+	{0x001071, 0x001074, PG_U_NON_SPACING_MARK},
+	{0x001075, 0x001081, PG_U_OTHER_LETTER},
+	{0x001082, 0x001082, PG_U_NON_SPACING_MARK},
+	{0x001083, 0x001084, PG_U_COMBINING_SPACING_MARK},
+	{0x001085, 0x001086, PG_U_NON_SPACING_MARK},
+	{0x001087, 0x00108c, PG_U_COMBINING_SPACING_MARK},
+	{0x00108d, 0x00108d, PG_U_NON_SPACING_MARK},
+	{0x00108e, 0x00108e, PG_U_OTHER_LETTER},
+	{0x00108f, 0x00108f, PG_U_COMBINING_SPACING_MARK},
+	{0x001090, 0x001099, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00109a, 0x00109c, PG_U_COMBINING_SPACING_MARK},
+	{0x00109d, 0x00109d, PG_U_NON_SPACING_MARK},
+	{0x00109e, 0x00109f, PG_U_OTHER_SYMBOL},
+	{0x0010a0, 0x0010c5, PG_U_UPPERCASE_LETTER},
+	{0x0010c6, 0x0010c6, PG_U_UNASSIGNED},
+	{0x0010c7, 0x0010c7, PG_U_UPPERCASE_LETTER},
+	{0x0010c8, 0x0010cc, PG_U_UNASSIGNED},
+	{0x0010cd, 0x0010cd, PG_U_UPPERCASE_LETTER},
+	{0x0010ce, 0x0010cf, PG_U_UNASSIGNED},
+	{0x0010d0, 0x0010fa, PG_U_LOWERCASE_LETTER},
+	{0x0010fb, 0x0010fb, PG_U_OTHER_PUNCTUATION},
+	{0x0010fc, 0x0010fc, PG_U_MODIFIER_LETTER},
+	{0x0010fd, 0x0010ff, PG_U_LOWERCASE_LETTER},
+	{0x001100, 0x001248, PG_U_OTHER_LETTER},
+	{0x001249, 0x001249, PG_U_UNASSIGNED},
+	{0x00124a, 0x00124d, PG_U_OTHER_LETTER},
+	{0x00124e, 0x00124f, PG_U_UNASSIGNED},
+	{0x001250, 0x001256, PG_U_OTHER_LETTER},
+	{0x001257, 0x001257, PG_U_UNASSIGNED},
+	{0x001258, 0x001258, PG_U_OTHER_LETTER},
+	{0x001259, 0x001259, PG_U_UNASSIGNED},
+	{0x00125a, 0x00125d, PG_U_OTHER_LETTER},
+	{0x00125e, 0x00125f, PG_U_UNASSIGNED},
+	{0x001260, 0x001288, PG_U_OTHER_LETTER},
+	{0x001289, 0x001289, PG_U_UNASSIGNED},
+	{0x00128a, 0x00128d, PG_U_OTHER_LETTER},
+	{0x00128e, 0x00128f, PG_U_UNASSIGNED},
+	{0x001290, 0x0012b0, PG_U_OTHER_LETTER},
+	{0x0012b1, 0x0012b1, PG_U_UNASSIGNED},
+	{0x0012b2, 0x0012b5, PG_U_OTHER_LETTER},
+	{0x0012b6, 0x0012b7, PG_U_UNASSIGNED},
+	{0x0012b8, 0x0012be, PG_U_OTHER_LETTER},
+	{0x0012bf, 0x0012bf, PG_U_UNASSIGNED},
+	{0x0012c0, 0x0012c0, PG_U_OTHER_LETTER},
+	{0x0012c1, 0x0012c1, PG_U_UNASSIGNED},
+	{0x0012c2, 0x0012c5, PG_U_OTHER_LETTER},
+	{0x0012c6, 0x0012c7, PG_U_UNASSIGNED},
+	{0x0012c8, 0x0012d6, PG_U_OTHER_LETTER},
+	{0x0012d7, 0x0012d7, PG_U_UNASSIGNED},
+	{0x0012d8, 0x001310, PG_U_OTHER_LETTER},
+	{0x001311, 0x001311, PG_U_UNASSIGNED},
+	{0x001312, 0x001315, PG_U_OTHER_LETTER},
+	{0x001316, 0x001317, PG_U_UNASSIGNED},
+	{0x001318, 0x00135a, PG_U_OTHER_LETTER},
+	{0x00135b, 0x00135c, PG_U_UNASSIGNED},
+	{0x00135d, 0x00135f, PG_U_NON_SPACING_MARK},
+	{0x001360, 0x001368, PG_U_OTHER_PUNCTUATION},
+	{0x001369, 0x00137c, PG_U_OTHER_NUMBER},
+	{0x00137d, 0x00137f, PG_U_UNASSIGNED},
+	{0x001380, 0x00138f, PG_U_OTHER_LETTER},
+	{0x001390, 0x001399, PG_U_OTHER_SYMBOL},
+	{0x00139a, 0x00139f, PG_U_UNASSIGNED},
+	{0x0013a0, 0x0013f5, PG_U_UPPERCASE_LETTER},
+	{0x0013f6, 0x0013f7, PG_U_UNASSIGNED},
+	{0x0013f8, 0x0013fd, PG_U_LOWERCASE_LETTER},
+	{0x0013fe, 0x0013ff, PG_U_UNASSIGNED},
+	{0x001400, 0x001400, PG_U_DASH_PUNCTUATION},
+	{0x001401, 0x00166c, PG_U_OTHER_LETTER},
+	{0x00166d, 0x00166d, PG_U_OTHER_SYMBOL},
+	{0x00166e, 0x00166e, PG_U_OTHER_PUNCTUATION},
+	{0x00166f, 0x00167f, PG_U_OTHER_LETTER},
+	{0x001680, 0x001680, PG_U_SPACE_SEPARATOR},
+	{0x001681, 0x00169a, PG_U_OTHER_LETTER},
+	{0x00169b, 0x00169b, PG_U_START_PUNCTUATION},
+	{0x00169c, 0x00169c, PG_U_END_PUNCTUATION},
+	{0x00169d, 0x00169f, PG_U_UNASSIGNED},
+	{0x0016a0, 0x0016ea, PG_U_OTHER_LETTER},
+	{0x0016eb, 0x0016ed, PG_U_OTHER_PUNCTUATION},
+	{0x0016ee, 0x0016f0, PG_U_LETTER_NUMBER},
+	{0x0016f1, 0x0016f8, PG_U_OTHER_LETTER},
+	{0x0016f9, 0x0016ff, PG_U_UNASSIGNED},
+	{0x001700, 0x001711, PG_U_OTHER_LETTER},
+	{0x001712, 0x001714, PG_U_NON_SPACING_MARK},
+	{0x001715, 0x001715, PG_U_COMBINING_SPACING_MARK},
+	{0x001716, 0x00171e, PG_U_UNASSIGNED},
+	{0x00171f, 0x001731, PG_U_OTHER_LETTER},
+	{0x001732, 0x001733, PG_U_NON_SPACING_MARK},
+	{0x001734, 0x001734, PG_U_COMBINING_SPACING_MARK},
+	{0x001735, 0x001736, PG_U_OTHER_PUNCTUATION},
+	{0x001737, 0x00173f, PG_U_UNASSIGNED},
+	{0x001740, 0x001751, PG_U_OTHER_LETTER},
+	{0x001752, 0x001753, PG_U_NON_SPACING_MARK},
+	{0x001754, 0x00175f, PG_U_UNASSIGNED},
+	{0x001760, 0x00176c, PG_U_OTHER_LETTER},
+	{0x00176d, 0x00176d, PG_U_UNASSIGNED},
+	{0x00176e, 0x001770, PG_U_OTHER_LETTER},
+	{0x001771, 0x001771, PG_U_UNASSIGNED},
+	{0x001772, 0x001773, PG_U_NON_SPACING_MARK},
+	{0x001774, 0x00177f, PG_U_UNASSIGNED},
+	{0x001780, 0x0017b3, PG_U_OTHER_LETTER},
+	{0x0017b4, 0x0017b5, PG_U_NON_SPACING_MARK},
+	{0x0017b6, 0x0017b6, PG_U_COMBINING_SPACING_MARK},
+	{0x0017b7, 0x0017bd, PG_U_NON_SPACING_MARK},
+	{0x0017be, 0x0017c5, PG_U_COMBINING_SPACING_MARK},
+	{0x0017c6, 0x0017c6, PG_U_NON_SPACING_MARK},
+	{0x0017c7, 0x0017c8, PG_U_COMBINING_SPACING_MARK},
+	{0x0017c9, 0x0017d3, PG_U_NON_SPACING_MARK},
+	{0x0017d4, 0x0017d6, PG_U_OTHER_PUNCTUATION},
+	{0x0017d7, 0x0017d7, PG_U_MODIFIER_LETTER},
+	{0x0017d8, 0x0017da, PG_U_OTHER_PUNCTUATION},
+	{0x0017db, 0x0017db, PG_U_CURRENCY_SYMBOL},
+	{0x0017dc, 0x0017dc, PG_U_OTHER_LETTER},
+	{0x0017dd, 0x0017dd, PG_U_NON_SPACING_MARK},
+	{0x0017de, 0x0017df, PG_U_UNASSIGNED},
+	{0x0017e0, 0x0017e9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0017ea, 0x0017ef, PG_U_UNASSIGNED},
+	{0x0017f0, 0x0017f9, PG_U_OTHER_NUMBER},
+	{0x0017fa, 0x0017ff, PG_U_UNASSIGNED},
+	{0x001800, 0x001805, PG_U_OTHER_PUNCTUATION},
+	{0x001806, 0x001806, PG_U_DASH_PUNCTUATION},
+	{0x001807, 0x00180a, PG_U_OTHER_PUNCTUATION},
+	{0x00180b, 0x00180d, PG_U_NON_SPACING_MARK},
+	{0x00180e, 0x00180e, PG_U_FORMAT_CHAR},
+	{0x00180f, 0x00180f, PG_U_NON_SPACING_MARK},
+	{0x001810, 0x001819, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00181a, 0x00181f, PG_U_UNASSIGNED},
+	{0x001820, 0x001842, PG_U_OTHER_LETTER},
+	{0x001843, 0x001843, PG_U_MODIFIER_LETTER},
+	{0x001844, 0x001878, PG_U_OTHER_LETTER},
+	{0x001879, 0x00187f, PG_U_UNASSIGNED},
+	{0x001880, 0x001884, PG_U_OTHER_LETTER},
+	{0x001885, 0x001886, PG_U_NON_SPACING_MARK},
+	{0x001887, 0x0018a8, PG_U_OTHER_LETTER},
+	{0x0018a9, 0x0018a9, PG_U_NON_SPACING_MARK},
+	{0x0018aa, 0x0018aa, PG_U_OTHER_LETTER},
+	{0x0018ab, 0x0018af, PG_U_UNASSIGNED},
+	{0x0018b0, 0x0018f5, PG_U_OTHER_LETTER},
+	{0x0018f6, 0x0018ff, PG_U_UNASSIGNED},
+	{0x001900, 0x00191e, PG_U_OTHER_LETTER},
+	{0x00191f, 0x00191f, PG_U_UNASSIGNED},
+	{0x001920, 0x001922, PG_U_NON_SPACING_MARK},
+	{0x001923, 0x001926, PG_U_COMBINING_SPACING_MARK},
+	{0x001927, 0x001928, PG_U_NON_SPACING_MARK},
+	{0x001929, 0x00192b, PG_U_COMBINING_SPACING_MARK},
+	{0x00192c, 0x00192f, PG_U_UNASSIGNED},
+	{0x001930, 0x001931, PG_U_COMBINING_SPACING_MARK},
+	{0x001932, 0x001932, PG_U_NON_SPACING_MARK},
+	{0x001933, 0x001938, PG_U_COMBINING_SPACING_MARK},
+	{0x001939, 0x00193b, PG_U_NON_SPACING_MARK},
+	{0x00193c, 0x00193f, PG_U_UNASSIGNED},
+	{0x001940, 0x001940, PG_U_OTHER_SYMBOL},
+	{0x001941, 0x001943, PG_U_UNASSIGNED},
+	{0x001944, 0x001945, PG_U_OTHER_PUNCTUATION},
+	{0x001946, 0x00194f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001950, 0x00196d, PG_U_OTHER_LETTER},
+	{0x00196e, 0x00196f, PG_U_UNASSIGNED},
+	{0x001970, 0x001974, PG_U_OTHER_LETTER},
+	{0x001975, 0x00197f, PG_U_UNASSIGNED},
+	{0x001980, 0x0019ab, PG_U_OTHER_LETTER},
+	{0x0019ac, 0x0019af, PG_U_UNASSIGNED},
+	{0x0019b0, 0x0019c9, PG_U_OTHER_LETTER},
+	{0x0019ca, 0x0019cf, PG_U_UNASSIGNED},
+	{0x0019d0, 0x0019d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0019da, 0x0019da, PG_U_OTHER_NUMBER},
+	{0x0019db, 0x0019dd, PG_U_UNASSIGNED},
+	{0x0019de, 0x0019ff, PG_U_OTHER_SYMBOL},
+	{0x001a00, 0x001a16, PG_U_OTHER_LETTER},
+	{0x001a17, 0x001a18, PG_U_NON_SPACING_MARK},
+	{0x001a19, 0x001a1a, PG_U_COMBINING_SPACING_MARK},
+	{0x001a1b, 0x001a1b, PG_U_NON_SPACING_MARK},
+	{0x001a1c, 0x001a1d, PG_U_UNASSIGNED},
+	{0x001a1e, 0x001a1f, PG_U_OTHER_PUNCTUATION},
+	{0x001a20, 0x001a54, PG_U_OTHER_LETTER},
+	{0x001a55, 0x001a55, PG_U_COMBINING_SPACING_MARK},
+	{0x001a56, 0x001a56, PG_U_NON_SPACING_MARK},
+	{0x001a57, 0x001a57, PG_U_COMBINING_SPACING_MARK},
+	{0x001a58, 0x001a5e, PG_U_NON_SPACING_MARK},
+	{0x001a5f, 0x001a5f, PG_U_UNASSIGNED},
+	{0x001a60, 0x001a60, PG_U_NON_SPACING_MARK},
+	{0x001a61, 0x001a61, PG_U_COMBINING_SPACING_MARK},
+	{0x001a62, 0x001a62, PG_U_NON_SPACING_MARK},
+	{0x001a63, 0x001a64, PG_U_COMBINING_SPACING_MARK},
+	{0x001a65, 0x001a6c, PG_U_NON_SPACING_MARK},
+	{0x001a6d, 0x001a72, PG_U_COMBINING_SPACING_MARK},
+	{0x001a73, 0x001a7c, PG_U_NON_SPACING_MARK},
+	{0x001a7d, 0x001a7e, PG_U_UNASSIGNED},
+	{0x001a7f, 0x001a7f, PG_U_NON_SPACING_MARK},
+	{0x001a80, 0x001a89, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001a8a, 0x001a8f, PG_U_UNASSIGNED},
+	{0x001a90, 0x001a99, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001a9a, 0x001a9f, PG_U_UNASSIGNED},
+	{0x001aa0, 0x001aa6, PG_U_OTHER_PUNCTUATION},
+	{0x001aa7, 0x001aa7, PG_U_MODIFIER_LETTER},
+	{0x001aa8, 0x001aad, PG_U_OTHER_PUNCTUATION},
+	{0x001aae, 0x001aaf, PG_U_UNASSIGNED},
+	{0x001ab0, 0x001abd, PG_U_NON_SPACING_MARK},
+	{0x001abe, 0x001abe, PG_U_ENCLOSING_MARK},
+	{0x001abf, 0x001ace, PG_U_NON_SPACING_MARK},
+	{0x001acf, 0x001aff, PG_U_UNASSIGNED},
+	{0x001b00, 0x001b03, PG_U_NON_SPACING_MARK},
+	{0x001b04, 0x001b04, PG_U_COMBINING_SPACING_MARK},
+	{0x001b05, 0x001b33, PG_U_OTHER_LETTER},
+	{0x001b34, 0x001b34, PG_U_NON_SPACING_MARK},
+	{0x001b35, 0x001b35, PG_U_COMBINING_SPACING_MARK},
+	{0x001b36, 0x001b3a, PG_U_NON_SPACING_MARK},
+	{0x001b3b, 0x001b3b, PG_U_COMBINING_SPACING_MARK},
+	{0x001b3c, 0x001b3c, PG_U_NON_SPACING_MARK},
+	{0x001b3d, 0x001b41, PG_U_COMBINING_SPACING_MARK},
+	{0x001b42, 0x001b42, PG_U_NON_SPACING_MARK},
+	{0x001b43, 0x001b44, PG_U_COMBINING_SPACING_MARK},
+	{0x001b45, 0x001b4c, PG_U_OTHER_LETTER},
+	{0x001b4d, 0x001b4f, PG_U_UNASSIGNED},
+	{0x001b50, 0x001b59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001b5a, 0x001b60, PG_U_OTHER_PUNCTUATION},
+	{0x001b61, 0x001b6a, PG_U_OTHER_SYMBOL},
+	{0x001b6b, 0x001b73, PG_U_NON_SPACING_MARK},
+	{0x001b74, 0x001b7c, PG_U_OTHER_SYMBOL},
+	{0x001b7d, 0x001b7e, PG_U_OTHER_PUNCTUATION},
+	{0x001b7f, 0x001b7f, PG_U_UNASSIGNED},
+	{0x001b80, 0x001b81, PG_U_NON_SPACING_MARK},
+	{0x001b82, 0x001b82, PG_U_COMBINING_SPACING_MARK},
+	{0x001b83, 0x001ba0, PG_U_OTHER_LETTER},
+	{0x001ba1, 0x001ba1, PG_U_COMBINING_SPACING_MARK},
+	{0x001ba2, 0x001ba5, PG_U_NON_SPACING_MARK},
+	{0x001ba6, 0x001ba7, PG_U_COMBINING_SPACING_MARK},
+	{0x001ba8, 0x001ba9, PG_U_NON_SPACING_MARK},
+	{0x001baa, 0x001baa, PG_U_COMBINING_SPACING_MARK},
+	{0x001bab, 0x001bad, PG_U_NON_SPACING_MARK},
+	{0x001bae, 0x001baf, PG_U_OTHER_LETTER},
+	{0x001bb0, 0x001bb9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001bba, 0x001be5, PG_U_OTHER_LETTER},
+	{0x001be6, 0x001be6, PG_U_NON_SPACING_MARK},
+	{0x001be7, 0x001be7, PG_U_COMBINING_SPACING_MARK},
+	{0x001be8, 0x001be9, PG_U_NON_SPACING_MARK},
+	{0x001bea, 0x001bec, PG_U_COMBINING_SPACING_MARK},
+	{0x001bed, 0x001bed, PG_U_NON_SPACING_MARK},
+	{0x001bee, 0x001bee, PG_U_COMBINING_SPACING_MARK},
+	{0x001bef, 0x001bf1, PG_U_NON_SPACING_MARK},
+	{0x001bf2, 0x001bf3, PG_U_COMBINING_SPACING_MARK},
+	{0x001bf4, 0x001bfb, PG_U_UNASSIGNED},
+	{0x001bfc, 0x001bff, PG_U_OTHER_PUNCTUATION},
+	{0x001c00, 0x001c23, PG_U_OTHER_LETTER},
+	{0x001c24, 0x001c2b, PG_U_COMBINING_SPACING_MARK},
+	{0x001c2c, 0x001c33, PG_U_NON_SPACING_MARK},
+	{0x001c34, 0x001c35, PG_U_COMBINING_SPACING_MARK},
+	{0x001c36, 0x001c37, PG_U_NON_SPACING_MARK},
+	{0x001c38, 0x001c3a, PG_U_UNASSIGNED},
+	{0x001c3b, 0x001c3f, PG_U_OTHER_PUNCTUATION},
+	{0x001c40, 0x001c49, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001c4a, 0x001c4c, PG_U_UNASSIGNED},
+	{0x001c4d, 0x001c4f, PG_U_OTHER_LETTER},
+	{0x001c50, 0x001c59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x001c5a, 0x001c77, PG_U_OTHER_LETTER},
+	{0x001c78, 0x001c7d, PG_U_MODIFIER_LETTER},
+	{0x001c7e, 0x001c7f, PG_U_OTHER_PUNCTUATION},
+	{0x001c80, 0x001c88, PG_U_LOWERCASE_LETTER},
+	{0x001c89, 0x001c8f, PG_U_UNASSIGNED},
+	{0x001c90, 0x001cba, PG_U_UPPERCASE_LETTER},
+	{0x001cbb, 0x001cbc, PG_U_UNASSIGNED},
+	{0x001cbd, 0x001cbf, PG_U_UPPERCASE_LETTER},
+	{0x001cc0, 0x001cc7, PG_U_OTHER_PUNCTUATION},
+	{0x001cc8, 0x001ccf, PG_U_UNASSIGNED},
+	{0x001cd0, 0x001cd2, PG_U_NON_SPACING_MARK},
+	{0x001cd3, 0x001cd3, PG_U_OTHER_PUNCTUATION},
+	{0x001cd4, 0x001ce0, PG_U_NON_SPACING_MARK},
+	{0x001ce1, 0x001ce1, PG_U_COMBINING_SPACING_MARK},
+	{0x001ce2, 0x001ce8, PG_U_NON_SPACING_MARK},
+	{0x001ce9, 0x001cec, PG_U_OTHER_LETTER},
+	{0x001ced, 0x001ced, PG_U_NON_SPACING_MARK},
+	{0x001cee, 0x001cf3, PG_U_OTHER_LETTER},
+	{0x001cf4, 0x001cf4, PG_U_NON_SPACING_MARK},
+	{0x001cf5, 0x001cf6, PG_U_OTHER_LETTER},
+	{0x001cf7, 0x001cf7, PG_U_COMBINING_SPACING_MARK},
+	{0x001cf8, 0x001cf9, PG_U_NON_SPACING_MARK},
+	{0x001cfa, 0x001cfa, PG_U_OTHER_LETTER},
+	{0x001cfb, 0x001cff, PG_U_UNASSIGNED},
+	{0x001d00, 0x001d2b, PG_U_LOWERCASE_LETTER},
+	{0x001d2c, 0x001d6a, PG_U_MODIFIER_LETTER},
+	{0x001d6b, 0x001d77, PG_U_LOWERCASE_LETTER},
+	{0x001d78, 0x001d78, PG_U_MODIFIER_LETTER},
+	{0x001d79, 0x001d9a, PG_U_LOWERCASE_LETTER},
+	{0x001d9b, 0x001dbf, PG_U_MODIFIER_LETTER},
+	{0x001dc0, 0x001dff, PG_U_NON_SPACING_MARK},
+	{0x001e00, 0x001e00, PG_U_UPPERCASE_LETTER},
+	{0x001e01, 0x001e01, PG_U_LOWERCASE_LETTER},
+	{0x001e02, 0x001e02, PG_U_UPPERCASE_LETTER},
+	{0x001e03, 0x001e03, PG_U_LOWERCASE_LETTER},
+	{0x001e04, 0x001e04, PG_U_UPPERCASE_LETTER},
+	{0x001e05, 0x001e05, PG_U_LOWERCASE_LETTER},
+	{0x001e06, 0x001e06, PG_U_UPPERCASE_LETTER},
+	{0x001e07, 0x001e07, PG_U_LOWERCASE_LETTER},
+	{0x001e08, 0x001e08, PG_U_UPPERCASE_LETTER},
+	{0x001e09, 0x001e09, PG_U_LOWERCASE_LETTER},
+	{0x001e0a, 0x001e0a, PG_U_UPPERCASE_LETTER},
+	{0x001e0b, 0x001e0b, PG_U_LOWERCASE_LETTER},
+	{0x001e0c, 0x001e0c, PG_U_UPPERCASE_LETTER},
+	{0x001e0d, 0x001e0d, PG_U_LOWERCASE_LETTER},
+	{0x001e0e, 0x001e0e, PG_U_UPPERCASE_LETTER},
+	{0x001e0f, 0x001e0f, PG_U_LOWERCASE_LETTER},
+	{0x001e10, 0x001e10, PG_U_UPPERCASE_LETTER},
+	{0x001e11, 0x001e11, PG_U_LOWERCASE_LETTER},
+	{0x001e12, 0x001e12, PG_U_UPPERCASE_LETTER},
+	{0x001e13, 0x001e13, PG_U_LOWERCASE_LETTER},
+	{0x001e14, 0x001e14, PG_U_UPPERCASE_LETTER},
+	{0x001e15, 0x001e15, PG_U_LOWERCASE_LETTER},
+	{0x001e16, 0x001e16, PG_U_UPPERCASE_LETTER},
+	{0x001e17, 0x001e17, PG_U_LOWERCASE_LETTER},
+	{0x001e18, 0x001e18, PG_U_UPPERCASE_LETTER},
+	{0x001e19, 0x001e19, PG_U_LOWERCASE_LETTER},
+	{0x001e1a, 0x001e1a, PG_U_UPPERCASE_LETTER},
+	{0x001e1b, 0x001e1b, PG_U_LOWERCASE_LETTER},
+	{0x001e1c, 0x001e1c, PG_U_UPPERCASE_LETTER},
+	{0x001e1d, 0x001e1d, PG_U_LOWERCASE_LETTER},
+	{0x001e1e, 0x001e1e, PG_U_UPPERCASE_LETTER},
+	{0x001e1f, 0x001e1f, PG_U_LOWERCASE_LETTER},
+	{0x001e20, 0x001e20, PG_U_UPPERCASE_LETTER},
+	{0x001e21, 0x001e21, PG_U_LOWERCASE_LETTER},
+	{0x001e22, 0x001e22, PG_U_UPPERCASE_LETTER},
+	{0x001e23, 0x001e23, PG_U_LOWERCASE_LETTER},
+	{0x001e24, 0x001e24, PG_U_UPPERCASE_LETTER},
+	{0x001e25, 0x001e25, PG_U_LOWERCASE_LETTER},
+	{0x001e26, 0x001e26, PG_U_UPPERCASE_LETTER},
+	{0x001e27, 0x001e27, PG_U_LOWERCASE_LETTER},
+	{0x001e28, 0x001e28, PG_U_UPPERCASE_LETTER},
+	{0x001e29, 0x001e29, PG_U_LOWERCASE_LETTER},
+	{0x001e2a, 0x001e2a, PG_U_UPPERCASE_LETTER},
+	{0x001e2b, 0x001e2b, PG_U_LOWERCASE_LETTER},
+	{0x001e2c, 0x001e2c, PG_U_UPPERCASE_LETTER},
+	{0x001e2d, 0x001e2d, PG_U_LOWERCASE_LETTER},
+	{0x001e2e, 0x001e2e, PG_U_UPPERCASE_LETTER},
+	{0x001e2f, 0x001e2f, PG_U_LOWERCASE_LETTER},
+	{0x001e30, 0x001e30, PG_U_UPPERCASE_LETTER},
+	{0x001e31, 0x001e31, PG_U_LOWERCASE_LETTER},
+	{0x001e32, 0x001e32, PG_U_UPPERCASE_LETTER},
+	{0x001e33, 0x001e33, PG_U_LOWERCASE_LETTER},
+	{0x001e34, 0x001e34, PG_U_UPPERCASE_LETTER},
+	{0x001e35, 0x001e35, PG_U_LOWERCASE_LETTER},
+	{0x001e36, 0x001e36, PG_U_UPPERCASE_LETTER},
+	{0x001e37, 0x001e37, PG_U_LOWERCASE_LETTER},
+	{0x001e38, 0x001e38, PG_U_UPPERCASE_LETTER},
+	{0x001e39, 0x001e39, PG_U_LOWERCASE_LETTER},
+	{0x001e3a, 0x001e3a, PG_U_UPPERCASE_LETTER},
+	{0x001e3b, 0x001e3b, PG_U_LOWERCASE_LETTER},
+	{0x001e3c, 0x001e3c, PG_U_UPPERCASE_LETTER},
+	{0x001e3d, 0x001e3d, PG_U_LOWERCASE_LETTER},
+	{0x001e3e, 0x001e3e, PG_U_UPPERCASE_LETTER},
+	{0x001e3f, 0x001e3f, PG_U_LOWERCASE_LETTER},
+	{0x001e40, 0x001e40, PG_U_UPPERCASE_LETTER},
+	{0x001e41, 0x001e41, PG_U_LOWERCASE_LETTER},
+	{0x001e42, 0x001e42, PG_U_UPPERCASE_LETTER},
+	{0x001e43, 0x001e43, PG_U_LOWERCASE_LETTER},
+	{0x001e44, 0x001e44, PG_U_UPPERCASE_LETTER},
+	{0x001e45, 0x001e45, PG_U_LOWERCASE_LETTER},
+	{0x001e46, 0x001e46, PG_U_UPPERCASE_LETTER},
+	{0x001e47, 0x001e47, PG_U_LOWERCASE_LETTER},
+	{0x001e48, 0x001e48, PG_U_UPPERCASE_LETTER},
+	{0x001e49, 0x001e49, PG_U_LOWERCASE_LETTER},
+	{0x001e4a, 0x001e4a, PG_U_UPPERCASE_LETTER},
+	{0x001e4b, 0x001e4b, PG_U_LOWERCASE_LETTER},
+	{0x001e4c, 0x001e4c, PG_U_UPPERCASE_LETTER},
+	{0x001e4d, 0x001e4d, PG_U_LOWERCASE_LETTER},
+	{0x001e4e, 0x001e4e, PG_U_UPPERCASE_LETTER},
+	{0x001e4f, 0x001e4f, PG_U_LOWERCASE_LETTER},
+	{0x001e50, 0x001e50, PG_U_UPPERCASE_LETTER},
+	{0x001e51, 0x001e51, PG_U_LOWERCASE_LETTER},
+	{0x001e52, 0x001e52, PG_U_UPPERCASE_LETTER},
+	{0x001e53, 0x001e53, PG_U_LOWERCASE_LETTER},
+	{0x001e54, 0x001e54, PG_U_UPPERCASE_LETTER},
+	{0x001e55, 0x001e55, PG_U_LOWERCASE_LETTER},
+	{0x001e56, 0x001e56, PG_U_UPPERCASE_LETTER},
+	{0x001e57, 0x001e57, PG_U_LOWERCASE_LETTER},
+	{0x001e58, 0x001e58, PG_U_UPPERCASE_LETTER},
+	{0x001e59, 0x001e59, PG_U_LOWERCASE_LETTER},
+	{0x001e5a, 0x001e5a, PG_U_UPPERCASE_LETTER},
+	{0x001e5b, 0x001e5b, PG_U_LOWERCASE_LETTER},
+	{0x001e5c, 0x001e5c, PG_U_UPPERCASE_LETTER},
+	{0x001e5d, 0x001e5d, PG_U_LOWERCASE_LETTER},
+	{0x001e5e, 0x001e5e, PG_U_UPPERCASE_LETTER},
+	{0x001e5f, 0x001e5f, PG_U_LOWERCASE_LETTER},
+	{0x001e60, 0x001e60, PG_U_UPPERCASE_LETTER},
+	{0x001e61, 0x001e61, PG_U_LOWERCASE_LETTER},
+	{0x001e62, 0x001e62, PG_U_UPPERCASE_LETTER},
+	{0x001e63, 0x001e63, PG_U_LOWERCASE_LETTER},
+	{0x001e64, 0x001e64, PG_U_UPPERCASE_LETTER},
+	{0x001e65, 0x001e65, PG_U_LOWERCASE_LETTER},
+	{0x001e66, 0x001e66, PG_U_UPPERCASE_LETTER},
+	{0x001e67, 0x001e67, PG_U_LOWERCASE_LETTER},
+	{0x001e68, 0x001e68, PG_U_UPPERCASE_LETTER},
+	{0x001e69, 0x001e69, PG_U_LOWERCASE_LETTER},
+	{0x001e6a, 0x001e6a, PG_U_UPPERCASE_LETTER},
+	{0x001e6b, 0x001e6b, PG_U_LOWERCASE_LETTER},
+	{0x001e6c, 0x001e6c, PG_U_UPPERCASE_LETTER},
+	{0x001e6d, 0x001e6d, PG_U_LOWERCASE_LETTER},
+	{0x001e6e, 0x001e6e, PG_U_UPPERCASE_LETTER},
+	{0x001e6f, 0x001e6f, PG_U_LOWERCASE_LETTER},
+	{0x001e70, 0x001e70, PG_U_UPPERCASE_LETTER},
+	{0x001e71, 0x001e71, PG_U_LOWERCASE_LETTER},
+	{0x001e72, 0x001e72, PG_U_UPPERCASE_LETTER},
+	{0x001e73, 0x001e73, PG_U_LOWERCASE_LETTER},
+	{0x001e74, 0x001e74, PG_U_UPPERCASE_LETTER},
+	{0x001e75, 0x001e75, PG_U_LOWERCASE_LETTER},
+	{0x001e76, 0x001e76, PG_U_UPPERCASE_LETTER},
+	{0x001e77, 0x001e77, PG_U_LOWERCASE_LETTER},
+	{0x001e78, 0x001e78, PG_U_UPPERCASE_LETTER},
+	{0x001e79, 0x001e79, PG_U_LOWERCASE_LETTER},
+	{0x001e7a, 0x001e7a, PG_U_UPPERCASE_LETTER},
+	{0x001e7b, 0x001e7b, PG_U_LOWERCASE_LETTER},
+	{0x001e7c, 0x001e7c, PG_U_UPPERCASE_LETTER},
+	{0x001e7d, 0x001e7d, PG_U_LOWERCASE_LETTER},
+	{0x001e7e, 0x001e7e, PG_U_UPPERCASE_LETTER},
+	{0x001e7f, 0x001e7f, PG_U_LOWERCASE_LETTER},
+	{0x001e80, 0x001e80, PG_U_UPPERCASE_LETTER},
+	{0x001e81, 0x001e81, PG_U_LOWERCASE_LETTER},
+	{0x001e82, 0x001e82, PG_U_UPPERCASE_LETTER},
+	{0x001e83, 0x001e83, PG_U_LOWERCASE_LETTER},
+	{0x001e84, 0x001e84, PG_U_UPPERCASE_LETTER},
+	{0x001e85, 0x001e85, PG_U_LOWERCASE_LETTER},
+	{0x001e86, 0x001e86, PG_U_UPPERCASE_LETTER},
+	{0x001e87, 0x001e87, PG_U_LOWERCASE_LETTER},
+	{0x001e88, 0x001e88, PG_U_UPPERCASE_LETTER},
+	{0x001e89, 0x001e89, PG_U_LOWERCASE_LETTER},
+	{0x001e8a, 0x001e8a, PG_U_UPPERCASE_LETTER},
+	{0x001e8b, 0x001e8b, PG_U_LOWERCASE_LETTER},
+	{0x001e8c, 0x001e8c, PG_U_UPPERCASE_LETTER},
+	{0x001e8d, 0x001e8d, PG_U_LOWERCASE_LETTER},
+	{0x001e8e, 0x001e8e, PG_U_UPPERCASE_LETTER},
+	{0x001e8f, 0x001e8f, PG_U_LOWERCASE_LETTER},
+	{0x001e90, 0x001e90, PG_U_UPPERCASE_LETTER},
+	{0x001e91, 0x001e91, PG_U_LOWERCASE_LETTER},
+	{0x001e92, 0x001e92, PG_U_UPPERCASE_LETTER},
+	{0x001e93, 0x001e93, PG_U_LOWERCASE_LETTER},
+	{0x001e94, 0x001e94, PG_U_UPPERCASE_LETTER},
+	{0x001e95, 0x001e9d, PG_U_LOWERCASE_LETTER},
+	{0x001e9e, 0x001e9e, PG_U_UPPERCASE_LETTER},
+	{0x001e9f, 0x001e9f, PG_U_LOWERCASE_LETTER},
+	{0x001ea0, 0x001ea0, PG_U_UPPERCASE_LETTER},
+	{0x001ea1, 0x001ea1, PG_U_LOWERCASE_LETTER},
+	{0x001ea2, 0x001ea2, PG_U_UPPERCASE_LETTER},
+	{0x001ea3, 0x001ea3, PG_U_LOWERCASE_LETTER},
+	{0x001ea4, 0x001ea4, PG_U_UPPERCASE_LETTER},
+	{0x001ea5, 0x001ea5, PG_U_LOWERCASE_LETTER},
+	{0x001ea6, 0x001ea6, PG_U_UPPERCASE_LETTER},
+	{0x001ea7, 0x001ea7, PG_U_LOWERCASE_LETTER},
+	{0x001ea8, 0x001ea8, PG_U_UPPERCASE_LETTER},
+	{0x001ea9, 0x001ea9, PG_U_LOWERCASE_LETTER},
+	{0x001eaa, 0x001eaa, PG_U_UPPERCASE_LETTER},
+	{0x001eab, 0x001eab, PG_U_LOWERCASE_LETTER},
+	{0x001eac, 0x001eac, PG_U_UPPERCASE_LETTER},
+	{0x001ead, 0x001ead, PG_U_LOWERCASE_LETTER},
+	{0x001eae, 0x001eae, PG_U_UPPERCASE_LETTER},
+	{0x001eaf, 0x001eaf, PG_U_LOWERCASE_LETTER},
+	{0x001eb0, 0x001eb0, PG_U_UPPERCASE_LETTER},
+	{0x001eb1, 0x001eb1, PG_U_LOWERCASE_LETTER},
+	{0x001eb2, 0x001eb2, PG_U_UPPERCASE_LETTER},
+	{0x001eb3, 0x001eb3, PG_U_LOWERCASE_LETTER},
+	{0x001eb4, 0x001eb4, PG_U_UPPERCASE_LETTER},
+	{0x001eb5, 0x001eb5, PG_U_LOWERCASE_LETTER},
+	{0x001eb6, 0x001eb6, PG_U_UPPERCASE_LETTER},
+	{0x001eb7, 0x001eb7, PG_U_LOWERCASE_LETTER},
+	{0x001eb8, 0x001eb8, PG_U_UPPERCASE_LETTER},
+	{0x001eb9, 0x001eb9, PG_U_LOWERCASE_LETTER},
+	{0x001eba, 0x001eba, PG_U_UPPERCASE_LETTER},
+	{0x001ebb, 0x001ebb, PG_U_LOWERCASE_LETTER},
+	{0x001ebc, 0x001ebc, PG_U_UPPERCASE_LETTER},
+	{0x001ebd, 0x001ebd, PG_U_LOWERCASE_LETTER},
+	{0x001ebe, 0x001ebe, PG_U_UPPERCASE_LETTER},
+	{0x001ebf, 0x001ebf, PG_U_LOWERCASE_LETTER},
+	{0x001ec0, 0x001ec0, PG_U_UPPERCASE_LETTER},
+	{0x001ec1, 0x001ec1, PG_U_LOWERCASE_LETTER},
+	{0x001ec2, 0x001ec2, PG_U_UPPERCASE_LETTER},
+	{0x001ec3, 0x001ec3, PG_U_LOWERCASE_LETTER},
+	{0x001ec4, 0x001ec4, PG_U_UPPERCASE_LETTER},
+	{0x001ec5, 0x001ec5, PG_U_LOWERCASE_LETTER},
+	{0x001ec6, 0x001ec6, PG_U_UPPERCASE_LETTER},
+	{0x001ec7, 0x001ec7, PG_U_LOWERCASE_LETTER},
+	{0x001ec8, 0x001ec8, PG_U_UPPERCASE_LETTER},
+	{0x001ec9, 0x001ec9, PG_U_LOWERCASE_LETTER},
+	{0x001eca, 0x001eca, PG_U_UPPERCASE_LETTER},
+	{0x001ecb, 0x001ecb, PG_U_LOWERCASE_LETTER},
+	{0x001ecc, 0x001ecc, PG_U_UPPERCASE_LETTER},
+	{0x001ecd, 0x001ecd, PG_U_LOWERCASE_LETTER},
+	{0x001ece, 0x001ece, PG_U_UPPERCASE_LETTER},
+	{0x001ecf, 0x001ecf, PG_U_LOWERCASE_LETTER},
+	{0x001ed0, 0x001ed0, PG_U_UPPERCASE_LETTER},
+	{0x001ed1, 0x001ed1, PG_U_LOWERCASE_LETTER},
+	{0x001ed2, 0x001ed2, PG_U_UPPERCASE_LETTER},
+	{0x001ed3, 0x001ed3, PG_U_LOWERCASE_LETTER},
+	{0x001ed4, 0x001ed4, PG_U_UPPERCASE_LETTER},
+	{0x001ed5, 0x001ed5, PG_U_LOWERCASE_LETTER},
+	{0x001ed6, 0x001ed6, PG_U_UPPERCASE_LETTER},
+	{0x001ed7, 0x001ed7, PG_U_LOWERCASE_LETTER},
+	{0x001ed8, 0x001ed8, PG_U_UPPERCASE_LETTER},
+	{0x001ed9, 0x001ed9, PG_U_LOWERCASE_LETTER},
+	{0x001eda, 0x001eda, PG_U_UPPERCASE_LETTER},
+	{0x001edb, 0x001edb, PG_U_LOWERCASE_LETTER},
+	{0x001edc, 0x001edc, PG_U_UPPERCASE_LETTER},
+	{0x001edd, 0x001edd, PG_U_LOWERCASE_LETTER},
+	{0x001ede, 0x001ede, PG_U_UPPERCASE_LETTER},
+	{0x001edf, 0x001edf, PG_U_LOWERCASE_LETTER},
+	{0x001ee0, 0x001ee0, PG_U_UPPERCASE_LETTER},
+	{0x001ee1, 0x001ee1, PG_U_LOWERCASE_LETTER},
+	{0x001ee2, 0x001ee2, PG_U_UPPERCASE_LETTER},
+	{0x001ee3, 0x001ee3, PG_U_LOWERCASE_LETTER},
+	{0x001ee4, 0x001ee4, PG_U_UPPERCASE_LETTER},
+	{0x001ee5, 0x001ee5, PG_U_LOWERCASE_LETTER},
+	{0x001ee6, 0x001ee6, PG_U_UPPERCASE_LETTER},
+	{0x001ee7, 0x001ee7, PG_U_LOWERCASE_LETTER},
+	{0x001ee8, 0x001ee8, PG_U_UPPERCASE_LETTER},
+	{0x001ee9, 0x001ee9, PG_U_LOWERCASE_LETTER},
+	{0x001eea, 0x001eea, PG_U_UPPERCASE_LETTER},
+	{0x001eeb, 0x001eeb, PG_U_LOWERCASE_LETTER},
+	{0x001eec, 0x001eec, PG_U_UPPERCASE_LETTER},
+	{0x001eed, 0x001eed, PG_U_LOWERCASE_LETTER},
+	{0x001eee, 0x001eee, PG_U_UPPERCASE_LETTER},
+	{0x001eef, 0x001eef, PG_U_LOWERCASE_LETTER},
+	{0x001ef0, 0x001ef0, PG_U_UPPERCASE_LETTER},
+	{0x001ef1, 0x001ef1, PG_U_LOWERCASE_LETTER},
+	{0x001ef2, 0x001ef2, PG_U_UPPERCASE_LETTER},
+	{0x001ef3, 0x001ef3, PG_U_LOWERCASE_LETTER},
+	{0x001ef4, 0x001ef4, PG_U_UPPERCASE_LETTER},
+	{0x001ef5, 0x001ef5, PG_U_LOWERCASE_LETTER},
+	{0x001ef6, 0x001ef6, PG_U_UPPERCASE_LETTER},
+	{0x001ef7, 0x001ef7, PG_U_LOWERCASE_LETTER},
+	{0x001ef8, 0x001ef8, PG_U_UPPERCASE_LETTER},
+	{0x001ef9, 0x001ef9, PG_U_LOWERCASE_LETTER},
+	{0x001efa, 0x001efa, PG_U_UPPERCASE_LETTER},
+	{0x001efb, 0x001efb, PG_U_LOWERCASE_LETTER},
+	{0x001efc, 0x001efc, PG_U_UPPERCASE_LETTER},
+	{0x001efd, 0x001efd, PG_U_LOWERCASE_LETTER},
+	{0x001efe, 0x001efe, PG_U_UPPERCASE_LETTER},
+	{0x001eff, 0x001f07, PG_U_LOWERCASE_LETTER},
+	{0x001f08, 0x001f0f, PG_U_UPPERCASE_LETTER},
+	{0x001f10, 0x001f15, PG_U_LOWERCASE_LETTER},
+	{0x001f16, 0x001f17, PG_U_UNASSIGNED},
+	{0x001f18, 0x001f1d, PG_U_UPPERCASE_LETTER},
+	{0x001f1e, 0x001f1f, PG_U_UNASSIGNED},
+	{0x001f20, 0x001f27, PG_U_LOWERCASE_LETTER},
+	{0x001f28, 0x001f2f, PG_U_UPPERCASE_LETTER},
+	{0x001f30, 0x001f37, PG_U_LOWERCASE_LETTER},
+	{0x001f38, 0x001f3f, PG_U_UPPERCASE_LETTER},
+	{0x001f40, 0x001f45, PG_U_LOWERCASE_LETTER},
+	{0x001f46, 0x001f47, PG_U_UNASSIGNED},
+	{0x001f48, 0x001f4d, PG_U_UPPERCASE_LETTER},
+	{0x001f4e, 0x001f4f, PG_U_UNASSIGNED},
+	{0x001f50, 0x001f57, PG_U_LOWERCASE_LETTER},
+	{0x001f58, 0x001f58, PG_U_UNASSIGNED},
+	{0x001f59, 0x001f59, PG_U_UPPERCASE_LETTER},
+	{0x001f5a, 0x001f5a, PG_U_UNASSIGNED},
+	{0x001f5b, 0x001f5b, PG_U_UPPERCASE_LETTER},
+	{0x001f5c, 0x001f5c, PG_U_UNASSIGNED},
+	{0x001f5d, 0x001f5d, PG_U_UPPERCASE_LETTER},
+	{0x001f5e, 0x001f5e, PG_U_UNASSIGNED},
+	{0x001f5f, 0x001f5f, PG_U_UPPERCASE_LETTER},
+	{0x001f60, 0x001f67, PG_U_LOWERCASE_LETTER},
+	{0x001f68, 0x001f6f, PG_U_UPPERCASE_LETTER},
+	{0x001f70, 0x001f7d, PG_U_LOWERCASE_LETTER},
+	{0x001f7e, 0x001f7f, PG_U_UNASSIGNED},
+	{0x001f80, 0x001f87, PG_U_LOWERCASE_LETTER},
+	{0x001f88, 0x001f8f, PG_U_TITLECASE_LETTER},
+	{0x001f90, 0x001f97, PG_U_LOWERCASE_LETTER},
+	{0x001f98, 0x001f9f, PG_U_TITLECASE_LETTER},
+	{0x001fa0, 0x001fa7, PG_U_LOWERCASE_LETTER},
+	{0x001fa8, 0x001faf, PG_U_TITLECASE_LETTER},
+	{0x001fb0, 0x001fb4, PG_U_LOWERCASE_LETTER},
+	{0x001fb5, 0x001fb5, PG_U_UNASSIGNED},
+	{0x001fb6, 0x001fb7, PG_U_LOWERCASE_LETTER},
+	{0x001fb8, 0x001fbb, PG_U_UPPERCASE_LETTER},
+	{0x001fbc, 0x001fbc, PG_U_TITLECASE_LETTER},
+	{0x001fbd, 0x001fbd, PG_U_MODIFIER_SYMBOL},
+	{0x001fbe, 0x001fbe, PG_U_LOWERCASE_LETTER},
+	{0x001fbf, 0x001fc1, PG_U_MODIFIER_SYMBOL},
+	{0x001fc2, 0x001fc4, PG_U_LOWERCASE_LETTER},
+	{0x001fc5, 0x001fc5, PG_U_UNASSIGNED},
+	{0x001fc6, 0x001fc7, PG_U_LOWERCASE_LETTER},
+	{0x001fc8, 0x001fcb, PG_U_UPPERCASE_LETTER},
+	{0x001fcc, 0x001fcc, PG_U_TITLECASE_LETTER},
+	{0x001fcd, 0x001fcf, PG_U_MODIFIER_SYMBOL},
+	{0x001fd0, 0x001fd3, PG_U_LOWERCASE_LETTER},
+	{0x001fd4, 0x001fd5, PG_U_UNASSIGNED},
+	{0x001fd6, 0x001fd7, PG_U_LOWERCASE_LETTER},
+	{0x001fd8, 0x001fdb, PG_U_UPPERCASE_LETTER},
+	{0x001fdc, 0x001fdc, PG_U_UNASSIGNED},
+	{0x001fdd, 0x001fdf, PG_U_MODIFIER_SYMBOL},
+	{0x001fe0, 0x001fe7, PG_U_LOWERCASE_LETTER},
+	{0x001fe8, 0x001fec, PG_U_UPPERCASE_LETTER},
+	{0x001fed, 0x001fef, PG_U_MODIFIER_SYMBOL},
+	{0x001ff0, 0x001ff1, PG_U_UNASSIGNED},
+	{0x001ff2, 0x001ff4, PG_U_LOWERCASE_LETTER},
+	{0x001ff5, 0x001ff5, PG_U_UNASSIGNED},
+	{0x001ff6, 0x001ff7, PG_U_LOWERCASE_LETTER},
+	{0x001ff8, 0x001ffb, PG_U_UPPERCASE_LETTER},
+	{0x001ffc, 0x001ffc, PG_U_TITLECASE_LETTER},
+	{0x001ffd, 0x001ffe, PG_U_MODIFIER_SYMBOL},
+	{0x001fff, 0x001fff, PG_U_UNASSIGNED},
+	{0x002000, 0x00200a, PG_U_SPACE_SEPARATOR},
+	{0x00200b, 0x00200f, PG_U_FORMAT_CHAR},
+	{0x002010, 0x002015, PG_U_DASH_PUNCTUATION},
+	{0x002016, 0x002017, PG_U_OTHER_PUNCTUATION},
+	{0x002018, 0x002018, PG_U_INITIAL_PUNCTUATION},
+	{0x002019, 0x002019, PG_U_FINAL_PUNCTUATION},
+	{0x00201a, 0x00201a, PG_U_START_PUNCTUATION},
+	{0x00201b, 0x00201c, PG_U_INITIAL_PUNCTUATION},
+	{0x00201d, 0x00201d, PG_U_FINAL_PUNCTUATION},
+	{0x00201e, 0x00201e, PG_U_START_PUNCTUATION},
+	{0x00201f, 0x00201f, PG_U_INITIAL_PUNCTUATION},
+	{0x002020, 0x002027, PG_U_OTHER_PUNCTUATION},
+	{0x002028, 0x002028, PG_U_LINE_SEPARATOR},
+	{0x002029, 0x002029, PG_U_PARAGRAPH_SEPARATOR},
+	{0x00202a, 0x00202e, PG_U_FORMAT_CHAR},
+	{0x00202f, 0x00202f, PG_U_SPACE_SEPARATOR},
+	{0x002030, 0x002038, PG_U_OTHER_PUNCTUATION},
+	{0x002039, 0x002039, PG_U_INITIAL_PUNCTUATION},
+	{0x00203a, 0x00203a, PG_U_FINAL_PUNCTUATION},
+	{0x00203b, 0x00203e, PG_U_OTHER_PUNCTUATION},
+	{0x00203f, 0x002040, PG_U_CONNECTOR_PUNCTUATION},
+	{0x002041, 0x002043, PG_U_OTHER_PUNCTUATION},
+	{0x002044, 0x002044, PG_U_MATH_SYMBOL},
+	{0x002045, 0x002045, PG_U_START_PUNCTUATION},
+	{0x002046, 0x002046, PG_U_END_PUNCTUATION},
+	{0x002047, 0x002051, PG_U_OTHER_PUNCTUATION},
+	{0x002052, 0x002052, PG_U_MATH_SYMBOL},
+	{0x002053, 0x002053, PG_U_OTHER_PUNCTUATION},
+	{0x002054, 0x002054, PG_U_CONNECTOR_PUNCTUATION},
+	{0x002055, 0x00205e, PG_U_OTHER_PUNCTUATION},
+	{0x00205f, 0x00205f, PG_U_SPACE_SEPARATOR},
+	{0x002060, 0x002064, PG_U_FORMAT_CHAR},
+	{0x002065, 0x002065, PG_U_UNASSIGNED},
+	{0x002066, 0x00206f, PG_U_FORMAT_CHAR},
+	{0x002070, 0x002070, PG_U_OTHER_NUMBER},
+	{0x002071, 0x002071, PG_U_MODIFIER_LETTER},
+	{0x002072, 0x002073, PG_U_UNASSIGNED},
+	{0x002074, 0x002079, PG_U_OTHER_NUMBER},
+	{0x00207a, 0x00207c, PG_U_MATH_SYMBOL},
+	{0x00207d, 0x00207d, PG_U_START_PUNCTUATION},
+	{0x00207e, 0x00207e, PG_U_END_PUNCTUATION},
+	{0x00207f, 0x00207f, PG_U_MODIFIER_LETTER},
+	{0x002080, 0x002089, PG_U_OTHER_NUMBER},
+	{0x00208a, 0x00208c, PG_U_MATH_SYMBOL},
+	{0x00208d, 0x00208d, PG_U_START_PUNCTUATION},
+	{0x00208e, 0x00208e, PG_U_END_PUNCTUATION},
+	{0x00208f, 0x00208f, PG_U_UNASSIGNED},
+	{0x002090, 0x00209c, PG_U_MODIFIER_LETTER},
+	{0x00209d, 0x00209f, PG_U_UNASSIGNED},
+	{0x0020a0, 0x0020c0, PG_U_CURRENCY_SYMBOL},
+	{0x0020c1, 0x0020cf, PG_U_UNASSIGNED},
+	{0x0020d0, 0x0020dc, PG_U_NON_SPACING_MARK},
+	{0x0020dd, 0x0020e0, PG_U_ENCLOSING_MARK},
+	{0x0020e1, 0x0020e1, PG_U_NON_SPACING_MARK},
+	{0x0020e2, 0x0020e4, PG_U_ENCLOSING_MARK},
+	{0x0020e5, 0x0020f0, PG_U_NON_SPACING_MARK},
+	{0x0020f1, 0x0020ff, PG_U_UNASSIGNED},
+	{0x002100, 0x002101, PG_U_OTHER_SYMBOL},
+	{0x002102, 0x002102, PG_U_UPPERCASE_LETTER},
+	{0x002103, 0x002106, PG_U_OTHER_SYMBOL},
+	{0x002107, 0x002107, PG_U_UPPERCASE_LETTER},
+	{0x002108, 0x002109, PG_U_OTHER_SYMBOL},
+	{0x00210a, 0x00210a, PG_U_LOWERCASE_LETTER},
+	{0x00210b, 0x00210d, PG_U_UPPERCASE_LETTER},
+	{0x00210e, 0x00210f, PG_U_LOWERCASE_LETTER},
+	{0x002110, 0x002112, PG_U_UPPERCASE_LETTER},
+	{0x002113, 0x002113, PG_U_LOWERCASE_LETTER},
+	{0x002114, 0x002114, PG_U_OTHER_SYMBOL},
+	{0x002115, 0x002115, PG_U_UPPERCASE_LETTER},
+	{0x002116, 0x002117, PG_U_OTHER_SYMBOL},
+	{0x002118, 0x002118, PG_U_MATH_SYMBOL},
+	{0x002119, 0x00211d, PG_U_UPPERCASE_LETTER},
+	{0x00211e, 0x002123, PG_U_OTHER_SYMBOL},
+	{0x002124, 0x002124, PG_U_UPPERCASE_LETTER},
+	{0x002125, 0x002125, PG_U_OTHER_SYMBOL},
+	{0x002126, 0x002126, PG_U_UPPERCASE_LETTER},
+	{0x002127, 0x002127, PG_U_OTHER_SYMBOL},
+	{0x002128, 0x002128, PG_U_UPPERCASE_LETTER},
+	{0x002129, 0x002129, PG_U_OTHER_SYMBOL},
+	{0x00212a, 0x00212d, PG_U_UPPERCASE_LETTER},
+	{0x00212e, 0x00212e, PG_U_OTHER_SYMBOL},
+	{0x00212f, 0x00212f, PG_U_LOWERCASE_LETTER},
+	{0x002130, 0x002133, PG_U_UPPERCASE_LETTER},
+	{0x002134, 0x002134, PG_U_LOWERCASE_LETTER},
+	{0x002135, 0x002138, PG_U_OTHER_LETTER},
+	{0x002139, 0x002139, PG_U_LOWERCASE_LETTER},
+	{0x00213a, 0x00213b, PG_U_OTHER_SYMBOL},
+	{0x00213c, 0x00213d, PG_U_LOWERCASE_LETTER},
+	{0x00213e, 0x00213f, PG_U_UPPERCASE_LETTER},
+	{0x002140, 0x002144, PG_U_MATH_SYMBOL},
+	{0x002145, 0x002145, PG_U_UPPERCASE_LETTER},
+	{0x002146, 0x002149, PG_U_LOWERCASE_LETTER},
+	{0x00214a, 0x00214a, PG_U_OTHER_SYMBOL},
+	{0x00214b, 0x00214b, PG_U_MATH_SYMBOL},
+	{0x00214c, 0x00214d, PG_U_OTHER_SYMBOL},
+	{0x00214e, 0x00214e, PG_U_LOWERCASE_LETTER},
+	{0x00214f, 0x00214f, PG_U_OTHER_SYMBOL},
+	{0x002150, 0x00215f, PG_U_OTHER_NUMBER},
+	{0x002160, 0x002182, PG_U_LETTER_NUMBER},
+	{0x002183, 0x002183, PG_U_UPPERCASE_LETTER},
+	{0x002184, 0x002184, PG_U_LOWERCASE_LETTER},
+	{0x002185, 0x002188, PG_U_LETTER_NUMBER},
+	{0x002189, 0x002189, PG_U_OTHER_NUMBER},
+	{0x00218a, 0x00218b, PG_U_OTHER_SYMBOL},
+	{0x00218c, 0x00218f, PG_U_UNASSIGNED},
+	{0x002190, 0x002194, PG_U_MATH_SYMBOL},
+	{0x002195, 0x002199, PG_U_OTHER_SYMBOL},
+	{0x00219a, 0x00219b, PG_U_MATH_SYMBOL},
+	{0x00219c, 0x00219f, PG_U_OTHER_SYMBOL},
+	{0x0021a0, 0x0021a0, PG_U_MATH_SYMBOL},
+	{0x0021a1, 0x0021a2, PG_U_OTHER_SYMBOL},
+	{0x0021a3, 0x0021a3, PG_U_MATH_SYMBOL},
+	{0x0021a4, 0x0021a5, PG_U_OTHER_SYMBOL},
+	{0x0021a6, 0x0021a6, PG_U_MATH_SYMBOL},
+	{0x0021a7, 0x0021ad, PG_U_OTHER_SYMBOL},
+	{0x0021ae, 0x0021ae, PG_U_MATH_SYMBOL},
+	{0x0021af, 0x0021cd, PG_U_OTHER_SYMBOL},
+	{0x0021ce, 0x0021cf, PG_U_MATH_SYMBOL},
+	{0x0021d0, 0x0021d1, PG_U_OTHER_SYMBOL},
+	{0x0021d2, 0x0021d2, PG_U_MATH_SYMBOL},
+	{0x0021d3, 0x0021d3, PG_U_OTHER_SYMBOL},
+	{0x0021d4, 0x0021d4, PG_U_MATH_SYMBOL},
+	{0x0021d5, 0x0021f3, PG_U_OTHER_SYMBOL},
+	{0x0021f4, 0x0022ff, PG_U_MATH_SYMBOL},
+	{0x002300, 0x002307, PG_U_OTHER_SYMBOL},
+	{0x002308, 0x002308, PG_U_START_PUNCTUATION},
+	{0x002309, 0x002309, PG_U_END_PUNCTUATION},
+	{0x00230a, 0x00230a, PG_U_START_PUNCTUATION},
+	{0x00230b, 0x00230b, PG_U_END_PUNCTUATION},
+	{0x00230c, 0x00231f, PG_U_OTHER_SYMBOL},
+	{0x002320, 0x002321, PG_U_MATH_SYMBOL},
+	{0x002322, 0x002328, PG_U_OTHER_SYMBOL},
+	{0x002329, 0x002329, PG_U_START_PUNCTUATION},
+	{0x00232a, 0x00232a, PG_U_END_PUNCTUATION},
+	{0x00232b, 0x00237b, PG_U_OTHER_SYMBOL},
+	{0x00237c, 0x00237c, PG_U_MATH_SYMBOL},
+	{0x00237d, 0x00239a, PG_U_OTHER_SYMBOL},
+	{0x00239b, 0x0023b3, PG_U_MATH_SYMBOL},
+	{0x0023b4, 0x0023db, PG_U_OTHER_SYMBOL},
+	{0x0023dc, 0x0023e1, PG_U_MATH_SYMBOL},
+	{0x0023e2, 0x002426, PG_U_OTHER_SYMBOL},
+	{0x002427, 0x00243f, PG_U_UNASSIGNED},
+	{0x002440, 0x00244a, PG_U_OTHER_SYMBOL},
+	{0x00244b, 0x00245f, PG_U_UNASSIGNED},
+	{0x002460, 0x00249b, PG_U_OTHER_NUMBER},
+	{0x00249c, 0x0024e9, PG_U_OTHER_SYMBOL},
+	{0x0024ea, 0x0024ff, PG_U_OTHER_NUMBER},
+	{0x002500, 0x0025b6, PG_U_OTHER_SYMBOL},
+	{0x0025b7, 0x0025b7, PG_U_MATH_SYMBOL},
+	{0x0025b8, 0x0025c0, PG_U_OTHER_SYMBOL},
+	{0x0025c1, 0x0025c1, PG_U_MATH_SYMBOL},
+	{0x0025c2, 0x0025f7, PG_U_OTHER_SYMBOL},
+	{0x0025f8, 0x0025ff, PG_U_MATH_SYMBOL},
+	{0x002600, 0x00266e, PG_U_OTHER_SYMBOL},
+	{0x00266f, 0x00266f, PG_U_MATH_SYMBOL},
+	{0x002670, 0x002767, PG_U_OTHER_SYMBOL},
+	{0x002768, 0x002768, PG_U_START_PUNCTUATION},
+	{0x002769, 0x002769, PG_U_END_PUNCTUATION},
+	{0x00276a, 0x00276a, PG_U_START_PUNCTUATION},
+	{0x00276b, 0x00276b, PG_U_END_PUNCTUATION},
+	{0x00276c, 0x00276c, PG_U_START_PUNCTUATION},
+	{0x00276d, 0x00276d, PG_U_END_PUNCTUATION},
+	{0x00276e, 0x00276e, PG_U_START_PUNCTUATION},
+	{0x00276f, 0x00276f, PG_U_END_PUNCTUATION},
+	{0x002770, 0x002770, PG_U_START_PUNCTUATION},
+	{0x002771, 0x002771, PG_U_END_PUNCTUATION},
+	{0x002772, 0x002772, PG_U_START_PUNCTUATION},
+	{0x002773, 0x002773, PG_U_END_PUNCTUATION},
+	{0x002774, 0x002774, PG_U_START_PUNCTUATION},
+	{0x002775, 0x002775, PG_U_END_PUNCTUATION},
+	{0x002776, 0x002793, PG_U_OTHER_NUMBER},
+	{0x002794, 0x0027bf, PG_U_OTHER_SYMBOL},
+	{0x0027c0, 0x0027c4, PG_U_MATH_SYMBOL},
+	{0x0027c5, 0x0027c5, PG_U_START_PUNCTUATION},
+	{0x0027c6, 0x0027c6, PG_U_END_PUNCTUATION},
+	{0x0027c7, 0x0027e5, PG_U_MATH_SYMBOL},
+	{0x0027e6, 0x0027e6, PG_U_START_PUNCTUATION},
+	{0x0027e7, 0x0027e7, PG_U_END_PUNCTUATION},
+	{0x0027e8, 0x0027e8, PG_U_START_PUNCTUATION},
+	{0x0027e9, 0x0027e9, PG_U_END_PUNCTUATION},
+	{0x0027ea, 0x0027ea, PG_U_START_PUNCTUATION},
+	{0x0027eb, 0x0027eb, PG_U_END_PUNCTUATION},
+	{0x0027ec, 0x0027ec, PG_U_START_PUNCTUATION},
+	{0x0027ed, 0x0027ed, PG_U_END_PUNCTUATION},
+	{0x0027ee, 0x0027ee, PG_U_START_PUNCTUATION},
+	{0x0027ef, 0x0027ef, PG_U_END_PUNCTUATION},
+	{0x0027f0, 0x0027ff, PG_U_MATH_SYMBOL},
+	{0x002800, 0x0028ff, PG_U_OTHER_SYMBOL},
+	{0x002900, 0x002982, PG_U_MATH_SYMBOL},
+	{0x002983, 0x002983, PG_U_START_PUNCTUATION},
+	{0x002984, 0x002984, PG_U_END_PUNCTUATION},
+	{0x002985, 0x002985, PG_U_START_PUNCTUATION},
+	{0x002986, 0x002986, PG_U_END_PUNCTUATION},
+	{0x002987, 0x002987, PG_U_START_PUNCTUATION},
+	{0x002988, 0x002988, PG_U_END_PUNCTUATION},
+	{0x002989, 0x002989, PG_U_START_PUNCTUATION},
+	{0x00298a, 0x00298a, PG_U_END_PUNCTUATION},
+	{0x00298b, 0x00298b, PG_U_START_PUNCTUATION},
+	{0x00298c, 0x00298c, PG_U_END_PUNCTUATION},
+	{0x00298d, 0x00298d, PG_U_START_PUNCTUATION},
+	{0x00298e, 0x00298e, PG_U_END_PUNCTUATION},
+	{0x00298f, 0x00298f, PG_U_START_PUNCTUATION},
+	{0x002990, 0x002990, PG_U_END_PUNCTUATION},
+	{0x002991, 0x002991, PG_U_START_PUNCTUATION},
+	{0x002992, 0x002992, PG_U_END_PUNCTUATION},
+	{0x002993, 0x002993, PG_U_START_PUNCTUATION},
+	{0x002994, 0x002994, PG_U_END_PUNCTUATION},
+	{0x002995, 0x002995, PG_U_START_PUNCTUATION},
+	{0x002996, 0x002996, PG_U_END_PUNCTUATION},
+	{0x002997, 0x002997, PG_U_START_PUNCTUATION},
+	{0x002998, 0x002998, PG_U_END_PUNCTUATION},
+	{0x002999, 0x0029d7, PG_U_MATH_SYMBOL},
+	{0x0029d8, 0x0029d8, PG_U_START_PUNCTUATION},
+	{0x0029d9, 0x0029d9, PG_U_END_PUNCTUATION},
+	{0x0029da, 0x0029da, PG_U_START_PUNCTUATION},
+	{0x0029db, 0x0029db, PG_U_END_PUNCTUATION},
+	{0x0029dc, 0x0029fb, PG_U_MATH_SYMBOL},
+	{0x0029fc, 0x0029fc, PG_U_START_PUNCTUATION},
+	{0x0029fd, 0x0029fd, PG_U_END_PUNCTUATION},
+	{0x0029fe, 0x002aff, PG_U_MATH_SYMBOL},
+	{0x002b00, 0x002b2f, PG_U_OTHER_SYMBOL},
+	{0x002b30, 0x002b44, PG_U_MATH_SYMBOL},
+	{0x002b45, 0x002b46, PG_U_OTHER_SYMBOL},
+	{0x002b47, 0x002b4c, PG_U_MATH_SYMBOL},
+	{0x002b4d, 0x002b73, PG_U_OTHER_SYMBOL},
+	{0x002b74, 0x002b75, PG_U_UNASSIGNED},
+	{0x002b76, 0x002b95, PG_U_OTHER_SYMBOL},
+	{0x002b96, 0x002b96, PG_U_UNASSIGNED},
+	{0x002b97, 0x002bff, PG_U_OTHER_SYMBOL},
+	{0x002c00, 0x002c2f, PG_U_UPPERCASE_LETTER},
+	{0x002c30, 0x002c5f, PG_U_LOWERCASE_LETTER},
+	{0x002c60, 0x002c60, PG_U_UPPERCASE_LETTER},
+	{0x002c61, 0x002c61, PG_U_LOWERCASE_LETTER},
+	{0x002c62, 0x002c64, PG_U_UPPERCASE_LETTER},
+	{0x002c65, 0x002c66, PG_U_LOWERCASE_LETTER},
+	{0x002c67, 0x002c67, PG_U_UPPERCASE_LETTER},
+	{0x002c68, 0x002c68, PG_U_LOWERCASE_LETTER},
+	{0x002c69, 0x002c69, PG_U_UPPERCASE_LETTER},
+	{0x002c6a, 0x002c6a, PG_U_LOWERCASE_LETTER},
+	{0x002c6b, 0x002c6b, PG_U_UPPERCASE_LETTER},
+	{0x002c6c, 0x002c6c, PG_U_LOWERCASE_LETTER},
+	{0x002c6d, 0x002c70, PG_U_UPPERCASE_LETTER},
+	{0x002c71, 0x002c71, PG_U_LOWERCASE_LETTER},
+	{0x002c72, 0x002c72, PG_U_UPPERCASE_LETTER},
+	{0x002c73, 0x002c74, PG_U_LOWERCASE_LETTER},
+	{0x002c75, 0x002c75, PG_U_UPPERCASE_LETTER},
+	{0x002c76, 0x002c7b, PG_U_LOWERCASE_LETTER},
+	{0x002c7c, 0x002c7d, PG_U_MODIFIER_LETTER},
+	{0x002c7e, 0x002c80, PG_U_UPPERCASE_LETTER},
+	{0x002c81, 0x002c81, PG_U_LOWERCASE_LETTER},
+	{0x002c82, 0x002c82, PG_U_UPPERCASE_LETTER},
+	{0x002c83, 0x002c83, PG_U_LOWERCASE_LETTER},
+	{0x002c84, 0x002c84, PG_U_UPPERCASE_LETTER},
+	{0x002c85, 0x002c85, PG_U_LOWERCASE_LETTER},
+	{0x002c86, 0x002c86, PG_U_UPPERCASE_LETTER},
+	{0x002c87, 0x002c87, PG_U_LOWERCASE_LETTER},
+	{0x002c88, 0x002c88, PG_U_UPPERCASE_LETTER},
+	{0x002c89, 0x002c89, PG_U_LOWERCASE_LETTER},
+	{0x002c8a, 0x002c8a, PG_U_UPPERCASE_LETTER},
+	{0x002c8b, 0x002c8b, PG_U_LOWERCASE_LETTER},
+	{0x002c8c, 0x002c8c, PG_U_UPPERCASE_LETTER},
+	{0x002c8d, 0x002c8d, PG_U_LOWERCASE_LETTER},
+	{0x002c8e, 0x002c8e, PG_U_UPPERCASE_LETTER},
+	{0x002c8f, 0x002c8f, PG_U_LOWERCASE_LETTER},
+	{0x002c90, 0x002c90, PG_U_UPPERCASE_LETTER},
+	{0x002c91, 0x002c91, PG_U_LOWERCASE_LETTER},
+	{0x002c92, 0x002c92, PG_U_UPPERCASE_LETTER},
+	{0x002c93, 0x002c93, PG_U_LOWERCASE_LETTER},
+	{0x002c94, 0x002c94, PG_U_UPPERCASE_LETTER},
+	{0x002c95, 0x002c95, PG_U_LOWERCASE_LETTER},
+	{0x002c96, 0x002c96, PG_U_UPPERCASE_LETTER},
+	{0x002c97, 0x002c97, PG_U_LOWERCASE_LETTER},
+	{0x002c98, 0x002c98, PG_U_UPPERCASE_LETTER},
+	{0x002c99, 0x002c99, PG_U_LOWERCASE_LETTER},
+	{0x002c9a, 0x002c9a, PG_U_UPPERCASE_LETTER},
+	{0x002c9b, 0x002c9b, PG_U_LOWERCASE_LETTER},
+	{0x002c9c, 0x002c9c, PG_U_UPPERCASE_LETTER},
+	{0x002c9d, 0x002c9d, PG_U_LOWERCASE_LETTER},
+	{0x002c9e, 0x002c9e, PG_U_UPPERCASE_LETTER},
+	{0x002c9f, 0x002c9f, PG_U_LOWERCASE_LETTER},
+	{0x002ca0, 0x002ca0, PG_U_UPPERCASE_LETTER},
+	{0x002ca1, 0x002ca1, PG_U_LOWERCASE_LETTER},
+	{0x002ca2, 0x002ca2, PG_U_UPPERCASE_LETTER},
+	{0x002ca3, 0x002ca3, PG_U_LOWERCASE_LETTER},
+	{0x002ca4, 0x002ca4, PG_U_UPPERCASE_LETTER},
+	{0x002ca5, 0x002ca5, PG_U_LOWERCASE_LETTER},
+	{0x002ca6, 0x002ca6, PG_U_UPPERCASE_LETTER},
+	{0x002ca7, 0x002ca7, PG_U_LOWERCASE_LETTER},
+	{0x002ca8, 0x002ca8, PG_U_UPPERCASE_LETTER},
+	{0x002ca9, 0x002ca9, PG_U_LOWERCASE_LETTER},
+	{0x002caa, 0x002caa, PG_U_UPPERCASE_LETTER},
+	{0x002cab, 0x002cab, PG_U_LOWERCASE_LETTER},
+	{0x002cac, 0x002cac, PG_U_UPPERCASE_LETTER},
+	{0x002cad, 0x002cad, PG_U_LOWERCASE_LETTER},
+	{0x002cae, 0x002cae, PG_U_UPPERCASE_LETTER},
+	{0x002caf, 0x002caf, PG_U_LOWERCASE_LETTER},
+	{0x002cb0, 0x002cb0, PG_U_UPPERCASE_LETTER},
+	{0x002cb1, 0x002cb1, PG_U_LOWERCASE_LETTER},
+	{0x002cb2, 0x002cb2, PG_U_UPPERCASE_LETTER},
+	{0x002cb3, 0x002cb3, PG_U_LOWERCASE_LETTER},
+	{0x002cb4, 0x002cb4, PG_U_UPPERCASE_LETTER},
+	{0x002cb5, 0x002cb5, PG_U_LOWERCASE_LETTER},
+	{0x002cb6, 0x002cb6, PG_U_UPPERCASE_LETTER},
+	{0x002cb7, 0x002cb7, PG_U_LOWERCASE_LETTER},
+	{0x002cb8, 0x002cb8, PG_U_UPPERCASE_LETTER},
+	{0x002cb9, 0x002cb9, PG_U_LOWERCASE_LETTER},
+	{0x002cba, 0x002cba, PG_U_UPPERCASE_LETTER},
+	{0x002cbb, 0x002cbb, PG_U_LOWERCASE_LETTER},
+	{0x002cbc, 0x002cbc, PG_U_UPPERCASE_LETTER},
+	{0x002cbd, 0x002cbd, PG_U_LOWERCASE_LETTER},
+	{0x002cbe, 0x002cbe, PG_U_UPPERCASE_LETTER},
+	{0x002cbf, 0x002cbf, PG_U_LOWERCASE_LETTER},
+	{0x002cc0, 0x002cc0, PG_U_UPPERCASE_LETTER},
+	{0x002cc1, 0x002cc1, PG_U_LOWERCASE_LETTER},
+	{0x002cc2, 0x002cc2, PG_U_UPPERCASE_LETTER},
+	{0x002cc3, 0x002cc3, PG_U_LOWERCASE_LETTER},
+	{0x002cc4, 0x002cc4, PG_U_UPPERCASE_LETTER},
+	{0x002cc5, 0x002cc5, PG_U_LOWERCASE_LETTER},
+	{0x002cc6, 0x002cc6, PG_U_UPPERCASE_LETTER},
+	{0x002cc7, 0x002cc7, PG_U_LOWERCASE_LETTER},
+	{0x002cc8, 0x002cc8, PG_U_UPPERCASE_LETTER},
+	{0x002cc9, 0x002cc9, PG_U_LOWERCASE_LETTER},
+	{0x002cca, 0x002cca, PG_U_UPPERCASE_LETTER},
+	{0x002ccb, 0x002ccb, PG_U_LOWERCASE_LETTER},
+	{0x002ccc, 0x002ccc, PG_U_UPPERCASE_LETTER},
+	{0x002ccd, 0x002ccd, PG_U_LOWERCASE_LETTER},
+	{0x002cce, 0x002cce, PG_U_UPPERCASE_LETTER},
+	{0x002ccf, 0x002ccf, PG_U_LOWERCASE_LETTER},
+	{0x002cd0, 0x002cd0, PG_U_UPPERCASE_LETTER},
+	{0x002cd1, 0x002cd1, PG_U_LOWERCASE_LETTER},
+	{0x002cd2, 0x002cd2, PG_U_UPPERCASE_LETTER},
+	{0x002cd3, 0x002cd3, PG_U_LOWERCASE_LETTER},
+	{0x002cd4, 0x002cd4, PG_U_UPPERCASE_LETTER},
+	{0x002cd5, 0x002cd5, PG_U_LOWERCASE_LETTER},
+	{0x002cd6, 0x002cd6, PG_U_UPPERCASE_LETTER},
+	{0x002cd7, 0x002cd7, PG_U_LOWERCASE_LETTER},
+	{0x002cd8, 0x002cd8, PG_U_UPPERCASE_LETTER},
+	{0x002cd9, 0x002cd9, PG_U_LOWERCASE_LETTER},
+	{0x002cda, 0x002cda, PG_U_UPPERCASE_LETTER},
+	{0x002cdb, 0x002cdb, PG_U_LOWERCASE_LETTER},
+	{0x002cdc, 0x002cdc, PG_U_UPPERCASE_LETTER},
+	{0x002cdd, 0x002cdd, PG_U_LOWERCASE_LETTER},
+	{0x002cde, 0x002cde, PG_U_UPPERCASE_LETTER},
+	{0x002cdf, 0x002cdf, PG_U_LOWERCASE_LETTER},
+	{0x002ce0, 0x002ce0, PG_U_UPPERCASE_LETTER},
+	{0x002ce1, 0x002ce1, PG_U_LOWERCASE_LETTER},
+	{0x002ce2, 0x002ce2, PG_U_UPPERCASE_LETTER},
+	{0x002ce3, 0x002ce4, PG_U_LOWERCASE_LETTER},
+	{0x002ce5, 0x002cea, PG_U_OTHER_SYMBOL},
+	{0x002ceb, 0x002ceb, PG_U_UPPERCASE_LETTER},
+	{0x002cec, 0x002cec, PG_U_LOWERCASE_LETTER},
+	{0x002ced, 0x002ced, PG_U_UPPERCASE_LETTER},
+	{0x002cee, 0x002cee, PG_U_LOWERCASE_LETTER},
+	{0x002cef, 0x002cf1, PG_U_NON_SPACING_MARK},
+	{0x002cf2, 0x002cf2, PG_U_UPPERCASE_LETTER},
+	{0x002cf3, 0x002cf3, PG_U_LOWERCASE_LETTER},
+	{0x002cf4, 0x002cf8, PG_U_UNASSIGNED},
+	{0x002cf9, 0x002cfc, PG_U_OTHER_PUNCTUATION},
+	{0x002cfd, 0x002cfd, PG_U_OTHER_NUMBER},
+	{0x002cfe, 0x002cff, PG_U_OTHER_PUNCTUATION},
+	{0x002d00, 0x002d25, PG_U_LOWERCASE_LETTER},
+	{0x002d26, 0x002d26, PG_U_UNASSIGNED},
+	{0x002d27, 0x002d27, PG_U_LOWERCASE_LETTER},
+	{0x002d28, 0x002d2c, PG_U_UNASSIGNED},
+	{0x002d2d, 0x002d2d, PG_U_LOWERCASE_LETTER},
+	{0x002d2e, 0x002d2f, PG_U_UNASSIGNED},
+	{0x002d30, 0x002d67, PG_U_OTHER_LETTER},
+	{0x002d68, 0x002d6e, PG_U_UNASSIGNED},
+	{0x002d6f, 0x002d6f, PG_U_MODIFIER_LETTER},
+	{0x002d70, 0x002d70, PG_U_OTHER_PUNCTUATION},
+	{0x002d71, 0x002d7e, PG_U_UNASSIGNED},
+	{0x002d7f, 0x002d7f, PG_U_NON_SPACING_MARK},
+	{0x002d80, 0x002d96, PG_U_OTHER_LETTER},
+	{0x002d97, 0x002d9f, PG_U_UNASSIGNED},
+	{0x002da0, 0x002da6, PG_U_OTHER_LETTER},
+	{0x002da7, 0x002da7, PG_U_UNASSIGNED},
+	{0x002da8, 0x002dae, PG_U_OTHER_LETTER},
+	{0x002daf, 0x002daf, PG_U_UNASSIGNED},
+	{0x002db0, 0x002db6, PG_U_OTHER_LETTER},
+	{0x002db7, 0x002db7, PG_U_UNASSIGNED},
+	{0x002db8, 0x002dbe, PG_U_OTHER_LETTER},
+	{0x002dbf, 0x002dbf, PG_U_UNASSIGNED},
+	{0x002dc0, 0x002dc6, PG_U_OTHER_LETTER},
+	{0x002dc7, 0x002dc7, PG_U_UNASSIGNED},
+	{0x002dc8, 0x002dce, PG_U_OTHER_LETTER},
+	{0x002dcf, 0x002dcf, PG_U_UNASSIGNED},
+	{0x002dd0, 0x002dd6, PG_U_OTHER_LETTER},
+	{0x002dd7, 0x002dd7, PG_U_UNASSIGNED},
+	{0x002dd8, 0x002dde, PG_U_OTHER_LETTER},
+	{0x002ddf, 0x002ddf, PG_U_UNASSIGNED},
+	{0x002de0, 0x002dff, PG_U_NON_SPACING_MARK},
+	{0x002e00, 0x002e01, PG_U_OTHER_PUNCTUATION},
+	{0x002e02, 0x002e02, PG_U_INITIAL_PUNCTUATION},
+	{0x002e03, 0x002e03, PG_U_FINAL_PUNCTUATION},
+	{0x002e04, 0x002e04, PG_U_INITIAL_PUNCTUATION},
+	{0x002e05, 0x002e05, PG_U_FINAL_PUNCTUATION},
+	{0x002e06, 0x002e08, PG_U_OTHER_PUNCTUATION},
+	{0x002e09, 0x002e09, PG_U_INITIAL_PUNCTUATION},
+	{0x002e0a, 0x002e0a, PG_U_FINAL_PUNCTUATION},
+	{0x002e0b, 0x002e0b, PG_U_OTHER_PUNCTUATION},
+	{0x002e0c, 0x002e0c, PG_U_INITIAL_PUNCTUATION},
+	{0x002e0d, 0x002e0d, PG_U_FINAL_PUNCTUATION},
+	{0x002e0e, 0x002e16, PG_U_OTHER_PUNCTUATION},
+	{0x002e17, 0x002e17, PG_U_DASH_PUNCTUATION},
+	{0x002e18, 0x002e19, PG_U_OTHER_PUNCTUATION},
+	{0x002e1a, 0x002e1a, PG_U_DASH_PUNCTUATION},
+	{0x002e1b, 0x002e1b, PG_U_OTHER_PUNCTUATION},
+	{0x002e1c, 0x002e1c, PG_U_INITIAL_PUNCTUATION},
+	{0x002e1d, 0x002e1d, PG_U_FINAL_PUNCTUATION},
+	{0x002e1e, 0x002e1f, PG_U_OTHER_PUNCTUATION},
+	{0x002e20, 0x002e20, PG_U_INITIAL_PUNCTUATION},
+	{0x002e21, 0x002e21, PG_U_FINAL_PUNCTUATION},
+	{0x002e22, 0x002e22, PG_U_START_PUNCTUATION},
+	{0x002e23, 0x002e23, PG_U_END_PUNCTUATION},
+	{0x002e24, 0x002e24, PG_U_START_PUNCTUATION},
+	{0x002e25, 0x002e25, PG_U_END_PUNCTUATION},
+	{0x002e26, 0x002e26, PG_U_START_PUNCTUATION},
+	{0x002e27, 0x002e27, PG_U_END_PUNCTUATION},
+	{0x002e28, 0x002e28, PG_U_START_PUNCTUATION},
+	{0x002e29, 0x002e29, PG_U_END_PUNCTUATION},
+	{0x002e2a, 0x002e2e, PG_U_OTHER_PUNCTUATION},
+	{0x002e2f, 0x002e2f, PG_U_MODIFIER_LETTER},
+	{0x002e30, 0x002e39, PG_U_OTHER_PUNCTUATION},
+	{0x002e3a, 0x002e3b, PG_U_DASH_PUNCTUATION},
+	{0x002e3c, 0x002e3f, PG_U_OTHER_PUNCTUATION},
+	{0x002e40, 0x002e40, PG_U_DASH_PUNCTUATION},
+	{0x002e41, 0x002e41, PG_U_OTHER_PUNCTUATION},
+	{0x002e42, 0x002e42, PG_U_START_PUNCTUATION},
+	{0x002e43, 0x002e4f, PG_U_OTHER_PUNCTUATION},
+	{0x002e50, 0x002e51, PG_U_OTHER_SYMBOL},
+	{0x002e52, 0x002e54, PG_U_OTHER_PUNCTUATION},
+	{0x002e55, 0x002e55, PG_U_START_PUNCTUATION},
+	{0x002e56, 0x002e56, PG_U_END_PUNCTUATION},
+	{0x002e57, 0x002e57, PG_U_START_PUNCTUATION},
+	{0x002e58, 0x002e58, PG_U_END_PUNCTUATION},
+	{0x002e59, 0x002e59, PG_U_START_PUNCTUATION},
+	{0x002e5a, 0x002e5a, PG_U_END_PUNCTUATION},
+	{0x002e5b, 0x002e5b, PG_U_START_PUNCTUATION},
+	{0x002e5c, 0x002e5c, PG_U_END_PUNCTUATION},
+	{0x002e5d, 0x002e5d, PG_U_DASH_PUNCTUATION},
+	{0x002e5e, 0x002e7f, PG_U_UNASSIGNED},
+	{0x002e80, 0x002e99, PG_U_OTHER_SYMBOL},
+	{0x002e9a, 0x002e9a, PG_U_UNASSIGNED},
+	{0x002e9b, 0x002ef3, PG_U_OTHER_SYMBOL},
+	{0x002ef4, 0x002eff, PG_U_UNASSIGNED},
+	{0x002f00, 0x002fd5, PG_U_OTHER_SYMBOL},
+	{0x002fd6, 0x002fef, PG_U_UNASSIGNED},
+	{0x002ff0, 0x002fff, PG_U_OTHER_SYMBOL},
+	{0x003000, 0x003000, PG_U_SPACE_SEPARATOR},
+	{0x003001, 0x003003, PG_U_OTHER_PUNCTUATION},
+	{0x003004, 0x003004, PG_U_OTHER_SYMBOL},
+	{0x003005, 0x003005, PG_U_MODIFIER_LETTER},
+	{0x003006, 0x003006, PG_U_OTHER_LETTER},
+	{0x003007, 0x003007, PG_U_LETTER_NUMBER},
+	{0x003008, 0x003008, PG_U_START_PUNCTUATION},
+	{0x003009, 0x003009, PG_U_END_PUNCTUATION},
+	{0x00300a, 0x00300a, PG_U_START_PUNCTUATION},
+	{0x00300b, 0x00300b, PG_U_END_PUNCTUATION},
+	{0x00300c, 0x00300c, PG_U_START_PUNCTUATION},
+	{0x00300d, 0x00300d, PG_U_END_PUNCTUATION},
+	{0x00300e, 0x00300e, PG_U_START_PUNCTUATION},
+	{0x00300f, 0x00300f, PG_U_END_PUNCTUATION},
+	{0x003010, 0x003010, PG_U_START_PUNCTUATION},
+	{0x003011, 0x003011, PG_U_END_PUNCTUATION},
+	{0x003012, 0x003013, PG_U_OTHER_SYMBOL},
+	{0x003014, 0x003014, PG_U_START_PUNCTUATION},
+	{0x003015, 0x003015, PG_U_END_PUNCTUATION},
+	{0x003016, 0x003016, PG_U_START_PUNCTUATION},
+	{0x003017, 0x003017, PG_U_END_PUNCTUATION},
+	{0x003018, 0x003018, PG_U_START_PUNCTUATION},
+	{0x003019, 0x003019, PG_U_END_PUNCTUATION},
+	{0x00301a, 0x00301a, PG_U_START_PUNCTUATION},
+	{0x00301b, 0x00301b, PG_U_END_PUNCTUATION},
+	{0x00301c, 0x00301c, PG_U_DASH_PUNCTUATION},
+	{0x00301d, 0x00301d, PG_U_START_PUNCTUATION},
+	{0x00301e, 0x00301f, PG_U_END_PUNCTUATION},
+	{0x003020, 0x003020, PG_U_OTHER_SYMBOL},
+	{0x003021, 0x003029, PG_U_LETTER_NUMBER},
+	{0x00302a, 0x00302d, PG_U_NON_SPACING_MARK},
+	{0x00302e, 0x00302f, PG_U_COMBINING_SPACING_MARK},
+	{0x003030, 0x003030, PG_U_DASH_PUNCTUATION},
+	{0x003031, 0x003035, PG_U_MODIFIER_LETTER},
+	{0x003036, 0x003037, PG_U_OTHER_SYMBOL},
+	{0x003038, 0x00303a, PG_U_LETTER_NUMBER},
+	{0x00303b, 0x00303b, PG_U_MODIFIER_LETTER},
+	{0x00303c, 0x00303c, PG_U_OTHER_LETTER},
+	{0x00303d, 0x00303d, PG_U_OTHER_PUNCTUATION},
+	{0x00303e, 0x00303f, PG_U_OTHER_SYMBOL},
+	{0x003040, 0x003040, PG_U_UNASSIGNED},
+	{0x003041, 0x003096, PG_U_OTHER_LETTER},
+	{0x003097, 0x003098, PG_U_UNASSIGNED},
+	{0x003099, 0x00309a, PG_U_NON_SPACING_MARK},
+	{0x00309b, 0x00309c, PG_U_MODIFIER_SYMBOL},
+	{0x00309d, 0x00309e, PG_U_MODIFIER_LETTER},
+	{0x00309f, 0x00309f, PG_U_OTHER_LETTER},
+	{0x0030a0, 0x0030a0, PG_U_DASH_PUNCTUATION},
+	{0x0030a1, 0x0030fa, PG_U_OTHER_LETTER},
+	{0x0030fb, 0x0030fb, PG_U_OTHER_PUNCTUATION},
+	{0x0030fc, 0x0030fe, PG_U_MODIFIER_LETTER},
+	{0x0030ff, 0x0030ff, PG_U_OTHER_LETTER},
+	{0x003100, 0x003104, PG_U_UNASSIGNED},
+	{0x003105, 0x00312f, PG_U_OTHER_LETTER},
+	{0x003130, 0x003130, PG_U_UNASSIGNED},
+	{0x003131, 0x00318e, PG_U_OTHER_LETTER},
+	{0x00318f, 0x00318f, PG_U_UNASSIGNED},
+	{0x003190, 0x003191, PG_U_OTHER_SYMBOL},
+	{0x003192, 0x003195, PG_U_OTHER_NUMBER},
+	{0x003196, 0x00319f, PG_U_OTHER_SYMBOL},
+	{0x0031a0, 0x0031bf, PG_U_OTHER_LETTER},
+	{0x0031c0, 0x0031e3, PG_U_OTHER_SYMBOL},
+	{0x0031e4, 0x0031ee, PG_U_UNASSIGNED},
+	{0x0031ef, 0x0031ef, PG_U_OTHER_SYMBOL},
+	{0x0031f0, 0x0031ff, PG_U_OTHER_LETTER},
+	{0x003200, 0x00321e, PG_U_OTHER_SYMBOL},
+	{0x00321f, 0x00321f, PG_U_UNASSIGNED},
+	{0x003220, 0x003229, PG_U_OTHER_NUMBER},
+	{0x00322a, 0x003247, PG_U_OTHER_SYMBOL},
+	{0x003248, 0x00324f, PG_U_OTHER_NUMBER},
+	{0x003250, 0x003250, PG_U_OTHER_SYMBOL},
+	{0x003251, 0x00325f, PG_U_OTHER_NUMBER},
+	{0x003260, 0x00327f, PG_U_OTHER_SYMBOL},
+	{0x003280, 0x003289, PG_U_OTHER_NUMBER},
+	{0x00328a, 0x0032b0, PG_U_OTHER_SYMBOL},
+	{0x0032b1, 0x0032bf, PG_U_OTHER_NUMBER},
+	{0x0032c0, 0x0033ff, PG_U_OTHER_SYMBOL},
+	{0x003400, 0x004dbf, PG_U_OTHER_LETTER},
+	{0x004dc0, 0x004dff, PG_U_OTHER_SYMBOL},
+	{0x004e00, 0x00a014, PG_U_OTHER_LETTER},
+	{0x00a015, 0x00a015, PG_U_MODIFIER_LETTER},
+	{0x00a016, 0x00a48c, PG_U_OTHER_LETTER},
+	{0x00a48d, 0x00a48f, PG_U_UNASSIGNED},
+	{0x00a490, 0x00a4c6, PG_U_OTHER_SYMBOL},
+	{0x00a4c7, 0x00a4cf, PG_U_UNASSIGNED},
+	{0x00a4d0, 0x00a4f7, PG_U_OTHER_LETTER},
+	{0x00a4f8, 0x00a4fd, PG_U_MODIFIER_LETTER},
+	{0x00a4fe, 0x00a4ff, PG_U_OTHER_PUNCTUATION},
+	{0x00a500, 0x00a60b, PG_U_OTHER_LETTER},
+	{0x00a60c, 0x00a60c, PG_U_MODIFIER_LETTER},
+	{0x00a60d, 0x00a60f, PG_U_OTHER_PUNCTUATION},
+	{0x00a610, 0x00a61f, PG_U_OTHER_LETTER},
+	{0x00a620, 0x00a629, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a62a, 0x00a62b, PG_U_OTHER_LETTER},
+	{0x00a62c, 0x00a63f, PG_U_UNASSIGNED},
+	{0x00a640, 0x00a640, PG_U_UPPERCASE_LETTER},
+	{0x00a641, 0x00a641, PG_U_LOWERCASE_LETTER},
+	{0x00a642, 0x00a642, PG_U_UPPERCASE_LETTER},
+	{0x00a643, 0x00a643, PG_U_LOWERCASE_LETTER},
+	{0x00a644, 0x00a644, PG_U_UPPERCASE_LETTER},
+	{0x00a645, 0x00a645, PG_U_LOWERCASE_LETTER},
+	{0x00a646, 0x00a646, PG_U_UPPERCASE_LETTER},
+	{0x00a647, 0x00a647, PG_U_LOWERCASE_LETTER},
+	{0x00a648, 0x00a648, PG_U_UPPERCASE_LETTER},
+	{0x00a649, 0x00a649, PG_U_LOWERCASE_LETTER},
+	{0x00a64a, 0x00a64a, PG_U_UPPERCASE_LETTER},
+	{0x00a64b, 0x00a64b, PG_U_LOWERCASE_LETTER},
+	{0x00a64c, 0x00a64c, PG_U_UPPERCASE_LETTER},
+	{0x00a64d, 0x00a64d, PG_U_LOWERCASE_LETTER},
+	{0x00a64e, 0x00a64e, PG_U_UPPERCASE_LETTER},
+	{0x00a64f, 0x00a64f, PG_U_LOWERCASE_LETTER},
+	{0x00a650, 0x00a650, PG_U_UPPERCASE_LETTER},
+	{0x00a651, 0x00a651, PG_U_LOWERCASE_LETTER},
+	{0x00a652, 0x00a652, PG_U_UPPERCASE_LETTER},
+	{0x00a653, 0x00a653, PG_U_LOWERCASE_LETTER},
+	{0x00a654, 0x00a654, PG_U_UPPERCASE_LETTER},
+	{0x00a655, 0x00a655, PG_U_LOWERCASE_LETTER},
+	{0x00a656, 0x00a656, PG_U_UPPERCASE_LETTER},
+	{0x00a657, 0x00a657, PG_U_LOWERCASE_LETTER},
+	{0x00a658, 0x00a658, PG_U_UPPERCASE_LETTER},
+	{0x00a659, 0x00a659, PG_U_LOWERCASE_LETTER},
+	{0x00a65a, 0x00a65a, PG_U_UPPERCASE_LETTER},
+	{0x00a65b, 0x00a65b, PG_U_LOWERCASE_LETTER},
+	{0x00a65c, 0x00a65c, PG_U_UPPERCASE_LETTER},
+	{0x00a65d, 0x00a65d, PG_U_LOWERCASE_LETTER},
+	{0x00a65e, 0x00a65e, PG_U_UPPERCASE_LETTER},
+	{0x00a65f, 0x00a65f, PG_U_LOWERCASE_LETTER},
+	{0x00a660, 0x00a660, PG_U_UPPERCASE_LETTER},
+	{0x00a661, 0x00a661, PG_U_LOWERCASE_LETTER},
+	{0x00a662, 0x00a662, PG_U_UPPERCASE_LETTER},
+	{0x00a663, 0x00a663, PG_U_LOWERCASE_LETTER},
+	{0x00a664, 0x00a664, PG_U_UPPERCASE_LETTER},
+	{0x00a665, 0x00a665, PG_U_LOWERCASE_LETTER},
+	{0x00a666, 0x00a666, PG_U_UPPERCASE_LETTER},
+	{0x00a667, 0x00a667, PG_U_LOWERCASE_LETTER},
+	{0x00a668, 0x00a668, PG_U_UPPERCASE_LETTER},
+	{0x00a669, 0x00a669, PG_U_LOWERCASE_LETTER},
+	{0x00a66a, 0x00a66a, PG_U_UPPERCASE_LETTER},
+	{0x00a66b, 0x00a66b, PG_U_LOWERCASE_LETTER},
+	{0x00a66c, 0x00a66c, PG_U_UPPERCASE_LETTER},
+	{0x00a66d, 0x00a66d, PG_U_LOWERCASE_LETTER},
+	{0x00a66e, 0x00a66e, PG_U_OTHER_LETTER},
+	{0x00a66f, 0x00a66f, PG_U_NON_SPACING_MARK},
+	{0x00a670, 0x00a672, PG_U_ENCLOSING_MARK},
+	{0x00a673, 0x00a673, PG_U_OTHER_PUNCTUATION},
+	{0x00a674, 0x00a67d, PG_U_NON_SPACING_MARK},
+	{0x00a67e, 0x00a67e, PG_U_OTHER_PUNCTUATION},
+	{0x00a67f, 0x00a67f, PG_U_MODIFIER_LETTER},
+	{0x00a680, 0x00a680, PG_U_UPPERCASE_LETTER},
+	{0x00a681, 0x00a681, PG_U_LOWERCASE_LETTER},
+	{0x00a682, 0x00a682, PG_U_UPPERCASE_LETTER},
+	{0x00a683, 0x00a683, PG_U_LOWERCASE_LETTER},
+	{0x00a684, 0x00a684, PG_U_UPPERCASE_LETTER},
+	{0x00a685, 0x00a685, PG_U_LOWERCASE_LETTER},
+	{0x00a686, 0x00a686, PG_U_UPPERCASE_LETTER},
+	{0x00a687, 0x00a687, PG_U_LOWERCASE_LETTER},
+	{0x00a688, 0x00a688, PG_U_UPPERCASE_LETTER},
+	{0x00a689, 0x00a689, PG_U_LOWERCASE_LETTER},
+	{0x00a68a, 0x00a68a, PG_U_UPPERCASE_LETTER},
+	{0x00a68b, 0x00a68b, PG_U_LOWERCASE_LETTER},
+	{0x00a68c, 0x00a68c, PG_U_UPPERCASE_LETTER},
+	{0x00a68d, 0x00a68d, PG_U_LOWERCASE_LETTER},
+	{0x00a68e, 0x00a68e, PG_U_UPPERCASE_LETTER},
+	{0x00a68f, 0x00a68f, PG_U_LOWERCASE_LETTER},
+	{0x00a690, 0x00a690, PG_U_UPPERCASE_LETTER},
+	{0x00a691, 0x00a691, PG_U_LOWERCASE_LETTER},
+	{0x00a692, 0x00a692, PG_U_UPPERCASE_LETTER},
+	{0x00a693, 0x00a693, PG_U_LOWERCASE_LETTER},
+	{0x00a694, 0x00a694, PG_U_UPPERCASE_LETTER},
+	{0x00a695, 0x00a695, PG_U_LOWERCASE_LETTER},
+	{0x00a696, 0x00a696, PG_U_UPPERCASE_LETTER},
+	{0x00a697, 0x00a697, PG_U_LOWERCASE_LETTER},
+	{0x00a698, 0x00a698, PG_U_UPPERCASE_LETTER},
+	{0x00a699, 0x00a699, PG_U_LOWERCASE_LETTER},
+	{0x00a69a, 0x00a69a, PG_U_UPPERCASE_LETTER},
+	{0x00a69b, 0x00a69b, PG_U_LOWERCASE_LETTER},
+	{0x00a69c, 0x00a69d, PG_U_MODIFIER_LETTER},
+	{0x00a69e, 0x00a69f, PG_U_NON_SPACING_MARK},
+	{0x00a6a0, 0x00a6e5, PG_U_OTHER_LETTER},
+	{0x00a6e6, 0x00a6ef, PG_U_LETTER_NUMBER},
+	{0x00a6f0, 0x00a6f1, PG_U_NON_SPACING_MARK},
+	{0x00a6f2, 0x00a6f7, PG_U_OTHER_PUNCTUATION},
+	{0x00a6f8, 0x00a6ff, PG_U_UNASSIGNED},
+	{0x00a700, 0x00a716, PG_U_MODIFIER_SYMBOL},
+	{0x00a717, 0x00a71f, PG_U_MODIFIER_LETTER},
+	{0x00a720, 0x00a721, PG_U_MODIFIER_SYMBOL},
+	{0x00a722, 0x00a722, PG_U_UPPERCASE_LETTER},
+	{0x00a723, 0x00a723, PG_U_LOWERCASE_LETTER},
+	{0x00a724, 0x00a724, PG_U_UPPERCASE_LETTER},
+	{0x00a725, 0x00a725, PG_U_LOWERCASE_LETTER},
+	{0x00a726, 0x00a726, PG_U_UPPERCASE_LETTER},
+	{0x00a727, 0x00a727, PG_U_LOWERCASE_LETTER},
+	{0x00a728, 0x00a728, PG_U_UPPERCASE_LETTER},
+	{0x00a729, 0x00a729, PG_U_LOWERCASE_LETTER},
+	{0x00a72a, 0x00a72a, PG_U_UPPERCASE_LETTER},
+	{0x00a72b, 0x00a72b, PG_U_LOWERCASE_LETTER},
+	{0x00a72c, 0x00a72c, PG_U_UPPERCASE_LETTER},
+	{0x00a72d, 0x00a72d, PG_U_LOWERCASE_LETTER},
+	{0x00a72e, 0x00a72e, PG_U_UPPERCASE_LETTER},
+	{0x00a72f, 0x00a731, PG_U_LOWERCASE_LETTER},
+	{0x00a732, 0x00a732, PG_U_UPPERCASE_LETTER},
+	{0x00a733, 0x00a733, PG_U_LOWERCASE_LETTER},
+	{0x00a734, 0x00a734, PG_U_UPPERCASE_LETTER},
+	{0x00a735, 0x00a735, PG_U_LOWERCASE_LETTER},
+	{0x00a736, 0x00a736, PG_U_UPPERCASE_LETTER},
+	{0x00a737, 0x00a737, PG_U_LOWERCASE_LETTER},
+	{0x00a738, 0x00a738, PG_U_UPPERCASE_LETTER},
+	{0x00a739, 0x00a739, PG_U_LOWERCASE_LETTER},
+	{0x00a73a, 0x00a73a, PG_U_UPPERCASE_LETTER},
+	{0x00a73b, 0x00a73b, PG_U_LOWERCASE_LETTER},
+	{0x00a73c, 0x00a73c, PG_U_UPPERCASE_LETTER},
+	{0x00a73d, 0x00a73d, PG_U_LOWERCASE_LETTER},
+	{0x00a73e, 0x00a73e, PG_U_UPPERCASE_LETTER},
+	{0x00a73f, 0x00a73f, PG_U_LOWERCASE_LETTER},
+	{0x00a740, 0x00a740, PG_U_UPPERCASE_LETTER},
+	{0x00a741, 0x00a741, PG_U_LOWERCASE_LETTER},
+	{0x00a742, 0x00a742, PG_U_UPPERCASE_LETTER},
+	{0x00a743, 0x00a743, PG_U_LOWERCASE_LETTER},
+	{0x00a744, 0x00a744, PG_U_UPPERCASE_LETTER},
+	{0x00a745, 0x00a745, PG_U_LOWERCASE_LETTER},
+	{0x00a746, 0x00a746, PG_U_UPPERCASE_LETTER},
+	{0x00a747, 0x00a747, PG_U_LOWERCASE_LETTER},
+	{0x00a748, 0x00a748, PG_U_UPPERCASE_LETTER},
+	{0x00a749, 0x00a749, PG_U_LOWERCASE_LETTER},
+	{0x00a74a, 0x00a74a, PG_U_UPPERCASE_LETTER},
+	{0x00a74b, 0x00a74b, PG_U_LOWERCASE_LETTER},
+	{0x00a74c, 0x00a74c, PG_U_UPPERCASE_LETTER},
+	{0x00a74d, 0x00a74d, PG_U_LOWERCASE_LETTER},
+	{0x00a74e, 0x00a74e, PG_U_UPPERCASE_LETTER},
+	{0x00a74f, 0x00a74f, PG_U_LOWERCASE_LETTER},
+	{0x00a750, 0x00a750, PG_U_UPPERCASE_LETTER},
+	{0x00a751, 0x00a751, PG_U_LOWERCASE_LETTER},
+	{0x00a752, 0x00a752, PG_U_UPPERCASE_LETTER},
+	{0x00a753, 0x00a753, PG_U_LOWERCASE_LETTER},
+	{0x00a754, 0x00a754, PG_U_UPPERCASE_LETTER},
+	{0x00a755, 0x00a755, PG_U_LOWERCASE_LETTER},
+	{0x00a756, 0x00a756, PG_U_UPPERCASE_LETTER},
+	{0x00a757, 0x00a757, PG_U_LOWERCASE_LETTER},
+	{0x00a758, 0x00a758, PG_U_UPPERCASE_LETTER},
+	{0x00a759, 0x00a759, PG_U_LOWERCASE_LETTER},
+	{0x00a75a, 0x00a75a, PG_U_UPPERCASE_LETTER},
+	{0x00a75b, 0x00a75b, PG_U_LOWERCASE_LETTER},
+	{0x00a75c, 0x00a75c, PG_U_UPPERCASE_LETTER},
+	{0x00a75d, 0x00a75d, PG_U_LOWERCASE_LETTER},
+	{0x00a75e, 0x00a75e, PG_U_UPPERCASE_LETTER},
+	{0x00a75f, 0x00a75f, PG_U_LOWERCASE_LETTER},
+	{0x00a760, 0x00a760, PG_U_UPPERCASE_LETTER},
+	{0x00a761, 0x00a761, PG_U_LOWERCASE_LETTER},
+	{0x00a762, 0x00a762, PG_U_UPPERCASE_LETTER},
+	{0x00a763, 0x00a763, PG_U_LOWERCASE_LETTER},
+	{0x00a764, 0x00a764, PG_U_UPPERCASE_LETTER},
+	{0x00a765, 0x00a765, PG_U_LOWERCASE_LETTER},
+	{0x00a766, 0x00a766, PG_U_UPPERCASE_LETTER},
+	{0x00a767, 0x00a767, PG_U_LOWERCASE_LETTER},
+	{0x00a768, 0x00a768, PG_U_UPPERCASE_LETTER},
+	{0x00a769, 0x00a769, PG_U_LOWERCASE_LETTER},
+	{0x00a76a, 0x00a76a, PG_U_UPPERCASE_LETTER},
+	{0x00a76b, 0x00a76b, PG_U_LOWERCASE_LETTER},
+	{0x00a76c, 0x00a76c, PG_U_UPPERCASE_LETTER},
+	{0x00a76d, 0x00a76d, PG_U_LOWERCASE_LETTER},
+	{0x00a76e, 0x00a76e, PG_U_UPPERCASE_LETTER},
+	{0x00a76f, 0x00a76f, PG_U_LOWERCASE_LETTER},
+	{0x00a770, 0x00a770, PG_U_MODIFIER_LETTER},
+	{0x00a771, 0x00a778, PG_U_LOWERCASE_LETTER},
+	{0x00a779, 0x00a779, PG_U_UPPERCASE_LETTER},
+	{0x00a77a, 0x00a77a, PG_U_LOWERCASE_LETTER},
+	{0x00a77b, 0x00a77b, PG_U_UPPERCASE_LETTER},
+	{0x00a77c, 0x00a77c, PG_U_LOWERCASE_LETTER},
+	{0x00a77d, 0x00a77e, PG_U_UPPERCASE_LETTER},
+	{0x00a77f, 0x00a77f, PG_U_LOWERCASE_LETTER},
+	{0x00a780, 0x00a780, PG_U_UPPERCASE_LETTER},
+	{0x00a781, 0x00a781, PG_U_LOWERCASE_LETTER},
+	{0x00a782, 0x00a782, PG_U_UPPERCASE_LETTER},
+	{0x00a783, 0x00a783, PG_U_LOWERCASE_LETTER},
+	{0x00a784, 0x00a784, PG_U_UPPERCASE_LETTER},
+	{0x00a785, 0x00a785, PG_U_LOWERCASE_LETTER},
+	{0x00a786, 0x00a786, PG_U_UPPERCASE_LETTER},
+	{0x00a787, 0x00a787, PG_U_LOWERCASE_LETTER},
+	{0x00a788, 0x00a788, PG_U_MODIFIER_LETTER},
+	{0x00a789, 0x00a78a, PG_U_MODIFIER_SYMBOL},
+	{0x00a78b, 0x00a78b, PG_U_UPPERCASE_LETTER},
+	{0x00a78c, 0x00a78c, PG_U_LOWERCASE_LETTER},
+	{0x00a78d, 0x00a78d, PG_U_UPPERCASE_LETTER},
+	{0x00a78e, 0x00a78e, PG_U_LOWERCASE_LETTER},
+	{0x00a78f, 0x00a78f, PG_U_OTHER_LETTER},
+	{0x00a790, 0x00a790, PG_U_UPPERCASE_LETTER},
+	{0x00a791, 0x00a791, PG_U_LOWERCASE_LETTER},
+	{0x00a792, 0x00a792, PG_U_UPPERCASE_LETTER},
+	{0x00a793, 0x00a795, PG_U_LOWERCASE_LETTER},
+	{0x00a796, 0x00a796, PG_U_UPPERCASE_LETTER},
+	{0x00a797, 0x00a797, PG_U_LOWERCASE_LETTER},
+	{0x00a798, 0x00a798, PG_U_UPPERCASE_LETTER},
+	{0x00a799, 0x00a799, PG_U_LOWERCASE_LETTER},
+	{0x00a79a, 0x00a79a, PG_U_UPPERCASE_LETTER},
+	{0x00a79b, 0x00a79b, PG_U_LOWERCASE_LETTER},
+	{0x00a79c, 0x00a79c, PG_U_UPPERCASE_LETTER},
+	{0x00a79d, 0x00a79d, PG_U_LOWERCASE_LETTER},
+	{0x00a79e, 0x00a79e, PG_U_UPPERCASE_LETTER},
+	{0x00a79f, 0x00a79f, PG_U_LOWERCASE_LETTER},
+	{0x00a7a0, 0x00a7a0, PG_U_UPPERCASE_LETTER},
+	{0x00a7a1, 0x00a7a1, PG_U_LOWERCASE_LETTER},
+	{0x00a7a2, 0x00a7a2, PG_U_UPPERCASE_LETTER},
+	{0x00a7a3, 0x00a7a3, PG_U_LOWERCASE_LETTER},
+	{0x00a7a4, 0x00a7a4, PG_U_UPPERCASE_LETTER},
+	{0x00a7a5, 0x00a7a5, PG_U_LOWERCASE_LETTER},
+	{0x00a7a6, 0x00a7a6, PG_U_UPPERCASE_LETTER},
+	{0x00a7a7, 0x00a7a7, PG_U_LOWERCASE_LETTER},
+	{0x00a7a8, 0x00a7a8, PG_U_UPPERCASE_LETTER},
+	{0x00a7a9, 0x00a7a9, PG_U_LOWERCASE_LETTER},
+	{0x00a7aa, 0x00a7ae, PG_U_UPPERCASE_LETTER},
+	{0x00a7af, 0x00a7af, PG_U_LOWERCASE_LETTER},
+	{0x00a7b0, 0x00a7b4, PG_U_UPPERCASE_LETTER},
+	{0x00a7b5, 0x00a7b5, PG_U_LOWERCASE_LETTER},
+	{0x00a7b6, 0x00a7b6, PG_U_UPPERCASE_LETTER},
+	{0x00a7b7, 0x00a7b7, PG_U_LOWERCASE_LETTER},
+	{0x00a7b8, 0x00a7b8, PG_U_UPPERCASE_LETTER},
+	{0x00a7b9, 0x00a7b9, PG_U_LOWERCASE_LETTER},
+	{0x00a7ba, 0x00a7ba, PG_U_UPPERCASE_LETTER},
+	{0x00a7bb, 0x00a7bb, PG_U_LOWERCASE_LETTER},
+	{0x00a7bc, 0x00a7bc, PG_U_UPPERCASE_LETTER},
+	{0x00a7bd, 0x00a7bd, PG_U_LOWERCASE_LETTER},
+	{0x00a7be, 0x00a7be, PG_U_UPPERCASE_LETTER},
+	{0x00a7bf, 0x00a7bf, PG_U_LOWERCASE_LETTER},
+	{0x00a7c0, 0x00a7c0, PG_U_UPPERCASE_LETTER},
+	{0x00a7c1, 0x00a7c1, PG_U_LOWERCASE_LETTER},
+	{0x00a7c2, 0x00a7c2, PG_U_UPPERCASE_LETTER},
+	{0x00a7c3, 0x00a7c3, PG_U_LOWERCASE_LETTER},
+	{0x00a7c4, 0x00a7c7, PG_U_UPPERCASE_LETTER},
+	{0x00a7c8, 0x00a7c8, PG_U_LOWERCASE_LETTER},
+	{0x00a7c9, 0x00a7c9, PG_U_UPPERCASE_LETTER},
+	{0x00a7ca, 0x00a7ca, PG_U_LOWERCASE_LETTER},
+	{0x00a7cb, 0x00a7cf, PG_U_UNASSIGNED},
+	{0x00a7d0, 0x00a7d0, PG_U_UPPERCASE_LETTER},
+	{0x00a7d1, 0x00a7d1, PG_U_LOWERCASE_LETTER},
+	{0x00a7d2, 0x00a7d2, PG_U_UNASSIGNED},
+	{0x00a7d3, 0x00a7d3, PG_U_LOWERCASE_LETTER},
+	{0x00a7d4, 0x00a7d4, PG_U_UNASSIGNED},
+	{0x00a7d5, 0x00a7d5, PG_U_LOWERCASE_LETTER},
+	{0x00a7d6, 0x00a7d6, PG_U_UPPERCASE_LETTER},
+	{0x00a7d7, 0x00a7d7, PG_U_LOWERCASE_LETTER},
+	{0x00a7d8, 0x00a7d8, PG_U_UPPERCASE_LETTER},
+	{0x00a7d9, 0x00a7d9, PG_U_LOWERCASE_LETTER},
+	{0x00a7da, 0x00a7f1, PG_U_UNASSIGNED},
+	{0x00a7f2, 0x00a7f4, PG_U_MODIFIER_LETTER},
+	{0x00a7f5, 0x00a7f5, PG_U_UPPERCASE_LETTER},
+	{0x00a7f6, 0x00a7f6, PG_U_LOWERCASE_LETTER},
+	{0x00a7f7, 0x00a7f7, PG_U_OTHER_LETTER},
+	{0x00a7f8, 0x00a7f9, PG_U_MODIFIER_LETTER},
+	{0x00a7fa, 0x00a7fa, PG_U_LOWERCASE_LETTER},
+	{0x00a7fb, 0x00a801, PG_U_OTHER_LETTER},
+	{0x00a802, 0x00a802, PG_U_NON_SPACING_MARK},
+	{0x00a803, 0x00a805, PG_U_OTHER_LETTER},
+	{0x00a806, 0x00a806, PG_U_NON_SPACING_MARK},
+	{0x00a807, 0x00a80a, PG_U_OTHER_LETTER},
+	{0x00a80b, 0x00a80b, PG_U_NON_SPACING_MARK},
+	{0x00a80c, 0x00a822, PG_U_OTHER_LETTER},
+	{0x00a823, 0x00a824, PG_U_COMBINING_SPACING_MARK},
+	{0x00a825, 0x00a826, PG_U_NON_SPACING_MARK},
+	{0x00a827, 0x00a827, PG_U_COMBINING_SPACING_MARK},
+	{0x00a828, 0x00a82b, PG_U_OTHER_SYMBOL},
+	{0x00a82c, 0x00a82c, PG_U_NON_SPACING_MARK},
+	{0x00a82d, 0x00a82f, PG_U_UNASSIGNED},
+	{0x00a830, 0x00a835, PG_U_OTHER_NUMBER},
+	{0x00a836, 0x00a837, PG_U_OTHER_SYMBOL},
+	{0x00a838, 0x00a838, PG_U_CURRENCY_SYMBOL},
+	{0x00a839, 0x00a839, PG_U_OTHER_SYMBOL},
+	{0x00a83a, 0x00a83f, PG_U_UNASSIGNED},
+	{0x00a840, 0x00a873, PG_U_OTHER_LETTER},
+	{0x00a874, 0x00a877, PG_U_OTHER_PUNCTUATION},
+	{0x00a878, 0x00a87f, PG_U_UNASSIGNED},
+	{0x00a880, 0x00a881, PG_U_COMBINING_SPACING_MARK},
+	{0x00a882, 0x00a8b3, PG_U_OTHER_LETTER},
+	{0x00a8b4, 0x00a8c3, PG_U_COMBINING_SPACING_MARK},
+	{0x00a8c4, 0x00a8c5, PG_U_NON_SPACING_MARK},
+	{0x00a8c6, 0x00a8cd, PG_U_UNASSIGNED},
+	{0x00a8ce, 0x00a8cf, PG_U_OTHER_PUNCTUATION},
+	{0x00a8d0, 0x00a8d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a8da, 0x00a8df, PG_U_UNASSIGNED},
+	{0x00a8e0, 0x00a8f1, PG_U_NON_SPACING_MARK},
+	{0x00a8f2, 0x00a8f7, PG_U_OTHER_LETTER},
+	{0x00a8f8, 0x00a8fa, PG_U_OTHER_PUNCTUATION},
+	{0x00a8fb, 0x00a8fb, PG_U_OTHER_LETTER},
+	{0x00a8fc, 0x00a8fc, PG_U_OTHER_PUNCTUATION},
+	{0x00a8fd, 0x00a8fe, PG_U_OTHER_LETTER},
+	{0x00a8ff, 0x00a8ff, PG_U_NON_SPACING_MARK},
+	{0x00a900, 0x00a909, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a90a, 0x00a925, PG_U_OTHER_LETTER},
+	{0x00a926, 0x00a92d, PG_U_NON_SPACING_MARK},
+	{0x00a92e, 0x00a92f, PG_U_OTHER_PUNCTUATION},
+	{0x00a930, 0x00a946, PG_U_OTHER_LETTER},
+	{0x00a947, 0x00a951, PG_U_NON_SPACING_MARK},
+	{0x00a952, 0x00a953, PG_U_COMBINING_SPACING_MARK},
+	{0x00a954, 0x00a95e, PG_U_UNASSIGNED},
+	{0x00a95f, 0x00a95f, PG_U_OTHER_PUNCTUATION},
+	{0x00a960, 0x00a97c, PG_U_OTHER_LETTER},
+	{0x00a97d, 0x00a97f, PG_U_UNASSIGNED},
+	{0x00a980, 0x00a982, PG_U_NON_SPACING_MARK},
+	{0x00a983, 0x00a983, PG_U_COMBINING_SPACING_MARK},
+	{0x00a984, 0x00a9b2, PG_U_OTHER_LETTER},
+	{0x00a9b3, 0x00a9b3, PG_U_NON_SPACING_MARK},
+	{0x00a9b4, 0x00a9b5, PG_U_COMBINING_SPACING_MARK},
+	{0x00a9b6, 0x00a9b9, PG_U_NON_SPACING_MARK},
+	{0x00a9ba, 0x00a9bb, PG_U_COMBINING_SPACING_MARK},
+	{0x00a9bc, 0x00a9bd, PG_U_NON_SPACING_MARK},
+	{0x00a9be, 0x00a9c0, PG_U_COMBINING_SPACING_MARK},
+	{0x00a9c1, 0x00a9cd, PG_U_OTHER_PUNCTUATION},
+	{0x00a9ce, 0x00a9ce, PG_U_UNASSIGNED},
+	{0x00a9cf, 0x00a9cf, PG_U_MODIFIER_LETTER},
+	{0x00a9d0, 0x00a9d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a9da, 0x00a9dd, PG_U_UNASSIGNED},
+	{0x00a9de, 0x00a9df, PG_U_OTHER_PUNCTUATION},
+	{0x00a9e0, 0x00a9e4, PG_U_OTHER_LETTER},
+	{0x00a9e5, 0x00a9e5, PG_U_NON_SPACING_MARK},
+	{0x00a9e6, 0x00a9e6, PG_U_MODIFIER_LETTER},
+	{0x00a9e7, 0x00a9ef, PG_U_OTHER_LETTER},
+	{0x00a9f0, 0x00a9f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00a9fa, 0x00a9fe, PG_U_OTHER_LETTER},
+	{0x00a9ff, 0x00a9ff, PG_U_UNASSIGNED},
+	{0x00aa00, 0x00aa28, PG_U_OTHER_LETTER},
+	{0x00aa29, 0x00aa2e, PG_U_NON_SPACING_MARK},
+	{0x00aa2f, 0x00aa30, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa31, 0x00aa32, PG_U_NON_SPACING_MARK},
+	{0x00aa33, 0x00aa34, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa35, 0x00aa36, PG_U_NON_SPACING_MARK},
+	{0x00aa37, 0x00aa3f, PG_U_UNASSIGNED},
+	{0x00aa40, 0x00aa42, PG_U_OTHER_LETTER},
+	{0x00aa43, 0x00aa43, PG_U_NON_SPACING_MARK},
+	{0x00aa44, 0x00aa4b, PG_U_OTHER_LETTER},
+	{0x00aa4c, 0x00aa4c, PG_U_NON_SPACING_MARK},
+	{0x00aa4d, 0x00aa4d, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa4e, 0x00aa4f, PG_U_UNASSIGNED},
+	{0x00aa50, 0x00aa59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00aa5a, 0x00aa5b, PG_U_UNASSIGNED},
+	{0x00aa5c, 0x00aa5f, PG_U_OTHER_PUNCTUATION},
+	{0x00aa60, 0x00aa6f, PG_U_OTHER_LETTER},
+	{0x00aa70, 0x00aa70, PG_U_MODIFIER_LETTER},
+	{0x00aa71, 0x00aa76, PG_U_OTHER_LETTER},
+	{0x00aa77, 0x00aa79, PG_U_OTHER_SYMBOL},
+	{0x00aa7a, 0x00aa7a, PG_U_OTHER_LETTER},
+	{0x00aa7b, 0x00aa7b, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa7c, 0x00aa7c, PG_U_NON_SPACING_MARK},
+	{0x00aa7d, 0x00aa7d, PG_U_COMBINING_SPACING_MARK},
+	{0x00aa7e, 0x00aaaf, PG_U_OTHER_LETTER},
+	{0x00aab0, 0x00aab0, PG_U_NON_SPACING_MARK},
+	{0x00aab1, 0x00aab1, PG_U_OTHER_LETTER},
+	{0x00aab2, 0x00aab4, PG_U_NON_SPACING_MARK},
+	{0x00aab5, 0x00aab6, PG_U_OTHER_LETTER},
+	{0x00aab7, 0x00aab8, PG_U_NON_SPACING_MARK},
+	{0x00aab9, 0x00aabd, PG_U_OTHER_LETTER},
+	{0x00aabe, 0x00aabf, PG_U_NON_SPACING_MARK},
+	{0x00aac0, 0x00aac0, PG_U_OTHER_LETTER},
+	{0x00aac1, 0x00aac1, PG_U_NON_SPACING_MARK},
+	{0x00aac2, 0x00aac2, PG_U_OTHER_LETTER},
+	{0x00aac3, 0x00aada, PG_U_UNASSIGNED},
+	{0x00aadb, 0x00aadc, PG_U_OTHER_LETTER},
+	{0x00aadd, 0x00aadd, PG_U_MODIFIER_LETTER},
+	{0x00aade, 0x00aadf, PG_U_OTHER_PUNCTUATION},
+	{0x00aae0, 0x00aaea, PG_U_OTHER_LETTER},
+	{0x00aaeb, 0x00aaeb, PG_U_COMBINING_SPACING_MARK},
+	{0x00aaec, 0x00aaed, PG_U_NON_SPACING_MARK},
+	{0x00aaee, 0x00aaef, PG_U_COMBINING_SPACING_MARK},
+	{0x00aaf0, 0x00aaf1, PG_U_OTHER_PUNCTUATION},
+	{0x00aaf2, 0x00aaf2, PG_U_OTHER_LETTER},
+	{0x00aaf3, 0x00aaf4, PG_U_MODIFIER_LETTER},
+	{0x00aaf5, 0x00aaf5, PG_U_COMBINING_SPACING_MARK},
+	{0x00aaf6, 0x00aaf6, PG_U_NON_SPACING_MARK},
+	{0x00aaf7, 0x00ab00, PG_U_UNASSIGNED},
+	{0x00ab01, 0x00ab06, PG_U_OTHER_LETTER},
+	{0x00ab07, 0x00ab08, PG_U_UNASSIGNED},
+	{0x00ab09, 0x00ab0e, PG_U_OTHER_LETTER},
+	{0x00ab0f, 0x00ab10, PG_U_UNASSIGNED},
+	{0x00ab11, 0x00ab16, PG_U_OTHER_LETTER},
+	{0x00ab17, 0x00ab1f, PG_U_UNASSIGNED},
+	{0x00ab20, 0x00ab26, PG_U_OTHER_LETTER},
+	{0x00ab27, 0x00ab27, PG_U_UNASSIGNED},
+	{0x00ab28, 0x00ab2e, PG_U_OTHER_LETTER},
+	{0x00ab2f, 0x00ab2f, PG_U_UNASSIGNED},
+	{0x00ab30, 0x00ab5a, PG_U_LOWERCASE_LETTER},
+	{0x00ab5b, 0x00ab5b, PG_U_MODIFIER_SYMBOL},
+	{0x00ab5c, 0x00ab5f, PG_U_MODIFIER_LETTER},
+	{0x00ab60, 0x00ab68, PG_U_LOWERCASE_LETTER},
+	{0x00ab69, 0x00ab69, PG_U_MODIFIER_LETTER},
+	{0x00ab6a, 0x00ab6b, PG_U_MODIFIER_SYMBOL},
+	{0x00ab6c, 0x00ab6f, PG_U_UNASSIGNED},
+	{0x00ab70, 0x00abbf, PG_U_LOWERCASE_LETTER},
+	{0x00abc0, 0x00abe2, PG_U_OTHER_LETTER},
+	{0x00abe3, 0x00abe4, PG_U_COMBINING_SPACING_MARK},
+	{0x00abe5, 0x00abe5, PG_U_NON_SPACING_MARK},
+	{0x00abe6, 0x00abe7, PG_U_COMBINING_SPACING_MARK},
+	{0x00abe8, 0x00abe8, PG_U_NON_SPACING_MARK},
+	{0x00abe9, 0x00abea, PG_U_COMBINING_SPACING_MARK},
+	{0x00abeb, 0x00abeb, PG_U_OTHER_PUNCTUATION},
+	{0x00abec, 0x00abec, PG_U_COMBINING_SPACING_MARK},
+	{0x00abed, 0x00abed, PG_U_NON_SPACING_MARK},
+	{0x00abee, 0x00abef, PG_U_UNASSIGNED},
+	{0x00abf0, 0x00abf9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00abfa, 0x00abff, PG_U_UNASSIGNED},
+	{0x00ac00, 0x00d7a3, PG_U_OTHER_LETTER},
+	{0x00d7a4, 0x00d7af, PG_U_UNASSIGNED},
+	{0x00d7b0, 0x00d7c6, PG_U_OTHER_LETTER},
+	{0x00d7c7, 0x00d7ca, PG_U_UNASSIGNED},
+	{0x00d7cb, 0x00d7fb, PG_U_OTHER_LETTER},
+	{0x00d7fc, 0x00d7ff, PG_U_UNASSIGNED},
+	{0x00d800, 0x00dfff, PG_U_SURROGATE},
+	{0x00e000, 0x00f8ff, PG_U_PRIVATE_USE_CHAR},
+	{0x00f900, 0x00fa6d, PG_U_OTHER_LETTER},
+	{0x00fa6e, 0x00fa6f, PG_U_UNASSIGNED},
+	{0x00fa70, 0x00fad9, PG_U_OTHER_LETTER},
+	{0x00fada, 0x00faff, PG_U_UNASSIGNED},
+	{0x00fb00, 0x00fb06, PG_U_LOWERCASE_LETTER},
+	{0x00fb07, 0x00fb12, PG_U_UNASSIGNED},
+	{0x00fb13, 0x00fb17, PG_U_LOWERCASE_LETTER},
+	{0x00fb18, 0x00fb1c, PG_U_UNASSIGNED},
+	{0x00fb1d, 0x00fb1d, PG_U_OTHER_LETTER},
+	{0x00fb1e, 0x00fb1e, PG_U_NON_SPACING_MARK},
+	{0x00fb1f, 0x00fb28, PG_U_OTHER_LETTER},
+	{0x00fb29, 0x00fb29, PG_U_MATH_SYMBOL},
+	{0x00fb2a, 0x00fb36, PG_U_OTHER_LETTER},
+	{0x00fb37, 0x00fb37, PG_U_UNASSIGNED},
+	{0x00fb38, 0x00fb3c, PG_U_OTHER_LETTER},
+	{0x00fb3d, 0x00fb3d, PG_U_UNASSIGNED},
+	{0x00fb3e, 0x00fb3e, PG_U_OTHER_LETTER},
+	{0x00fb3f, 0x00fb3f, PG_U_UNASSIGNED},
+	{0x00fb40, 0x00fb41, PG_U_OTHER_LETTER},
+	{0x00fb42, 0x00fb42, PG_U_UNASSIGNED},
+	{0x00fb43, 0x00fb44, PG_U_OTHER_LETTER},
+	{0x00fb45, 0x00fb45, PG_U_UNASSIGNED},
+	{0x00fb46, 0x00fbb1, PG_U_OTHER_LETTER},
+	{0x00fbb2, 0x00fbc2, PG_U_MODIFIER_SYMBOL},
+	{0x00fbc3, 0x00fbd2, PG_U_UNASSIGNED},
+	{0x00fbd3, 0x00fd3d, PG_U_OTHER_LETTER},
+	{0x00fd3e, 0x00fd3e, PG_U_END_PUNCTUATION},
+	{0x00fd3f, 0x00fd3f, PG_U_START_PUNCTUATION},
+	{0x00fd40, 0x00fd4f, PG_U_OTHER_SYMBOL},
+	{0x00fd50, 0x00fd8f, PG_U_OTHER_LETTER},
+	{0x00fd90, 0x00fd91, PG_U_UNASSIGNED},
+	{0x00fd92, 0x00fdc7, PG_U_OTHER_LETTER},
+	{0x00fdc8, 0x00fdce, PG_U_UNASSIGNED},
+	{0x00fdcf, 0x00fdcf, PG_U_OTHER_SYMBOL},
+	{0x00fdd0, 0x00fdef, PG_U_UNASSIGNED},
+	{0x00fdf0, 0x00fdfb, PG_U_OTHER_LETTER},
+	{0x00fdfc, 0x00fdfc, PG_U_CURRENCY_SYMBOL},
+	{0x00fdfd, 0x00fdff, PG_U_OTHER_SYMBOL},
+	{0x00fe00, 0x00fe0f, PG_U_NON_SPACING_MARK},
+	{0x00fe10, 0x00fe16, PG_U_OTHER_PUNCTUATION},
+	{0x00fe17, 0x00fe17, PG_U_START_PUNCTUATION},
+	{0x00fe18, 0x00fe18, PG_U_END_PUNCTUATION},
+	{0x00fe19, 0x00fe19, PG_U_OTHER_PUNCTUATION},
+	{0x00fe1a, 0x00fe1f, PG_U_UNASSIGNED},
+	{0x00fe20, 0x00fe2f, PG_U_NON_SPACING_MARK},
+	{0x00fe30, 0x00fe30, PG_U_OTHER_PUNCTUATION},
+	{0x00fe31, 0x00fe32, PG_U_DASH_PUNCTUATION},
+	{0x00fe33, 0x00fe34, PG_U_CONNECTOR_PUNCTUATION},
+	{0x00fe35, 0x00fe35, PG_U_START_PUNCTUATION},
+	{0x00fe36, 0x00fe36, PG_U_END_PUNCTUATION},
+	{0x00fe37, 0x00fe37, PG_U_START_PUNCTUATION},
+	{0x00fe38, 0x00fe38, PG_U_END_PUNCTUATION},
+	{0x00fe39, 0x00fe39, PG_U_START_PUNCTUATION},
+	{0x00fe3a, 0x00fe3a, PG_U_END_PUNCTUATION},
+	{0x00fe3b, 0x00fe3b, PG_U_START_PUNCTUATION},
+	{0x00fe3c, 0x00fe3c, PG_U_END_PUNCTUATION},
+	{0x00fe3d, 0x00fe3d, PG_U_START_PUNCTUATION},
+	{0x00fe3e, 0x00fe3e, PG_U_END_PUNCTUATION},
+	{0x00fe3f, 0x00fe3f, PG_U_START_PUNCTUATION},
+	{0x00fe40, 0x00fe40, PG_U_END_PUNCTUATION},
+	{0x00fe41, 0x00fe41, PG_U_START_PUNCTUATION},
+	{0x00fe42, 0x00fe42, PG_U_END_PUNCTUATION},
+	{0x00fe43, 0x00fe43, PG_U_START_PUNCTUATION},
+	{0x00fe44, 0x00fe44, PG_U_END_PUNCTUATION},
+	{0x00fe45, 0x00fe46, PG_U_OTHER_PUNCTUATION},
+	{0x00fe47, 0x00fe47, PG_U_START_PUNCTUATION},
+	{0x00fe48, 0x00fe48, PG_U_END_PUNCTUATION},
+	{0x00fe49, 0x00fe4c, PG_U_OTHER_PUNCTUATION},
+	{0x00fe4d, 0x00fe4f, PG_U_CONNECTOR_PUNCTUATION},
+	{0x00fe50, 0x00fe52, PG_U_OTHER_PUNCTUATION},
+	{0x00fe53, 0x00fe53, PG_U_UNASSIGNED},
+	{0x00fe54, 0x00fe57, PG_U_OTHER_PUNCTUATION},
+	{0x00fe58, 0x00fe58, PG_U_DASH_PUNCTUATION},
+	{0x00fe59, 0x00fe59, PG_U_START_PUNCTUATION},
+	{0x00fe5a, 0x00fe5a, PG_U_END_PUNCTUATION},
+	{0x00fe5b, 0x00fe5b, PG_U_START_PUNCTUATION},
+	{0x00fe5c, 0x00fe5c, PG_U_END_PUNCTUATION},
+	{0x00fe5d, 0x00fe5d, PG_U_START_PUNCTUATION},
+	{0x00fe5e, 0x00fe5e, PG_U_END_PUNCTUATION},
+	{0x00fe5f, 0x00fe61, PG_U_OTHER_PUNCTUATION},
+	{0x00fe62, 0x00fe62, PG_U_MATH_SYMBOL},
+	{0x00fe63, 0x00fe63, PG_U_DASH_PUNCTUATION},
+	{0x00fe64, 0x00fe66, PG_U_MATH_SYMBOL},
+	{0x00fe67, 0x00fe67, PG_U_UNASSIGNED},
+	{0x00fe68, 0x00fe68, PG_U_OTHER_PUNCTUATION},
+	{0x00fe69, 0x00fe69, PG_U_CURRENCY_SYMBOL},
+	{0x00fe6a, 0x00fe6b, PG_U_OTHER_PUNCTUATION},
+	{0x00fe6c, 0x00fe6f, PG_U_UNASSIGNED},
+	{0x00fe70, 0x00fe74, PG_U_OTHER_LETTER},
+	{0x00fe75, 0x00fe75, PG_U_UNASSIGNED},
+	{0x00fe76, 0x00fefc, PG_U_OTHER_LETTER},
+	{0x00fefd, 0x00fefe, PG_U_UNASSIGNED},
+	{0x00feff, 0x00feff, PG_U_FORMAT_CHAR},
+	{0x00ff00, 0x00ff00, PG_U_UNASSIGNED},
+	{0x00ff01, 0x00ff03, PG_U_OTHER_PUNCTUATION},
+	{0x00ff04, 0x00ff04, PG_U_CURRENCY_SYMBOL},
+	{0x00ff05, 0x00ff07, PG_U_OTHER_PUNCTUATION},
+	{0x00ff08, 0x00ff08, PG_U_START_PUNCTUATION},
+	{0x00ff09, 0x00ff09, PG_U_END_PUNCTUATION},
+	{0x00ff0a, 0x00ff0a, PG_U_OTHER_PUNCTUATION},
+	{0x00ff0b, 0x00ff0b, PG_U_MATH_SYMBOL},
+	{0x00ff0c, 0x00ff0c, PG_U_OTHER_PUNCTUATION},
+	{0x00ff0d, 0x00ff0d, PG_U_DASH_PUNCTUATION},
+	{0x00ff0e, 0x00ff0f, PG_U_OTHER_PUNCTUATION},
+	{0x00ff10, 0x00ff19, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x00ff1a, 0x00ff1b, PG_U_OTHER_PUNCTUATION},
+	{0x00ff1c, 0x00ff1e, PG_U_MATH_SYMBOL},
+	{0x00ff1f, 0x00ff20, PG_U_OTHER_PUNCTUATION},
+	{0x00ff21, 0x00ff3a, PG_U_UPPERCASE_LETTER},
+	{0x00ff3b, 0x00ff3b, PG_U_START_PUNCTUATION},
+	{0x00ff3c, 0x00ff3c, PG_U_OTHER_PUNCTUATION},
+	{0x00ff3d, 0x00ff3d, PG_U_END_PUNCTUATION},
+	{0x00ff3e, 0x00ff3e, PG_U_MODIFIER_SYMBOL},
+	{0x00ff3f, 0x00ff3f, PG_U_CONNECTOR_PUNCTUATION},
+	{0x00ff40, 0x00ff40, PG_U_MODIFIER_SYMBOL},
+	{0x00ff41, 0x00ff5a, PG_U_LOWERCASE_LETTER},
+	{0x00ff5b, 0x00ff5b, PG_U_START_PUNCTUATION},
+	{0x00ff5c, 0x00ff5c, PG_U_MATH_SYMBOL},
+	{0x00ff5d, 0x00ff5d, PG_U_END_PUNCTUATION},
+	{0x00ff5e, 0x00ff5e, PG_U_MATH_SYMBOL},
+	{0x00ff5f, 0x00ff5f, PG_U_START_PUNCTUATION},
+	{0x00ff60, 0x00ff60, PG_U_END_PUNCTUATION},
+	{0x00ff61, 0x00ff61, PG_U_OTHER_PUNCTUATION},
+	{0x00ff62, 0x00ff62, PG_U_START_PUNCTUATION},
+	{0x00ff63, 0x00ff63, PG_U_END_PUNCTUATION},
+	{0x00ff64, 0x00ff65, PG_U_OTHER_PUNCTUATION},
+	{0x00ff66, 0x00ff6f, PG_U_OTHER_LETTER},
+	{0x00ff70, 0x00ff70, PG_U_MODIFIER_LETTER},
+	{0x00ff71, 0x00ff9d, PG_U_OTHER_LETTER},
+	{0x00ff9e, 0x00ff9f, PG_U_MODIFIER_LETTER},
+	{0x00ffa0, 0x00ffbe, PG_U_OTHER_LETTER},
+	{0x00ffbf, 0x00ffc1, PG_U_UNASSIGNED},
+	{0x00ffc2, 0x00ffc7, PG_U_OTHER_LETTER},
+	{0x00ffc8, 0x00ffc9, PG_U_UNASSIGNED},
+	{0x00ffca, 0x00ffcf, PG_U_OTHER_LETTER},
+	{0x00ffd0, 0x00ffd1, PG_U_UNASSIGNED},
+	{0x00ffd2, 0x00ffd7, PG_U_OTHER_LETTER},
+	{0x00ffd8, 0x00ffd9, PG_U_UNASSIGNED},
+	{0x00ffda, 0x00ffdc, PG_U_OTHER_LETTER},
+	{0x00ffdd, 0x00ffdf, PG_U_UNASSIGNED},
+	{0x00ffe0, 0x00ffe1, PG_U_CURRENCY_SYMBOL},
+	{0x00ffe2, 0x00ffe2, PG_U_MATH_SYMBOL},
+	{0x00ffe3, 0x00ffe3, PG_U_MODIFIER_SYMBOL},
+	{0x00ffe4, 0x00ffe4, PG_U_OTHER_SYMBOL},
+	{0x00ffe5, 0x00ffe6, PG_U_CURRENCY_SYMBOL},
+	{0x00ffe7, 0x00ffe7, PG_U_UNASSIGNED},
+	{0x00ffe8, 0x00ffe8, PG_U_OTHER_SYMBOL},
+	{0x00ffe9, 0x00ffec, PG_U_MATH_SYMBOL},
+	{0x00ffed, 0x00ffee, PG_U_OTHER_SYMBOL},
+	{0x00ffef, 0x00fff8, PG_U_UNASSIGNED},
+	{0x00fff9, 0x00fffb, PG_U_FORMAT_CHAR},
+	{0x00fffc, 0x00fffd, PG_U_OTHER_SYMBOL},
+	{0x00fffe, 0x00ffff, PG_U_UNASSIGNED},
+	{0x010000, 0x01000b, PG_U_OTHER_LETTER},
+	{0x01000c, 0x01000c, PG_U_UNASSIGNED},
+	{0x01000d, 0x010026, PG_U_OTHER_LETTER},
+	{0x010027, 0x010027, PG_U_UNASSIGNED},
+	{0x010028, 0x01003a, PG_U_OTHER_LETTER},
+	{0x01003b, 0x01003b, PG_U_UNASSIGNED},
+	{0x01003c, 0x01003d, PG_U_OTHER_LETTER},
+	{0x01003e, 0x01003e, PG_U_UNASSIGNED},
+	{0x01003f, 0x01004d, PG_U_OTHER_LETTER},
+	{0x01004e, 0x01004f, PG_U_UNASSIGNED},
+	{0x010050, 0x01005d, PG_U_OTHER_LETTER},
+	{0x01005e, 0x01007f, PG_U_UNASSIGNED},
+	{0x010080, 0x0100fa, PG_U_OTHER_LETTER},
+	{0x0100fb, 0x0100ff, PG_U_UNASSIGNED},
+	{0x010100, 0x010102, PG_U_OTHER_PUNCTUATION},
+	{0x010103, 0x010106, PG_U_UNASSIGNED},
+	{0x010107, 0x010133, PG_U_OTHER_NUMBER},
+	{0x010134, 0x010136, PG_U_UNASSIGNED},
+	{0x010137, 0x01013f, PG_U_OTHER_SYMBOL},
+	{0x010140, 0x010174, PG_U_LETTER_NUMBER},
+	{0x010175, 0x010178, PG_U_OTHER_NUMBER},
+	{0x010179, 0x010189, PG_U_OTHER_SYMBOL},
+	{0x01018a, 0x01018b, PG_U_OTHER_NUMBER},
+	{0x01018c, 0x01018e, PG_U_OTHER_SYMBOL},
+	{0x01018f, 0x01018f, PG_U_UNASSIGNED},
+	{0x010190, 0x01019c, PG_U_OTHER_SYMBOL},
+	{0x01019d, 0x01019f, PG_U_UNASSIGNED},
+	{0x0101a0, 0x0101a0, PG_U_OTHER_SYMBOL},
+	{0x0101a1, 0x0101cf, PG_U_UNASSIGNED},
+	{0x0101d0, 0x0101fc, PG_U_OTHER_SYMBOL},
+	{0x0101fd, 0x0101fd, PG_U_NON_SPACING_MARK},
+	{0x0101fe, 0x01027f, PG_U_UNASSIGNED},
+	{0x010280, 0x01029c, PG_U_OTHER_LETTER},
+	{0x01029d, 0x01029f, PG_U_UNASSIGNED},
+	{0x0102a0, 0x0102d0, PG_U_OTHER_LETTER},
+	{0x0102d1, 0x0102df, PG_U_UNASSIGNED},
+	{0x0102e0, 0x0102e0, PG_U_NON_SPACING_MARK},
+	{0x0102e1, 0x0102fb, PG_U_OTHER_NUMBER},
+	{0x0102fc, 0x0102ff, PG_U_UNASSIGNED},
+	{0x010300, 0x01031f, PG_U_OTHER_LETTER},
+	{0x010320, 0x010323, PG_U_OTHER_NUMBER},
+	{0x010324, 0x01032c, PG_U_UNASSIGNED},
+	{0x01032d, 0x010340, PG_U_OTHER_LETTER},
+	{0x010341, 0x010341, PG_U_LETTER_NUMBER},
+	{0x010342, 0x010349, PG_U_OTHER_LETTER},
+	{0x01034a, 0x01034a, PG_U_LETTER_NUMBER},
+	{0x01034b, 0x01034f, PG_U_UNASSIGNED},
+	{0x010350, 0x010375, PG_U_OTHER_LETTER},
+	{0x010376, 0x01037a, PG_U_NON_SPACING_MARK},
+	{0x01037b, 0x01037f, PG_U_UNASSIGNED},
+	{0x010380, 0x01039d, PG_U_OTHER_LETTER},
+	{0x01039e, 0x01039e, PG_U_UNASSIGNED},
+	{0x01039f, 0x01039f, PG_U_OTHER_PUNCTUATION},
+	{0x0103a0, 0x0103c3, PG_U_OTHER_LETTER},
+	{0x0103c4, 0x0103c7, PG_U_UNASSIGNED},
+	{0x0103c8, 0x0103cf, PG_U_OTHER_LETTER},
+	{0x0103d0, 0x0103d0, PG_U_OTHER_PUNCTUATION},
+	{0x0103d1, 0x0103d5, PG_U_LETTER_NUMBER},
+	{0x0103d6, 0x0103ff, PG_U_UNASSIGNED},
+	{0x010400, 0x010427, PG_U_UPPERCASE_LETTER},
+	{0x010428, 0x01044f, PG_U_LOWERCASE_LETTER},
+	{0x010450, 0x01049d, PG_U_OTHER_LETTER},
+	{0x01049e, 0x01049f, PG_U_UNASSIGNED},
+	{0x0104a0, 0x0104a9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0104aa, 0x0104af, PG_U_UNASSIGNED},
+	{0x0104b0, 0x0104d3, PG_U_UPPERCASE_LETTER},
+	{0x0104d4, 0x0104d7, PG_U_UNASSIGNED},
+	{0x0104d8, 0x0104fb, PG_U_LOWERCASE_LETTER},
+	{0x0104fc, 0x0104ff, PG_U_UNASSIGNED},
+	{0x010500, 0x010527, PG_U_OTHER_LETTER},
+	{0x010528, 0x01052f, PG_U_UNASSIGNED},
+	{0x010530, 0x010563, PG_U_OTHER_LETTER},
+	{0x010564, 0x01056e, PG_U_UNASSIGNED},
+	{0x01056f, 0x01056f, PG_U_OTHER_PUNCTUATION},
+	{0x010570, 0x01057a, PG_U_UPPERCASE_LETTER},
+	{0x01057b, 0x01057b, PG_U_UNASSIGNED},
+	{0x01057c, 0x01058a, PG_U_UPPERCASE_LETTER},
+	{0x01058b, 0x01058b, PG_U_UNASSIGNED},
+	{0x01058c, 0x010592, PG_U_UPPERCASE_LETTER},
+	{0x010593, 0x010593, PG_U_UNASSIGNED},
+	{0x010594, 0x010595, PG_U_UPPERCASE_LETTER},
+	{0x010596, 0x010596, PG_U_UNASSIGNED},
+	{0x010597, 0x0105a1, PG_U_LOWERCASE_LETTER},
+	{0x0105a2, 0x0105a2, PG_U_UNASSIGNED},
+	{0x0105a3, 0x0105b1, PG_U_LOWERCASE_LETTER},
+	{0x0105b2, 0x0105b2, PG_U_UNASSIGNED},
+	{0x0105b3, 0x0105b9, PG_U_LOWERCASE_LETTER},
+	{0x0105ba, 0x0105ba, PG_U_UNASSIGNED},
+	{0x0105bb, 0x0105bc, PG_U_LOWERCASE_LETTER},
+	{0x0105bd, 0x0105ff, PG_U_UNASSIGNED},
+	{0x010600, 0x010736, PG_U_OTHER_LETTER},
+	{0x010737, 0x01073f, PG_U_UNASSIGNED},
+	{0x010740, 0x010755, PG_U_OTHER_LETTER},
+	{0x010756, 0x01075f, PG_U_UNASSIGNED},
+	{0x010760, 0x010767, PG_U_OTHER_LETTER},
+	{0x010768, 0x01077f, PG_U_UNASSIGNED},
+	{0x010780, 0x010785, PG_U_MODIFIER_LETTER},
+	{0x010786, 0x010786, PG_U_UNASSIGNED},
+	{0x010787, 0x0107b0, PG_U_MODIFIER_LETTER},
+	{0x0107b1, 0x0107b1, PG_U_UNASSIGNED},
+	{0x0107b2, 0x0107ba, PG_U_MODIFIER_LETTER},
+	{0x0107bb, 0x0107ff, PG_U_UNASSIGNED},
+	{0x010800, 0x010805, PG_U_OTHER_LETTER},
+	{0x010806, 0x010807, PG_U_UNASSIGNED},
+	{0x010808, 0x010808, PG_U_OTHER_LETTER},
+	{0x010809, 0x010809, PG_U_UNASSIGNED},
+	{0x01080a, 0x010835, PG_U_OTHER_LETTER},
+	{0x010836, 0x010836, PG_U_UNASSIGNED},
+	{0x010837, 0x010838, PG_U_OTHER_LETTER},
+	{0x010839, 0x01083b, PG_U_UNASSIGNED},
+	{0x01083c, 0x01083c, PG_U_OTHER_LETTER},
+	{0x01083d, 0x01083e, PG_U_UNASSIGNED},
+	{0x01083f, 0x010855, PG_U_OTHER_LETTER},
+	{0x010856, 0x010856, PG_U_UNASSIGNED},
+	{0x010857, 0x010857, PG_U_OTHER_PUNCTUATION},
+	{0x010858, 0x01085f, PG_U_OTHER_NUMBER},
+	{0x010860, 0x010876, PG_U_OTHER_LETTER},
+	{0x010877, 0x010878, PG_U_OTHER_SYMBOL},
+	{0x010879, 0x01087f, PG_U_OTHER_NUMBER},
+	{0x010880, 0x01089e, PG_U_OTHER_LETTER},
+	{0x01089f, 0x0108a6, PG_U_UNASSIGNED},
+	{0x0108a7, 0x0108af, PG_U_OTHER_NUMBER},
+	{0x0108b0, 0x0108df, PG_U_UNASSIGNED},
+	{0x0108e0, 0x0108f2, PG_U_OTHER_LETTER},
+	{0x0108f3, 0x0108f3, PG_U_UNASSIGNED},
+	{0x0108f4, 0x0108f5, PG_U_OTHER_LETTER},
+	{0x0108f6, 0x0108fa, PG_U_UNASSIGNED},
+	{0x0108fb, 0x0108ff, PG_U_OTHER_NUMBER},
+	{0x010900, 0x010915, PG_U_OTHER_LETTER},
+	{0x010916, 0x01091b, PG_U_OTHER_NUMBER},
+	{0x01091c, 0x01091e, PG_U_UNASSIGNED},
+	{0x01091f, 0x01091f, PG_U_OTHER_PUNCTUATION},
+	{0x010920, 0x010939, PG_U_OTHER_LETTER},
+	{0x01093a, 0x01093e, PG_U_UNASSIGNED},
+	{0x01093f, 0x01093f, PG_U_OTHER_PUNCTUATION},
+	{0x010940, 0x01097f, PG_U_UNASSIGNED},
+	{0x010980, 0x0109b7, PG_U_OTHER_LETTER},
+	{0x0109b8, 0x0109bb, PG_U_UNASSIGNED},
+	{0x0109bc, 0x0109bd, PG_U_OTHER_NUMBER},
+	{0x0109be, 0x0109bf, PG_U_OTHER_LETTER},
+	{0x0109c0, 0x0109cf, PG_U_OTHER_NUMBER},
+	{0x0109d0, 0x0109d1, PG_U_UNASSIGNED},
+	{0x0109d2, 0x0109ff, PG_U_OTHER_NUMBER},
+	{0x010a00, 0x010a00, PG_U_OTHER_LETTER},
+	{0x010a01, 0x010a03, PG_U_NON_SPACING_MARK},
+	{0x010a04, 0x010a04, PG_U_UNASSIGNED},
+	{0x010a05, 0x010a06, PG_U_NON_SPACING_MARK},
+	{0x010a07, 0x010a0b, PG_U_UNASSIGNED},
+	{0x010a0c, 0x010a0f, PG_U_NON_SPACING_MARK},
+	{0x010a10, 0x010a13, PG_U_OTHER_LETTER},
+	{0x010a14, 0x010a14, PG_U_UNASSIGNED},
+	{0x010a15, 0x010a17, PG_U_OTHER_LETTER},
+	{0x010a18, 0x010a18, PG_U_UNASSIGNED},
+	{0x010a19, 0x010a35, PG_U_OTHER_LETTER},
+	{0x010a36, 0x010a37, PG_U_UNASSIGNED},
+	{0x010a38, 0x010a3a, PG_U_NON_SPACING_MARK},
+	{0x010a3b, 0x010a3e, PG_U_UNASSIGNED},
+	{0x010a3f, 0x010a3f, PG_U_NON_SPACING_MARK},
+	{0x010a40, 0x010a48, PG_U_OTHER_NUMBER},
+	{0x010a49, 0x010a4f, PG_U_UNASSIGNED},
+	{0x010a50, 0x010a58, PG_U_OTHER_PUNCTUATION},
+	{0x010a59, 0x010a5f, PG_U_UNASSIGNED},
+	{0x010a60, 0x010a7c, PG_U_OTHER_LETTER},
+	{0x010a7d, 0x010a7e, PG_U_OTHER_NUMBER},
+	{0x010a7f, 0x010a7f, PG_U_OTHER_PUNCTUATION},
+	{0x010a80, 0x010a9c, PG_U_OTHER_LETTER},
+	{0x010a9d, 0x010a9f, PG_U_OTHER_NUMBER},
+	{0x010aa0, 0x010abf, PG_U_UNASSIGNED},
+	{0x010ac0, 0x010ac7, PG_U_OTHER_LETTER},
+	{0x010ac8, 0x010ac8, PG_U_OTHER_SYMBOL},
+	{0x010ac9, 0x010ae4, PG_U_OTHER_LETTER},
+	{0x010ae5, 0x010ae6, PG_U_NON_SPACING_MARK},
+	{0x010ae7, 0x010aea, PG_U_UNASSIGNED},
+	{0x010aeb, 0x010aef, PG_U_OTHER_NUMBER},
+	{0x010af0, 0x010af6, PG_U_OTHER_PUNCTUATION},
+	{0x010af7, 0x010aff, PG_U_UNASSIGNED},
+	{0x010b00, 0x010b35, PG_U_OTHER_LETTER},
+	{0x010b36, 0x010b38, PG_U_UNASSIGNED},
+	{0x010b39, 0x010b3f, PG_U_OTHER_PUNCTUATION},
+	{0x010b40, 0x010b55, PG_U_OTHER_LETTER},
+	{0x010b56, 0x010b57, PG_U_UNASSIGNED},
+	{0x010b58, 0x010b5f, PG_U_OTHER_NUMBER},
+	{0x010b60, 0x010b72, PG_U_OTHER_LETTER},
+	{0x010b73, 0x010b77, PG_U_UNASSIGNED},
+	{0x010b78, 0x010b7f, PG_U_OTHER_NUMBER},
+	{0x010b80, 0x010b91, PG_U_OTHER_LETTER},
+	{0x010b92, 0x010b98, PG_U_UNASSIGNED},
+	{0x010b99, 0x010b9c, PG_U_OTHER_PUNCTUATION},
+	{0x010b9d, 0x010ba8, PG_U_UNASSIGNED},
+	{0x010ba9, 0x010baf, PG_U_OTHER_NUMBER},
+	{0x010bb0, 0x010bff, PG_U_UNASSIGNED},
+	{0x010c00, 0x010c48, PG_U_OTHER_LETTER},
+	{0x010c49, 0x010c7f, PG_U_UNASSIGNED},
+	{0x010c80, 0x010cb2, PG_U_UPPERCASE_LETTER},
+	{0x010cb3, 0x010cbf, PG_U_UNASSIGNED},
+	{0x010cc0, 0x010cf2, PG_U_LOWERCASE_LETTER},
+	{0x010cf3, 0x010cf9, PG_U_UNASSIGNED},
+	{0x010cfa, 0x010cff, PG_U_OTHER_NUMBER},
+	{0x010d00, 0x010d23, PG_U_OTHER_LETTER},
+	{0x010d24, 0x010d27, PG_U_NON_SPACING_MARK},
+	{0x010d28, 0x010d2f, PG_U_UNASSIGNED},
+	{0x010d30, 0x010d39, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x010d3a, 0x010e5f, PG_U_UNASSIGNED},
+	{0x010e60, 0x010e7e, PG_U_OTHER_NUMBER},
+	{0x010e7f, 0x010e7f, PG_U_UNASSIGNED},
+	{0x010e80, 0x010ea9, PG_U_OTHER_LETTER},
+	{0x010eaa, 0x010eaa, PG_U_UNASSIGNED},
+	{0x010eab, 0x010eac, PG_U_NON_SPACING_MARK},
+	{0x010ead, 0x010ead, PG_U_DASH_PUNCTUATION},
+	{0x010eae, 0x010eaf, PG_U_UNASSIGNED},
+	{0x010eb0, 0x010eb1, PG_U_OTHER_LETTER},
+	{0x010eb2, 0x010efc, PG_U_UNASSIGNED},
+	{0x010efd, 0x010eff, PG_U_NON_SPACING_MARK},
+	{0x010f00, 0x010f1c, PG_U_OTHER_LETTER},
+	{0x010f1d, 0x010f26, PG_U_OTHER_NUMBER},
+	{0x010f27, 0x010f27, PG_U_OTHER_LETTER},
+	{0x010f28, 0x010f2f, PG_U_UNASSIGNED},
+	{0x010f30, 0x010f45, PG_U_OTHER_LETTER},
+	{0x010f46, 0x010f50, PG_U_NON_SPACING_MARK},
+	{0x010f51, 0x010f54, PG_U_OTHER_NUMBER},
+	{0x010f55, 0x010f59, PG_U_OTHER_PUNCTUATION},
+	{0x010f5a, 0x010f6f, PG_U_UNASSIGNED},
+	{0x010f70, 0x010f81, PG_U_OTHER_LETTER},
+	{0x010f82, 0x010f85, PG_U_NON_SPACING_MARK},
+	{0x010f86, 0x010f89, PG_U_OTHER_PUNCTUATION},
+	{0x010f8a, 0x010faf, PG_U_UNASSIGNED},
+	{0x010fb0, 0x010fc4, PG_U_OTHER_LETTER},
+	{0x010fc5, 0x010fcb, PG_U_OTHER_NUMBER},
+	{0x010fcc, 0x010fdf, PG_U_UNASSIGNED},
+	{0x010fe0, 0x010ff6, PG_U_OTHER_LETTER},
+	{0x010ff7, 0x010fff, PG_U_UNASSIGNED},
+	{0x011000, 0x011000, PG_U_COMBINING_SPACING_MARK},
+	{0x011001, 0x011001, PG_U_NON_SPACING_MARK},
+	{0x011002, 0x011002, PG_U_COMBINING_SPACING_MARK},
+	{0x011003, 0x011037, PG_U_OTHER_LETTER},
+	{0x011038, 0x011046, PG_U_NON_SPACING_MARK},
+	{0x011047, 0x01104d, PG_U_OTHER_PUNCTUATION},
+	{0x01104e, 0x011051, PG_U_UNASSIGNED},
+	{0x011052, 0x011065, PG_U_OTHER_NUMBER},
+	{0x011066, 0x01106f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011070, 0x011070, PG_U_NON_SPACING_MARK},
+	{0x011071, 0x011072, PG_U_OTHER_LETTER},
+	{0x011073, 0x011074, PG_U_NON_SPACING_MARK},
+	{0x011075, 0x011075, PG_U_OTHER_LETTER},
+	{0x011076, 0x01107e, PG_U_UNASSIGNED},
+	{0x01107f, 0x011081, PG_U_NON_SPACING_MARK},
+	{0x011082, 0x011082, PG_U_COMBINING_SPACING_MARK},
+	{0x011083, 0x0110af, PG_U_OTHER_LETTER},
+	{0x0110b0, 0x0110b2, PG_U_COMBINING_SPACING_MARK},
+	{0x0110b3, 0x0110b6, PG_U_NON_SPACING_MARK},
+	{0x0110b7, 0x0110b8, PG_U_COMBINING_SPACING_MARK},
+	{0x0110b9, 0x0110ba, PG_U_NON_SPACING_MARK},
+	{0x0110bb, 0x0110bc, PG_U_OTHER_PUNCTUATION},
+	{0x0110bd, 0x0110bd, PG_U_FORMAT_CHAR},
+	{0x0110be, 0x0110c1, PG_U_OTHER_PUNCTUATION},
+	{0x0110c2, 0x0110c2, PG_U_NON_SPACING_MARK},
+	{0x0110c3, 0x0110cc, PG_U_UNASSIGNED},
+	{0x0110cd, 0x0110cd, PG_U_FORMAT_CHAR},
+	{0x0110ce, 0x0110cf, PG_U_UNASSIGNED},
+	{0x0110d0, 0x0110e8, PG_U_OTHER_LETTER},
+	{0x0110e9, 0x0110ef, PG_U_UNASSIGNED},
+	{0x0110f0, 0x0110f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0110fa, 0x0110ff, PG_U_UNASSIGNED},
+	{0x011100, 0x011102, PG_U_NON_SPACING_MARK},
+	{0x011103, 0x011126, PG_U_OTHER_LETTER},
+	{0x011127, 0x01112b, PG_U_NON_SPACING_MARK},
+	{0x01112c, 0x01112c, PG_U_COMBINING_SPACING_MARK},
+	{0x01112d, 0x011134, PG_U_NON_SPACING_MARK},
+	{0x011135, 0x011135, PG_U_UNASSIGNED},
+	{0x011136, 0x01113f, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011140, 0x011143, PG_U_OTHER_PUNCTUATION},
+	{0x011144, 0x011144, PG_U_OTHER_LETTER},
+	{0x011145, 0x011146, PG_U_COMBINING_SPACING_MARK},
+	{0x011147, 0x011147, PG_U_OTHER_LETTER},
+	{0x011148, 0x01114f, PG_U_UNASSIGNED},
+	{0x011150, 0x011172, PG_U_OTHER_LETTER},
+	{0x011173, 0x011173, PG_U_NON_SPACING_MARK},
+	{0x011174, 0x011175, PG_U_OTHER_PUNCTUATION},
+	{0x011176, 0x011176, PG_U_OTHER_LETTER},
+	{0x011177, 0x01117f, PG_U_UNASSIGNED},
+	{0x011180, 0x011181, PG_U_NON_SPACING_MARK},
+	{0x011182, 0x011182, PG_U_COMBINING_SPACING_MARK},
+	{0x011183, 0x0111b2, PG_U_OTHER_LETTER},
+	{0x0111b3, 0x0111b5, PG_U_COMBINING_SPACING_MARK},
+	{0x0111b6, 0x0111be, PG_U_NON_SPACING_MARK},
+	{0x0111bf, 0x0111c0, PG_U_COMBINING_SPACING_MARK},
+	{0x0111c1, 0x0111c4, PG_U_OTHER_LETTER},
+	{0x0111c5, 0x0111c8, PG_U_OTHER_PUNCTUATION},
+	{0x0111c9, 0x0111cc, PG_U_NON_SPACING_MARK},
+	{0x0111cd, 0x0111cd, PG_U_OTHER_PUNCTUATION},
+	{0x0111ce, 0x0111ce, PG_U_COMBINING_SPACING_MARK},
+	{0x0111cf, 0x0111cf, PG_U_NON_SPACING_MARK},
+	{0x0111d0, 0x0111d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0111da, 0x0111da, PG_U_OTHER_LETTER},
+	{0x0111db, 0x0111db, PG_U_OTHER_PUNCTUATION},
+	{0x0111dc, 0x0111dc, PG_U_OTHER_LETTER},
+	{0x0111dd, 0x0111df, PG_U_OTHER_PUNCTUATION},
+	{0x0111e0, 0x0111e0, PG_U_UNASSIGNED},
+	{0x0111e1, 0x0111f4, PG_U_OTHER_NUMBER},
+	{0x0111f5, 0x0111ff, PG_U_UNASSIGNED},
+	{0x011200, 0x011211, PG_U_OTHER_LETTER},
+	{0x011212, 0x011212, PG_U_UNASSIGNED},
+	{0x011213, 0x01122b, PG_U_OTHER_LETTER},
+	{0x01122c, 0x01122e, PG_U_COMBINING_SPACING_MARK},
+	{0x01122f, 0x011231, PG_U_NON_SPACING_MARK},
+	{0x011232, 0x011233, PG_U_COMBINING_SPACING_MARK},
+	{0x011234, 0x011234, PG_U_NON_SPACING_MARK},
+	{0x011235, 0x011235, PG_U_COMBINING_SPACING_MARK},
+	{0x011236, 0x011237, PG_U_NON_SPACING_MARK},
+	{0x011238, 0x01123d, PG_U_OTHER_PUNCTUATION},
+	{0x01123e, 0x01123e, PG_U_NON_SPACING_MARK},
+	{0x01123f, 0x011240, PG_U_OTHER_LETTER},
+	{0x011241, 0x011241, PG_U_NON_SPACING_MARK},
+	{0x011242, 0x01127f, PG_U_UNASSIGNED},
+	{0x011280, 0x011286, PG_U_OTHER_LETTER},
+	{0x011287, 0x011287, PG_U_UNASSIGNED},
+	{0x011288, 0x011288, PG_U_OTHER_LETTER},
+	{0x011289, 0x011289, PG_U_UNASSIGNED},
+	{0x01128a, 0x01128d, PG_U_OTHER_LETTER},
+	{0x01128e, 0x01128e, PG_U_UNASSIGNED},
+	{0x01128f, 0x01129d, PG_U_OTHER_LETTER},
+	{0x01129e, 0x01129e, PG_U_UNASSIGNED},
+	{0x01129f, 0x0112a8, PG_U_OTHER_LETTER},
+	{0x0112a9, 0x0112a9, PG_U_OTHER_PUNCTUATION},
+	{0x0112aa, 0x0112af, PG_U_UNASSIGNED},
+	{0x0112b0, 0x0112de, PG_U_OTHER_LETTER},
+	{0x0112df, 0x0112df, PG_U_NON_SPACING_MARK},
+	{0x0112e0, 0x0112e2, PG_U_COMBINING_SPACING_MARK},
+	{0x0112e3, 0x0112ea, PG_U_NON_SPACING_MARK},
+	{0x0112eb, 0x0112ef, PG_U_UNASSIGNED},
+	{0x0112f0, 0x0112f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0112fa, 0x0112ff, PG_U_UNASSIGNED},
+	{0x011300, 0x011301, PG_U_NON_SPACING_MARK},
+	{0x011302, 0x011303, PG_U_COMBINING_SPACING_MARK},
+	{0x011304, 0x011304, PG_U_UNASSIGNED},
+	{0x011305, 0x01130c, PG_U_OTHER_LETTER},
+	{0x01130d, 0x01130e, PG_U_UNASSIGNED},
+	{0x01130f, 0x011310, PG_U_OTHER_LETTER},
+	{0x011311, 0x011312, PG_U_UNASSIGNED},
+	{0x011313, 0x011328, PG_U_OTHER_LETTER},
+	{0x011329, 0x011329, PG_U_UNASSIGNED},
+	{0x01132a, 0x011330, PG_U_OTHER_LETTER},
+	{0x011331, 0x011331, PG_U_UNASSIGNED},
+	{0x011332, 0x011333, PG_U_OTHER_LETTER},
+	{0x011334, 0x011334, PG_U_UNASSIGNED},
+	{0x011335, 0x011339, PG_U_OTHER_LETTER},
+	{0x01133a, 0x01133a, PG_U_UNASSIGNED},
+	{0x01133b, 0x01133c, PG_U_NON_SPACING_MARK},
+	{0x01133d, 0x01133d, PG_U_OTHER_LETTER},
+	{0x01133e, 0x01133f, PG_U_COMBINING_SPACING_MARK},
+	{0x011340, 0x011340, PG_U_NON_SPACING_MARK},
+	{0x011341, 0x011344, PG_U_COMBINING_SPACING_MARK},
+	{0x011345, 0x011346, PG_U_UNASSIGNED},
+	{0x011347, 0x011348, PG_U_COMBINING_SPACING_MARK},
+	{0x011349, 0x01134a, PG_U_UNASSIGNED},
+	{0x01134b, 0x01134d, PG_U_COMBINING_SPACING_MARK},
+	{0x01134e, 0x01134f, PG_U_UNASSIGNED},
+	{0x011350, 0x011350, PG_U_OTHER_LETTER},
+	{0x011351, 0x011356, PG_U_UNASSIGNED},
+	{0x011357, 0x011357, PG_U_COMBINING_SPACING_MARK},
+	{0x011358, 0x01135c, PG_U_UNASSIGNED},
+	{0x01135d, 0x011361, PG_U_OTHER_LETTER},
+	{0x011362, 0x011363, PG_U_COMBINING_SPACING_MARK},
+	{0x011364, 0x011365, PG_U_UNASSIGNED},
+	{0x011366, 0x01136c, PG_U_NON_SPACING_MARK},
+	{0x01136d, 0x01136f, PG_U_UNASSIGNED},
+	{0x011370, 0x011374, PG_U_NON_SPACING_MARK},
+	{0x011375, 0x0113ff, PG_U_UNASSIGNED},
+	{0x011400, 0x011434, PG_U_OTHER_LETTER},
+	{0x011435, 0x011437, PG_U_COMBINING_SPACING_MARK},
+	{0x011438, 0x01143f, PG_U_NON_SPACING_MARK},
+	{0x011440, 0x011441, PG_U_COMBINING_SPACING_MARK},
+	{0x011442, 0x011444, PG_U_NON_SPACING_MARK},
+	{0x011445, 0x011445, PG_U_COMBINING_SPACING_MARK},
+	{0x011446, 0x011446, PG_U_NON_SPACING_MARK},
+	{0x011447, 0x01144a, PG_U_OTHER_LETTER},
+	{0x01144b, 0x01144f, PG_U_OTHER_PUNCTUATION},
+	{0x011450, 0x011459, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01145a, 0x01145b, PG_U_OTHER_PUNCTUATION},
+	{0x01145c, 0x01145c, PG_U_UNASSIGNED},
+	{0x01145d, 0x01145d, PG_U_OTHER_PUNCTUATION},
+	{0x01145e, 0x01145e, PG_U_NON_SPACING_MARK},
+	{0x01145f, 0x011461, PG_U_OTHER_LETTER},
+	{0x011462, 0x01147f, PG_U_UNASSIGNED},
+	{0x011480, 0x0114af, PG_U_OTHER_LETTER},
+	{0x0114b0, 0x0114b2, PG_U_COMBINING_SPACING_MARK},
+	{0x0114b3, 0x0114b8, PG_U_NON_SPACING_MARK},
+	{0x0114b9, 0x0114b9, PG_U_COMBINING_SPACING_MARK},
+	{0x0114ba, 0x0114ba, PG_U_NON_SPACING_MARK},
+	{0x0114bb, 0x0114be, PG_U_COMBINING_SPACING_MARK},
+	{0x0114bf, 0x0114c0, PG_U_NON_SPACING_MARK},
+	{0x0114c1, 0x0114c1, PG_U_COMBINING_SPACING_MARK},
+	{0x0114c2, 0x0114c3, PG_U_NON_SPACING_MARK},
+	{0x0114c4, 0x0114c5, PG_U_OTHER_LETTER},
+	{0x0114c6, 0x0114c6, PG_U_OTHER_PUNCTUATION},
+	{0x0114c7, 0x0114c7, PG_U_OTHER_LETTER},
+	{0x0114c8, 0x0114cf, PG_U_UNASSIGNED},
+	{0x0114d0, 0x0114d9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0114da, 0x01157f, PG_U_UNASSIGNED},
+	{0x011580, 0x0115ae, PG_U_OTHER_LETTER},
+	{0x0115af, 0x0115b1, PG_U_COMBINING_SPACING_MARK},
+	{0x0115b2, 0x0115b5, PG_U_NON_SPACING_MARK},
+	{0x0115b6, 0x0115b7, PG_U_UNASSIGNED},
+	{0x0115b8, 0x0115bb, PG_U_COMBINING_SPACING_MARK},
+	{0x0115bc, 0x0115bd, PG_U_NON_SPACING_MARK},
+	{0x0115be, 0x0115be, PG_U_COMBINING_SPACING_MARK},
+	{0x0115bf, 0x0115c0, PG_U_NON_SPACING_MARK},
+	{0x0115c1, 0x0115d7, PG_U_OTHER_PUNCTUATION},
+	{0x0115d8, 0x0115db, PG_U_OTHER_LETTER},
+	{0x0115dc, 0x0115dd, PG_U_NON_SPACING_MARK},
+	{0x0115de, 0x0115ff, PG_U_UNASSIGNED},
+	{0x011600, 0x01162f, PG_U_OTHER_LETTER},
+	{0x011630, 0x011632, PG_U_COMBINING_SPACING_MARK},
+	{0x011633, 0x01163a, PG_U_NON_SPACING_MARK},
+	{0x01163b, 0x01163c, PG_U_COMBINING_SPACING_MARK},
+	{0x01163d, 0x01163d, PG_U_NON_SPACING_MARK},
+	{0x01163e, 0x01163e, PG_U_COMBINING_SPACING_MARK},
+	{0x01163f, 0x011640, PG_U_NON_SPACING_MARK},
+	{0x011641, 0x011643, PG_U_OTHER_PUNCTUATION},
+	{0x011644, 0x011644, PG_U_OTHER_LETTER},
+	{0x011645, 0x01164f, PG_U_UNASSIGNED},
+	{0x011650, 0x011659, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01165a, 0x01165f, PG_U_UNASSIGNED},
+	{0x011660, 0x01166c, PG_U_OTHER_PUNCTUATION},
+	{0x01166d, 0x01167f, PG_U_UNASSIGNED},
+	{0x011680, 0x0116aa, PG_U_OTHER_LETTER},
+	{0x0116ab, 0x0116ab, PG_U_NON_SPACING_MARK},
+	{0x0116ac, 0x0116ac, PG_U_COMBINING_SPACING_MARK},
+	{0x0116ad, 0x0116ad, PG_U_NON_SPACING_MARK},
+	{0x0116ae, 0x0116af, PG_U_COMBINING_SPACING_MARK},
+	{0x0116b0, 0x0116b5, PG_U_NON_SPACING_MARK},
+	{0x0116b6, 0x0116b6, PG_U_COMBINING_SPACING_MARK},
+	{0x0116b7, 0x0116b7, PG_U_NON_SPACING_MARK},
+	{0x0116b8, 0x0116b8, PG_U_OTHER_LETTER},
+	{0x0116b9, 0x0116b9, PG_U_OTHER_PUNCTUATION},
+	{0x0116ba, 0x0116bf, PG_U_UNASSIGNED},
+	{0x0116c0, 0x0116c9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0116ca, 0x0116ff, PG_U_UNASSIGNED},
+	{0x011700, 0x01171a, PG_U_OTHER_LETTER},
+	{0x01171b, 0x01171c, PG_U_UNASSIGNED},
+	{0x01171d, 0x01171f, PG_U_NON_SPACING_MARK},
+	{0x011720, 0x011721, PG_U_COMBINING_SPACING_MARK},
+	{0x011722, 0x011725, PG_U_NON_SPACING_MARK},
+	{0x011726, 0x011726, PG_U_COMBINING_SPACING_MARK},
+	{0x011727, 0x01172b, PG_U_NON_SPACING_MARK},
+	{0x01172c, 0x01172f, PG_U_UNASSIGNED},
+	{0x011730, 0x011739, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01173a, 0x01173b, PG_U_OTHER_NUMBER},
+	{0x01173c, 0x01173e, PG_U_OTHER_PUNCTUATION},
+	{0x01173f, 0x01173f, PG_U_OTHER_SYMBOL},
+	{0x011740, 0x011746, PG_U_OTHER_LETTER},
+	{0x011747, 0x0117ff, PG_U_UNASSIGNED},
+	{0x011800, 0x01182b, PG_U_OTHER_LETTER},
+	{0x01182c, 0x01182e, PG_U_COMBINING_SPACING_MARK},
+	{0x01182f, 0x011837, PG_U_NON_SPACING_MARK},
+	{0x011838, 0x011838, PG_U_COMBINING_SPACING_MARK},
+	{0x011839, 0x01183a, PG_U_NON_SPACING_MARK},
+	{0x01183b, 0x01183b, PG_U_OTHER_PUNCTUATION},
+	{0x01183c, 0x01189f, PG_U_UNASSIGNED},
+	{0x0118a0, 0x0118bf, PG_U_UPPERCASE_LETTER},
+	{0x0118c0, 0x0118df, PG_U_LOWERCASE_LETTER},
+	{0x0118e0, 0x0118e9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x0118ea, 0x0118f2, PG_U_OTHER_NUMBER},
+	{0x0118f3, 0x0118fe, PG_U_UNASSIGNED},
+	{0x0118ff, 0x011906, PG_U_OTHER_LETTER},
+	{0x011907, 0x011908, PG_U_UNASSIGNED},
+	{0x011909, 0x011909, PG_U_OTHER_LETTER},
+	{0x01190a, 0x01190b, PG_U_UNASSIGNED},
+	{0x01190c, 0x011913, PG_U_OTHER_LETTER},
+	{0x011914, 0x011914, PG_U_UNASSIGNED},
+	{0x011915, 0x011916, PG_U_OTHER_LETTER},
+	{0x011917, 0x011917, PG_U_UNASSIGNED},
+	{0x011918, 0x01192f, PG_U_OTHER_LETTER},
+	{0x011930, 0x011935, PG_U_COMBINING_SPACING_MARK},
+	{0x011936, 0x011936, PG_U_UNASSIGNED},
+	{0x011937, 0x011938, PG_U_COMBINING_SPACING_MARK},
+	{0x011939, 0x01193a, PG_U_UNASSIGNED},
+	{0x01193b, 0x01193c, PG_U_NON_SPACING_MARK},
+	{0x01193d, 0x01193d, PG_U_COMBINING_SPACING_MARK},
+	{0x01193e, 0x01193e, PG_U_NON_SPACING_MARK},
+	{0x01193f, 0x01193f, PG_U_OTHER_LETTER},
+	{0x011940, 0x011940, PG_U_COMBINING_SPACING_MARK},
+	{0x011941, 0x011941, PG_U_OTHER_LETTER},
+	{0x011942, 0x011942, PG_U_COMBINING_SPACING_MARK},
+	{0x011943, 0x011943, PG_U_NON_SPACING_MARK},
+	{0x011944, 0x011946, PG_U_OTHER_PUNCTUATION},
+	{0x011947, 0x01194f, PG_U_UNASSIGNED},
+	{0x011950, 0x011959, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01195a, 0x01199f, PG_U_UNASSIGNED},
+	{0x0119a0, 0x0119a7, PG_U_OTHER_LETTER},
+	{0x0119a8, 0x0119a9, PG_U_UNASSIGNED},
+	{0x0119aa, 0x0119d0, PG_U_OTHER_LETTER},
+	{0x0119d1, 0x0119d3, PG_U_COMBINING_SPACING_MARK},
+	{0x0119d4, 0x0119d7, PG_U_NON_SPACING_MARK},
+	{0x0119d8, 0x0119d9, PG_U_UNASSIGNED},
+	{0x0119da, 0x0119db, PG_U_NON_SPACING_MARK},
+	{0x0119dc, 0x0119df, PG_U_COMBINING_SPACING_MARK},
+	{0x0119e0, 0x0119e0, PG_U_NON_SPACING_MARK},
+	{0x0119e1, 0x0119e1, PG_U_OTHER_LETTER},
+	{0x0119e2, 0x0119e2, PG_U_OTHER_PUNCTUATION},
+	{0x0119e3, 0x0119e3, PG_U_OTHER_LETTER},
+	{0x0119e4, 0x0119e4, PG_U_COMBINING_SPACING_MARK},
+	{0x0119e5, 0x0119ff, PG_U_UNASSIGNED},
+	{0x011a00, 0x011a00, PG_U_OTHER_LETTER},
+	{0x011a01, 0x011a0a, PG_U_NON_SPACING_MARK},
+	{0x011a0b, 0x011a32, PG_U_OTHER_LETTER},
+	{0x011a33, 0x011a38, PG_U_NON_SPACING_MARK},
+	{0x011a39, 0x011a39, PG_U_COMBINING_SPACING_MARK},
+	{0x011a3a, 0x011a3a, PG_U_OTHER_LETTER},
+	{0x011a3b, 0x011a3e, PG_U_NON_SPACING_MARK},
+	{0x011a3f, 0x011a46, PG_U_OTHER_PUNCTUATION},
+	{0x011a47, 0x011a47, PG_U_NON_SPACING_MARK},
+	{0x011a48, 0x011a4f, PG_U_UNASSIGNED},
+	{0x011a50, 0x011a50, PG_U_OTHER_LETTER},
+	{0x011a51, 0x011a56, PG_U_NON_SPACING_MARK},
+	{0x011a57, 0x011a58, PG_U_COMBINING_SPACING_MARK},
+	{0x011a59, 0x011a5b, PG_U_NON_SPACING_MARK},
+	{0x011a5c, 0x011a89, PG_U_OTHER_LETTER},
+	{0x011a8a, 0x011a96, PG_U_NON_SPACING_MARK},
+	{0x011a97, 0x011a97, PG_U_COMBINING_SPACING_MARK},
+	{0x011a98, 0x011a99, PG_U_NON_SPACING_MARK},
+	{0x011a9a, 0x011a9c, PG_U_OTHER_PUNCTUATION},
+	{0x011a9d, 0x011a9d, PG_U_OTHER_LETTER},
+	{0x011a9e, 0x011aa2, PG_U_OTHER_PUNCTUATION},
+	{0x011aa3, 0x011aaf, PG_U_UNASSIGNED},
+	{0x011ab0, 0x011af8, PG_U_OTHER_LETTER},
+	{0x011af9, 0x011aff, PG_U_UNASSIGNED},
+	{0x011b00, 0x011b09, PG_U_OTHER_PUNCTUATION},
+	{0x011b0a, 0x011bff, PG_U_UNASSIGNED},
+	{0x011c00, 0x011c08, PG_U_OTHER_LETTER},
+	{0x011c09, 0x011c09, PG_U_UNASSIGNED},
+	{0x011c0a, 0x011c2e, PG_U_OTHER_LETTER},
+	{0x011c2f, 0x011c2f, PG_U_COMBINING_SPACING_MARK},
+	{0x011c30, 0x011c36, PG_U_NON_SPACING_MARK},
+	{0x011c37, 0x011c37, PG_U_UNASSIGNED},
+	{0x011c38, 0x011c3d, PG_U_NON_SPACING_MARK},
+	{0x011c3e, 0x011c3e, PG_U_COMBINING_SPACING_MARK},
+	{0x011c3f, 0x011c3f, PG_U_NON_SPACING_MARK},
+	{0x011c40, 0x011c40, PG_U_OTHER_LETTER},
+	{0x011c41, 0x011c45, PG_U_OTHER_PUNCTUATION},
+	{0x011c46, 0x011c4f, PG_U_UNASSIGNED},
+	{0x011c50, 0x011c59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011c5a, 0x011c6c, PG_U_OTHER_NUMBER},
+	{0x011c6d, 0x011c6f, PG_U_UNASSIGNED},
+	{0x011c70, 0x011c71, PG_U_OTHER_PUNCTUATION},
+	{0x011c72, 0x011c8f, PG_U_OTHER_LETTER},
+	{0x011c90, 0x011c91, PG_U_UNASSIGNED},
+	{0x011c92, 0x011ca7, PG_U_NON_SPACING_MARK},
+	{0x011ca8, 0x011ca8, PG_U_UNASSIGNED},
+	{0x011ca9, 0x011ca9, PG_U_COMBINING_SPACING_MARK},
+	{0x011caa, 0x011cb0, PG_U_NON_SPACING_MARK},
+	{0x011cb1, 0x011cb1, PG_U_COMBINING_SPACING_MARK},
+	{0x011cb2, 0x011cb3, PG_U_NON_SPACING_MARK},
+	{0x011cb4, 0x011cb4, PG_U_COMBINING_SPACING_MARK},
+	{0x011cb5, 0x011cb6, PG_U_NON_SPACING_MARK},
+	{0x011cb7, 0x011cff, PG_U_UNASSIGNED},
+	{0x011d00, 0x011d06, PG_U_OTHER_LETTER},
+	{0x011d07, 0x011d07, PG_U_UNASSIGNED},
+	{0x011d08, 0x011d09, PG_U_OTHER_LETTER},
+	{0x011d0a, 0x011d0a, PG_U_UNASSIGNED},
+	{0x011d0b, 0x011d30, PG_U_OTHER_LETTER},
+	{0x011d31, 0x011d36, PG_U_NON_SPACING_MARK},
+	{0x011d37, 0x011d39, PG_U_UNASSIGNED},
+	{0x011d3a, 0x011d3a, PG_U_NON_SPACING_MARK},
+	{0x011d3b, 0x011d3b, PG_U_UNASSIGNED},
+	{0x011d3c, 0x011d3d, PG_U_NON_SPACING_MARK},
+	{0x011d3e, 0x011d3e, PG_U_UNASSIGNED},
+	{0x011d3f, 0x011d45, PG_U_NON_SPACING_MARK},
+	{0x011d46, 0x011d46, PG_U_OTHER_LETTER},
+	{0x011d47, 0x011d47, PG_U_NON_SPACING_MARK},
+	{0x011d48, 0x011d4f, PG_U_UNASSIGNED},
+	{0x011d50, 0x011d59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011d5a, 0x011d5f, PG_U_UNASSIGNED},
+	{0x011d60, 0x011d65, PG_U_OTHER_LETTER},
+	{0x011d66, 0x011d66, PG_U_UNASSIGNED},
+	{0x011d67, 0x011d68, PG_U_OTHER_LETTER},
+	{0x011d69, 0x011d69, PG_U_UNASSIGNED},
+	{0x011d6a, 0x011d89, PG_U_OTHER_LETTER},
+	{0x011d8a, 0x011d8e, PG_U_COMBINING_SPACING_MARK},
+	{0x011d8f, 0x011d8f, PG_U_UNASSIGNED},
+	{0x011d90, 0x011d91, PG_U_NON_SPACING_MARK},
+	{0x011d92, 0x011d92, PG_U_UNASSIGNED},
+	{0x011d93, 0x011d94, PG_U_COMBINING_SPACING_MARK},
+	{0x011d95, 0x011d95, PG_U_NON_SPACING_MARK},
+	{0x011d96, 0x011d96, PG_U_COMBINING_SPACING_MARK},
+	{0x011d97, 0x011d97, PG_U_NON_SPACING_MARK},
+	{0x011d98, 0x011d98, PG_U_OTHER_LETTER},
+	{0x011d99, 0x011d9f, PG_U_UNASSIGNED},
+	{0x011da0, 0x011da9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011daa, 0x011edf, PG_U_UNASSIGNED},
+	{0x011ee0, 0x011ef2, PG_U_OTHER_LETTER},
+	{0x011ef3, 0x011ef4, PG_U_NON_SPACING_MARK},
+	{0x011ef5, 0x011ef6, PG_U_COMBINING_SPACING_MARK},
+	{0x011ef7, 0x011ef8, PG_U_OTHER_PUNCTUATION},
+	{0x011ef9, 0x011eff, PG_U_UNASSIGNED},
+	{0x011f00, 0x011f01, PG_U_NON_SPACING_MARK},
+	{0x011f02, 0x011f02, PG_U_OTHER_LETTER},
+	{0x011f03, 0x011f03, PG_U_COMBINING_SPACING_MARK},
+	{0x011f04, 0x011f10, PG_U_OTHER_LETTER},
+	{0x011f11, 0x011f11, PG_U_UNASSIGNED},
+	{0x011f12, 0x011f33, PG_U_OTHER_LETTER},
+	{0x011f34, 0x011f35, PG_U_COMBINING_SPACING_MARK},
+	{0x011f36, 0x011f3a, PG_U_NON_SPACING_MARK},
+	{0x011f3b, 0x011f3d, PG_U_UNASSIGNED},
+	{0x011f3e, 0x011f3f, PG_U_COMBINING_SPACING_MARK},
+	{0x011f40, 0x011f40, PG_U_NON_SPACING_MARK},
+	{0x011f41, 0x011f41, PG_U_COMBINING_SPACING_MARK},
+	{0x011f42, 0x011f42, PG_U_NON_SPACING_MARK},
+	{0x011f43, 0x011f4f, PG_U_OTHER_PUNCTUATION},
+	{0x011f50, 0x011f59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x011f5a, 0x011faf, PG_U_UNASSIGNED},
+	{0x011fb0, 0x011fb0, PG_U_OTHER_LETTER},
+	{0x011fb1, 0x011fbf, PG_U_UNASSIGNED},
+	{0x011fc0, 0x011fd4, PG_U_OTHER_NUMBER},
+	{0x011fd5, 0x011fdc, PG_U_OTHER_SYMBOL},
+	{0x011fdd, 0x011fe0, PG_U_CURRENCY_SYMBOL},
+	{0x011fe1, 0x011ff1, PG_U_OTHER_SYMBOL},
+	{0x011ff2, 0x011ffe, PG_U_UNASSIGNED},
+	{0x011fff, 0x011fff, PG_U_OTHER_PUNCTUATION},
+	{0x012000, 0x012399, PG_U_OTHER_LETTER},
+	{0x01239a, 0x0123ff, PG_U_UNASSIGNED},
+	{0x012400, 0x01246e, PG_U_LETTER_NUMBER},
+	{0x01246f, 0x01246f, PG_U_UNASSIGNED},
+	{0x012470, 0x012474, PG_U_OTHER_PUNCTUATION},
+	{0x012475, 0x01247f, PG_U_UNASSIGNED},
+	{0x012480, 0x012543, PG_U_OTHER_LETTER},
+	{0x012544, 0x012f8f, PG_U_UNASSIGNED},
+	{0x012f90, 0x012ff0, PG_U_OTHER_LETTER},
+	{0x012ff1, 0x012ff2, PG_U_OTHER_PUNCTUATION},
+	{0x012ff3, 0x012fff, PG_U_UNASSIGNED},
+	{0x013000, 0x01342f, PG_U_OTHER_LETTER},
+	{0x013430, 0x01343f, PG_U_FORMAT_CHAR},
+	{0x013440, 0x013440, PG_U_NON_SPACING_MARK},
+	{0x013441, 0x013446, PG_U_OTHER_LETTER},
+	{0x013447, 0x013455, PG_U_NON_SPACING_MARK},
+	{0x013456, 0x0143ff, PG_U_UNASSIGNED},
+	{0x014400, 0x014646, PG_U_OTHER_LETTER},
+	{0x014647, 0x0167ff, PG_U_UNASSIGNED},
+	{0x016800, 0x016a38, PG_U_OTHER_LETTER},
+	{0x016a39, 0x016a3f, PG_U_UNASSIGNED},
+	{0x016a40, 0x016a5e, PG_U_OTHER_LETTER},
+	{0x016a5f, 0x016a5f, PG_U_UNASSIGNED},
+	{0x016a60, 0x016a69, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x016a6a, 0x016a6d, PG_U_UNASSIGNED},
+	{0x016a6e, 0x016a6f, PG_U_OTHER_PUNCTUATION},
+	{0x016a70, 0x016abe, PG_U_OTHER_LETTER},
+	{0x016abf, 0x016abf, PG_U_UNASSIGNED},
+	{0x016ac0, 0x016ac9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x016aca, 0x016acf, PG_U_UNASSIGNED},
+	{0x016ad0, 0x016aed, PG_U_OTHER_LETTER},
+	{0x016aee, 0x016aef, PG_U_UNASSIGNED},
+	{0x016af0, 0x016af4, PG_U_NON_SPACING_MARK},
+	{0x016af5, 0x016af5, PG_U_OTHER_PUNCTUATION},
+	{0x016af6, 0x016aff, PG_U_UNASSIGNED},
+	{0x016b00, 0x016b2f, PG_U_OTHER_LETTER},
+	{0x016b30, 0x016b36, PG_U_NON_SPACING_MARK},
+	{0x016b37, 0x016b3b, PG_U_OTHER_PUNCTUATION},
+	{0x016b3c, 0x016b3f, PG_U_OTHER_SYMBOL},
+	{0x016b40, 0x016b43, PG_U_MODIFIER_LETTER},
+	{0x016b44, 0x016b44, PG_U_OTHER_PUNCTUATION},
+	{0x016b45, 0x016b45, PG_U_OTHER_SYMBOL},
+	{0x016b46, 0x016b4f, PG_U_UNASSIGNED},
+	{0x016b50, 0x016b59, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x016b5a, 0x016b5a, PG_U_UNASSIGNED},
+	{0x016b5b, 0x016b61, PG_U_OTHER_NUMBER},
+	{0x016b62, 0x016b62, PG_U_UNASSIGNED},
+	{0x016b63, 0x016b77, PG_U_OTHER_LETTER},
+	{0x016b78, 0x016b7c, PG_U_UNASSIGNED},
+	{0x016b7d, 0x016b8f, PG_U_OTHER_LETTER},
+	{0x016b90, 0x016e3f, PG_U_UNASSIGNED},
+	{0x016e40, 0x016e5f, PG_U_UPPERCASE_LETTER},
+	{0x016e60, 0x016e7f, PG_U_LOWERCASE_LETTER},
+	{0x016e80, 0x016e96, PG_U_OTHER_NUMBER},
+	{0x016e97, 0x016e9a, PG_U_OTHER_PUNCTUATION},
+	{0x016e9b, 0x016eff, PG_U_UNASSIGNED},
+	{0x016f00, 0x016f4a, PG_U_OTHER_LETTER},
+	{0x016f4b, 0x016f4e, PG_U_UNASSIGNED},
+	{0x016f4f, 0x016f4f, PG_U_NON_SPACING_MARK},
+	{0x016f50, 0x016f50, PG_U_OTHER_LETTER},
+	{0x016f51, 0x016f87, PG_U_COMBINING_SPACING_MARK},
+	{0x016f88, 0x016f8e, PG_U_UNASSIGNED},
+	{0x016f8f, 0x016f92, PG_U_NON_SPACING_MARK},
+	{0x016f93, 0x016f9f, PG_U_MODIFIER_LETTER},
+	{0x016fa0, 0x016fdf, PG_U_UNASSIGNED},
+	{0x016fe0, 0x016fe1, PG_U_MODIFIER_LETTER},
+	{0x016fe2, 0x016fe2, PG_U_OTHER_PUNCTUATION},
+	{0x016fe3, 0x016fe3, PG_U_MODIFIER_LETTER},
+	{0x016fe4, 0x016fe4, PG_U_NON_SPACING_MARK},
+	{0x016fe5, 0x016fef, PG_U_UNASSIGNED},
+	{0x016ff0, 0x016ff1, PG_U_COMBINING_SPACING_MARK},
+	{0x016ff2, 0x016fff, PG_U_UNASSIGNED},
+	{0x017000, 0x0187f7, PG_U_OTHER_LETTER},
+	{0x0187f8, 0x0187ff, PG_U_UNASSIGNED},
+	{0x018800, 0x018cd5, PG_U_OTHER_LETTER},
+	{0x018cd6, 0x018cff, PG_U_UNASSIGNED},
+	{0x018d00, 0x018d08, PG_U_OTHER_LETTER},
+	{0x018d09, 0x01afef, PG_U_UNASSIGNED},
+	{0x01aff0, 0x01aff3, PG_U_MODIFIER_LETTER},
+	{0x01aff4, 0x01aff4, PG_U_UNASSIGNED},
+	{0x01aff5, 0x01affb, PG_U_MODIFIER_LETTER},
+	{0x01affc, 0x01affc, PG_U_UNASSIGNED},
+	{0x01affd, 0x01affe, PG_U_MODIFIER_LETTER},
+	{0x01afff, 0x01afff, PG_U_UNASSIGNED},
+	{0x01b000, 0x01b122, PG_U_OTHER_LETTER},
+	{0x01b123, 0x01b131, PG_U_UNASSIGNED},
+	{0x01b132, 0x01b132, PG_U_OTHER_LETTER},
+	{0x01b133, 0x01b14f, PG_U_UNASSIGNED},
+	{0x01b150, 0x01b152, PG_U_OTHER_LETTER},
+	{0x01b153, 0x01b154, PG_U_UNASSIGNED},
+	{0x01b155, 0x01b155, PG_U_OTHER_LETTER},
+	{0x01b156, 0x01b163, PG_U_UNASSIGNED},
+	{0x01b164, 0x01b167, PG_U_OTHER_LETTER},
+	{0x01b168, 0x01b16f, PG_U_UNASSIGNED},
+	{0x01b170, 0x01b2fb, PG_U_OTHER_LETTER},
+	{0x01b2fc, 0x01bbff, PG_U_UNASSIGNED},
+	{0x01bc00, 0x01bc6a, PG_U_OTHER_LETTER},
+	{0x01bc6b, 0x01bc6f, PG_U_UNASSIGNED},
+	{0x01bc70, 0x01bc7c, PG_U_OTHER_LETTER},
+	{0x01bc7d, 0x01bc7f, PG_U_UNASSIGNED},
+	{0x01bc80, 0x01bc88, PG_U_OTHER_LETTER},
+	{0x01bc89, 0x01bc8f, PG_U_UNASSIGNED},
+	{0x01bc90, 0x01bc99, PG_U_OTHER_LETTER},
+	{0x01bc9a, 0x01bc9b, PG_U_UNASSIGNED},
+	{0x01bc9c, 0x01bc9c, PG_U_OTHER_SYMBOL},
+	{0x01bc9d, 0x01bc9e, PG_U_NON_SPACING_MARK},
+	{0x01bc9f, 0x01bc9f, PG_U_OTHER_PUNCTUATION},
+	{0x01bca0, 0x01bca3, PG_U_FORMAT_CHAR},
+	{0x01bca4, 0x01ceff, PG_U_UNASSIGNED},
+	{0x01cf00, 0x01cf2d, PG_U_NON_SPACING_MARK},
+	{0x01cf2e, 0x01cf2f, PG_U_UNASSIGNED},
+	{0x01cf30, 0x01cf46, PG_U_NON_SPACING_MARK},
+	{0x01cf47, 0x01cf4f, PG_U_UNASSIGNED},
+	{0x01cf50, 0x01cfc3, PG_U_OTHER_SYMBOL},
+	{0x01cfc4, 0x01cfff, PG_U_UNASSIGNED},
+	{0x01d000, 0x01d0f5, PG_U_OTHER_SYMBOL},
+	{0x01d0f6, 0x01d0ff, PG_U_UNASSIGNED},
+	{0x01d100, 0x01d126, PG_U_OTHER_SYMBOL},
+	{0x01d127, 0x01d128, PG_U_UNASSIGNED},
+	{0x01d129, 0x01d164, PG_U_OTHER_SYMBOL},
+	{0x01d165, 0x01d166, PG_U_COMBINING_SPACING_MARK},
+	{0x01d167, 0x01d169, PG_U_NON_SPACING_MARK},
+	{0x01d16a, 0x01d16c, PG_U_OTHER_SYMBOL},
+	{0x01d16d, 0x01d172, PG_U_COMBINING_SPACING_MARK},
+	{0x01d173, 0x01d17a, PG_U_FORMAT_CHAR},
+	{0x01d17b, 0x01d182, PG_U_NON_SPACING_MARK},
+	{0x01d183, 0x01d184, PG_U_OTHER_SYMBOL},
+	{0x01d185, 0x01d18b, PG_U_NON_SPACING_MARK},
+	{0x01d18c, 0x01d1a9, PG_U_OTHER_SYMBOL},
+	{0x01d1aa, 0x01d1ad, PG_U_NON_SPACING_MARK},
+	{0x01d1ae, 0x01d1ea, PG_U_OTHER_SYMBOL},
+	{0x01d1eb, 0x01d1ff, PG_U_UNASSIGNED},
+	{0x01d200, 0x01d241, PG_U_OTHER_SYMBOL},
+	{0x01d242, 0x01d244, PG_U_NON_SPACING_MARK},
+	{0x01d245, 0x01d245, PG_U_OTHER_SYMBOL},
+	{0x01d246, 0x01d2bf, PG_U_UNASSIGNED},
+	{0x01d2c0, 0x01d2d3, PG_U_OTHER_NUMBER},
+	{0x01d2d4, 0x01d2df, PG_U_UNASSIGNED},
+	{0x01d2e0, 0x01d2f3, PG_U_OTHER_NUMBER},
+	{0x01d2f4, 0x01d2ff, PG_U_UNASSIGNED},
+	{0x01d300, 0x01d356, PG_U_OTHER_SYMBOL},
+	{0x01d357, 0x01d35f, PG_U_UNASSIGNED},
+	{0x01d360, 0x01d378, PG_U_OTHER_NUMBER},
+	{0x01d379, 0x01d3ff, PG_U_UNASSIGNED},
+	{0x01d400, 0x01d419, PG_U_UPPERCASE_LETTER},
+	{0x01d41a, 0x01d433, PG_U_LOWERCASE_LETTER},
+	{0x01d434, 0x01d44d, PG_U_UPPERCASE_LETTER},
+	{0x01d44e, 0x01d454, PG_U_LOWERCASE_LETTER},
+	{0x01d455, 0x01d455, PG_U_UNASSIGNED},
+	{0x01d456, 0x01d467, PG_U_LOWERCASE_LETTER},
+	{0x01d468, 0x01d481, PG_U_UPPERCASE_LETTER},
+	{0x01d482, 0x01d49b, PG_U_LOWERCASE_LETTER},
+	{0x01d49c, 0x01d49c, PG_U_UPPERCASE_LETTER},
+	{0x01d49d, 0x01d49d, PG_U_UNASSIGNED},
+	{0x01d49e, 0x01d49f, PG_U_UPPERCASE_LETTER},
+	{0x01d4a0, 0x01d4a1, PG_U_UNASSIGNED},
+	{0x01d4a2, 0x01d4a2, PG_U_UPPERCASE_LETTER},
+	{0x01d4a3, 0x01d4a4, PG_U_UNASSIGNED},
+	{0x01d4a5, 0x01d4a6, PG_U_UPPERCASE_LETTER},
+	{0x01d4a7, 0x01d4a8, PG_U_UNASSIGNED},
+	{0x01d4a9, 0x01d4ac, PG_U_UPPERCASE_LETTER},
+	{0x01d4ad, 0x01d4ad, PG_U_UNASSIGNED},
+	{0x01d4ae, 0x01d4b5, PG_U_UPPERCASE_LETTER},
+	{0x01d4b6, 0x01d4b9, PG_U_LOWERCASE_LETTER},
+	{0x01d4ba, 0x01d4ba, PG_U_UNASSIGNED},
+	{0x01d4bb, 0x01d4bb, PG_U_LOWERCASE_LETTER},
+	{0x01d4bc, 0x01d4bc, PG_U_UNASSIGNED},
+	{0x01d4bd, 0x01d4c3, PG_U_LOWERCASE_LETTER},
+	{0x01d4c4, 0x01d4c4, PG_U_UNASSIGNED},
+	{0x01d4c5, 0x01d4cf, PG_U_LOWERCASE_LETTER},
+	{0x01d4d0, 0x01d4e9, PG_U_UPPERCASE_LETTER},
+	{0x01d4ea, 0x01d503, PG_U_LOWERCASE_LETTER},
+	{0x01d504, 0x01d505, PG_U_UPPERCASE_LETTER},
+	{0x01d506, 0x01d506, PG_U_UNASSIGNED},
+	{0x01d507, 0x01d50a, PG_U_UPPERCASE_LETTER},
+	{0x01d50b, 0x01d50c, PG_U_UNASSIGNED},
+	{0x01d50d, 0x01d514, PG_U_UPPERCASE_LETTER},
+	{0x01d515, 0x01d515, PG_U_UNASSIGNED},
+	{0x01d516, 0x01d51c, PG_U_UPPERCASE_LETTER},
+	{0x01d51d, 0x01d51d, PG_U_UNASSIGNED},
+	{0x01d51e, 0x01d537, PG_U_LOWERCASE_LETTER},
+	{0x01d538, 0x01d539, PG_U_UPPERCASE_LETTER},
+	{0x01d53a, 0x01d53a, PG_U_UNASSIGNED},
+	{0x01d53b, 0x01d53e, PG_U_UPPERCASE_LETTER},
+	{0x01d53f, 0x01d53f, PG_U_UNASSIGNED},
+	{0x01d540, 0x01d544, PG_U_UPPERCASE_LETTER},
+	{0x01d545, 0x01d545, PG_U_UNASSIGNED},
+	{0x01d546, 0x01d546, PG_U_UPPERCASE_LETTER},
+	{0x01d547, 0x01d549, PG_U_UNASSIGNED},
+	{0x01d54a, 0x01d550, PG_U_UPPERCASE_LETTER},
+	{0x01d551, 0x01d551, PG_U_UNASSIGNED},
+	{0x01d552, 0x01d56b, PG_U_LOWERCASE_LETTER},
+	{0x01d56c, 0x01d585, PG_U_UPPERCASE_LETTER},
+	{0x01d586, 0x01d59f, PG_U_LOWERCASE_LETTER},
+	{0x01d5a0, 0x01d5b9, PG_U_UPPERCASE_LETTER},
+	{0x01d5ba, 0x01d5d3, PG_U_LOWERCASE_LETTER},
+	{0x01d5d4, 0x01d5ed, PG_U_UPPERCASE_LETTER},
+	{0x01d5ee, 0x01d607, PG_U_LOWERCASE_LETTER},
+	{0x01d608, 0x01d621, PG_U_UPPERCASE_LETTER},
+	{0x01d622, 0x01d63b, PG_U_LOWERCASE_LETTER},
+	{0x01d63c, 0x01d655, PG_U_UPPERCASE_LETTER},
+	{0x01d656, 0x01d66f, PG_U_LOWERCASE_LETTER},
+	{0x01d670, 0x01d689, PG_U_UPPERCASE_LETTER},
+	{0x01d68a, 0x01d6a5, PG_U_LOWERCASE_LETTER},
+	{0x01d6a6, 0x01d6a7, PG_U_UNASSIGNED},
+	{0x01d6a8, 0x01d6c0, PG_U_UPPERCASE_LETTER},
+	{0x01d6c1, 0x01d6c1, PG_U_MATH_SYMBOL},
+	{0x01d6c2, 0x01d6da, PG_U_LOWERCASE_LETTER},
+	{0x01d6db, 0x01d6db, PG_U_MATH_SYMBOL},
+	{0x01d6dc, 0x01d6e1, PG_U_LOWERCASE_LETTER},
+	{0x01d6e2, 0x01d6fa, PG_U_UPPERCASE_LETTER},
+	{0x01d6fb, 0x01d6fb, PG_U_MATH_SYMBOL},
+	{0x01d6fc, 0x01d714, PG_U_LOWERCASE_LETTER},
+	{0x01d715, 0x01d715, PG_U_MATH_SYMBOL},
+	{0x01d716, 0x01d71b, PG_U_LOWERCASE_LETTER},
+	{0x01d71c, 0x01d734, PG_U_UPPERCASE_LETTER},
+	{0x01d735, 0x01d735, PG_U_MATH_SYMBOL},
+	{0x01d736, 0x01d74e, PG_U_LOWERCASE_LETTER},
+	{0x01d74f, 0x01d74f, PG_U_MATH_SYMBOL},
+	{0x01d750, 0x01d755, PG_U_LOWERCASE_LETTER},
+	{0x01d756, 0x01d76e, PG_U_UPPERCASE_LETTER},
+	{0x01d76f, 0x01d76f, PG_U_MATH_SYMBOL},
+	{0x01d770, 0x01d788, PG_U_LOWERCASE_LETTER},
+	{0x01d789, 0x01d789, PG_U_MATH_SYMBOL},
+	{0x01d78a, 0x01d78f, PG_U_LOWERCASE_LETTER},
+	{0x01d790, 0x01d7a8, PG_U_UPPERCASE_LETTER},
+	{0x01d7a9, 0x01d7a9, PG_U_MATH_SYMBOL},
+	{0x01d7aa, 0x01d7c2, PG_U_LOWERCASE_LETTER},
+	{0x01d7c3, 0x01d7c3, PG_U_MATH_SYMBOL},
+	{0x01d7c4, 0x01d7c9, PG_U_LOWERCASE_LETTER},
+	{0x01d7ca, 0x01d7ca, PG_U_UPPERCASE_LETTER},
+	{0x01d7cb, 0x01d7cb, PG_U_LOWERCASE_LETTER},
+	{0x01d7cc, 0x01d7cd, PG_U_UNASSIGNED},
+	{0x01d7ce, 0x01d7ff, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01d800, 0x01d9ff, PG_U_OTHER_SYMBOL},
+	{0x01da00, 0x01da36, PG_U_NON_SPACING_MARK},
+	{0x01da37, 0x01da3a, PG_U_OTHER_SYMBOL},
+	{0x01da3b, 0x01da6c, PG_U_NON_SPACING_MARK},
+	{0x01da6d, 0x01da74, PG_U_OTHER_SYMBOL},
+	{0x01da75, 0x01da75, PG_U_NON_SPACING_MARK},
+	{0x01da76, 0x01da83, PG_U_OTHER_SYMBOL},
+	{0x01da84, 0x01da84, PG_U_NON_SPACING_MARK},
+	{0x01da85, 0x01da86, PG_U_OTHER_SYMBOL},
+	{0x01da87, 0x01da8b, PG_U_OTHER_PUNCTUATION},
+	{0x01da8c, 0x01da9a, PG_U_UNASSIGNED},
+	{0x01da9b, 0x01da9f, PG_U_NON_SPACING_MARK},
+	{0x01daa0, 0x01daa0, PG_U_UNASSIGNED},
+	{0x01daa1, 0x01daaf, PG_U_NON_SPACING_MARK},
+	{0x01dab0, 0x01deff, PG_U_UNASSIGNED},
+	{0x01df00, 0x01df09, PG_U_LOWERCASE_LETTER},
+	{0x01df0a, 0x01df0a, PG_U_OTHER_LETTER},
+	{0x01df0b, 0x01df1e, PG_U_LOWERCASE_LETTER},
+	{0x01df1f, 0x01df24, PG_U_UNASSIGNED},
+	{0x01df25, 0x01df2a, PG_U_LOWERCASE_LETTER},
+	{0x01df2b, 0x01dfff, PG_U_UNASSIGNED},
+	{0x01e000, 0x01e006, PG_U_NON_SPACING_MARK},
+	{0x01e007, 0x01e007, PG_U_UNASSIGNED},
+	{0x01e008, 0x01e018, PG_U_NON_SPACING_MARK},
+	{0x01e019, 0x01e01a, PG_U_UNASSIGNED},
+	{0x01e01b, 0x01e021, PG_U_NON_SPACING_MARK},
+	{0x01e022, 0x01e022, PG_U_UNASSIGNED},
+	{0x01e023, 0x01e024, PG_U_NON_SPACING_MARK},
+	{0x01e025, 0x01e025, PG_U_UNASSIGNED},
+	{0x01e026, 0x01e02a, PG_U_NON_SPACING_MARK},
+	{0x01e02b, 0x01e02f, PG_U_UNASSIGNED},
+	{0x01e030, 0x01e06d, PG_U_MODIFIER_LETTER},
+	{0x01e06e, 0x01e08e, PG_U_UNASSIGNED},
+	{0x01e08f, 0x01e08f, PG_U_NON_SPACING_MARK},
+	{0x01e090, 0x01e0ff, PG_U_UNASSIGNED},
+	{0x01e100, 0x01e12c, PG_U_OTHER_LETTER},
+	{0x01e12d, 0x01e12f, PG_U_UNASSIGNED},
+	{0x01e130, 0x01e136, PG_U_NON_SPACING_MARK},
+	{0x01e137, 0x01e13d, PG_U_MODIFIER_LETTER},
+	{0x01e13e, 0x01e13f, PG_U_UNASSIGNED},
+	{0x01e140, 0x01e149, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e14a, 0x01e14d, PG_U_UNASSIGNED},
+	{0x01e14e, 0x01e14e, PG_U_OTHER_LETTER},
+	{0x01e14f, 0x01e14f, PG_U_OTHER_SYMBOL},
+	{0x01e150, 0x01e28f, PG_U_UNASSIGNED},
+	{0x01e290, 0x01e2ad, PG_U_OTHER_LETTER},
+	{0x01e2ae, 0x01e2ae, PG_U_NON_SPACING_MARK},
+	{0x01e2af, 0x01e2bf, PG_U_UNASSIGNED},
+	{0x01e2c0, 0x01e2eb, PG_U_OTHER_LETTER},
+	{0x01e2ec, 0x01e2ef, PG_U_NON_SPACING_MARK},
+	{0x01e2f0, 0x01e2f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e2fa, 0x01e2fe, PG_U_UNASSIGNED},
+	{0x01e2ff, 0x01e2ff, PG_U_CURRENCY_SYMBOL},
+	{0x01e300, 0x01e4cf, PG_U_UNASSIGNED},
+	{0x01e4d0, 0x01e4ea, PG_U_OTHER_LETTER},
+	{0x01e4eb, 0x01e4eb, PG_U_MODIFIER_LETTER},
+	{0x01e4ec, 0x01e4ef, PG_U_NON_SPACING_MARK},
+	{0x01e4f0, 0x01e4f9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e4fa, 0x01e7df, PG_U_UNASSIGNED},
+	{0x01e7e0, 0x01e7e6, PG_U_OTHER_LETTER},
+	{0x01e7e7, 0x01e7e7, PG_U_UNASSIGNED},
+	{0x01e7e8, 0x01e7eb, PG_U_OTHER_LETTER},
+	{0x01e7ec, 0x01e7ec, PG_U_UNASSIGNED},
+	{0x01e7ed, 0x01e7ee, PG_U_OTHER_LETTER},
+	{0x01e7ef, 0x01e7ef, PG_U_UNASSIGNED},
+	{0x01e7f0, 0x01e7fe, PG_U_OTHER_LETTER},
+	{0x01e7ff, 0x01e7ff, PG_U_UNASSIGNED},
+	{0x01e800, 0x01e8c4, PG_U_OTHER_LETTER},
+	{0x01e8c5, 0x01e8c6, PG_U_UNASSIGNED},
+	{0x01e8c7, 0x01e8cf, PG_U_OTHER_NUMBER},
+	{0x01e8d0, 0x01e8d6, PG_U_NON_SPACING_MARK},
+	{0x01e8d7, 0x01e8ff, PG_U_UNASSIGNED},
+	{0x01e900, 0x01e921, PG_U_UPPERCASE_LETTER},
+	{0x01e922, 0x01e943, PG_U_LOWERCASE_LETTER},
+	{0x01e944, 0x01e94a, PG_U_NON_SPACING_MARK},
+	{0x01e94b, 0x01e94b, PG_U_MODIFIER_LETTER},
+	{0x01e94c, 0x01e94f, PG_U_UNASSIGNED},
+	{0x01e950, 0x01e959, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01e95a, 0x01e95d, PG_U_UNASSIGNED},
+	{0x01e95e, 0x01e95f, PG_U_OTHER_PUNCTUATION},
+	{0x01e960, 0x01ec70, PG_U_UNASSIGNED},
+	{0x01ec71, 0x01ecab, PG_U_OTHER_NUMBER},
+	{0x01ecac, 0x01ecac, PG_U_OTHER_SYMBOL},
+	{0x01ecad, 0x01ecaf, PG_U_OTHER_NUMBER},
+	{0x01ecb0, 0x01ecb0, PG_U_CURRENCY_SYMBOL},
+	{0x01ecb1, 0x01ecb4, PG_U_OTHER_NUMBER},
+	{0x01ecb5, 0x01ed00, PG_U_UNASSIGNED},
+	{0x01ed01, 0x01ed2d, PG_U_OTHER_NUMBER},
+	{0x01ed2e, 0x01ed2e, PG_U_OTHER_SYMBOL},
+	{0x01ed2f, 0x01ed3d, PG_U_OTHER_NUMBER},
+	{0x01ed3e, 0x01edff, PG_U_UNASSIGNED},
+	{0x01ee00, 0x01ee03, PG_U_OTHER_LETTER},
+	{0x01ee04, 0x01ee04, PG_U_UNASSIGNED},
+	{0x01ee05, 0x01ee1f, PG_U_OTHER_LETTER},
+	{0x01ee20, 0x01ee20, PG_U_UNASSIGNED},
+	{0x01ee21, 0x01ee22, PG_U_OTHER_LETTER},
+	{0x01ee23, 0x01ee23, PG_U_UNASSIGNED},
+	{0x01ee24, 0x01ee24, PG_U_OTHER_LETTER},
+	{0x01ee25, 0x01ee26, PG_U_UNASSIGNED},
+	{0x01ee27, 0x01ee27, PG_U_OTHER_LETTER},
+	{0x01ee28, 0x01ee28, PG_U_UNASSIGNED},
+	{0x01ee29, 0x01ee32, PG_U_OTHER_LETTER},
+	{0x01ee33, 0x01ee33, PG_U_UNASSIGNED},
+	{0x01ee34, 0x01ee37, PG_U_OTHER_LETTER},
+	{0x01ee38, 0x01ee38, PG_U_UNASSIGNED},
+	{0x01ee39, 0x01ee39, PG_U_OTHER_LETTER},
+	{0x01ee3a, 0x01ee3a, PG_U_UNASSIGNED},
+	{0x01ee3b, 0x01ee3b, PG_U_OTHER_LETTER},
+	{0x01ee3c, 0x01ee41, PG_U_UNASSIGNED},
+	{0x01ee42, 0x01ee42, PG_U_OTHER_LETTER},
+	{0x01ee43, 0x01ee46, PG_U_UNASSIGNED},
+	{0x01ee47, 0x01ee47, PG_U_OTHER_LETTER},
+	{0x01ee48, 0x01ee48, PG_U_UNASSIGNED},
+	{0x01ee49, 0x01ee49, PG_U_OTHER_LETTER},
+	{0x01ee4a, 0x01ee4a, PG_U_UNASSIGNED},
+	{0x01ee4b, 0x01ee4b, PG_U_OTHER_LETTER},
+	{0x01ee4c, 0x01ee4c, PG_U_UNASSIGNED},
+	{0x01ee4d, 0x01ee4f, PG_U_OTHER_LETTER},
+	{0x01ee50, 0x01ee50, PG_U_UNASSIGNED},
+	{0x01ee51, 0x01ee52, PG_U_OTHER_LETTER},
+	{0x01ee53, 0x01ee53, PG_U_UNASSIGNED},
+	{0x01ee54, 0x01ee54, PG_U_OTHER_LETTER},
+	{0x01ee55, 0x01ee56, PG_U_UNASSIGNED},
+	{0x01ee57, 0x01ee57, PG_U_OTHER_LETTER},
+	{0x01ee58, 0x01ee58, PG_U_UNASSIGNED},
+	{0x01ee59, 0x01ee59, PG_U_OTHER_LETTER},
+	{0x01ee5a, 0x01ee5a, PG_U_UNASSIGNED},
+	{0x01ee5b, 0x01ee5b, PG_U_OTHER_LETTER},
+	{0x01ee5c, 0x01ee5c, PG_U_UNASSIGNED},
+	{0x01ee5d, 0x01ee5d, PG_U_OTHER_LETTER},
+	{0x01ee5e, 0x01ee5e, PG_U_UNASSIGNED},
+	{0x01ee5f, 0x01ee5f, PG_U_OTHER_LETTER},
+	{0x01ee60, 0x01ee60, PG_U_UNASSIGNED},
+	{0x01ee61, 0x01ee62, PG_U_OTHER_LETTER},
+	{0x01ee63, 0x01ee63, PG_U_UNASSIGNED},
+	{0x01ee64, 0x01ee64, PG_U_OTHER_LETTER},
+	{0x01ee65, 0x01ee66, PG_U_UNASSIGNED},
+	{0x01ee67, 0x01ee6a, PG_U_OTHER_LETTER},
+	{0x01ee6b, 0x01ee6b, PG_U_UNASSIGNED},
+	{0x01ee6c, 0x01ee72, PG_U_OTHER_LETTER},
+	{0x01ee73, 0x01ee73, PG_U_UNASSIGNED},
+	{0x01ee74, 0x01ee77, PG_U_OTHER_LETTER},
+	{0x01ee78, 0x01ee78, PG_U_UNASSIGNED},
+	{0x01ee79, 0x01ee7c, PG_U_OTHER_LETTER},
+	{0x01ee7d, 0x01ee7d, PG_U_UNASSIGNED},
+	{0x01ee7e, 0x01ee7e, PG_U_OTHER_LETTER},
+	{0x01ee7f, 0x01ee7f, PG_U_UNASSIGNED},
+	{0x01ee80, 0x01ee89, PG_U_OTHER_LETTER},
+	{0x01ee8a, 0x01ee8a, PG_U_UNASSIGNED},
+	{0x01ee8b, 0x01ee9b, PG_U_OTHER_LETTER},
+	{0x01ee9c, 0x01eea0, PG_U_UNASSIGNED},
+	{0x01eea1, 0x01eea3, PG_U_OTHER_LETTER},
+	{0x01eea4, 0x01eea4, PG_U_UNASSIGNED},
+	{0x01eea5, 0x01eea9, PG_U_OTHER_LETTER},
+	{0x01eeaa, 0x01eeaa, PG_U_UNASSIGNED},
+	{0x01eeab, 0x01eebb, PG_U_OTHER_LETTER},
+	{0x01eebc, 0x01eeef, PG_U_UNASSIGNED},
+	{0x01eef0, 0x01eef1, PG_U_MATH_SYMBOL},
+	{0x01eef2, 0x01efff, PG_U_UNASSIGNED},
+	{0x01f000, 0x01f02b, PG_U_OTHER_SYMBOL},
+	{0x01f02c, 0x01f02f, PG_U_UNASSIGNED},
+	{0x01f030, 0x01f093, PG_U_OTHER_SYMBOL},
+	{0x01f094, 0x01f09f, PG_U_UNASSIGNED},
+	{0x01f0a0, 0x01f0ae, PG_U_OTHER_SYMBOL},
+	{0x01f0af, 0x01f0b0, PG_U_UNASSIGNED},
+	{0x01f0b1, 0x01f0bf, PG_U_OTHER_SYMBOL},
+	{0x01f0c0, 0x01f0c0, PG_U_UNASSIGNED},
+	{0x01f0c1, 0x01f0cf, PG_U_OTHER_SYMBOL},
+	{0x01f0d0, 0x01f0d0, PG_U_UNASSIGNED},
+	{0x01f0d1, 0x01f0f5, PG_U_OTHER_SYMBOL},
+	{0x01f0f6, 0x01f0ff, PG_U_UNASSIGNED},
+	{0x01f100, 0x01f10c, PG_U_OTHER_NUMBER},
+	{0x01f10d, 0x01f1ad, PG_U_OTHER_SYMBOL},
+	{0x01f1ae, 0x01f1e5, PG_U_UNASSIGNED},
+	{0x01f1e6, 0x01f202, PG_U_OTHER_SYMBOL},
+	{0x01f203, 0x01f20f, PG_U_UNASSIGNED},
+	{0x01f210, 0x01f23b, PG_U_OTHER_SYMBOL},
+	{0x01f23c, 0x01f23f, PG_U_UNASSIGNED},
+	{0x01f240, 0x01f248, PG_U_OTHER_SYMBOL},
+	{0x01f249, 0x01f24f, PG_U_UNASSIGNED},
+	{0x01f250, 0x01f251, PG_U_OTHER_SYMBOL},
+	{0x01f252, 0x01f25f, PG_U_UNASSIGNED},
+	{0x01f260, 0x01f265, PG_U_OTHER_SYMBOL},
+	{0x01f266, 0x01f2ff, PG_U_UNASSIGNED},
+	{0x01f300, 0x01f3fa, PG_U_OTHER_SYMBOL},
+	{0x01f3fb, 0x01f3ff, PG_U_MODIFIER_SYMBOL},
+	{0x01f400, 0x01f6d7, PG_U_OTHER_SYMBOL},
+	{0x01f6d8, 0x01f6db, PG_U_UNASSIGNED},
+	{0x01f6dc, 0x01f6ec, PG_U_OTHER_SYMBOL},
+	{0x01f6ed, 0x01f6ef, PG_U_UNASSIGNED},
+	{0x01f6f0, 0x01f6fc, PG_U_OTHER_SYMBOL},
+	{0x01f6fd, 0x01f6ff, PG_U_UNASSIGNED},
+	{0x01f700, 0x01f776, PG_U_OTHER_SYMBOL},
+	{0x01f777, 0x01f77a, PG_U_UNASSIGNED},
+	{0x01f77b, 0x01f7d9, PG_U_OTHER_SYMBOL},
+	{0x01f7da, 0x01f7df, PG_U_UNASSIGNED},
+	{0x01f7e0, 0x01f7eb, PG_U_OTHER_SYMBOL},
+	{0x01f7ec, 0x01f7ef, PG_U_UNASSIGNED},
+	{0x01f7f0, 0x01f7f0, PG_U_OTHER_SYMBOL},
+	{0x01f7f1, 0x01f7ff, PG_U_UNASSIGNED},
+	{0x01f800, 0x01f80b, PG_U_OTHER_SYMBOL},
+	{0x01f80c, 0x01f80f, PG_U_UNASSIGNED},
+	{0x01f810, 0x01f847, PG_U_OTHER_SYMBOL},
+	{0x01f848, 0x01f84f, PG_U_UNASSIGNED},
+	{0x01f850, 0x01f859, PG_U_OTHER_SYMBOL},
+	{0x01f85a, 0x01f85f, PG_U_UNASSIGNED},
+	{0x01f860, 0x01f887, PG_U_OTHER_SYMBOL},
+	{0x01f888, 0x01f88f, PG_U_UNASSIGNED},
+	{0x01f890, 0x01f8ad, PG_U_OTHER_SYMBOL},
+	{0x01f8ae, 0x01f8af, PG_U_UNASSIGNED},
+	{0x01f8b0, 0x01f8b1, PG_U_OTHER_SYMBOL},
+	{0x01f8b2, 0x01f8ff, PG_U_UNASSIGNED},
+	{0x01f900, 0x01fa53, PG_U_OTHER_SYMBOL},
+	{0x01fa54, 0x01fa5f, PG_U_UNASSIGNED},
+	{0x01fa60, 0x01fa6d, PG_U_OTHER_SYMBOL},
+	{0x01fa6e, 0x01fa6f, PG_U_UNASSIGNED},
+	{0x01fa70, 0x01fa7c, PG_U_OTHER_SYMBOL},
+	{0x01fa7d, 0x01fa7f, PG_U_UNASSIGNED},
+	{0x01fa80, 0x01fa88, PG_U_OTHER_SYMBOL},
+	{0x01fa89, 0x01fa8f, PG_U_UNASSIGNED},
+	{0x01fa90, 0x01fabd, PG_U_OTHER_SYMBOL},
+	{0x01fabe, 0x01fabe, PG_U_UNASSIGNED},
+	{0x01fabf, 0x01fac5, PG_U_OTHER_SYMBOL},
+	{0x01fac6, 0x01facd, PG_U_UNASSIGNED},
+	{0x01face, 0x01fadb, PG_U_OTHER_SYMBOL},
+	{0x01fadc, 0x01fadf, PG_U_UNASSIGNED},
+	{0x01fae0, 0x01fae8, PG_U_OTHER_SYMBOL},
+	{0x01fae9, 0x01faef, PG_U_UNASSIGNED},
+	{0x01faf0, 0x01faf8, PG_U_OTHER_SYMBOL},
+	{0x01faf9, 0x01faff, PG_U_UNASSIGNED},
+	{0x01fb00, 0x01fb92, PG_U_OTHER_SYMBOL},
+	{0x01fb93, 0x01fb93, PG_U_UNASSIGNED},
+	{0x01fb94, 0x01fbca, PG_U_OTHER_SYMBOL},
+	{0x01fbcb, 0x01fbef, PG_U_UNASSIGNED},
+	{0x01fbf0, 0x01fbf9, PG_U_DECIMAL_DIGIT_NUMBER},
+	{0x01fbfa, 0x01ffff, PG_U_UNASSIGNED},
+	{0x020000, 0x02a6df, PG_U_OTHER_LETTER},
+	{0x02a6e0, 0x02a6ff, PG_U_UNASSIGNED},
+	{0x02a700, 0x02b739, PG_U_OTHER_LETTER},
+	{0x02b73a, 0x02b73f, PG_U_UNASSIGNED},
+	{0x02b740, 0x02b81d, PG_U_OTHER_LETTER},
+	{0x02b81e, 0x02b81f, PG_U_UNASSIGNED},
+	{0x02b820, 0x02cea1, PG_U_OTHER_LETTER},
+	{0x02cea2, 0x02ceaf, PG_U_UNASSIGNED},
+	{0x02ceb0, 0x02ebe0, PG_U_OTHER_LETTER},
+	{0x02ebe1, 0x02ebef, PG_U_UNASSIGNED},
+	{0x02ebf0, 0x02ee5d, PG_U_OTHER_LETTER},
+	{0x02ee5e, 0x02f7ff, PG_U_UNASSIGNED},
+	{0x02f800, 0x02fa1d, PG_U_OTHER_LETTER},
+	{0x02fa1e, 0x02ffff, PG_U_UNASSIGNED},
+	{0x030000, 0x03134a, PG_U_OTHER_LETTER},
+	{0x03134b, 0x03134f, PG_U_UNASSIGNED},
+	{0x031350, 0x0323af, PG_U_OTHER_LETTER},
+	{0x0323b0, 0x0e0000, PG_U_UNASSIGNED},
+	{0x0e0001, 0x0e0001, PG_U_FORMAT_CHAR},
+	{0x0e0002, 0x0e001f, PG_U_UNASSIGNED},
+	{0x0e0020, 0x0e007f, PG_U_FORMAT_CHAR},
+	{0x0e0080, 0x0e00ff, PG_U_UNASSIGNED},
+	{0x0e0100, 0x0e01ef, PG_U_NON_SPACING_MARK},
+	{0x0e01f0, 0x0effff, PG_U_UNASSIGNED},
+	{0x0f0000, 0x0ffffd, PG_U_PRIVATE_USE_CHAR},
+	{0x0ffffe, 0x0fffff, PG_U_UNASSIGNED},
+	{0x100000, 0x10fffd, PG_U_PRIVATE_USE_CHAR},
+	{0x10fffe, 0x10ffff, PG_U_UNASSIGNED}
+};
+
diff --git a/src/include/common/unicode_version.h b/src/include/common/unicode_version.h
new file mode 100644
index 0000000000..8e3d484ae6
--- /dev/null
+++ b/src/include/common/unicode_version.h
@@ -0,0 +1,16 @@
+/*-------------------------------------------------------------------------
+ *
+ * unicode_version.h
+ *	  Unicode version used by Postgres.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/unicode_version.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#define PG_UNICODE_VERSION		"15.1"
+
+
diff --git a/src/test/icu/t/010_database.pl b/src/test/icu/t/010_database.pl
index 0e9446cebe..67fc3bbf19 100644
--- a/src/test/icu/t/010_database.pl
+++ b/src/test/icu/t/010_database.pl
@@ -27,6 +27,10 @@ CREATE TABLE icu (def text, en text COLLATE "en-x-icu", upfirst text COLLATE upp
 INSERT INTO icu VALUES ('a', 'a', 'a'), ('b', 'b', 'b'), ('A', 'A', 'A'), ('B', 'B', 'B');
 });
 
+is( $node1->safe_psql('dbicu', q{SELECT icu_unicode_version() IS NOT NULL}),
+	qq(t),
+	'ICU unicode version defined');
+
 is( $node1->safe_psql('dbicu', q{SELECT def FROM icu ORDER BY def}),
 	qq(A
 a
diff --git a/src/test/regress/expected/unicode.out b/src/test/regress/expected/unicode.out
index f2713a2326..1e06de2264 100644
--- a/src/test/regress/expected/unicode.out
+++ b/src/test/regress/expected/unicode.out
@@ -8,6 +8,24 @@ SELECT U&'\0061\0308bc' <> U&'\00E4bc' COLLATE "C" AS sanity_check;
  t
 (1 row)
 
+SELECT unicode_version() IS NOT NULL;
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT unicode_assigned(U&'abc');
+ unicode_assigned 
+------------------
+ t
+(1 row)
+
+SELECT unicode_assigned(U&'abc\+10FFFF');
+ unicode_assigned 
+------------------
+ f
+(1 row)
+
 SELECT normalize('');
  normalize 
 -----------
diff --git a/src/test/regress/sql/unicode.sql b/src/test/regress/sql/unicode.sql
index 63cd523f85..e50adb68ed 100644
--- a/src/test/regress/sql/unicode.sql
+++ b/src/test/regress/sql/unicode.sql
@@ -5,6 +5,10 @@ SELECT getdatabaseencoding() <> 'UTF8' AS skip_test \gset
 
 SELECT U&'\0061\0308bc' <> U&'\00E4bc' COLLATE "C" AS sanity_check;
 
+SELECT unicode_version() IS NOT NULL;
+SELECT unicode_assigned(U&'abc');
+SELECT unicode_assigned(U&'abc\+10FFFF');
+
 SELECT normalize('');
 SELECT normalize(U&'\0061\0308\24D1c') = U&'\00E4\24D1c' COLLATE "C" AS test_default;
 SELECT normalize(U&'\0061\0308\24D1c', NFC) = U&'\00E4\24D1c' COLLATE "C" AS test_nfc;
-- 
2.34.1

#56Daniel Verite
daniel@manitou-mail.org
In reply to: Jeff Davis (#55)
Re: Pre-proposal: unicode normalized text

Jeff Davis wrote:

I believe the patch has utility as-is, but I've been brainstorming a
few more ideas that could build on it:

* Add a per-database option to enforce only storing assigned unicode
code points.

There's a problem in the fact that the set of assigned code points is
expanding with every Unicode release, which happens about every year.

If we had this option in Postgres 11 released in 2018 it would use
Unicode 11, and in 2023 this feature would reject thousands of code
points that have been assigned since then.

Aside from that, aborting a transaction because there's an
unassigned code point in a string feels like doing too much,
too late.
The programs that want to filter out unwanted code points
do it before they hit the database, client-side.

Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

#57Robert Haas
robertmhaas@gmail.com
In reply to: Daniel Verite (#56)
Re: Pre-proposal: unicode normalized text

On Tue, Oct 17, 2023 at 11:07 AM Daniel Verite <daniel@manitou-mail.org> wrote:

There's a problem in the fact that the set of assigned code points is
expanding with every Unicode release, which happens about every year.

If we had this option in Postgres 11 released in 2018 it would use
Unicode 11, and in 2023 this feature would reject thousands of code
points that have been assigned since then.

Are code points assigned from a gapless sequence? That is, is the
implementation of codepoint_is_assigned(char) just 'codepoint <
SOME_VALUE' and SOME_VALUE increases over time?

If so, we could consider having a function that lets you specify the
bound as an input parameter. But whether anyone would use it, or know
how to set that input parameter, is questionable. The real issue here
is whether you can figure out which of the code points that you could
put into the database already have collation definitions.

--
Robert Haas
EDB: http://www.enterprisedb.com

#58Isaac Morland
isaac.morland@gmail.com
In reply to: Robert Haas (#57)
Re: Pre-proposal: unicode normalized text

On Tue, 17 Oct 2023 at 11:15, Robert Haas <robertmhaas@gmail.com> wrote:

Are code points assigned from a gapless sequence? That is, is the
implementation of codepoint_is_assigned(char) just 'codepoint <
SOME_VALUE' and SOME_VALUE increases over time?

Not even close. Code points are organized in blocks, e.g. for mathematical
symbols or Ethiopic script. Sometimes new blocks are added, sometimes new
characters are added to existing blocks. Where they go is a combination of
convenience, history, and planning.

#59Robert Haas
robertmhaas@gmail.com
In reply to: Isaac Morland (#58)
Re: Pre-proposal: unicode normalized text

On Tue, Oct 17, 2023 at 11:38 AM Isaac Morland <isaac.morland@gmail.com> wrote:

On Tue, 17 Oct 2023 at 11:15, Robert Haas <robertmhaas@gmail.com> wrote:

Are code points assigned from a gapless sequence? That is, is the
implementation of codepoint_is_assigned(char) just 'codepoint <
SOME_VALUE' and SOME_VALUE increases over time?

Not even close. Code points are organized in blocks, e.g. for mathematical symbols or Ethiopic script. Sometimes new blocks are added, sometimes new characters are added to existing blocks. Where they go is a combination of convenience, history, and planning.

Ah. Good to know.

--
Robert Haas
EDB: http://www.enterprisedb.com

#60Jeff Davis
pgsql@j-davis.com
In reply to: Daniel Verite (#56)
Re: Pre-proposal: unicode normalized text

On Tue, 2023-10-17 at 17:07 +0200, Daniel Verite wrote:

There's a problem in the fact that the set of assigned code points is
expanding with every Unicode release, which happens about every year.

If we had this option in Postgres 11 released in 2018 it would use
Unicode 11, and in 2023 this feature would reject thousands of code
points that have been assigned since then.

That wouldn't be good for everyone, but might it be good for some
users?

We already expose normalization functions. If users are depending on
normalization, and they have unassigned code points in their system,
that will break when we update Unicode. By restricting themselves to
assigned code points, normalization is guaranteed to be forward-
compatible.

Regards,
Jeff Davis

#61Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#55)
Re: Pre-proposal: unicode normalized text

On Mon, 2023-10-16 at 20:32 -0700, Jeff Davis wrote:

On Wed, 2023-10-11 at 08:56 +0200, Peter Eisentraut wrote:

We need to be careful about precise terminology.  "Valid" has a
defined
meaning for Unicode.  A byte sequence can be valid or not as UTF-
8. 
But
a string containing unassigned code points is not not-"valid" as
Unicode.

New patch attached, function name is "unicode_assigned".

I plan to commit something like v3 early next week unless someone else
has additional comments or I missed a concern.

Regards,
Jeff Davis

#62Thomas Munro
thomas.munro@gmail.com
In reply to: Jeff Davis (#61)
Re: Pre-proposal: unicode normalized text

bowerbird and hammerkop didn't like commit a02b37fc. They're still
using the old 3rd build system that is not tested by CI. It's due for
removal in the 17 cycle IIUC but in the meantime I guess the new
codegen script needs to be invoked by something under src/tools/msvc?

varlena.obj : error LNK2019: unresolved external symbol
unicode_category referenced in function unicode_assigned
[H:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]

#63Nico Williams
nico@cryptonector.com
In reply to: Robert Haas (#41)
Re: Pre-proposal: unicode normalized text

On Fri, Oct 06, 2023 at 02:37:06PM -0400, Robert Haas wrote:

Sure, because TEXT in PG doesn't have codeset+encoding as part of it --
it's whatever the database's encoding is. Collation can and should be a
porperty of a column, since for Unicode it wouldn't be reasonable to
make that part of the type. But codeset+encoding should really be a
property of the type if PG were to support more than one. IMO.

No, what I mean is, you can't just be like "oh, the varlena will be
different in memory than on disk" as if that were no big deal.

It would have to be the same in memory as on disk, indeed, but you might
need new types in C as well for that.

I agree that, as an alternative to encoding being a column property,
it could instead be completely a type property, meaning that if you
want to store, say, LATIN1 text in your UTF-8 database, you first
create a latint1text data type and then use it, rather than, as in the
model I proposed, creating a text column and then applying a setting
like ENCODING latin1 to it. I think that there might be some problems

Yes, that was the idea.

with that model, but it could also have some benefits. [...]

Mainly, I think, whether you want PG to do automatic codeset conversions
(ugly and problematic) or not, like for when using text functions.

Automatic codeset conversions are problematic because a) it can be lossy
(so what to do when it is?) and b) automatic type conversions can be
surprising.

Ultimately the client would have to do its own codeset conversions, if
it wants them, or treat text in codesets other than its local one as
blobs and leave it for a higher app layer to deal with.

I wouldn't want to propose automatic codeset conversions. If you'd want
that then you might as well declare it has to all be UTF-8 and say no to
any other codesets.

But, even if we were all convinced that this kind of feature was good
to add, I think it would almost certainly be wrong to invent new
varlena features along the way.

Yes.

Nico
--

#64Nico Williams
nico@cryptonector.com
In reply to: Robert Haas (#11)
Re: Pre-proposal: unicode normalized text

On Wed, Oct 04, 2023 at 01:16:22PM -0400, Robert Haas wrote:

There's a very popular commercial database where, or so I have been
led to believe, any byte sequence at all is accepted when you try to
put values into the database. [...]

In other circles we call this "just-use-8".

ZFS, for example, has an option to require that filenames be valid
UTF-8 or not, and if not it will accept any garbage (other than ASCII
NUL and /, for obvious reasons).

For filesystems the situation is a bit dire because:

- strings at the system call boundary have never been tagged with a
codeset (in the beginning there was only ASCII)
- there has never been a standard codeset to use at the system call
boundary,
- there have been multiple codesets in use for decades

so filesystems have to be prepared to be tolerant of garbage, at least
until only Unicode is left (UTF-16 on Windows filesystems, UTF-8 for
most others).

This is another reason that ZFS has form-insensitive/form-preserving
behavior: if you want to use non-UTF-8 filenames then names or
substrings thereof that look like valid UTF-8 won't accidentally be
broken by normalization.

If PG never tagged strings with codesets on the wire then PG has the
same problem, especially since there's multiple implementations of the
PG wire protocol.

So I can see why a "popular database" might want to take this approach.

For the longer run though, either move to supporting only UTF-8, or
allow multiple text types each with a codeset specified in its type.

At any rate, if we were to go in the direction of rejecting code
points that aren't yet assigned, or aren't yet known to the collation
library, that's another way for data loading to fail. Which feels like
very defensible behavior, but not what everyone wants, or is used to.

Yes. See points about ZFS. I do think ZFS struck a good balance.

PG could take the ZFS approach and add functions for use in CHECK
constraints that enforce valid UTF-8, valid Unicode (no use of
unassigned codepoints, no use of private use codepoints not configured
into the database), etc.

Coming back to the "just-use-8" thing, a database could have a text type
where the codeset is not specified, one or more text types where the
codeset is specified, manual or automatic codeset conversions, and
whatever enforcement functions make sense. Provided that the type
information is not lost at the edges.

Whether we ever get to a core data type -- and more importantly,
whether anyone uses it -- I'm not sure.

Same here.

A TEXTutf8 type (whatever name you want to give it) could be useful as a
way to a) opt into heavier enforcement w/o having to write CHECK
constraints, b) documentation of intent, all provided that the type is
not lost on the wire nor in memory.

Support for other codesets is less important.

Nico
--

#65Nico Williams
nico@cryptonector.com
In reply to: Daniel Verite (#56)
Re: Pre-proposal: unicode normalized text

On Tue, Oct 17, 2023 at 05:07:40PM +0200, Daniel Verite wrote:

* Add a per-database option to enforce only storing assigned unicode
code points.

There's a problem in the fact that the set of assigned code points is
expanding with every Unicode release, which happens about every year.

If we had this option in Postgres 11 released in 2018 it would use
Unicode 11, and in 2023 this feature would reject thousands of code
points that have been assigned since then.

Yes, and that's desirable if PG were to normalize text as Jeff proposes,
since then PG wouldn't know how to normalize text containing codepoints
assigned after that. At that point to use those codepoints you'd have
to upgrade PG -- not too unreasonable.

Nico
--

#66Nico Williams
nico@cryptonector.com
In reply to: Jeff Davis (#17)
Re: Pre-proposal: unicode normalized text

On Wed, Oct 04, 2023 at 01:15:03PM -0700, Jeff Davis wrote:

The fact that there are multiple types of normalization and multiple
notions of equality doesn't make this easier.

And then there's text that isn't normalized to any of them.

NFC is really the only one that makes sense.

Yes.

Most input modes produce NFC, though there may be scripts (like Hangul)
where input modes might produce NFD, so I wouldn't say NFC is universal.

Unfortunately HFS+ uses NFD so NFD can leak into places naturally enough
through OS X.

I believe that having a kind of text data type where it's stored in NFC
and compared with memcmp() would be a good place for many users to be -
- probably most users. It's got all the performance and stability
benefits of memcmp(), with slightly richer semantics. It's less likely
that someone malicious can confuse the database by using different
representations of the same character.

The problem is that it's not universally better for everyone: there are
certainly users who would prefer that the codepoints they send to the
database are preserved exactly, and also users who would like to be
able to use unassigned code points.

The alternative is forminsensitivity, where you compare strings as
equal even if they aren't memcmp() eq as long as they are equal when
normalized. This can be made fast, though not as fast as memcmp().

The problem with form insensitivity is that you might have to implement
it in numerous places. In ZFS there's only a few, but in a database
every index type, for example, will need to hook in form insensitivity.
If so then that complexity would be a good argument to just normalize.

Nico
--

#67Jeff Davis
pgsql@j-davis.com
In reply to: Thomas Munro (#62)
Re: Pre-proposal: unicode normalized text

On Fri, 2023-11-03 at 10:51 +1300, Thomas Munro wrote:

bowerbird and hammerkop didn't like commit a02b37fc.  They're still
using the old 3rd build system that is not tested by CI.  It's due
for
removal in the 17 cycle IIUC but in the meantime I guess the new
codegen script needs to be invoked by something under src/tools/msvc?

  varlena.obj : error LNK2019: unresolved external symbol
unicode_category referenced in function unicode_assigned
[H:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]

I think I just need to add unicode_category.c to @pgcommonallfiles in
Mkvcbuild.pm. I'll do a trial commit tomorrow and see if that fixes it
unless someone has a better suggestion.

Regards,
Jeff Davis

#68David Rowley
dgrowleyml@gmail.com
In reply to: Jeff Davis (#67)
Re: Pre-proposal: unicode normalized text

On Fri, 3 Nov 2023 at 20:49, Jeff Davis <pgsql@j-davis.com> wrote:

On Fri, 2023-11-03 at 10:51 +1300, Thomas Munro wrote:

bowerbird and hammerkop didn't like commit a02b37fc. They're still
using the old 3rd build system that is not tested by CI. It's due
for
removal in the 17 cycle IIUC but in the meantime I guess the new
codegen script needs to be invoked by something under src/tools/msvc?

varlena.obj : error LNK2019: unresolved external symbol
unicode_category referenced in function unicode_assigned
[H:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]

I think I just need to add unicode_category.c to @pgcommonallfiles in
Mkvcbuild.pm. I'll do a trial commit tomorrow and see if that fixes it
unless someone has a better suggestion.

(I didn't realise this was being discussed.)

Thomas mentioned this to me earlier today. After looking I also
concluded that unicode_category.c needed to be added to
@pgcommonallfiles. After looking at the time, I didn't expect you to
be around so opted just to push that to fix the MSVC buildfarm
members.

Sorry for the duplicate effort and/or stepping on your toes.

David

#69John Naylor
johncnaylorls@gmail.com
In reply to: Jeff Davis (#61)
Re: Pre-proposal: unicode normalized text

On Sat, Oct 28, 2023 at 4:15 AM Jeff Davis <pgsql@j-davis.com> wrote:

I plan to commit something like v3 early next week unless someone else
has additional comments or I missed a concern.

Hi Jeff, is the CF entry titled "Unicode character general category
functions" ready to be marked committed?

#70Jeff Davis
pgsql@j-davis.com
In reply to: David Rowley (#68)
Re: Pre-proposal: unicode normalized text

On Fri, 2023-11-03 at 21:01 +1300, David Rowley wrote:

Thomas mentioned this to me earlier today. After looking I also
concluded that unicode_category.c needed to be added to
@pgcommonallfiles. After looking at the time, I didn't expect you to
be around so opted just to push that to fix the MSVC buildfarm
members.

Sorry for the duplicate effort and/or stepping on your toes.

Thank you, no apology necessary.

Regards,
Jeff Davis

#71Jeff Davis
pgsql@j-davis.com
In reply to: John Naylor (#69)
Re: Pre-proposal: unicode normalized text

On Fri, 2023-11-03 at 17:11 +0700, John Naylor wrote:

On Sat, Oct 28, 2023 at 4:15 AM Jeff Davis <pgsql@j-davis.com> wrote:

I plan to commit something like v3 early next week unless someone
else
has additional comments or I missed a concern.

Hi Jeff, is the CF entry titled "Unicode character general category
functions" ready to be marked committed?

Done, thank you.

Regards,
Jeff Davis

#72Phil Krylov
phil@krylov.eu
In reply to: Chapman Flack (#20)
Re: Pre-proposal: unicode normalized text

On 2023-10-04 23:32, Chapman Flack wrote:

Well, for what reason does anybody run PG now with the encoding set
to anything besides UTF-8? I don't really have my finger on that pulse.
Could it be that it bloats common strings in their local script, and
with enough of those to store, it could matter to use the local
encoding that stores them more economically?

I do use CP1251 for storing some data which is coming in as XMLs in
CP1251, and thus definitely fits. In UTF-8, that data would take exactly
2x the size on disks (before compression, and pglz/lz4 won't help much
with that).

-- Ph.

#73Thomas Munro
thomas.munro@gmail.com
In reply to: David Rowley (#68)
Re: Pre-proposal: unicode normalized text

On Fri, Nov 3, 2023 at 9:01 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 3 Nov 2023 at 20:49, Jeff Davis <pgsql@j-davis.com> wrote:

On Fri, 2023-11-03 at 10:51 +1300, Thomas Munro wrote:

bowerbird and hammerkop didn't like commit a02b37fc. They're still
using the old 3rd build system that is not tested by CI. It's due
for
removal in the 17 cycle IIUC but in the meantime I guess the new
codegen script needs to be invoked by something under src/tools/msvc?

varlena.obj : error LNK2019: unresolved external symbol
unicode_category referenced in function unicode_assigned
[H:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]

I think I just need to add unicode_category.c to @pgcommonallfiles in
Mkvcbuild.pm. I'll do a trial commit tomorrow and see if that fixes it
unless someone has a better suggestion.

(I didn't realise this was being discussed.)

Thomas mentioned this to me earlier today. After looking I also
concluded that unicode_category.c needed to be added to
@pgcommonallfiles. After looking at the time, I didn't expect you to
be around so opted just to push that to fix the MSVC buildfarm
members.

Shouldn't it be added unconditionally near unicode_norm.c? It looks
like it was accidentally made conditional on openssl, which might
explain why it worked for David but not for bowerbird.

#74David Rowley
dgrowleyml@gmail.com
In reply to: Thomas Munro (#73)
Re: Pre-proposal: unicode normalized text

On Sat, 4 Nov 2023 at 10:57, Thomas Munro <thomas.munro@gmail.com> wrote:

On Fri, Nov 3, 2023 at 9:01 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 3 Nov 2023 at 20:49, Jeff Davis <pgsql@j-davis.com> wrote:

I think I just need to add unicode_category.c to @pgcommonallfiles in
Mkvcbuild.pm. I'll do a trial commit tomorrow and see if that fixes it
unless someone has a better suggestion.

Thomas mentioned this to me earlier today. After looking I also
concluded that unicode_category.c needed to be added to
@pgcommonallfiles. After looking at the time, I didn't expect you to
be around so opted just to push that to fix the MSVC buildfarm
members.

Shouldn't it be added unconditionally near unicode_norm.c? It looks
like it was accidentally made conditional on openssl, which might
explain why it worked for David but not for bowerbird.

Well, I did that one pretty poorly :-(

I've just pushed a fix for that. Thanks.

David

#75Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#3)
1 attachment(s)
Re: Pre-proposal: unicode normalized text

On Mon, 2023-10-02 at 16:06 -0400, Robert Haas wrote:

It seems to me that this overlooks one of the major points of Jeff's
proposal, which is that we don't reject text input that contains
unassigned code points. That decision turns out to be really painful.

Attached is an implementation of a per-database option STRICT_UNICODE
which enforces the use of assigned code points only.

Not everyone would want to use it. There are lots of applications that
accept free-form text, and that may include recently-assigned code
points not yet recognized by Postgres.

But it would offer protection/stability for some databases. It makes it
possible to have a hard guarantee that Unicode normalization is
stable[1]https://www.unicode.org/policies/stability_policy.html#Normalization. And it may also mitigate the risk of collation changes --
using unassigned code points carries a high risk that the collation
order changes as soon as the collation provider recognizes the
assignment. (Though assigned code points can change, too, so limiting
yourself to assigned code points is only a mitigation.)

I worry slightly that users will think at first that they want only
assigned code points, and then later figure out that the application
has increased in scope and now takes all kinds of free-form text. In
that case, the user can "ALTER DATABASE ... STRICT_UNICODE FALSE", and
follow up with some "CHECK (unicode_assigned(...))" constraints on the
particular fields that they'd like to protect.

There's some weirdness that the set of assigned code points as Postgres
sees it may not match what a collation provider sees due to differing
Unicode versions. That's not great -- perhaps we could check that code
points are considered assigned by *both* Postgres and ICU. I don't know
if there's a way to tell if libc considers a code point to be assigned.

Regards,
Jeff Davis

[1]: https://www.unicode.org/policies/stability_policy.html#Normalization
https://www.unicode.org/policies/stability_policy.html#Normalization

Attachments:

v1-0001-CREATE-DATABASE-.-STRICT_UNICODE.patchtext/x-patch; charset=UTF-8; name=v1-0001-CREATE-DATABASE-.-STRICT_UNICODE.patchDownload
From 54a15ee4ac5d5f437f4d536d724e1fa9e535fd50 Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Thu, 29 Feb 2024 13:13:58 -0800
Subject: [PATCH v1] CREATE DATABASE ... STRICT_UNICODE.

Introduce new per-database option STRICT_UNICODE, which causes
Postgres to reject any textual value containing unassigned code
points. (Surrogate halves were already rejected because they are
invalid for UTF-8.)

"Unassigned" means unassigned as of the version of Unicode that
Postgres is based on; that is, the version returned by the SQL
function unicode_version().

By rejecting unassigned code points, it helps stabilize the database
against semantic changes across Postgres versions resulting from
assignment of previously-unassigned code points. For instance, Unicode
normalization is only stable across Unicode versions when using
assigned code points.

New databases may use STRICT_UNICODE if the template also uses
STRICT_UNICODE, or if the template is template0. An existing database
may be altered to disable STRICT_UNICODE (and therefore allow
unassigned code points), but may not be altered to enable
STRICT_UNICODE (because existing values may contain unassigned code
points).

Discussion: https://postgr.es/m/f30b58657ceb71d5be032decf4058d454cc1df74.camel%40j-davis.com
---
 doc/src/sgml/ref/alter_database.sgml  | 33 ++++++++++++++
 doc/src/sgml/ref/create_database.sgml | 23 ++++++++++
 doc/src/sgml/ref/createdb.sgml        | 23 ++++++++++
 doc/src/sgml/ref/initdb.sgml          | 23 ++++++++++
 src/backend/commands/dbcommands.c     | 64 ++++++++++++++++++++++++---
 src/backend/utils/adt/oracle_compat.c | 16 +++++++
 src/backend/utils/adt/pg_locale.c     |  3 ++
 src/backend/utils/adt/varlena.c       | 35 +++++++++++++++
 src/backend/utils/init/postinit.c     |  2 +
 src/bin/initdb/initdb.c               | 21 +++++++++
 src/bin/pg_dump/pg_dump.c             | 12 +++++
 src/bin/psql/describe.c               | 11 +++++
 src/bin/scripts/createdb.c            | 15 +++++++
 src/include/catalog/pg_database.dat   |  1 +
 src/include/catalog/pg_database.h     |  3 ++
 src/include/utils/pg_locale.h         |  3 ++
 16 files changed, 281 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/ref/alter_database.sgml b/doc/src/sgml/ref/alter_database.sgml
index 2479c41e8d..07e42dbdd4 100644
--- a/doc/src/sgml/ref/alter_database.sgml
+++ b/doc/src/sgml/ref/alter_database.sgml
@@ -25,6 +25,7 @@ ALTER DATABASE <replaceable class="parameter">name</replaceable> [ [ WITH ] <rep
 
 <phrase>where <replaceable class="parameter">option</replaceable> can be:</phrase>
 
+    STRICT_UNICODE <replaceable class="parameter">strict_unicode</replaceable>
     ALLOW_CONNECTIONS <replaceable class="parameter">allowconn</replaceable>
     CONNECTION LIMIT <replaceable class="parameter">connlimit</replaceable>
     IS_TEMPLATE <replaceable class="parameter">istemplate</replaceable>
@@ -112,6 +113,38 @@ ALTER DATABASE <replaceable class="parameter">name</replaceable> RESET ALL
       </listitem>
      </varlistentry>
 
+      <varlistentry>
+       <term><replaceable class="parameter">strict_unicode</replaceable></term>
+       <listitem>
+        <para>
+         If <literal>true</literal>, specifies that the initial databases will
+         reject Unicode code points that are unassigned as of the version of
+         Unicode returned by <function>unicode_version()</function> (See <xref
+         linkend="functions-version"/>). Only valid if the encoding is
+         <literal>UTF8</literal>.
+        </para>
+        <para>
+         This setting may be changed from <literal>true</literal> to
+         <literal>false</literal> to enable storing textual values containing
+         unassigned Unicode code points. However, this setting may not be
+         changed from <literal>false</literal> to <literal>true</literal>,
+         because existing textual values in the database might contain
+         unassigned Unicode code points. A changed setting is recognized in
+         new connections.
+        </para>
+        <note>
+         <para>
+          This option affects all textual fields in the initial databases, and
+          should only be used when the applications control the text
+          input. Furthermore, it may not be possible to use recently-assigned
+          code points if <productname>PostgreSQL</productname> is based on an
+          older version of Unicode that does not yet recognize the new
+          assignments.
+         </para>
+        </note>
+       </listitem>
+      </varlistentry>
+
       <varlistentry>
        <term><replaceable class="parameter">allowconn</replaceable></term>
        <listitem>
diff --git a/doc/src/sgml/ref/create_database.sgml b/doc/src/sgml/ref/create_database.sgml
index 72927960eb..c546789d28 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -25,6 +25,7 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
     [ WITH ] [ OWNER [=] <replaceable class="parameter">user_name</replaceable> ]
            [ TEMPLATE [=] <replaceable class="parameter">template</replaceable> ]
            [ ENCODING [=] <replaceable class="parameter">encoding</replaceable> ]
+           [ STRICT_UNICODE [=] <replaceable class="parameter">strict_unicode</replaceable> ]
            [ STRATEGY [=] <replaceable class="parameter">strategy</replaceable> ]
            [ LOCALE [=] <replaceable class="parameter">locale</replaceable> ]
            [ LC_COLLATE [=] <replaceable class="parameter">lc_collate</replaceable> ]
@@ -120,6 +121,28 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
        </para>
       </listitem>
      </varlistentry>
+     <varlistentry id="create-database-strict-unicode">
+      <term><replaceable class="parameter">strict_unicode</replaceable></term>
+      <listitem>
+       <para>
+        If <literal>true</literal>, specifies that the initial databases will
+        reject Unicode code points that are unassigned as of the version of
+        Unicode returned by <function>unicode_version()</function> (See <xref
+        linkend="functions-version"/>). Only valid if the encoding is
+        <literal>UTF8</literal>.
+       </para>
+       <note>
+        <para>
+         This option affects all textual fields in the initial databases, and
+         should only be used when the applications control the text
+         input. Furthermore, it may not be possible to use recently-assigned
+         code points if <productname>PostgreSQL</productname> is based on an
+         older version of Unicode that does not yet recognize the new
+         assignments.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
      <varlistentry id="create-database-strategy" xreflabel="CREATE DATABASE STRATEGY">
       <term><replaceable class="parameter">strategy</replaceable></term>
       <listitem>
diff --git a/doc/src/sgml/ref/createdb.sgml b/doc/src/sgml/ref/createdb.sgml
index e4647d5ce7..d2b8014b59 100644
--- a/doc/src/sgml/ref/createdb.sgml
+++ b/doc/src/sgml/ref/createdb.sgml
@@ -118,6 +118,29 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--strict-unicode</option></term>
+      <listitem>
+       <para>
+        Specifies that the database will reject Unicode code points that are
+        unassigned as of the version of Unicode returned by
+        <function>unicode_version()</function> (See <xref
+        linkend="functions-version"/>). Only valid if the encoding is
+        <literal>UTF8</literal>.
+       </para>
+       <note>
+        <para>
+         This option affects all textual fields in the database, and should
+         only be used when the applications control the text
+         input. Furthermore, it may not be possible to use recently-assigned
+         code points if <productname>PostgreSQL</productname> is based on an
+         older version of Unicode that does not yet recognize the new
+         assignments.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-l <replaceable class="parameter">locale</replaceable></option></term>
       <term><option>--locale=<replaceable class="parameter">locale</replaceable></option></term>
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index cd75cae10e..4242aea278 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -227,6 +227,29 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry id="app-initdb-option-strict-unicode">
+      <term><option>--strict-unicode</option></term>
+      <listitem>
+       <para>
+        Specifies that the initial databases will reject Unicode code points
+        that are unassigned as of the version of Unicode returned by
+        <function>unicode_version()</function> (See <xref
+        linkend="functions-version"/>). Only valid if the encoding is
+        <literal>UTF8</literal>.
+       </para>
+       <note>
+        <para>
+         This option affects all textual fields in the initial databases, and
+         should only be used when the applications control the text
+         input. Furthermore, it may not be possible to use recently-assigned
+         code points if <productname>PostgreSQL</productname> is based on an
+         older version of Unicode that does not yet recognize the new
+         assignments.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="app-initdb-allow-group-access" xreflabel="group access">
       <term><option>-g</option></term>
       <term><option>--allow-group-access</option></term>
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index b1327de71e..9524d4447c 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -116,7 +116,8 @@ static void movedb(const char *dbname, const char *tblspcname);
 static void movedb_failure_callback(int code, Datum arg);
 static bool get_db_info(const char *name, LOCKMODE lockmode,
 						Oid *dbIdP, Oid *ownerIdP,
-						int *encodingP, bool *dbIsTemplateP, bool *dbAllowConnP, bool *dbHasLoginEvtP,
+						int *encodingP, bool *dbstrictunicodeP, bool *dbIsTemplateP,
+						bool *dbAllowConnP, bool *dbHasLoginEvtP,
 						TransactionId *dbFrozenXidP, MultiXactId *dbMinMultiP,
 						Oid *dbTablespace, char **dbCollate, char **dbCtype, char **dbIculocale,
 						char **dbIcurules,
@@ -673,6 +674,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 	Oid			src_dboid;
 	Oid			src_owner;
 	int			src_encoding = -1;
+	bool		src_strictunicode = false;
 	char	   *src_collate = NULL;
 	char	   *src_ctype = NULL;
 	char	   *src_iculocale = NULL;
@@ -697,6 +699,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 	DefElem    *downer = NULL;
 	DefElem    *dtemplate = NULL;
 	DefElem    *dencoding = NULL;
+	DefElem	   *dstrictunicode = NULL;
 	DefElem    *dlocale = NULL;
 	DefElem    *dcollate = NULL;
 	DefElem    *dctype = NULL;
@@ -718,6 +721,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 	char		dblocprovider = '\0';
 	char	   *canonname;
 	int			encoding = -1;
+	bool		dbstrictunicode = false;
 	bool		dbistemplate = false;
 	bool		dballowconnections = true;
 	int			dbconnlimit = DATCONNLIMIT_UNLIMITED;
@@ -756,6 +760,12 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 				errorConflictingDefElem(defel, pstate);
 			dencoding = defel;
 		}
+		else if (strcmp(defel->defname, "strict_unicode") == 0)
+		{
+			if (dstrictunicode)
+				errorConflictingDefElem(defel, pstate);
+			dstrictunicode = defel;
+		}
 		else if (strcmp(defel->defname, "locale") == 0)
 		{
 			if (dlocale)
@@ -893,6 +903,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 						 parser_errposition(pstate, dencoding->location)));
 		}
 	}
+	if (dstrictunicode)
+		dbstrictunicode = defGetBoolean(dstrictunicode);
 	if (dlocale && dlocale->arg)
 	{
 		dbcollate = defGetString(dlocale);
@@ -968,7 +980,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 		dbtemplate = "template1";	/* Default template database name */
 
 	if (!get_db_info(dbtemplate, ShareLock,
-					 &src_dboid, &src_owner, &src_encoding,
+					 &src_dboid, &src_owner, &src_encoding, &src_strictunicode,
 					 &src_istemplate, &src_allowconn, &src_hasloginevt,
 					 &src_frozenxid, &src_minmxid, &src_deftablespace,
 					 &src_collate, &src_ctype, &src_iculocale, &src_icurules, &src_locprovider,
@@ -1021,6 +1033,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 	/* If encoding or locales are defaulted, use source's setting */
 	if (encoding < 0)
 		encoding = src_encoding;
+	if (!dstrictunicode)
+		dbstrictunicode  = src_strictunicode;
 	if (dbcollate == NULL)
 		dbcollate = src_collate;
 	if (dbctype == NULL)
@@ -1057,6 +1071,12 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 				 errhint("If the locale name is specific to ICU, use ICU_LOCALE.")));
 	dbctype = canonname;
 
+	if (dbstrictunicode && encoding != PG_UTF8)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("encoding \"%s\" does not support STRICT_UNICODE",
+						pg_encoding_to_char(encoding))));
+
 	check_encoding_locale_matches(encoding, dbcollate, dbctype);
 
 	if (dblocprovider == COLLPROVIDER_ICU)
@@ -1131,6 +1151,12 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 							pg_encoding_to_char(src_encoding)),
 					 errhint("Use the same encoding as in the template database, or use template0 as template.")));
 
+		if (dbstrictunicode && !src_strictunicode)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("STRICT_UNICODE is incompatible with the template database"),
+					 errhint("Use a template database with STRICT_UNICODE, or use template0 as template.")));
+
 		if (strcmp(dbcollate, src_collate) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1373,6 +1399,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 		DirectFunctionCall1(namein, CStringGetDatum(dbname));
 	new_record[Anum_pg_database_datdba - 1] = ObjectIdGetDatum(datdba);
 	new_record[Anum_pg_database_encoding - 1] = Int32GetDatum(encoding);
+	new_record[Anum_pg_database_datstrictunicode - 1] = BoolGetDatum(dbstrictunicode);
 	new_record[Anum_pg_database_datlocprovider - 1] = CharGetDatum(dblocprovider);
 	new_record[Anum_pg_database_datistemplate - 1] = BoolGetDatum(dbistemplate);
 	new_record[Anum_pg_database_datallowconn - 1] = BoolGetDatum(dballowconnections);
@@ -1604,7 +1631,7 @@ dropdb(const char *dbname, bool missing_ok, bool force)
 	 */
 	pgdbrel = table_open(DatabaseRelationId, RowExclusiveLock);
 
-	if (!get_db_info(dbname, AccessExclusiveLock, &db_id, NULL, NULL,
+	if (!get_db_info(dbname, AccessExclusiveLock, &db_id, NULL, NULL, NULL,
 					 &db_istemplate, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL))
 	{
 		if (!missing_ok)
@@ -1819,7 +1846,7 @@ RenameDatabase(const char *oldname, const char *newname)
 	 */
 	rel = table_open(DatabaseRelationId, RowExclusiveLock);
 
-	if (!get_db_info(oldname, AccessExclusiveLock, &db_id, NULL, NULL, NULL,
+	if (!get_db_info(oldname, AccessExclusiveLock, &db_id, NULL, NULL, NULL, NULL,
 					 NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_DATABASE),
@@ -1929,7 +1956,7 @@ movedb(const char *dbname, const char *tblspcname)
 	 */
 	pgdbrel = table_open(DatabaseRelationId, RowExclusiveLock);
 
-	if (!get_db_info(dbname, AccessExclusiveLock, &db_id, NULL, NULL, NULL,
+	if (!get_db_info(dbname, AccessExclusiveLock, &db_id, NULL, NULL, NULL, NULL,
 					 NULL, NULL, NULL, NULL, &src_tblspcoid, NULL, NULL, NULL, NULL, NULL, NULL))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_DATABASE),
@@ -2274,9 +2301,11 @@ AlterDatabase(ParseState *pstate, AlterDatabaseStmt *stmt, bool isTopLevel)
 	ScanKeyData scankey;
 	SysScanDesc scan;
 	ListCell   *option;
+	bool		dbstrictunicode = false;
 	bool		dbistemplate = false;
 	bool		dballowconnections = true;
 	int			dbconnlimit = DATCONNLIMIT_UNLIMITED;
+	DefElem    *dstrictunicode = NULL;
 	DefElem    *distemplate = NULL;
 	DefElem    *dallowconnections = NULL;
 	DefElem    *dconnlimit = NULL;
@@ -2290,7 +2319,13 @@ AlterDatabase(ParseState *pstate, AlterDatabaseStmt *stmt, bool isTopLevel)
 	{
 		DefElem    *defel = (DefElem *) lfirst(option);
 
-		if (strcmp(defel->defname, "is_template") == 0)
+		if (strcmp(defel->defname, "strict_unicode") == 0)
+		{
+			if (dstrictunicode)
+				errorConflictingDefElem(defel, pstate);
+			dstrictunicode = defel;
+		}
+		else if (strcmp(defel->defname, "is_template") == 0)
 		{
 			if (distemplate)
 				errorConflictingDefElem(defel, pstate);
@@ -2340,6 +2375,8 @@ AlterDatabase(ParseState *pstate, AlterDatabaseStmt *stmt, bool isTopLevel)
 		return InvalidOid;
 	}
 
+	if (dstrictunicode && dstrictunicode->arg)
+		dbstrictunicode = defGetBoolean(dstrictunicode);
 	if (distemplate && distemplate->arg)
 		dbistemplate = defGetBoolean(distemplate);
 	if (dallowconnections && dallowconnections->arg)
@@ -2400,6 +2437,15 @@ AlterDatabase(ParseState *pstate, AlterDatabaseStmt *stmt, bool isTopLevel)
 	/*
 	 * Build an updated tuple, perusing the information just obtained
 	 */
+	if (dstrictunicode)
+	{
+		if (dbstrictunicode && !datform->datstrictunicode)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("STRICT_UNICODE cannot be enabled on an existing database")));
+		new_record[Anum_pg_database_datstrictunicode - 1] = BoolGetDatum(dbstrictunicode);
+		new_record_repl[Anum_pg_database_datstrictunicode - 1] = true;
+	}
 	if (distemplate)
 	{
 		new_record[Anum_pg_database_datistemplate - 1] = BoolGetDatum(dbistemplate);
@@ -2695,7 +2741,8 @@ pg_database_collation_actual_version(PG_FUNCTION_ARGS)
 static bool
 get_db_info(const char *name, LOCKMODE lockmode,
 			Oid *dbIdP, Oid *ownerIdP,
-			int *encodingP, bool *dbIsTemplateP, bool *dbAllowConnP, bool *dbHasLoginEvtP,
+			int *encodingP, bool *strictunicodeP, bool *dbIsTemplateP,
+			bool *dbAllowConnP, bool *dbHasLoginEvtP,
 			TransactionId *dbFrozenXidP, MultiXactId *dbMinMultiP,
 			Oid *dbTablespace, char **dbCollate, char **dbCtype, char **dbIculocale,
 			char **dbIcurules,
@@ -2777,6 +2824,9 @@ get_db_info(const char *name, LOCKMODE lockmode,
 				/* character encoding */
 				if (encodingP)
 					*encodingP = dbform->encoding;
+				/* reject unassigned code points? (UTF-8 only) */
+				if (strictunicodeP)
+					*strictunicodeP = dbform->datstrictunicode;
 				/* allowed as template? */
 				if (dbIsTemplateP)
 					*dbIsTemplateP = dbform->datistemplate;
diff --git a/src/backend/utils/adt/oracle_compat.c b/src/backend/utils/adt/oracle_compat.c
index b126a7d460..d7061f964f 100644
--- a/src/backend/utils/adt/oracle_compat.c
+++ b/src/backend/utils/adt/oracle_compat.c
@@ -16,11 +16,13 @@
 #include "postgres.h"
 
 #include "common/int.h"
+#include "common/unicode_category.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "utils/builtins.h"
 #include "utils/formatting.h"
 #include "utils/memutils.h"
+#include "utils/pg_locale.h"
 #include "varatt.h"
 
 
@@ -1030,6 +1032,7 @@ chr			(PG_FUNCTION_ARGS)
 		/* for Unicode we treat the argument as a code point */
 		int			bytes;
 		unsigned char *wch;
+		pg_unicode_category category;
 
 		/*
 		 * We only allow valid Unicode code points; per RFC3629 that stops at
@@ -1042,6 +1045,19 @@ chr			(PG_FUNCTION_ARGS)
 					 errmsg("requested character too large for encoding: %u",
 							cvalue)));
 
+		if (database_strict_unicode)
+		{
+			category = unicode_category(cvalue);
+			if (category == PG_U_UNASSIGNED)
+				ereport(ERROR,
+						errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						errmsg("unassigned Unicode code point: %06X", cvalue));
+			else if (category == PG_U_SURROGATE)
+				ereport(ERROR,
+						errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						errmsg("Unicode code point is surrogate: %06X", cvalue));
+		}
+
 		if (cvalue > 0xffff)
 			bytes = 4;
 		else if (cvalue > 0x07ff)
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 79b59b0af7..8ac9a35226 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -114,6 +114,9 @@ char	   *localized_full_days[7 + 1];
 char	   *localized_abbrev_months[12 + 1];
 char	   *localized_full_months[12 + 1];
 
+/* reject unassigned code points? (UTF-8 only) */
+bool database_strict_unicode = false;
+
 /* is the databases's LC_CTYPE the C locale? */
 bool		database_ctype_is_c = false;
 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 543afb66e5..e659a54c80 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -138,6 +138,7 @@ static char *text_position_next_internal(char *start_ptr, TextPositionState *sta
 static char *text_position_get_match_ptr(TextPositionState *state);
 static int	text_position_get_match_pos(TextPositionState *state);
 static void text_position_cleanup(TextPositionState *state);
+static void check_strict_unicode(text *input);
 static void check_collation_set(Oid collid);
 static int	text_cmp(text *arg1, text *arg2, Oid collid);
 static bytea *bytea_catenate(bytea *t1, bytea *t2);
@@ -200,6 +201,7 @@ cstring_to_text_with_len(const char *s, int len)
 	SET_VARSIZE(result, len + VARHDRSZ);
 	memcpy(VARDATA(result), s, len);
 
+	check_strict_unicode(result);
 	return result;
 }
 
@@ -609,6 +611,7 @@ textrecv(PG_FUNCTION_ARGS)
 
 	result = cstring_to_text_with_len(str, nbytes);
 	pfree(str);
+
 	PG_RETURN_TEXT_P(result);
 }
 
@@ -6298,6 +6301,38 @@ unicode_assigned(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(true);
 }
 
+static void
+check_strict_unicode(text *input)
+{
+	unsigned char *p;
+	int			size;
+
+	if (!database_strict_unicode)
+		return;
+
+	Assert(GetDatabaseEncoding() == PG_UTF8);
+
+	/* convert to pg_wchar */
+	size = pg_mbstrlen_with_len(VARDATA_ANY(input), VARSIZE_ANY_EXHDR(input));
+	p = (unsigned char *) VARDATA_ANY(input);
+	for (int i = 0; i < size; i++)
+	{
+		pg_wchar	code = utf8_to_unicode(p);
+		int			category = unicode_category(code);
+
+		if (category == PG_U_UNASSIGNED)
+			ereport(ERROR,
+					errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					errmsg("unassigned Unicode code point: %06X", code));
+		else if (category == PG_U_SURROGATE)
+			ereport(ERROR,
+					errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					errmsg("Unicode code point is surrogate: %06X", code));
+
+		p += pg_utf_mblen(p);
+	}
+}
+
 Datum
 unicode_normalize_func(PG_FUNCTION_ARGS)
 {
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 5ffe9bdd98..045e8c07aa 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -401,6 +401,8 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
 	SetConfigOption("client_encoding", GetDatabaseEncodingName(),
 					PGC_BACKEND, PGC_S_DYNAMIC_DEFAULT);
 
+	database_strict_unicode = dbform->datstrictunicode;
+
 	/* assign locale variables */
 	datum = SysCacheGetAttrNotNull(DATABASEOID, tup, Anum_pg_database_datcollate);
 	collate = TextDatumGetCString(datum);
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index ac409b0006..2418a7ba5b 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -93,6 +93,13 @@ typedef struct _stringlist
 	struct _stringlist *next;
 } _stringlist;
 
+enum trivalue
+{
+	TRI_DEFAULT,
+	TRI_NO,
+	TRI_YES
+};
+
 static const char *const auth_methods_host[] = {
 	"trust", "reject", "scram-sha-256", "md5", "password", "ident", "radius",
 #ifdef ENABLE_GSS
@@ -149,6 +156,7 @@ static char *icu_locale = NULL;
 static char *icu_rules = NULL;
 static const char *default_text_search_config = NULL;
 static char *username = NULL;
+static enum trivalue strict_unicode = TRI_DEFAULT;
 static bool pwprompt = false;
 static char *pwfilename = NULL;
 static char *superuser_password = NULL;
@@ -1509,6 +1517,9 @@ bootstrap_template1(void)
 	bki_lines = replace_token(bki_lines, "ENCODING",
 							  encodingid_to_string(encodingid));
 
+	bki_lines = replace_token(bki_lines, "STRICT_UNICODE",
+							  (strict_unicode == TRI_YES) ? "TRUE" : "FALSE");
+
 	bki_lines = replace_token(bki_lines, "LC_COLLATE",
 							  escape_quotes_bki(lc_collate));
 
@@ -2432,6 +2443,8 @@ usage(const char *progname)
 	printf(_("      --auth-local=METHOD   default authentication method for local-socket connections\n"));
 	printf(_(" [-D, --pgdata=]DATADIR     location for this database cluster\n"));
 	printf(_("  -E, --encoding=ENCODING   set default encoding for new databases\n"));
+	printf(_("      --no-strict-unicode   disable strict unicode\n"));
+	printf(_("      --strict-unicode      enable strict unicode\n"));
 	printf(_("  -g, --allow-group-access  allow group read/execute on data directory\n"));
 	printf(_("      --icu-locale=LOCALE   set ICU locale ID for new databases\n"));
 	printf(_("      --icu-rules=RULES     set additional ICU collation rules for new databases\n"));
@@ -3102,6 +3115,8 @@ main(int argc, char *argv[])
 		{"icu-locale", required_argument, NULL, 16},
 		{"icu-rules", required_argument, NULL, 17},
 		{"sync-method", required_argument, NULL, 18},
+		{"no-strict-unicode", no_argument, NULL, 19},
+		{"strict-unicode", no_argument, NULL, 20},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -3286,6 +3301,12 @@ main(int argc, char *argv[])
 				if (!parse_sync_method(optarg, &sync_method))
 					exit(1);
 				break;
+			case 19:
+				strict_unicode = TRI_NO;
+				break;
+			case 20:
+				strict_unicode = TRI_YES;
+				break;
 			default:
 				/* getopt_long already emitted a complaint */
 				pg_log_error_hint("Try \"%s --help\" for more information.", progname);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2225a12718..7b028c0be3 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -2981,6 +2981,7 @@ dumpDatabase(Archive *fout)
 				i_datname,
 				i_datdba,
 				i_encoding,
+				i_datstrictunicode,
 				i_datlocprovider,
 				i_collate,
 				i_ctype,
@@ -3000,6 +3001,7 @@ dumpDatabase(Archive *fout)
 	const char *datname,
 			   *dba,
 			   *encoding,
+			   *datstrictunicode,
 			   *datlocprovider,
 			   *collate,
 			   *ctype,
@@ -3035,6 +3037,10 @@ dumpDatabase(Archive *fout)
 		appendPQExpBufferStr(dbQry, "daticurules, ");
 	else
 		appendPQExpBufferStr(dbQry, "NULL AS daticurules, ");
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(dbQry, "datstrictunicode, ");
+	else
+		appendPQExpBufferStr(dbQry, "'f' AS datstrictunicode, ");
 	appendPQExpBufferStr(dbQry,
 						 "(SELECT spcname FROM pg_tablespace t WHERE t.oid = dattablespace) AS tablespace, "
 						 "shobj_description(oid, 'pg_database') AS description "
@@ -3048,6 +3054,7 @@ dumpDatabase(Archive *fout)
 	i_datname = PQfnumber(res, "datname");
 	i_datdba = PQfnumber(res, "datdba");
 	i_encoding = PQfnumber(res, "encoding");
+	i_datstrictunicode = PQfnumber(res, "datstrictunicode");
 	i_datlocprovider = PQfnumber(res, "datlocprovider");
 	i_collate = PQfnumber(res, "datcollate");
 	i_ctype = PQfnumber(res, "datctype");
@@ -3067,6 +3074,7 @@ dumpDatabase(Archive *fout)
 	datname = PQgetvalue(res, 0, i_datname);
 	dba = getRoleName(PQgetvalue(res, 0, i_datdba));
 	encoding = PQgetvalue(res, 0, i_encoding);
+	datstrictunicode = PQgetvalue(res, 0, i_datstrictunicode);
 	datlocprovider = PQgetvalue(res, 0, i_datlocprovider);
 	collate = PQgetvalue(res, 0, i_collate);
 	ctype = PQgetvalue(res, 0, i_ctype);
@@ -3111,6 +3119,10 @@ dumpDatabase(Archive *fout)
 		appendStringLiteralAH(creaQry, encoding, fout);
 	}
 
+	if (strcmp(datstrictunicode, "t") == 0)
+		appendPQExpBufferStr(creaQry, " STRICT_UNICODE = TRUE");
+	else
+		appendPQExpBufferStr(creaQry, " STRICT_UNICODE = FALSE");
 	appendPQExpBufferStr(creaQry, " LOCALE_PROVIDER = ");
 	if (datlocprovider[0] == 'c')
 		appendPQExpBufferStr(creaQry, "libc");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index b6a4eb1d56..0873bddbb6 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -953,6 +953,17 @@ listAllDbs(const char *pattern, bool verbose)
 		appendPQExpBuffer(&buf,
 						  "  NULL as \"%s\",\n",
 						  gettext_noop("ICU Rules"));
+	if (verbose)
+	{
+		if (pset.sversion >= 170000)
+			appendPQExpBuffer(&buf,
+							  "  d.datstrictunicode as \"%s\",\n",
+							  gettext_noop("Strict Unicode"));
+		else
+			appendPQExpBuffer(&buf,
+							  "  'f' as \"%s\",\n",
+							  gettext_noop("Strict Unicode"));
+	}
 	appendPQExpBufferStr(&buf, "  ");
 	printACLColumn(&buf, "d.datacl");
 	if (verbose)
diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index 14970a6a5f..3f8f8d27fb 100644
--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -42,6 +42,8 @@ main(int argc, char *argv[])
 		{"locale-provider", required_argument, NULL, 4},
 		{"icu-locale", required_argument, NULL, 5},
 		{"icu-rules", required_argument, NULL, 6},
+		{"no-strict-unicode", no_argument, NULL, 7},
+		{"strict-unicode", no_argument, NULL, 8},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -55,6 +57,7 @@ main(int argc, char *argv[])
 	char	   *host = NULL;
 	char	   *port = NULL;
 	char	   *username = NULL;
+	enum trivalue strict_unicode = TRI_DEFAULT;
 	enum trivalue prompt_password = TRI_DEFAULT;
 	ConnParams	cparams;
 	bool		echo = false;
@@ -139,6 +142,12 @@ main(int argc, char *argv[])
 			case 6:
 				icu_rules = pg_strdup(optarg);
 				break;
+			case 7:
+				strict_unicode = TRI_NO;
+				break;
+			case 8:
+				strict_unicode = TRI_YES;
+				break;
 			default:
 				/* getopt_long already emitted a complaint */
 				pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -207,6 +216,12 @@ main(int argc, char *argv[])
 		appendPQExpBufferStr(&sql, " ENCODING ");
 		appendStringLiteralConn(&sql, encoding, conn);
 	}
+	if (strict_unicode != TRI_DEFAULT)
+	{
+		const char *val = (strict_unicode == TRI_YES) ? "TRUE" : "FALSE";
+		appendPQExpBufferStr(&sql, " STRICT_UNICODE ");
+		appendStringLiteralConn(&sql, val, conn);
+	}
 	if (strategy)
 		appendPQExpBuffer(&sql, " STRATEGY %s", fmtId(strategy));
 	if (template)
diff --git a/src/include/catalog/pg_database.dat b/src/include/catalog/pg_database.dat
index 4306e8a3e8..330f11133d 100644
--- a/src/include/catalog/pg_database.dat
+++ b/src/include/catalog/pg_database.dat
@@ -15,6 +15,7 @@
 { oid => '1', oid_symbol => 'Template1DbOid',
   descr => 'default template for new databases',
   datname => 'template1', encoding => 'ENCODING',
+  datstrictunicode => 'STRICT_UNICODE',
   datlocprovider => 'LOCALE_PROVIDER', datistemplate => 't',
   datallowconn => 't', dathasloginevt => 'f', datconnlimit => '-1', datfrozenxid => '0',
   datminmxid => '1', dattablespace => 'pg_default', datcollate => 'LC_COLLATE',
diff --git a/src/include/catalog/pg_database.h b/src/include/catalog/pg_database.h
index 014baa7bab..21b512818b 100644
--- a/src/include/catalog/pg_database.h
+++ b/src/include/catalog/pg_database.h
@@ -52,6 +52,9 @@ CATALOG(pg_database,1262,DatabaseRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID
 	/* database has login event triggers? */
 	bool		dathasloginevt;
 
+	/* reject unassigned code points? (UTF-8 only) */
+	bool		datstrictunicode BKI_DEFAULT(false);
+
 	/*
 	 * Max connections allowed. Negative values have special meaning, see
 	 * DATCONNLIMIT_* defines below.
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 28c925b5af..f48853b98f 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -48,6 +48,9 @@ extern PGDLLIMPORT char *localized_full_days[];
 extern PGDLLIMPORT char *localized_abbrev_months[];
 extern PGDLLIMPORT char *localized_full_months[];
 
+/* reject unassigned code points? (UTF-8 only) */
+extern PGDLLIMPORT bool database_strict_unicode;
+
 /* is the databases's LC_CTYPE the C locale? */
 extern PGDLLIMPORT bool database_ctype_is_c;
 
-- 
2.34.1

#76Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#75)
Re: Pre-proposal: unicode normalized text

On Thu, 2024-02-29 at 17:02 -0800, Jeff Davis wrote:

Attached is an implementation of a per-database option STRICT_UNICODE
which enforces the use of assigned code points only.

The CF app doesn't seem to point at the latest patch:

/messages/by-id/a0e85aca6e03042881924c4b31a840a915a9d349.camel@j-davis.com

which is perhaps why nobody has looked at it yet.

But in any case, I'm OK if this gets bumped to 18. I still think it's a
good feature, but some of the value will come later in v18 anyway, when
I plan to propose support for case folding. Case folding is a version
of lowercasing with compatibility guarantees when you only use assigned
code points.

Regards,
Jeff Davis

#77Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#75)
Re: Pre-proposal: unicode normalized text

On Thu, 2024-02-29 at 17:02 -0800, Jeff Davis wrote:

Attached is an implementation of a per-database option STRICT_UNICODE
which enforces the use of assigned code points only.

I'm withdrawing this patch due to lack of interest.

Regards,
Jeff Davis