Type modifiers for DOMAIN

Started by Takahiro Itagakiabout 16 years ago11 messages
#1Takahiro Itagaki
itagaki.takahiro@oss.ntt.co.jp

Hi,

I'm trying to use DOMAIN as just synonym types for database migration.
For example,
=# CREATE DOMAIN varchar2 AS pg_catalog.varchar;
=# CREATE DOMAIN number AS pg_catalog.numeric;

Domains were created successfully, but I cannot use type modifiers for them.
=# CREATE TABLE tbl (v varchar2(10));
ERROR: type modifier is not allowed for type "varchar2"

What reason do we have not to inherit typmodin/typmodout from the base type?
I found a comment in DefineDomain(),
/* Domains never accept typmods, so no typmodin/typmodout needed */
but can we relax the restriction? This feature would be useful for migration
from other DBMSes that have non-standard data types.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

#2Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Takahiro Itagaki (#1)
Re: Type modifiers for DOMAIN

On Wed, Jan 6, 2010 at 1:12 AM, Takahiro Itagaki
<itagaki.takahiro@oss.ntt.co.jp> wrote:

What reason do we have not to inherit typmodin/typmodout from the base type?
I found a comment in DefineDomain(),
   /* Domains never accept typmods, so no typmodin/typmodout needed */
but can we relax the restriction? This feature would be useful for migration
from other DBMSes that have non-standard data types.

+1

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Takahiro Itagaki (#1)
Re: Type modifiers for DOMAIN

Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp> writes:

Domains were created successfully, but I cannot use type modifiers for them.
=# CREATE TABLE tbl (v varchar2(10));
ERROR: type modifier is not allowed for type "varchar2"

What reason do we have not to inherit typmodin/typmodout from the base type?

Because the domain is supposed to be opaque as to exactly what its
underlying type is. In particular, you're supposed to do this:

CREATE DOMAIN varchar2 AS pg_catalog.varchar(10);

If you look in the SQL spec you will not find any suggestion that it
should work the way you propose.

regards, tom lane

#4Josh Berkus
josh@agliodbs.com
In reply to: Takahiro Itagaki (#1)
Re: Type modifiers for DOMAIN

/* Domains never accept typmods, so no typmodin/typmodout needed */
but can we relax the restriction? This feature would be useful for migration
from other DBMSes that have non-standard data types.

For migration, wouldn't it be adequate simply to ignore the typemod? Or
to allow it as part of the type name? e.g.:

create domain "varchar2(10)" AS varchar
check ( length(value) <= 10 );

I know that wouldn't actually work, but you see what I'm getting at?

--Josh Berkus

#5Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Takahiro Itagaki (#1)
Re: Type modifiers for DOMAIN

Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp> wrote:

This feature would be useful for migration
from other DBMSes that have non-standard data types.

Domains can indeed be useful to allow migration -- if they are used
as intended. If you identify all of the unique data domains on your
source platform and define the columns with domains rather than bare
types, you can easily identify the appropriate mapping on some other
platform and declare the appropriate domains there. Migration is
then easy.

-Kevin

#6Takahiro Itagaki
itagaki.takahiro@oss.ntt.co.jp
In reply to: Tom Lane (#3)
Re: Type modifiers for DOMAIN

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Because the domain is supposed to be opaque as to exactly what its
underlying type is. In particular, you're supposed to do this:

CREATE DOMAIN varchar2 AS pg_catalog.varchar(10);

If you look in the SQL spec you will not find any suggestion that it
should work the way you propose.

Hmmm, it means we need to create domains for each length of character types.
If we allowed domains with pass-through-modifiers, it could save codes
than CREATE (scalar) TYPE.

=# CREATE DOMAIN digit_varchar AS varchar ( <pass-through-modifiers> )
CHECK (VALUE ~ E'^\\d*$');
=# CREATE TABLE tbl (digit10 digit_varchar(10));

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

#7Chuck McDevitt
cmcdevitt@greenplum.com
In reply to: Tom Lane (#3)
Bug with PATHs having non-ASCII characters

Just an FYI regarding this bug:

http://archives.postgresql.org/pgsql-bugs/2009-12/msg00267.php

Windows always uses UNICODE to store file and directory names.

The wide-char version of any WIN32 API call will accept or return data in UTF-16 encoded Unicode, regardless of the local environment's single-byte (MBCS) encoding settings (codepage).

So in the windows environment, at least, you can always be sure how to handle file/directory/path that includes non-ASCII characters.

It's a little bit of a pain to use the wide-char API calls from PostgreSQL, but converting UTF-16 from and to UTF-8 is pretty easy and a guaranteed 1:1 mapping.

P.s. The non-wide-char version of the Win32 API is just a bunch of wrappers that convert the char data to/from UTF-16 based on the current codepage. The wide-char API is the native one.
To force the call to the wide-char API version, you just add W to the end of the function name (adding A forces it the other way).

#8Takahiro Itagaki
itagaki.takahiro@oss.ntt.co.jp
In reply to: Chuck McDevitt (#7)
1 attachment(s)
Re: Bug with PATHs having non-ASCII characters

Chuck McDevitt <cmcdevitt@greenplum.com> wrote:

Just an FYI regarding this bug:
http://archives.postgresql.org/pgsql-bugs/2009-12/msg00267.php

The wide-char version of any WIN32 API call will accept or return
data in UTF-16 encoded Unicode, regardless of the local environment's
single-byte (MBCS) encoding settings (codepage).

I have a Windows-specific patch for open(), attached for reference.
But we need to consider about other issues:

- We need to consider about not only only open(), but also opendir(),
stat() and symlink().

- An entirely-different fix is needed for non-Windows platforms.
Probably we will convert encodings from GetDatabaseEncoding() to
GetPlatformEncoding() in MBCS, but this is not needed on Windows.
We should consider avoiding random ifdef blocks for the switching.

- Those conversions are not free. We might need to avoid conversions
for paths under $PGDATA because we only use ascii names in the server.
I used a test with IS_HIGHBIT_SET in the attached patch, but I'm not
sure whether it is the best method.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

Attachments:

pgwin32_open.patchapplication/octet-stream; name=pgwin32_open.patchDownload
diff -cprN head/src/port/open.c eventlog/src/port/open.c
*** head/src/port/open.c	2009-06-11 23:49:15.000000000 +0900
--- eventlog/src/port/open.c	2009-09-15 12:31:24.556451161 +0900
***************
*** 23,28 ****
--- 23,31 ----
  #include <fcntl.h>
  #include <assert.h>
  
+ #ifndef FRONTEND
+ #include "mb/pg_wchar.h"
+ #endif
  
  static int
  openFlagsToCreateFileFlags(int openFlags)
*************** pgwin32_open(const char *fileName, int f
*** 65,70 ****
--- 68,78 ----
  	HANDLE		h = INVALID_HANDLE_VALUE;
  	SECURITY_ATTRIBUTES sa;
  	int			loops = 0;
+ 	DWORD		dwDesiredAccess;
+ 	DWORD		dwShareMode;
+ 	DWORD		dwCreationDisposition;
+ 	DWORD		dwFlagsAndAttributes;
+ 	WCHAR	   *wFileName = NULL;
  
  	/* Check that we can handle the request */
  	assert((fileFlags & ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND |
*************** pgwin32_open(const char *fileName, int f
*** 72,97 ****
  						 _O_SHORT_LIVED | O_DSYNC | O_DIRECT |
  		  (O_CREAT | O_TRUNC | O_EXCL) | (O_TEXT | O_BINARY))) == fileFlags);
  
- 	sa.nLength = sizeof(sa);
- 	sa.bInheritHandle = TRUE;
- 	sa.lpSecurityDescriptor = NULL;
- 
- 	while ((h = CreateFile(fileName,
  	/* cannot use O_RDONLY, as it == 0 */
! 					  (fileFlags & O_RDWR) ? (GENERIC_WRITE | GENERIC_READ) :
! 					 ((fileFlags & O_WRONLY) ? GENERIC_WRITE : GENERIC_READ),
  	/* These flags allow concurrent rename/unlink */
! 					(FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
! 						   &sa,
! 						   openFlagsToCreateFileFlags(fileFlags),
! 						   FILE_ATTRIBUTE_NORMAL |
  					 ((fileFlags & O_RANDOM) ? FILE_FLAG_RANDOM_ACCESS : 0) |
  			   ((fileFlags & O_SEQUENTIAL) ? FILE_FLAG_SEQUENTIAL_SCAN : 0) |
  			  ((fileFlags & _O_SHORT_LIVED) ? FILE_ATTRIBUTE_TEMPORARY : 0) |
  				((fileFlags & O_TEMPORARY) ? FILE_FLAG_DELETE_ON_CLOSE : 0) |
  					  ((fileFlags & O_DIRECT) ? FILE_FLAG_NO_BUFFERING : 0) |
! 					   ((fileFlags & O_DSYNC) ? FILE_FLAG_WRITE_THROUGH : 0),
! 						   NULL)) == INVALID_HANDLE_VALUE)
  	{
  		/*
  		 * Sharing violation or locking error can indicate antivirus, backup
--- 80,126 ----
  						 _O_SHORT_LIVED | O_DSYNC | O_DIRECT |
  		  (O_CREAT | O_TRUNC | O_EXCL) | (O_TEXT | O_BINARY))) == fileFlags);
  
  	/* cannot use O_RDONLY, as it == 0 */
! 	dwDesiredAccess = (fileFlags & O_RDWR) ? (GENERIC_WRITE | GENERIC_READ) :
! 						((fileFlags & O_WRONLY) ? GENERIC_WRITE : GENERIC_READ);
  	/* These flags allow concurrent rename/unlink */
! 	dwShareMode = (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE);
!     dwCreationDisposition = openFlagsToCreateFileFlags(fileFlags);
!     dwFlagsAndAttributes = FILE_ATTRIBUTE_NORMAL |
  					 ((fileFlags & O_RANDOM) ? FILE_FLAG_RANDOM_ACCESS : 0) |
  			   ((fileFlags & O_SEQUENTIAL) ? FILE_FLAG_SEQUENTIAL_SCAN : 0) |
  			  ((fileFlags & _O_SHORT_LIVED) ? FILE_ATTRIBUTE_TEMPORARY : 0) |
  				((fileFlags & O_TEMPORARY) ? FILE_FLAG_DELETE_ON_CLOSE : 0) |
  					  ((fileFlags & O_DIRECT) ? FILE_FLAG_NO_BUFFERING : 0) |
! 					   ((fileFlags & O_DSYNC) ? FILE_FLAG_WRITE_THROUGH : 0);
! 
! 	sa.nLength = sizeof(sa);
! 	sa.bInheritHandle = TRUE;
! 	sa.lpSecurityDescriptor = NULL;
! 
! #ifndef FRONTEND
! 	/*
! 	 * Use wide-character file name only if the database encoding doesn't match
! 	 * to the platform encoding and the path contains any multi-byte characters.
! 	 */
! 	if (GetDatabaseEncoding() != GetPlatformEncoding())
! 	{
! 		int		len;
! 		bool	hasMBChar = false;
! 
! 		for (len = 0; fileName[len]; len++)
! 			hasMBChar |= IS_HIGHBIT_SET(fileName[len]);
! 		if (hasMBChar)
! 			wFileName = pgwin32_toUTF16(fileName, len, NULL);
! 	}
! #endif
! 
! 	while ((h = (wFileName != NULL
! 				? CreateFileW(wFileName, dwDesiredAccess, dwShareMode, &sa,
! 						dwCreationDisposition, dwFlagsAndAttributes, NULL)
! 				: CreateFileA(fileName, dwDesiredAccess, dwShareMode, &sa,
! 						dwCreationDisposition, dwFlagsAndAttributes, NULL))
! 		   ) == INVALID_HANDLE_VALUE)
  	{
  		/*
  		 * Sharing violation or locking error can indicate antivirus, backup
*************** pgwin32_open(const char *fileName, int f
*** 119,128 ****
--- 148,166 ----
  				continue;
  		}
  
+ #ifndef FRONTEND
+ 		if (wFileName)
+ 			pfree(wFileName);
+ #endif
  		_dosmaperr(err);
  		return -1;
  	}
  
+ #ifndef FRONTEND
+ 	if (wFileName)
+ 		pfree(wFileName);
+ #endif
+ 
  	/* _open_osfhandle will, on error, set errno accordingly */
  	if ((fd = _open_osfhandle((long) h, fileFlags & O_APPEND)) < 0)
  		CloseHandle(h);			/* will not affect errno */
#9Greg Stark
gsstark@mit.edu
In reply to: Chuck McDevitt (#7)
Re: Bug with PATHs having non-ASCII characters

On Thu, Jan 7, 2010 at 12:57 AM, Chuck McDevitt <cmcdevitt@greenplum.com> wrote:

Windows always uses UNICODE to store file and directory names.

So what does that mean when the filesystem is a shared filesystem or
one mounted in Windows but originally written out by another OS?

I think the answer is that it interprets the data on disk as being in
an encoding the user claims it is and if it isn't then things go bad.
I'm not sure what that means for writable filesystems when you try
writing a unicode character that encoding can't encode though.

--
greg

#10Magnus Hagander
magnus@hagander.net
In reply to: Takahiro Itagaki (#8)
Re: Bug with PATHs having non-ASCII characters

On Thu, Jan 7, 2010 at 02:37, Takahiro Itagaki
<itagaki.takahiro@oss.ntt.co.jp> wrote:

Chuck McDevitt <cmcdevitt@greenplum.com> wrote:

Just an FYI regarding this bug:
http://archives.postgresql.org/pgsql-bugs/2009-12/msg00267.php

The wide-char version of any WIN32 API call will accept or return
data in UTF-16 encoded Unicode, regardless of the local environment's
single-byte (MBCS) encoding settings (codepage).

I have a Windows-specific patch for open(), attached for reference.
But we need to consider about other issues:

 - We need to consider about not only only open(), but also opendir(),
   stat() and symlink().

 - An entirely-different fix is needed for non-Windows platforms.
   Probably we will convert encodings from GetDatabaseEncoding() to
   GetPlatformEncoding() in MBCS, but this is not needed on Windows.
   We should consider avoiding random ifdef blocks for the switching.

Shouldn't we develop this with "multi-platform" in mind from the
start, instead of doing a Windows specific patch? It may be that we
end up with two completely different codepaths, but more likely we can
share some of it between them?

 - Those conversions are not free. We might need to avoid conversions
   for paths under $PGDATA because we only use ascii names in the server.
   I used a test with IS_HIGHBIT_SET in the attached patch, but I'm not
   sure whether it is the best method.

If we're going to end up with our own wrapper anyway, we can just pass
an extra parameter to it saying if we want conversion or not? That way
we can avoid doing it for cases where we know it's safe, but do it
when user-input is included?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#11Alvaro Herrera
alvherre@commandprompt.com
In reply to: Magnus Hagander (#10)
Re: Bug with PATHs having non-ASCII characters

Magnus Hagander wrote:

On Thu, Jan 7, 2010 at 02:37, Takahiro Itagaki
<itagaki.takahiro@oss.ntt.co.jp> wrote:

I have a Windows-specific patch for open(), attached for reference.
But we need to consider about other issues:

�- We need to consider about not only only open(), but also opendir(),
� �stat() and symlink().

�- An entirely-different fix is needed for non-Windows platforms.
� �Probably we will convert encodings from GetDatabaseEncoding() to
� �GetPlatformEncoding() in MBCS, but this is not needed on Windows.
� �We should consider avoiding random ifdef blocks for the switching.

Shouldn't we develop this with "multi-platform" in mind from the
start, instead of doing a Windows specific patch? It may be that we
end up with two completely different codepaths, but more likely we can
share some of it between them?

I'm not sure there's a lot you can do in platforms other than Windows.
On Windows the filenames are all Unicode and you know you can get them
using UTF16. On other platforms it could be anything, and it is
certainly not server_encoding; even different filesystems can use
different encodings.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.