Re: [SQL] Slow Inserts Again

Started by Frank Mortonover 26 years ago7 messages
#1Frank Morton
fmorton@base2inc.com

I have tried many combinations of things to speed this
up as you all have suggested. I have had no success
using "copy" at all because of problems with quotes
and other punctuation in the data.

I must tell you, this doesn't sound reasonable to me. It's usually very
easy, if you already have a program that writes down the fields, to make
sure it scans the contents thereof and adds a backslash before each tab,
newline and backslash in every one of the fields.

Is there a doc somewhere about what characters are treated
in some special way when doing a copy?

#2Herouth Maoz
herouth@oumail.openu.ac.il
In reply to: Frank Morton (#1)

At 17:10 +0300 on 03/05/1999, Frank Morton wrote:

Is there a doc somewhere about what characters are treated
in some special way when doing a copy?

Yes, the documentation of the COPY command. The essence is that if you use
the default delimiter (tab), you need to put a backslash before each tab,
newline and backslash in each of the text fields. Oh, and null fields
should be converted to \N.

It's all in the docs. Why don't you try to copy some of the rows into a
temporary table, and dump that table using pg_dump -a -t table_name dbname?
It may give you a clue.

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma

#3Ross J. Reedstrom
reedstrm@rice.edu
In reply to: Herouth Maoz (#2)
pg_dump bug (was Re: [SQL] Slow Inserts Again)

Herouth Maoz wrote:

It's all in the docs. Why don't you try to copy some of the rows into a
temporary table, and dump that table using pg_dump -a -t table_name dbname?
It may give you a clue.

this is unrelated to the slow insert bug, but Herouth's suggestion
reminded me of something that needs to be looked at before 6.5 is out of
beta: pg_dump seems to have problems with mixed case tablenames. There
doesn't seem to be a way to send a quoted tablename into pg_dump as the
value for a -t option, in 6.4.2 (example below). Can someone try this on
6.5beta? I know some issues with quoting output of pg_dump (i.e. COPY)
was addressed, I'm wondering if input parsing got touched.

Actually, groveling through the source, it looks like even 6.4.2 should
"do the right thing": the query is built with the class (table) name
wrapped with fmtId(), which should do exactly this quoting. Anyone else
see this?

P.S. shouldn't the non existence of the table handed to pg_dump raise a
user visible error?

P.P.S. How does one go about setting up a second version of PG to test
on the same machine, without interference with the production (older)
version? I've only got the one machine to test on.

test=> create table TestTable (a int, b text);
CREATE
test=> create table "TestTable" (c int, d text);
CREATE
test=> \q
$ pg_dump -t TestTable test
CREATE TABLE "testtable" (
"a" int4,
"b" text);
COPY "testtable" FROM stdin;
\.
$ pg_dump -t "TestTable" test
CREATE TABLE "testtable" (
"a" int4,
"b" text);
COPY "testtable" FROM stdin;
\.
$ pg_dump -t \"TestTable\" test
$ pg_dump test
CREATE TABLE "testtable" (
"a" int4,
"b" text);
CREATE TABLE "TestTable" (
"c" int4,
"d" text);
COPY "testtable" FROM stdin;
\.
COPY "TestTable" FROM stdin;
\.

--
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St., Houston, TX 77005

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ross J. Reedstrom (#3)
Re: [HACKERS] pg_dump bug (was Re: [SQL] Slow Inserts Again)

"Ross J. Reedstrom" <reedstrm@rice.edu> writes:

P.S. shouldn't the non existence of the table handed to pg_dump raise a
user visible error?

Probably.

P.P.S. How does one go about setting up a second version of PG to test
on the same machine, without interference with the production (older)
version? I've only got the one machine to test on.

No problem, I do that all the time. When running "configure", override
the default install location *and* the default port number. For
example, I build test versions using

--with-pgport=5440 --prefix=/users/postgres/testversion

(You might also want --enable-cassert when working with beta code.)

Build, and make install as per normal --- it will go into the directory
you specified as the prefix. Before doing initdb or starting the test
postmaster, make sure you have environment set up to reference the test
location; the critical stuff is PATH, PGLIB, PGDATA, and USER (initdb
uses USER to decide who'll be superuser).

regards, tom lane

#5Thomas Lockhart
lockhart@alumni.caltech.edu
In reply to: Herouth Maoz (#2)
1 attachment(s)
Re: [HACKERS] pg_dump bug (was Re: [SQL] Slow Inserts Again)

... pg_dump seems to have problems with mixed case tablenames. There
doesn't seem to be a way to send a quoted tablename into pg_dump as the
value for a -t option, in 6.4.2 (example below). Can someone try this on
6.5beta? I know some issues with quoting output of pg_dump (i.e. COPY)
was addressed, I'm wondering if input parsing got touched.

pg_dump explicitly converts all table names to lowercase. I've got a
patch which looks for a table name which starts with a double quote,
and suppresses the case conversion if so:

[postgres@golem pg_dump]$ pg_dump -t '"MixedCase"' postgres
CREATE TABLE "MixedCase" (
"i" int4);
COPY "MixedCase" FROM stdin;
1
2
\.

Patch enclosed for you to try. Bruce, any reason not to apply this to
the tree?

- Tom

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California

Attachments:

pg_dump.c.patchtext/plain; charset=us-ascii; name=pg_dump.c.patchDownload
*** pg_dump.c.orig	Thu Apr 15 05:08:53 1999
--- pg_dump.c	Tue May  4 13:47:01 1999
***************
*** 606,615 ****
  					int			i;
  
  					tablename = strdup(optarg);
! 					for (i = 0; tablename[i]; i++)
! 						if (isascii((unsigned char) tablename[i]) &&
! 							isupper(tablename[i]))
! 							tablename[i] = tolower(tablename[i]);
  				}
  				break;
  			case 'v':			/* verbose */
--- 606,626 ----
  					int			i;
  
  					tablename = strdup(optarg);
! 					/* quoted string? Then strip quotes and preserve case... */
! 					if (tablename[0] == '"')
! 					{
! 						strcpy(tablename, &tablename[1]);
! 						if (*(tablename+strlen(tablename)-1) == '"')
! 							*(tablename+strlen(tablename)-1) = '\0';
! 					}
! 					/* otherwise, convert table name to lowercase... */
! 					else
! 					{
! 						for (i = 0; tablename[i]; i++)
! 							if (isascii((unsigned char) tablename[i]) &&
! 								isupper(tablename[i]))
! 								tablename[i] = tolower(tablename[i]);
! 					}
  				}
  				break;
  			case 'v':			/* verbose */
#6Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Thomas Lockhart (#5)
Re: [HACKERS] pg_dump bug (was Re: [SQL] Slow Inserts Again)

Apply, please. It is a bug fix.

... pg_dump seems to have problems with mixed case tablenames. There
doesn't seem to be a way to send a quoted tablename into pg_dump as the
value for a -t option, in 6.4.2 (example below). Can someone try this on
6.5beta? I know some issues with quoting output of pg_dump (i.e. COPY)
was addressed, I'm wondering if input parsing got touched.

pg_dump explicitly converts all table names to lowercase. I've got a
patch which looks for a table name which starts with a double quote,
and suppresses the case conversion if so:

[postgres@golem pg_dump]$ pg_dump -t '"MixedCase"' postgres
CREATE TABLE "MixedCase" (
"i" int4);
COPY "MixedCase" FROM stdin;
1
2
\.

Patch enclosed for you to try. Bruce, any reason not to apply this to
the tree?

- Tom

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California

*** pg_dump.c.orig	Thu Apr 15 05:08:53 1999
--- pg_dump.c	Tue May  4 13:47:01 1999
***************
*** 606,615 ****
int			i;
tablename = strdup(optarg);
! 					for (i = 0; tablename[i]; i++)
! 						if (isascii((unsigned char) tablename[i]) &&
! 							isupper(tablename[i]))
! 							tablename[i] = tolower(tablename[i]);
}
break;
case 'v':			/* verbose */
--- 606,626 ----
int			i;

tablename = strdup(optarg);
! /* quoted string? Then strip quotes and preserve case... */
! if (tablename[0] == '"')
! {
! strcpy(tablename, &tablename[1]);
! if (*(tablename+strlen(tablename)-1) == '"')
! *(tablename+strlen(tablename)-1) = '\0';
! }
! /* otherwise, convert table name to lowercase... */
! else
! {
! for (i = 0; tablename[i]; i++)
! if (isascii((unsigned char) tablename[i]) &&
! isupper(tablename[i]))
! tablename[i] = tolower(tablename[i]);
! }
}
break;
case 'v': /* verbose */

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#7Thomas Lockhart
lockhart@alumni.caltech.edu
In reply to: Bruce Momjian (#6)
Re: [HACKERS] pg_dump bug (was Re: [SQL] Slow Inserts Again)

Apply, please. It is a bug fix.

... pg_dump seems to have problems with mixed case tablenames.

pg_dump explicitly converts all table names to lowercase. I've got a
patch which looks for a table name which starts with a double quote,
and suppresses the case conversion if so:

Already done :)

- Tom

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California