Take skip header out of a loop in COPY FROM

Started by Surafel Temesgenover 6 years ago5 messages
#1Surafel Temesgen
surafel3000@gmail.com
1 attachment(s)

Hello,

Even if skipping header is done only once its checked and skipped in a
loop. If I don’t miss something it can be done out side a loop like
attached patch

regards

Surafel

Attachments:

outing-skip-header-from-loop-v1.patchtext/x-patch; charset=US-ASCII; name=outing-skip-header-from-loop-v1.patchDownload
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 4f04d122c3..4e7709d7bf 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -3004,6 +3004,13 @@ CopyFrom(CopyState cstate)
 	errcallback.previous = error_context_stack;
 	error_context_stack = &errcallback;
 
+	/* on input just throw the header line away */
+	if ( cstate->header_line)
+	{
+		cstate->cur_lineno++;
+		(void) CopyReadLine(cstate);
+	}
+
 	for (;;)
 	{
 		TupleTableSlot *myslot;
@@ -3642,14 +3649,6 @@ NextCopyFromRawFields(CopyState cstate, char ***fields, int *nfields)
 	/* only available for text or csv input */
 	Assert(!cstate->binary);
 
-	/* on input just throw the header line away */
-	if (cstate->cur_lineno == 0 && cstate->header_line)
-	{
-		cstate->cur_lineno++;
-		if (CopyReadLine(cstate))
-			return false;		/* done */
-	}
-
 	cstate->cur_lineno++;
 
 	/* Actually read the line into memory here */
#2Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Surafel Temesgen (#1)
Re: Take skip header out of a loop in COPY FROM

On 22/08/2019 11:31, Surafel Temesgen wrote:

Hello,

Even if skipping header is done only once its checked and skipped in a
loop. If I don’t miss something it can be done out side a loop like
attached patch

You may be on to something, but if we move it to CopyFrom(), as in your
patch, then it won't get executed e.g. from the calls in file_fdw.
file_fdw calls BeginCopyFrom(), followed by NextCopyFrom(); it doesn't
use CopyFrom().

- Heikki

#3Adam Lee
ali@pivotal.io
In reply to: Heikki Linnakangas (#2)
Re: Take skip header out of a loop in COPY FROM

On Thu, Aug 22, 2019 at 11:48:31AM +0300, Heikki Linnakangas wrote:

On 22/08/2019 11:31, Surafel Temesgen wrote:

Hello,

Even if skipping header is done only once its checked and skipped in a
loop. If I don’t miss something it can be done out side a loop like
attached patch

You may be on to something, but if we move it to CopyFrom(), as in your
patch, then it won't get executed e.g. from the calls in file_fdw. file_fdw
calls BeginCopyFrom(), followed by NextCopyFrom(); it doesn't use
CopyFrom().

- Heikki

Yes.

My next thought is to call unlikely() here, but we don't have it...
/messages/by-id/CABRT9RC-AUuQL6txxsoOkLxjK1iTpyexpbizRF4Zxny1GXASGg@mail.gmail.com

--
Adam Lee

#4Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Adam Lee (#3)
Re: Take skip header out of a loop in COPY FROM

On 22/08/2019 12:54, Adam Lee wrote:

My next thought is to call unlikely() here, but we don't have it...
/messages/by-id/CABRT9RC-AUuQL6txxsoOkLxjK1iTpyexpbizRF4Zxny1GXASGg@mail.gmail.com

We do, actually, since commit aa3ca5e3dd in v10.

Not sure it's worth the trouble here. Optimizing COPY in general would
be good, even small speedups there are helpful because everyone uses
COPY, but without some evidence I don't believe particular branch is
even measurable.

- Heikki

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#4)
Re: Take skip header out of a loop in COPY FROM

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 22/08/2019 12:54, Adam Lee wrote:

My next thought is to call unlikely() here, but we don't have it...

We do, actually, since commit aa3ca5e3dd in v10.

Not sure it's worth the trouble here. Optimizing COPY in general would
be good, even small speedups there are helpful because everyone uses
COPY, but without some evidence I don't believe particular branch is
even measurable.

I concur that there's no reason to think that this if-test has a
measurable performance cost. We're about to do CopyReadLine which
certainly has way more than one branch's worth of processing in it.

If we want to get involved with sprinkling unlikely() calls into
copy.c, the inner per-character or per-field loops would be the
place to look for wins IMO.

I'm going to mark this CF entry as Returned With Feedback.

regards, tom lane