why do we need create tuplestore for each fetch?

Started by 高增琦about 14 years ago4 messages

pgf00a@gmail.com

about 14 years ago

Hi everyone,

I found this several days ago when I try to debug a "fetch" of cursor.
And I have sent a mail to this list, but no one reply...
Maybe this is a very simple problem, please help me, thanks a lot...

Here is the example:
create table t (a int);
insert into t values (1),(3),(5),(7),(9);
insert into t select a+1 from t;
begin;
declare c cursor for select * from t order by a;
fetch 3 in c;
fetch 3 in c;
fetch 3 in c;

In 'PortalRun', a fetch stmt will be treated with PORTAL_UTIL_SELECT,
and then a tuplestore will be created in 'FillPortalStore' in the
fetch stmt's portal.

In 'FillPortalStore', all result will be store at that tuplestore,
Then, go back to 'PortalRun'; next, 'PortalRunSelect' will send this
results to client...

My problem is: why do we need create that tuplestore as an
middle storeage? why do not we just send these result to clent
at the first time?

Thank you very much.

--
GaoZengqi
pgf00a@gmail.com
zengqigao@gmail.com

robertmhaas@gmail.com

about 14 years ago

In reply to: 高增琦 (#1)

Re: why do we need create tuplestore for each fetch?

On Thu, Dec 15, 2011 at 8:30 AM, 高增琦 <pgf00a@gmail.com> wrote:

I found this several days ago when I try to debug a "fetch" of cursor.
And I have sent a mail to this list, but no one reply...
Maybe this is a very simple problem, please help me, thanks a lot...

Here is the example:
    create table t (a int);
    insert into t values (1),(3),(5),(7),(9);
    insert into t select a+1 from t;
    begin;
    declare c cursor for select * from t order by a;
    fetch 3 in c;
    fetch 3 in c;
    fetch 3 in c;

In 'PortalRun', a fetch stmt will be treated with PORTAL_UTIL_SELECT,
and then a tuplestore will be created in 'FillPortalStore' in the
fetch stmt's portal.

In 'FillPortalStore', all result will be store at that tuplestore,
Then, go back to 'PortalRun'; next, 'PortalRunSelect' will send this
results to client...

My problem is: why do we need create that tuplestore as an
middle storeage? why do not we just send these result to clent
at the first time?

Good question. I wouldn't expect it to matter very much for a
three-row fetch, but maybe it does for larger ones? What is your
motivation for investigating this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgf00a@gmail.com

about 14 years ago

In reply to: Robert Haas (#2)

Re: why do we need create tuplestore for each fetch?

Thanks for you reply.

I found query without cursor is faster then query with server-side cursor
and several fetches.
But I have a large result set to retrieve from database. I have to choose
server-side cursor
to avoid out-of-memory problem.

When I try to debug the cursor and fetch, I found this unexpected behavior.
I think maybe
the tuplestore slows the cursor. (maybe I should do some profile later)

I want to change the code, but I am afraid there are important reasons for
the tuplestore.
Therefore, I post it to this list for help: why create tuplestore for each
fetch?

p.s. a large fetch may turn tuplestore to use buffer file, and slow the
performance very much.

On Mon, Dec 19, 2011 at 9:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Dec 15, 2011 at 8:30 AM, 高增琦 <pgf00a@gmail.com> wrote:

I found this several days ago when I try to debug a "fetch" of cursor.
And I have sent a mail to this list, but no one reply...
Maybe this is a very simple problem, please help me, thanks a lot...

Here is the example:
create table t (a int);
insert into t values (1),(3),(5),(7),(9);
insert into t select a+1 from t;
begin;
declare c cursor for select * from t order by a;
fetch 3 in c;
fetch 3 in c;
fetch 3 in c;

In 'PortalRun', a fetch stmt will be treated with PORTAL_UTIL_SELECT,
and then a tuplestore will be created in 'FillPortalStore' in the
fetch stmt's portal.

In 'FillPortalStore', all result will be store at that tuplestore,
Then, go back to 'PortalRun'; next, 'PortalRunSelect' will send this
results to client...

My problem is: why do we need create that tuplestore as an
middle storeage? why do not we just send these result to clent
at the first time?

Good question. I wouldn't expect it to matter very much for a
three-row fetch, but maybe it does for larger ones? What is your
motivation for investigating this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
GaoZengqi
pgf00a@gmail.com
zengqigao@gmail.com

tgl@sss.pgh.pa.us

about 14 years ago

In reply to: 高增琦 (#3)

Re: why do we need create tuplestore for each fetch?

=?UTF-8?B?6auY5aKe55Cm?= <pgf00a@gmail.com> writes:

Here is the example:
create table t (a int);
insert into t values (1),(3),(5),(7),(9);
insert into t select a+1 from t;
begin;
declare c cursor for select * from t order by a;
fetch 3 in c;
fetch 3 in c;
fetch 3 in c;

In 'PortalRun', a fetch stmt will be treated with PORTAL_UTIL_SELECT,
and then a tuplestore will be created in 'FillPortalStore' in the
fetch stmt's portal.

How are you trying to do the fetches, PQexec("fetch 3 in c") ?
That is an inherently inefficient way to do things, and trying to shave
a few cycles off the intermediate tuplestore isn't going to fix that.
The general overhead of parsing a new SQL command is probably going to
swamp the costs of a tuplestore, especially if it's too small to spill
to disk (and if it isn't, you really do need the tuplestore mechanism,
slow or not).

If you want to get a speed improvement there would probably be a lot
more bang for the buck in extending libpq to support protocol-level
portal access. It does already have PQdescribePortal, but for some
reason not anything for "fetch N rows from portal so-and-so". Not
sure whether it's worth providing explicit portal open/close commands
separate from PQexec'ing DECLARE CURSOR and CLOSE, but maybe at the
margins those steps would be worth improving too.

regards, tom lane