weird interaction between asynchronous queries and pg_sleep

Started by Merlin Moncurealmost 5 years ago2 messages
#1Merlin Moncure
mmoncure@gmail.com

Consider the following snippet

create table data as select generate_series(1,1000000) s;

do $d$
begin
PERFORM * FROM dblink_connect('test','');

PERFORM * from dblink_send_query('test', 'SELECT * FROM data');

LOOP
if dblink_is_busy('test') = 0
THEN
PERFORM * FROM dblink_get_result('test') AS R(V int);
PERFORM * FROM dblink_get_result('test') AS R(V int);
RETURN;
END IF;

PERFORM pg_sleep(.001);
END LOOP;

PERFORM * FROM dblink_disconnect('test');
END;
$d$;

What's interesting here is that, when I vary the sleep parameter, I get:
0: .4 seconds (per top, this is busywait), same as running synchronous.
0.000001: 1.4 seconds
0.001: 2.4 seconds
0.01: 10.6 seconds
0.1: does not terminate

This effect is only noticeable when the remote query is returning
volumes of data. My question is, is there any way to sleep loop
client side without giving up 3x performance penalty? Why is that
that when more local sleep queries are executed, performance improves?

merlin

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Merlin Moncure (#1)
Re: weird interaction between asynchronous queries and pg_sleep

On Thu, Apr 8, 2021 at 1:05 PM Merlin Moncure <mmoncure@gmail.com> wrote:

This effect is only noticeable when the remote query is returning
volumes of data. My question is, is there any way to sleep loop
client side without giving up 3x performance penalty? Why is that
that when more local sleep queries are executed, performance improves?

Looking at this more, it looks like that when sleeping with pg_sleep,
libpq does not receive the data. I think for this type of pattern to
work correctly, dblink would need a custom sleep function wrapping
poll (or epoll) that consumes input on the socket when signalled read
ready.

merlin