often PREPARE can generate high load (and sometimes minutes long unavailability)

Started by Pavel Stehulealmost 12 years ago11 messages
#1Pavel Stehule
pavel.stehule@gmail.com

Hello

I got a example of code, that generate relatively high load with minimal
connections.

This code is +/- bad - it repeatedly generate prepare statement, but
somewhere uses prepared statements as protections against SQL injections
and they can use same use case.

Pseudocode (I can send a test case privately):

Script a:

-- A,B are in RAM
for i in 1 .. N loop
insert into A values();
for j in 1 .. M loop
insert into B values();
end loop;
end loop;

Script b:

-- query is extremely fast - returns 0 or 1 rows usually
40 threads execute
while true loop
pr = PREPARE SELECT * FROM A LEFT JOIN B ON ..
EXECUTE pr(...)
sleep(10 ms)
end loop

running both script together can produce high load with minimal number of
executed queries.

354246.00 93.0% s_lock
/usr/lib/postgresql/9.2/bin/postgres
10503.00 2.8% LWLockRelease
/usr/lib/postgresql/9.2/bin/postgres
8802.00 2.3% LWLockAcquire
/usr/lib/postgresql/9.2/bin/postgres
828.00 0.2% _raw_spin_lock
[kernel.kallsyms]
559.00 0.1% _raw_spin_lock_irqsave
[kernel.kallsyms]
340.00 0.1% switch_mm
[kernel.kallsyms]
305.00 0.1% poll_schedule_timeout
[kernel.kallsyms]
274.00 0.1% native_write_msr_safe
[kernel.kallsyms]
257.00 0.1% _raw_spin_lock_irq
[kernel.kallsyms]
238.00 0.1% apic_timer_interrupt
[kernel.kallsyms]
236.00 0.1% __schedule
[kernel.kallsyms]
213.00 0.1% HeapTupleSatisfiesMVCC

With systemtap I got list of spin locks

light weight locks
lockname mode count avg (time)
DynamicLocks Exclusive 2804 1025
DynamicLocks Shared 106 130
ProcArrayLock Exclusive 63 963551
ProcArrayLock Shared 50 4160
LockMgrLocks Exclusive 18 159
IndividualLock Exclusive 2 7

There is relative few very long ProcArrayLocks lwlocks

This issue is very pathologic on fast computers with more than 8 CPU. This
issue was detected after migration from 8.4 to 9.2. (but tested with same
result on 9.0) I see it on devel 9.4 today actualized.

When I moved PREPARE from cycle, then described issues is gone. But when I
use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is
related to planner, ...

Regards

Pavel

#2Jeff Janes
jeff.janes@gmail.com
In reply to: Pavel Stehule (#1)
Re: often PREPARE can generate high load (and sometimes minutes long unavailability)

On Sun, Feb 23, 2014 at 11:04 AM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

Hello

I got a example of code, that generate relatively high load with minimal
connections.

This code is +/- bad - it repeatedly generate prepare statement, but
somewhere uses prepared statements as protections against SQL injections
and they can use same use case.

Pseudocode (I can send a test case privately):

Script a:

-- A,B are in RAM
for i in 1 .. N loop
insert into A values();
for j in 1 .. M loop
insert into B values();
end loop;
end loop;

Script b:

-- query is extremely fast - returns 0 or 1 rows usually
40 threads execute
while true loop
pr = PREPARE SELECT * FROM A LEFT JOIN B ON ..
EXECUTE pr(...)
sleep(10 ms)
end loop

Digging through uncommitted tuples at the top or bottom of an index (which
happenings during planning, especially the planner of merge joins) is very
contentious. Tom proposed changing the snapshot used for planning to
Dirty, but the proposal didn't go anywhere because no one did the testing
to confirm that it solved the problem in the field. Perhaps you can help
do that.

See:

"[PERFORM] Performance bug in prepared statement binding in 9.2?" and
several related threads.

Cheers,

Jeff

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jeff Janes (#2)
Re: often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-23 20:35 GMT+01:00 Jeff Janes <jeff.janes@gmail.com>:

On Sun, Feb 23, 2014 at 11:04 AM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

Hello

I got a example of code, that generate relatively high load with minimal
connections.

This code is +/- bad - it repeatedly generate prepare statement, but
somewhere uses prepared statements as protections against SQL injections
and they can use same use case.

Pseudocode (I can send a test case privately):

Script a:

-- A,B are in RAM
for i in 1 .. N loop
insert into A values();
for j in 1 .. M loop
insert into B values();
end loop;
end loop;

Script b:

-- query is extremely fast - returns 0 or 1 rows usually
40 threads execute
while true loop
pr = PREPARE SELECT * FROM A LEFT JOIN B ON ..
EXECUTE pr(...)
sleep(10 ms)
end loop

Digging through uncommitted tuples at the top or bottom of an index (which
happenings during planning, especially the planner of merge joins) is very
contentious. Tom proposed changing the snapshot used for planning to
Dirty, but the proposal didn't go anywhere because no one did the testing
to confirm that it solved the problem in the field. Perhaps you can help
do that.

I am able to test some patches. Thank you for info

Regards

Pavel

Show quoted text

See:

"[PERFORM] Performance bug in prepared statement binding in 9.2?" and
several related threads.

Cheers,

Jeff

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jeff Janes (#2)
Re: often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-23 20:35 GMT+01:00 Jeff Janes <jeff.janes@gmail.com>:

On Sun, Feb 23, 2014 at 11:04 AM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

Hello

I got a example of code, that generate relatively high load with minimal
connections.

This code is +/- bad - it repeatedly generate prepare statement, but
somewhere uses prepared statements as protections against SQL injections
and they can use same use case.

Pseudocode (I can send a test case privately):

Script a:

-- A,B are in RAM
for i in 1 .. N loop
insert into A values();
for j in 1 .. M loop
insert into B values();
end loop;
end loop;

Script b:

-- query is extremely fast - returns 0 or 1 rows usually
40 threads execute
while true loop
pr = PREPARE SELECT * FROM A LEFT JOIN B ON ..
EXECUTE pr(...)
sleep(10 ms)
end loop

Digging through uncommitted tuples at the top or bottom of an index (which
happenings during planning, especially the planner of merge joins) is very
contentious. Tom proposed changing the snapshot used for planning to
Dirty, but the proposal didn't go anywhere because no one did the testing
to confirm that it solved the problem in the field. Perhaps you can help
do that.

See:

"[PERFORM] Performance bug in prepared statement binding in 9.2?" and
several related threads.

yes, it is very similar. Only it is little bit worse - on 16CPU it can
produce a 20-60 minutes unavailability

regards

Pavel

Show quoted text

Cheers,

Jeff

#5Andres Freund
andres@2ndquadrant.com
In reply to: Pavel Stehule (#1)
Re: often PREPARE can generate high load (and sometimes minutes long unavailability)

Hi,

On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:

There is relative few very long ProcArrayLocks lwlocks

This issue is very pathologic on fast computers with more than 8 CPU. This
issue was detected after migration from 8.4 to 9.2. (but tested with same
result on 9.0) I see it on devel 9.4 today actualized.

When I moved PREPARE from cycle, then described issues is gone. But when I
use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is
related to planner, ...

In addition to the issue Jeff mentioned, I'd suggest trying the same
workload with repeatable read. That can do *wonders* because of the
reduced number of snapshots.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andres Freund (#5)
Re: often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-23 21:32 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:

Hi,

On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:

There is relative few very long ProcArrayLocks lwlocks

This issue is very pathologic on fast computers with more than 8 CPU.

This

issue was detected after migration from 8.4 to 9.2. (but tested with same
result on 9.0) I see it on devel 9.4 today actualized.

When I moved PREPARE from cycle, then described issues is gone. But when

I

use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is
related to planner, ...

In addition to the issue Jeff mentioned, I'd suggest trying the same
workload with repeatable read. That can do *wonders* because of the
reduced number of snapshots.

I tested it, and it doesn't help.

Is there some patch, that I can test related to this issue?

Show quoted text

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#7Andres Freund
andres@2ndquadrant.com
In reply to: Pavel Stehule (#1)
Re: often PREPARE can generate high load (and sometimes minutes long unavailability)

On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:

354246.00 93.0% s_lock
/usr/lib/postgresql/9.2/bin/postgres
10503.00 2.8% LWLockRelease
/usr/lib/postgresql/9.2/bin/postgres
8802.00 2.3% LWLockAcquire
/usr/lib/postgresql/9.2/bin/postgres
828.00 0.2% _raw_spin_lock
[kernel.kallsyms]
559.00 0.1% _raw_spin_lock_irqsave
[kernel.kallsyms]
340.00 0.1% switch_mm
[kernel.kallsyms]
305.00 0.1% poll_schedule_timeout
[kernel.kallsyms]
274.00 0.1% native_write_msr_safe
[kernel.kallsyms]
257.00 0.1% _raw_spin_lock_irq
[kernel.kallsyms]
238.00 0.1% apic_timer_interrupt
[kernel.kallsyms]
236.00 0.1% __schedule
[kernel.kallsyms]
213.00 0.1% HeapTupleSatisfiesMVCC

With systemtap I got list of spin locks

light weight locks
lockname mode count avg (time)
DynamicLocks Exclusive 2804 1025
DynamicLocks Shared 106 130
ProcArrayLock Exclusive 63 963551
ProcArrayLock Shared 50 4160
LockMgrLocks Exclusive 18 159
IndividualLock Exclusive 2 7

There is relative few very long ProcArrayLocks lwlocks

It's odd that there are so many exclusive acquisition
ProcArrayLocks... A hierarchical profile would be interesting. I'd
suggest compiling postgres with -fno-omit-frame-pointer and doing a
profile with perf.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andres Freund (#7)
Re: often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-24 16:09 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:

On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:

354246.00 93.0% s_lock
/usr/lib/postgresql/9.2/bin/postgres
10503.00 2.8% LWLockRelease
/usr/lib/postgresql/9.2/bin/postgres
8802.00 2.3% LWLockAcquire
/usr/lib/postgresql/9.2/bin/postgres
828.00 0.2% _raw_spin_lock
[kernel.kallsyms]
559.00 0.1% _raw_spin_lock_irqsave
[kernel.kallsyms]
340.00 0.1% switch_mm
[kernel.kallsyms]
305.00 0.1% poll_schedule_timeout
[kernel.kallsyms]
274.00 0.1% native_write_msr_safe
[kernel.kallsyms]
257.00 0.1% _raw_spin_lock_irq
[kernel.kallsyms]
238.00 0.1% apic_timer_interrupt
[kernel.kallsyms]
236.00 0.1% __schedule
[kernel.kallsyms]
213.00 0.1% HeapTupleSatisfiesMVCC

With systemtap I got list of spin locks

light weight locks
lockname mode count avg (time)
DynamicLocks Exclusive 2804 1025
DynamicLocks Shared 106 130
ProcArrayLock Exclusive 63 963551
ProcArrayLock Shared 50 4160
LockMgrLocks Exclusive 18 159
IndividualLock Exclusive 2 7

There is relative few very long ProcArrayLocks lwlocks

It's odd that there are so many exclusive acquisition
ProcArrayLocks... A hierarchical profile would be interesting. I'd
suggest compiling postgres with -fno-omit-frame-pointer and doing a
profile with perf.

I had no experience with perf, so maybe it is not what you want

-  19.59%       postmaster  postgres
   - s_lock
      - 55.06% LWLockAcquire
         + 99.84% TransactionIdIsInProgress
      - 44.63% LWLockRelease
         + 99.91% TransactionIdIsInProgress
-  13.84%       postmaster  postgres
   - tas
      - 97.97% s_lock
         + 55.01% LWLockAcquire
         + 44.99% LWLockRelease
      - 1.10% LWLockAcquire
         + 99.89% TransactionIdIsInProgress
      - 0.93% LWLockRelease
         + 99.93% TransactionIdIsInProgress

- 19.59% postmaster postgres
- s_lock
- 55.06% LWLockAcquire
- 99.84% TransactionIdIsInProgress
HeapTupleSatisfiesMVCC
heap_hot_search_buffer
index_fetch_heap
index_getnext
get_actual_variable_range
ineq_histogram_selectivity
scalarineqsel
mergejoinscansel
cached_scansel
initial_cost_mergejoin
try_mergejoin_path
sort_inner_and_outer
add_paths_to_joinrel
make_join_rel
make_rels_by_clause_joins
join_search_one_level
standard_join_search
make_rel_from_joinlist
make_one_rel
query_planner
grouping_planner
subquery_planner
standard_planner
planner
pg_plan_query
pg_plan_queries
BuildCachedPlan
GetCachedPlan
exec_bind_message
PostgresMain
ExitPostmaster
BackendStartup
ServerLoop
PostmasterMain
startup_hacks

regards

Pavel

Greetings,

Show quoted text

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#9Jeff Janes
jeff.janes@gmail.com
In reply to: Pavel Stehule (#6)
Re: often PREPARE can generate high load (and sometimes minutes long unavailability)

On Mon, Feb 24, 2014 at 7:02 AM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

2014-02-23 21:32 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:

Hi,

On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:

There is relative few very long ProcArrayLocks lwlocks

This issue is very pathologic on fast computers with more than 8 CPU.

This

issue was detected after migration from 8.4 to 9.2. (but tested with

same

result on 9.0) I see it on devel 9.4 today actualized.

When I moved PREPARE from cycle, then described issues is gone. But

when I

use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is
related to planner, ...

In addition to the issue Jeff mentioned, I'd suggest trying the same
workload with repeatable read. That can do *wonders* because of the
reduced number of snapshots.

I tested it, and it doesn't help.

Is there some patch, that I can test related to this issue?

This is the one that I was referring to:

/messages/by-id/11927.1384199294@sss.pgh.pa.us

Cheers,

Jeff

#10Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jeff Janes (#9)
Re: often PREPARE can generate high load (and sometimes minutes long unavailability)

Hello

2014-02-24 21:31 GMT+01:00 Jeff Janes <jeff.janes@gmail.com>:

On Mon, Feb 24, 2014 at 7:02 AM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

2014-02-23 21:32 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:

Hi,

On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:

There is relative few very long ProcArrayLocks lwlocks

This issue is very pathologic on fast computers with more than 8 CPU.

This

issue was detected after migration from 8.4 to 9.2. (but tested with

same

result on 9.0) I see it on devel 9.4 today actualized.

When I moved PREPARE from cycle, then described issues is gone. But

when I

use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is
related to planner, ...

In addition to the issue Jeff mentioned, I'd suggest trying the same
workload with repeatable read. That can do *wonders* because of the
reduced number of snapshots.

I tested it, and it doesn't help.

Is there some patch, that I can test related to this issue?

This is the one that I was referring to:

/messages/by-id/11927.1384199294@sss.pgh.pa.us

I tested this patch with great success. Waiting on ProcArrayLocks are off.
Throughput is expected.

For described use case it is seriously interesting.

Regards

Pavel

light weight locks
lockname mode count avg
DynamicLocks Exclusive 8055 5003
DynamicLocks Shared 1666 50
LockMgrLocks Exclusive 129 36
IndividualLock Exclusive 34 48
IndividualLock Shared 21 7
BufFreelistLock Exclusive 12 8
WALWriteLock Exclusive 1 38194
ProcArrayLock Shared 1 8

Show quoted text

Cheers,

Jeff

#11Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#10)
1 attachment(s)
Re: often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-25 11:29 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hello

2014-02-24 21:31 GMT+01:00 Jeff Janes <jeff.janes@gmail.com>:

On Mon, Feb 24, 2014 at 7:02 AM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

2014-02-23 21:32 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:

Hi,

On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:

There is relative few very long ProcArrayLocks lwlocks

This issue is very pathologic on fast computers with more than 8 CPU.

This

issue was detected after migration from 8.4 to 9.2. (but tested with

same

result on 9.0) I see it on devel 9.4 today actualized.

When I moved PREPARE from cycle, then described issues is gone. But

when I

use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is
related to planner, ...

In addition to the issue Jeff mentioned, I'd suggest trying the same
workload with repeatable read. That can do *wonders* because of the
reduced number of snapshots.

I tested it, and it doesn't help.

Is there some patch, that I can test related to this issue?

This is the one that I was referring to:

/messages/by-id/11927.1384199294@sss.pgh.pa.us

I tested this patch with great success. Waiting on ProcArrayLocks are
off. Throughput is expected.

For described use case it is seriously interesting.

Here is a update for 9.4

Regards

Pavel

Show quoted text

Regards

Pavel

light weight locks
lockname mode count avg
DynamicLocks Exclusive 8055 5003
DynamicLocks Shared 1666 50
LockMgrLocks Exclusive 129 36
IndividualLock Exclusive 34 48
IndividualLock Shared 21 7
BufFreelistLock Exclusive 12 8
WALWriteLock Exclusive 1 38194
ProcArrayLock Shared 1 8

Cheers,

Jeff

Attachments:

sel_func.patchtext/x-patch; charset=US-ASCII; name=sel_func.patchDownload
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
new file mode 100644
index d525ca4..261473d
*** a/src/backend/utils/adt/selfuncs.c
--- b/src/backend/utils/adt/selfuncs.c
*************** get_actual_variable_range(PlannerInfo *r
*** 4958,4963 ****
--- 4958,4964 ----
  			HeapTuple	tup;
  			Datum		values[INDEX_MAX_KEYS];
  			bool		isnull[INDEX_MAX_KEYS];
+ 			SnapshotData	SnapshotDirty;
  
  			estate = CreateExecutorState();
  			econtext = GetPerTupleExprContext(estate);
*************** get_actual_variable_range(PlannerInfo *r
*** 4980,4985 ****
--- 4981,4987 ----
  			slot = MakeSingleTupleTableSlot(RelationGetDescr(heapRel));
  			econtext->ecxt_scantuple = slot;
  			get_typlenbyval(vardata->atttype, &typLen, &typByVal);
+ 			InitDirtySnapshot(SnapshotDirty);
  
  			/* set up an IS NOT NULL scan key so that we ignore nulls */
  			ScanKeyEntryInitialize(&scankeys[0],
*************** get_actual_variable_range(PlannerInfo *r
*** 4997,5003 ****
  			if (min)
  			{
  				index_scan = index_beginscan(heapRel, indexRel,
! 											 GetActiveSnapshot(), 1, 0);
  				index_rescan(index_scan, scankeys, 1, NULL, 0);
  
  				/* Fetch first tuple in sortop's direction */
--- 4999,5005 ----
  			if (min)
  			{
  				index_scan = index_beginscan(heapRel, indexRel,
! 											 &SnapshotDirty, 1, 0);
  				index_rescan(index_scan, scankeys, 1, NULL, 0);
  
  				/* Fetch first tuple in sortop's direction */
*************** get_actual_variable_range(PlannerInfo *r
*** 5029,5035 ****
  			if (max && have_data)
  			{
  				index_scan = index_beginscan(heapRel, indexRel,
! 											 GetActiveSnapshot(), 1, 0);
  				index_rescan(index_scan, scankeys, 1, NULL, 0);
  
  				/* Fetch first tuple in reverse direction */
--- 5031,5037 ----
  			if (max && have_data)
  			{
  				index_scan = index_beginscan(heapRel, indexRel,
! 											 &SnapshotDirty, 1, 0);
  				index_rescan(index_scan, scankeys, 1, NULL, 0);
  
  				/* Fetch first tuple in reverse direction */