Inconsistent performance with LIKE and bind variable on long-lived connection

Started by Steven Grimmalmost 9 years ago12 messagesgeneral
Jump to latest
#1Steven Grimm
sgrimm@thesegovia.com

PostgreSQL 9.6.3 on OS X Sierra, JDBC driver version 42.1.1.

I noticed that one of my queries was slowing down after a few invocations.
Narrowed it down to an issue with bind variables and LIKE conditions. Very
consistently, on a given connection, the first 9 times a SELECT containing
a LIKE whose value is a bind variable is executed, the query returns
results quickly. But starting with the tenth execution, it slows down by a
factor of 20 and stays slow until the connection is closed.

The slowdown doesn't happen with bind variables on equality conditions, and
it doesn't happen if a constant is used in the LIKE condition rather than a
bind variable.

The below Java code reproduces the problem 100% of the time on my system.
It populates a test table with 10000 rows if the table doesn't already
exist, then runs a simple two-condition SELECT with different combinations
of bind variables and constants. Each query is run 20 times and its
execution times in milliseconds are displayed.

On my system I get output like this:

Two bind variables 57 22 8 5 9 9 10 13 8 144 151 236 198
204 197 197 152 126 108 102
Equality bind variable 5 5 5 5 5 5 7 5 4 6 5 4
5 5 4 4 5 5 5 5
LIKE bind variable 5 5 5 5 9 5 5 12 6 111 106 107 108
121 110 101 107 108 113 108
No bind variables 5 5 4 5 5 5 4 4 5 5 4 5
5 4 6 5 4 5 7 4

In other words, when a bind variable is used in the LIKE condition, the
query suddenly goes from taking 5-15 milliseconds to taking 100+
milliseconds. When the query is run in psql, it takes 5-10 milliseconds.

I'm not sure if this is a problem with the JDBC driver or the server, but
it's certainly unexpected!

---

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

public class BindTest {
static String url = "jdbc:postgresql:test";

private static void init() {
try (Connection conn = DriverManager.getConnection(url)) {
try (PreparedStatement stmt = conn.prepareStatement(
"CREATE TABLE test (col1 TEXT, col2 TEXT, PRIMARY KEY (col1,
col2))")) {
stmt.execute();
}

try (PreparedStatement stmt = conn.prepareStatement("INSERT INTO test
VALUES (?,?)")) {
stmt.setString(1, "xyz");

for (int i = 0; i< 10000; i++) {
stmt.setInt(2, i);
stmt.addBatch();
}

stmt.executeBatch();
}

try (PreparedStatement stmt = conn.prepareStatement("VACUUM
ANALYZE")) {
stmt.execute();
}
} catch (SQLException e) {
// Table exists, so don't populate it
}
}

private static void logTime(String name, PreparedStatement stmt) throws
SQLException {
StringBuilder out = new StringBuilder(String.format("%-22s", name));

for (int i = 0; i< 20; i++) {
long startTime = System.currentTimeMillis();
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
rs.getString(1);
}
long endTime = System.currentTimeMillis();
rs.close();

out.append(String.format(" %3d", endTime - startTime));
}

stmt.close();

System.out.println(out);
}

public static void main(String[] args) throws Exception {
init();

try (Connection conn = DriverManager.getConnection(url)) {
PreparedStatement stmt = conn.prepareStatement(
"SELECT col2 FROM test WHERE col1 = ? AND col2 LIKE ? ORDER BY
col2");
stmt.setString(1, "xyz");
stmt.setString(2, "%");
logTime("Two bind variables", stmt);

stmt = conn.prepareStatement(
"SELECT col2 FROM test WHERE col1 = ? AND col2 LIKE '%' ORDER BY
col2");
stmt.setString(1, "xyz");
logTime("Equality bind variable", stmt);

stmt = conn.prepareStatement(
"SELECT col2 FROM test WHERE col1 = 'xyz' AND col2 LIKE ? ORDER
BY col2");
stmt.setString(1, "%");
logTime("LIKE bind variable", stmt);

stmt = conn.prepareStatement(
"SELECT col2 FROM test WHERE col1 = 'xyz' AND col2 LIKE '%' ORDER
BY col2");
logTime("No bind variables", stmt);
}
}
}

#2Alban Hertroys
haramrae@gmail.com
In reply to: Steven Grimm (#1)
Re: Inconsistent performance with LIKE and bind variable on long-lived connection

On 10 Jun 2017, at 5:37, Steven Grimm <sgrimm@thesegovia.com> wrote:

[…]

I notice that you're declaring your ResultSet variable inside the loop, which means that you create and destroy it frequently. I've been told that this is a pattern that the GC has trouble keeping up with (although that was around the Java 5 era), so you might be seeing the effects of memory churn in your client instead of in the database.

I modified your function to not do that anymore, does that make a difference?

Note; the long variables are scalar instead of objects. I don't think they need the same treatment, but it can't hurt.

private static void logTime(String name, PreparedStatement stmt) throws SQLException {
StringBuilder out = new StringBuilder(String.format("%-22s", name));
ResultSet rs;
long startTime, endTime;

for (int i = 0; i< 20; i++) {
startTime = System.currentTimeMillis();
rs = stmt.executeQuery();
while (rs.next()) {
rs.getString(1);
}
endTime = System.currentTimeMillis();
rs.close();

out.append(String.format(" %3d", endTime - startTime));
}

stmt.close();

System.out.println(out);
}

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Steven Grimm
sgrimm@thesegovia.com
In reply to: Alban Hertroys (#2)
Re: Inconsistent performance with LIKE and bind variable on long-lived connection

On Sat, Jun 10, 2017 at 1:46 AM, Alban Hertroys <haramrae@gmail.com> wrote:

I notice that you're declaring your ResultSet variable inside the loop,

which means that you create and destroy it frequently. I've been told that
this is a pattern that the GC has trouble keeping up with (although that
was around the Java 5 era), so you might be seeing the effects of memory
churn in your client instead of in the database.

Sadly, no, it doesn't help. I'd be pretty surprised if that was it, though,
given that it consistently happens with the bind variable and never happens
without; surely the result set's GC behavior would be the same in either
case?

For grins, I tried running with the -verbosegc option to log GC activity:

Two bind variables 48 16 13 8 6 5 6 7 5 115 110 109 132
108 110 113 109 113 108 108
Equality bind variable 5 6 6 5 6 6 6 6 7 6 8 8 5
6 4 5 5 5 5 5
[GC (Allocation Failure) 65536K->2293K(251392K), 0.0035075 secs]
LIKE bind variable 5 5 6 5 6 5 5 6 6 110 107 112 116
118 107 112 115 105 104 104
No bind variables 5 5 4 5 5 4 5 5 4 6 5 6 5
8 4 4 4 4 5 4

So there's only one GC run and it takes about 3 milliseconds. That result
is the same whether the ResultSet is declared inside or outside the loop.

-Steve

#4rob stone
floriparob@gmail.com
In reply to: Steven Grimm (#3)
Re: Inconsistent performance with LIKE and bind variable on long-lived connection

Hi

On Sat, 2017-06-10 at 09:17 -0700, Steven Grimm wrote:

On Sat, Jun 10, 2017 at 1:46 AM, Alban Hertroys <haramrae@gmail.com>
wrote:

I notice that you're declaring your ResultSet variable inside the

loop, which means that you create and destroy it frequently. I've
been told that this is a pattern that the GC has trouble keeping up
with (although that was around the Java 5 era), so you might be
seeing the effects of memory churn in your client instead of in the
database.

Sadly, no, it doesn't help. I'd be pretty surprised if that was it,
though, given that it consistently happens with the bind variable and
never happens without; surely the result set's GC behavior would be
the same in either case?

For grins, I tried running with the -verbosegc option to log GC
activity:

Two bind variables      48  16  13   8   6   5   6   7   5 115 110
109 132 108 110 113 109 113 108 108
Equality bind variable   5   6   6   5   6   6   6   6   7   6   8  
8   5   6   4   5   5   5   5   5
[GC (Allocation Failure)  65536K->2293K(251392K), 0.0035075 secs]
LIKE bind variable       5   5   6   5   6   5   5   6   6 110 107
112 116 118 107 112 115 105 104 104
No bind variables        5   5   4   5   5   4   5   5   4   6   5  
6   5   8   4   4   4   4   5   4

So there's only one GC run and it takes about 3 milliseconds. That
result is the same whether the ResultSet is declared inside or
outside the loop.

-Steve

I ran your test and here are my results:-

:~/softdev/java_apps$ java BindTest
Two bind
variables     402  15  13  14  13  13  15  14   9  39  46  45  41  39  
41  38  39  38  40  38
Equality bind
variable   9  15   9  10  12  13  10  13  10   9   9   9   7   8   7   
8   8  10   8   7
LIKE bind
variable       9   9   9   8  10   8   8   8   8  40  40  41  37  38  3
9  39  39  42  38  38
No bind
variables       13   9   9   9   9  11  10   8   9   8   9  10   8   9 
  9   8   9   9   8  12
:~/softdev/java_apps$ java BindTest
Two bind
variables      57  17  17  15  15  14  13  14  13  38  47  42  40  39  
39  41  37  39  38  37
Equality bind
variable   8   9   9  10  14  15  10  13  10  10  10   8   8   9  13   
8   9   8   9   8
LIKE bind
variable      10   9  11  10   9   8   9   8   8  39  38  43  43  39  3
9  37  38  38  43  39
No bind
variables        9   9  11   9   9   9   8   9   9   8   9  10   9  15 
 10   9   9   9   9  13

Using Java version 1.8.0_131

postgresql-9.4.1212.jar

PostgreSQL 9.6.3 on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-
18) 6.3.0 20170516, 64-bit

You'll notice that the first time it ran the database was "cold" and it
took 402, whereas the second time it dropped to 57.

If I have time today I might alter it to used named variables and see
if that makes a difference.

Cheers,
Rob

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Steven Grimm
sgrimm@thesegovia.com
In reply to: rob stone (#4)
Re: Inconsistent performance with LIKE and bind variable on long-lived connection

The problem doesn't appear to be specific to the JDBC driver. Tried a quick
version of this in Python for grins with a database that was already
populated by the Java code (sadly, the psycopg2 library doesn't directly
support prepared statements):

import psycopg2
import time

conn = psycopg2.connect('dbname=test')
cur = conn.cursor()
cur.execute('PREPARE myplan AS '
'SELECT col2 FROM test WHERE col1 = $1 AND col2 LIKE $2 ORDER
BY col2')
times = []

for i in range(0, 20):
start_time = time.time()
cur.execute('EXECUTE myplan (%s, %s)', ('xyz', '%'))
cur.fetchall()
end_time = time.time()
times.append(int((end_time - start_time) * 1000))

print(times)

The output looks similar to the pattern in the Java test code, though it
gets slow after 5 iterations rather than 9:

[7, 6, 6, 5, 6, 102, 104, 111, 107, 109, 108, 114, 102, 107, 107, 134, 102,
106, 108, 103]

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Steven Grimm (#5)
Re: Inconsistent performance with LIKE and bind variable on long-lived connection

On Saturday, June 10, 2017, Steven Grimm <sgrimm@thesegovia.com> wrote:

The problem doesn't appear to be specific to the JDBC driver. Tried a
quick version of this in Python for grins with a database that was already
populated by the Java code (sadly, the psycopg2 library doesn't directly
support prepared statements):

https://www.postgresql.org/docs/current/static/sql-prepare.html

Specifically, the notes section.

David J.

#7Steven Grimm
sgrimm@thesegovia.com
In reply to: David G. Johnston (#6)
Re: Inconsistent performance with LIKE and bind variable on long-lived connection

On Sat, Jun 10, 2017 at 11:10 PM, David G. Johnston <
david.g.johnston@gmail.com> wrote:

https://www.postgresql.org/docs/current/static/sql-prepare.html

Specifically, the notes section.

That seems to fit the behavior. Thanks; I wasn't aware of that feature of
prepared statements. I changed the Python code to do EXPLAIN ANALYZE
EXECUTE rather than EXECUTE, and I do indeed see a change in plan after the
fifth iteration:

(first five iterations)

Index Only Scan using test_pkey on test (cost=0.29..476.29 rows=9999
width=4) (actual time=0.058..2.439 rows=10000 loops=1)
Index Cond: (col1 = 'xyz'::text)
Filter: (col2 ~~ '%'::text)
Heap Fetches: 0
Execution time: 2.957 ms

(iterations 6+)

Sort (cost=205.41..205.54 rows=50 width=4) (actual time=104.986..105.784
rows=10000 loops=1)
Sort Key: col2
Sort Method: quicksort Memory: 853kB
-> Seq Scan on test (cost=0.00..204.00 rows=50 width=4) (actual
time=0.014..2.100 rows=10000 loops=1)
Filter: ((col2 ~~ $2) AND (col1 = $1))
Execution time: 106.282 ms

So the problem here may be that the cost estimate for the generic execution
plan is way off in the case of a LIKE bind variable that matches a large
number of rows. I did make sure to have the Java code do a VACUUM ANALYZE
after doing its inserts, just to eliminate lack of statistics as a possible
explanation. Maybe the incorrect row count estimate (50 instead of 10000)
is causing it to think the quicksort will be a lot cheaper than it ends up
being with the actual rows?

Interesting that the Java version switches to the suboptimal plan after 9
iterations rather than 5. I don't know how to get the JDBC driver to do an
EXPLAIN on a prepared statement, so I can't confirm that the same thing is
happening there, but it seems plausible. Happy to try that if there's a way
to do it.

-Steve

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steven Grimm (#7)
Re: Inconsistent performance with LIKE and bind variable on long-lived connection

Steven Grimm <sgrimm@thesegovia.com> writes:

That seems to fit the behavior. Thanks; I wasn't aware of that feature of
prepared statements. I changed the Python code to do EXPLAIN ANALYZE
EXECUTE rather than EXECUTE, and I do indeed see a change in plan after the
fifth iteration:

(first five iterations)

Index Only Scan using test_pkey on test (cost=0.29..476.29 rows=9999
width=4) (actual time=0.058..2.439 rows=10000 loops=1)
Index Cond: (col1 = 'xyz'::text)
Filter: (col2 ~~ '%'::text)
Heap Fetches: 0
Execution time: 2.957 ms

(iterations 6+)

Sort (cost=205.41..205.54 rows=50 width=4) (actual time=104.986..105.784
rows=10000 loops=1)
Sort Key: col2
Sort Method: quicksort Memory: 853kB
-> Seq Scan on test (cost=0.00..204.00 rows=50 width=4) (actual
time=0.014..2.100 rows=10000 loops=1)
Filter: ((col2 ~~ $2) AND (col1 = $1))
Execution time: 106.282 ms

So the problem here may be that the cost estimate for the generic execution
plan is way off in the case of a LIKE bind variable that matches a large
number of rows. I did make sure to have the Java code do a VACUUM ANALYZE
after doing its inserts, just to eliminate lack of statistics as a possible
explanation. Maybe the incorrect row count estimate (50 instead of 10000)
is causing it to think the quicksort will be a lot cheaper than it ends up
being with the actual rows?

Right. The plancache code is designed to switch to a generic plan if
that doesn't seem to save anything compared to a custom plan that's
built for the specific parameter value(s). But there's an implicit
assumption in that design that knowledge of the specific values can
only make the plan better not worse. In this case, knowing the values
causes the planner to know that the plan will scan much more of the
table than its generic estimate would guess, so it ends up with a
higher cost estimate. But that's not because the plan is "worse",
it's because the rowcount estimate is closer to reality.

I've seen this problem once or twice before, but I'm not entirely sure
what to do about it. Disentangling the two effects seems hard. One idea
is to reject the generic plan if it comes out cheaper than the average
custom plan, on the grounds that this must reflect less expensive (but
inaccurate) conclusions about the rowcount not a genuinely better plan.
But I have a feeling that that's too simplistic --- in particular I'm not
sure it does the right thing when the custom plan number is indeed an
average over several different custom plans. (That is, if you were
throwing a mix of selective and unselective patterns at this query,
so that some of the custom plans were indexscans and some were seqscans,
I'm not very sure what would happen with a rule like that.)

Interesting that the Java version switches to the suboptimal plan after 9
iterations rather than 5. I don't know how to get the JDBC driver to do an
EXPLAIN on a prepared statement, so I can't confirm that the same thing is
happening there, but it seems plausible. Happy to try that if there's a way
to do it.

Yeah, I've been watching this thread and trying to figure out how to
explain that part; I suspected a cause of this form but couldn't
make that theory match the 9-iterations observation. (I still can't.)

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#8)
Re: Inconsistent performance with LIKE and bind variable on long-lived connection

I wrote:

Right. The plancache code is designed to switch to a generic plan if
that doesn't seem to save anything compared to a custom plan that's
built for the specific parameter value(s).

Er, -ENOCAFFEINE. That's backwards of course. I think the rest of
what I wrote is okay.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10Steven Grimm
sgrimm@thesegovia.com
In reply to: Tom Lane (#8)
Re: Inconsistent performance with LIKE and bind variable on long-lived connection

On Sun, Jun 11, 2017 at 8:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, I've been watching this thread and trying to figure out how to
explain that part; I suspected a cause of this form but couldn't
make that theory match the 9-iterations observation. (I still can't.)

I walked through the Java code in a debugger just now and have an
explanation for the 5 vs. 9 discrepancy. The JDBC driver keeps a cache of
queries that have been passed to a connection's prepareStatement() method,
and inlines the bind values the first 4 times it sees a query in the hopes
of reducing overhead on one-off queries. So I believe the sequence ends up
being:

1-4: JDBC driver inlines the values, server sees no bind variables
5: JDBC driver prepares the statement, server sees bind variables and tries
generic plan
6+: JDBC driver reuses the existing prepared statement from iteration 5
10: Server has seen the query 5 times before and switches to the custom plan

As for the broader problem, at the risk of being hopelessly naive about how
all this works internally: Could the discrepancy between the estimated and
actual row counts be tracked and fed back into the planner somehow?

-Steve

#11Pavel Stehule
pavel.stehule@gmail.com
In reply to: Steven Grimm (#10)
Re: Inconsistent performance with LIKE and bind variable on long-lived connection

2017-06-11 18:34 GMT+02:00 Steven Grimm <sgrimm@thesegovia.com>:

On Sun, Jun 11, 2017 at 8:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, I've been watching this thread and trying to figure out how to
explain that part; I suspected a cause of this form but couldn't
make that theory match the 9-iterations observation. (I still can't.)

I walked through the Java code in a debugger just now and have an
explanation for the 5 vs. 9 discrepancy. The JDBC driver keeps a cache of
queries that have been passed to a connection's prepareStatement() method,
and inlines the bind values the first 4 times it sees a query in the hopes
of reducing overhead on one-off queries. So I believe the sequence ends up
being:

1-4: JDBC driver inlines the values, server sees no bind variables

yes - this is client side prepared statement - prepareThreshold
https://jdbc.postgresql.org/documentation/head/connect.html

Regards

Pavel

5: JDBC driver prepares the statement, server sees bind variables and tries

Show quoted text

generic plan
6+: JDBC driver reuses the existing prepared statement from iteration 5
10: Server has seen the query 5 times before and switches to the custom
plan

As for the broader problem, at the risk of being hopelessly naive about
how all this works internally: Could the discrepancy between the estimated
and actual row counts be tracked and fed back into the planner somehow?

-Steve

#12David G. Johnston
david.g.johnston@gmail.com
In reply to: Steven Grimm (#7)
Re: Inconsistent performance with LIKE and bind variable on long-lived connection

On Sun, Jun 11, 2017 at 1:10 AM, Steven Grimm <sgrimm@thesegovia.com> wrote:

(first five iterations)

Index Only Scan using test_pkey on test (cost=0.29..476.29 rows=9999
width=4) (actual time=0.058..2.439 rows=10000 loops=1)
Index Cond: (col1 = 'xyz'::text)
Filter: (col2 ~~ '%'::text)
Heap Fetches: 0
Execution time: 2.957 ms

(iterations 6+)

Sort (cost=205.41..205.54 rows=50 width=4) (actual time=104.986..105.784
rows=10000 loops=1)
Sort Key: col2
Sort Method: quicksort Memory: 853kB
-> Seq Scan on test (cost=0.00..204.00 rows=50 width=4) (actual
time=0.014..2.100 rows=10000 loops=1)
Filter: ((col2 ~~ $2) AND (col1 = $1))
Execution time: 106.282 ms

​Can you convince the planner to use the IOS for the generic plan too; by
setting "​enable_seqscan=false" and maybe "enable_sort=false"?

I get what Tom's saying generally but I'm surprised it would throw away an
IOS plan for a sequential scan + sort when it thinks there are fewer rows
that will actually match. I've generally read that the closer to the whole
table you expect to retrieve the more advantageous a sequential scan is but
this exhibits the opposite behavior.

IOW, I'm wondering why of the various generic plans why this one is
considered the cheapest in the first place. Is the measure "cost per row"
an invalid metric to consider - since in the above the specific plan is
0.048 compared to 0.244 for the generic one?

Or, should we at least add memory of actual executions for a given set of
bindings? It would probably be acceptable for a pattern like: "5 4 5 6 4
20 4 5 6" to show up - we try the specific plan 5 times, then we try the
generic one and see that we got worse, and so we go back to the specific
plan. Assuming the actual supplied inputs don't change as is the case in
the example.

David J.