php and postgres - too many queries too fast?
Hey everybody. I'm running postgresql 8.2 on a windows 2k3 server
machine. I have a table (two tables actually, in table and table_import
format). The _import table has all text type fields, while the main
table has datatypes defined.
I wrote a PHP script that checks the various type fields (e.g. integer
is a number, date is a date, etc.) and sets a bool flag is_ok to true
for the row if the data all checks out. The script gets a dump of the
data from a SELECT * statement, then takes each row, verifies the
various data fields (successfully, I've tested) and sets a variable
baddata. At the end of the checking, if baddata still equals 0, then it
crafts an UPDATE statement to change the value of is_ok. There are a
relatively small amount of rows (~1500, small time to you guys I'm
sure), and the script runs fairly fast.
OK, enough of the background, here's my issue. For some lovely reason,
even though my script reports running an UPDATE query 1563 times (out of
1566 rows), only 316 rows have is_ok set to TRUE. I've tried a few
times, changing this and that, and it always updates those 316 rows (no
reason for this, the data is actually really good when it comes in, I'm
just trying to build an extra layer of assuredness). Of particular note,
I'm trying to keep the script fairly uniform and work for all our
tables, so the UPDATE statement looks something like
UPDATE table SET is_ok = 'TRUE' WHERE var1 = value1 AND var2 = value2
AND var3 = value3.....
for every field in the record (I would have it base it on the primary
key, but the field names and locations are different for each table). Is
it possible that I'm trying to run too many queries at once (or rather
rapid succession)? I've tried encapsulating the queries in a BEGIN ..
COMMIT transaction which improved my speed quite a bit, but it's still
updating only those rows.
I know that it's entirely possible that the problem lies in the PHP, or
the network, or the web server configuration, or the moon phase, but is
there anything here that jumps out at anybody as a possible cause?
TIA
Thomas R. Hart II
tomhart@coopfed.org
On 11/5/07, Tom Hart <tomhart@coopfed.org> wrote:
Hey everybody. I'm running postgresql 8.2 on a windows 2k3 server
machine. I have a table (two tables actually, in table and table_import
format). The _import table has all text type fields, while the main
table has datatypes defined.I wrote a PHP script that checks the various type fields (e.g. integer
is a number, date is a date, etc.) and sets a bool flag is_ok to true
for the row if the data all checks out. The script gets a dump of the
data from a SELECT * statement, then takes each row, verifies the
various data fields (successfully, I've tested) and sets a variable
baddata. At the end of the checking, if baddata still equals 0, then it
crafts an UPDATE statement to change the value of is_ok. There are a
relatively small amount of rows (~1500, small time to you guys I'm
sure), and the script runs fairly fast.OK, enough of the background, here's my issue. For some lovely reason,
even though my script reports running an UPDATE query 1563 times (out of
1566 rows), only 316 rows have is_ok set to TRUE. I've tried a few
times, changing this and that, and it always updates those 316 rows (no
reason for this, the data is actually really good when it comes in, I'm
just trying to build an extra layer of assuredness). Of particular note,
I'm trying to keep the script fairly uniform and work for all our
tables, so the UPDATE statement looks something likeUPDATE table SET is_ok = 'TRUE' WHERE var1 = value1 AND var2 = value2
AND var3 = value3.....
Can you create a repeatable test version of this? I.e. have one row
that does and one row that doesn't update? And give it to the list as
a .sql file to be loaded along with the queries you're using?
I understand the general idea of what you're saying, but I'm afraid
there's not enough detail in your post to really help.
Tom Hart wrote:
[snip]
OK, enough of the background, here's my issue. For some lovely reason,
even though my script reports running an UPDATE query 1563 times (out of
1566 rows), only 316 rows have is_ok set to TRUE. I've tried a few
times, changing this and that, and it always updates those 316 rows (no
reason for this, the data is actually really good when it comes in, I'm
just trying to build an extra layer of assuredness). Of particular note,
I'm trying to keep the script fairly uniform and work for all our
tables, so the UPDATE statement looks something likeUPDATE table SET is_ok = 'TRUE' WHERE var1 = value1 AND var2 = value2
AND var3 = value3.....Thomas R. Hart II
tomhart@coopfed.org
Have you run one of these queries via psql or something other than php?
I doubt its a "too many too fast" thing. I'd guess a logic error
someplace.
Why 1563 queries? Can you get the row's modified per query? If you're
tables looks like:
var1 | var2 | var3
--------------------
a | b | a
a | b | c
Would you fire of two query's like:
UPDATE table SET is_ok = 'TRUE' WHERE var1 = 'a' AND var2 = 'b' and var3
= 'a;
UPDATE table SET is_ok = 'TRUE' WHERE var1 = 'a' AND var2 = 'b' and var3
= 'c;
if so, do you generate the update's on the fly?
-Andy
On Mon, 2007-11-05 at 17:18 -0500, Tom Hart wrote:
UPDATE table SET is_ok = 'TRUE' WHERE var1 = value1 AND var2 = value2
AND var3 = value3.....
As others have said, you need to narrow the problem down a bit more
before we can provide useful help.
However, a wild guess might be that some of your fields contain NULLs.
In SQL, NULL=NULL is _not_ true (more specifically, it is NULL).
To see what I mean, do "SELECT 1 WHERE NULL=NULL", it will return 0
rows.
Regards,
Jeff Davis
andy wrote:
Tom Hart wrote:
[snip]OK, enough of the background, here's my issue. For some lovely
reason, even though my script reports running an UPDATE query 1563
times (out of 1566 rows), only 316 rows have is_ok set to TRUE. I've
tried a few times, changing this and that, and it always updates
those 316 rows (no reason for this, the data is actually really good
when it comes in, I'm just trying to build an extra layer of
assuredness). Of particular note, I'm trying to keep the script
fairly uniform and work for all our tables, so the UPDATE statement
looks something likeUPDATE table SET is_ok = 'TRUE' WHERE var1 = value1 AND var2 = value2
AND var3 = value3.....Thomas R. Hart II
tomhart@coopfed.orgHave you run one of these queries via psql or something other than
php? I doubt its a "too many too fast" thing. I'd guess a logic
error someplace.Why 1563 queries? Can you get the row's modified per query? If
you're tables looks like:var1 | var2 | var3
--------------------
a | b | a
a | b | cWould you fire of two query's like:
UPDATE table SET is_ok = 'TRUE' WHERE var1 = 'a' AND var2 = 'b' and
var3 = 'a;UPDATE table SET is_ok = 'TRUE' WHERE var1 = 'a' AND var2 = 'b' and
var3 = 'c;if so, do you generate the update's on the fly?
-Andy
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
I thought I had run the queries through psql (it's been a long one :-)
and when I tried to verify, I was able to find my problem (yes, I am an
idiot).
It turns out that for some reason it didn't like to UPDATE when I was
using a text type field (specifically an empty text field) in the WHERE
clause. To remedy this, I instructed PHP to not use a field in the WHERE
clause if the destination type was 'text', and now we're working
beautifully (2.405 seconds to run the script through 1566 rows, running
updates on 1563 of them). Now I just need to figure out what's going on
with those 3 rogue rows.
Sorry I hadn't checked all the bases thoroughly, but now they definitely
are belong to us. Thanks for the help and have a good night.
Tom Hart wrote:
It turns out that for some reason it didn't like to UPDATE when I was
using a text type field (specifically an empty text field) in the WHERE
clause. To remedy this, I instructed PHP to not use a field in the WHERE
clause if the destination type was 'text', and now we're working
beautifully (2.405 seconds to run the script through 1566 rows, running
updates on 1563 of them). Now I just need to figure out what's going on
with those 3 rogue rows.
Its almost certainly not an empty text field, but a NULL one, see Jeff's
reply to your original message.
--
Richard Huxton
Archonet Ltd