Keyword list sanity check

Started by Heikki Linnakangasover 16 years ago18 messages
#1Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
1 attachment(s)

I wrote a little perl script to perform a basic sanity check to keywords
in gram.y and kwlist.h. It checks that all lists are in alphabetical
order, all keywords present in gram.y are listed in kwlist.h in the
right category, and conversely that all keywords listed in kwlist.h are
listed in gram.y.

It found one minor issue already:

$ perl src/tools/check_keywords.pl
'SCHEMA' after 'SERVER' in unreserved_keyword list is misplaced

SERVER is not in the right place in gram.y, it should go between
SERIALIZABLE and SERVER. I'll fix that.

I'll put this into src/tools. It's heavily dependent on the format of
the lists in gram.y and kwlist.h but if it bitrots due to changes in
those files, we can either fix it or just remove it if it's not deemed
useful anymore.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachments:

check_keywords.plapplication/x-perl; name=check_keywords.plDownload
#2Greg Stark
greg.stark@enterprisedb.com
In reply to: Heikki Linnakangas (#1)
Re: Keyword list sanity check

I had previously considered adding an assertion in the backend to
check they're sorted properly. That would be less formatting dependent
and would be only a couple lines of C.

I don't think we can do that with the gram.y check though.

--
Greg

On 28 Apr 2009, at 09:33, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com

Show quoted text

wrote:

I wrote a little perl script to perform a basic sanity check to
keywords in gram.y and kwlist.h. It checks that all lists are in
alphabetical order, all keywords present in gram.y are listed in
kwlist.h in the right category, and conversely that all keywords
listed in kwlist.h are listed in gram.y.

It found one minor issue already:

$ perl src/tools/check_keywords.pl
'SCHEMA' after 'SERVER' in unreserved_keyword list is misplaced

SERVER is not in the right place in gram.y, it should go between
SERIALIZABLE and SERVER. I'll fix that.

I'll put this into src/tools. It's heavily dependent on the format
of the lists in gram.y and kwlist.h but if it bitrots due to changes
in those files, we can either fix it or just remove it if it's not
deemed useful anymore.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
<check_keywords.pl>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Greg Stark (#2)
Re: Keyword list sanity check

Greg Stark wrote:

I had previously considered adding an assertion in the backend to check
they're sorted properly. That would be less formatting dependent and
would be only a couple lines of C.

I don't think we can do that with the gram.y check though.

Well, the ordering in gram.y is just pro forma anyway. Checking that all
keywords are present and correctly labeled in kwlist.h is more
important. What's still missing is to check that all keywords defined
with "%token <keyword>" in gram.y are present in one of the keyword lists.

Hmm, it just occurred to me that this script is only a few steps away
from actually generating kwlist.h from gram.y, instead of merely
cross-checking them.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4David Fetter
david@fetter.org
In reply to: Heikki Linnakangas (#3)
Re: Keyword list sanity check

On Tue, Apr 28, 2009 at 11:57:19AM +0300, Heikki Linnakangas wrote:

Greg Stark wrote:

I had previously considered adding an assertion in the backend to
check they're sorted properly. That would be less formatting
dependent and would be only a couple lines of C.

I don't think we can do that with the gram.y check though.

Well, the ordering in gram.y is just pro forma anyway. Checking that
all keywords are present and correctly labeled in kwlist.h is more
important. What's still missing is to check that all keywords
defined with "%token <keyword>" in gram.y are present in one of the
keyword lists.

Hmm, it just occurred to me that this script is only a few steps
away from actually generating kwlist.h from gram.y, instead of
merely cross-checking them.

Single Point of Truth is good :)

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#5David Fetter
david@fetter.org
In reply to: Heikki Linnakangas (#1)
Re: Keyword list sanity check

On Tue, Apr 28, 2009 at 11:33:28AM +0300, Heikki Linnakangas wrote:

I wrote a little perl script to perform a basic sanity check to
keywords in gram.y and kwlist.h. It checks that all lists are in
alphabetical order, all keywords present in gram.y are listed in
kwlist.h in the right category, and conversely that all keywords
listed in kwlist.h are listed in gram.y.

It found one minor issue already:

$ perl src/tools/check_keywords.pl 'SCHEMA' after 'SERVER' in
unreserved_keyword list is misplaced

SERVER is not in the right place in gram.y, it should go between
SERIALIZABLE and SERVER. I'll fix that.

I'll put this into src/tools. It's heavily dependent on the format
of the lists in gram.y and kwlist.h but if it bitrots due to
changes in those files, we can either fix it or just remove it if
it's not deemed useful anymore.

Please clean up this code at least to the point where it's
strict-clean, which means putting "use strict;" right after the
shebang line and not checking it in until it runs that way.

I tried, but couldn't make heads or tails of the thing, given all the
unused- and similarly-named variables, failure to indent, etc. I
don't know how to put this gently, but if you checked in C code with
quality like this, you'd have it bounced with derision.

I'd also like to propose that "strict clean" be a minimum code quality
metric for any Perl code in our code base. A lot of what's in there
is just about impossible to maintain.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#6Laurent Laborde
kerdezixe@gmail.com
In reply to: Heikki Linnakangas (#1)
1 attachment(s)
Re: Keyword list sanity check

On Tue, Apr 28, 2009 at 10:33 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I wrote a little perl script to perform a basic sanity check to keywords in
gram.y and kwlist.h. It checks that all lists are in alphabetical order, all
keywords present in gram.y are listed in kwlist.h in the right category, and
conversely that all keywords listed in kwlist.h are listed in gram.y.

Friendly greetings !
Here is a new version of check_keywords.pl :
- perl -w and "use strict" enabled (and all the fixes that come with it)
- minor cleaning

--
Laurent Laborde
Sysadmin at jfg://networks
http://www.over-blog.com/

Attachments:

check_keywords.plapplication/x-perl; name=check_keywords.plDownload
#7Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Laurent Laborde (#6)
Re: Keyword list sanity check

Laurent Laborde wrote:

On Tue, Apr 28, 2009 at 10:33 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I wrote a little perl script to perform a basic sanity check to keywords in
gram.y and kwlist.h. It checks that all lists are in alphabetical order, all
keywords present in gram.y are listed in kwlist.h in the right category, and
conversely that all keywords listed in kwlist.h are listed in gram.y.

Friendly greetings !
Here is a new version of check_keywords.pl :
- perl -w and "use strict" enabled (and all the fixes that come with it)
- minor cleaning

Thanks, applied!

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#8Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: David Fetter (#5)
Re: Keyword list sanity check

David Fetter wrote:

Please clean up this code at least to the point where it's
strict-clean, which means putting "use strict;" right after the
shebang line and not checking it in until it runs that way.

I tried, but couldn't make heads or tails of the thing, given all the
unused- and similarly-named variables, failure to indent, etc. I
don't know how to put this gently, but if you checked in C code with
quality like this, you'd have it bounced with derision.

Yep, I'm completely novice at perl. I've checked in Laurent's patch to
clean it up, which at least makes it strict-clean.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#9Robert Haas
robertmhaas@gmail.com
In reply to: David Fetter (#5)
Re: Keyword list sanity check

On Thu, Apr 30, 2009 at 3:27 AM, David Fetter <david@fetter.org> wrote:

On Tue, Apr 28, 2009 at 11:33:28AM +0300, Heikki Linnakangas wrote:

I wrote a little perl script to perform a basic sanity check to
keywords  in gram.y and kwlist.h. It checks that all lists are in
alphabetical  order, all keywords present in gram.y are listed in
kwlist.h in the  right category, and conversely that all keywords
listed in kwlist.h are  listed in gram.y.

It found one minor issue already:

$ perl src/tools/check_keywords.pl 'SCHEMA' after 'SERVER' in
unreserved_keyword list is misplaced

SERVER is not in the right place in gram.y, it should go between
SERIALIZABLE and SERVER. I'll fix that.

I'll put this into src/tools. It's heavily dependent on the format
of  the lists in gram.y and kwlist.h but if it bitrots due to
changes in  those files, we can either fix it or just remove it if
it's not deemed  useful anymore.

Please clean up this code at least to the point where it's
strict-clean, which means putting "use strict;" right after the
shebang line and not checking it in until it runs that way.

And "use warnings;", too.

...Robert

#10Peter Eisentraut
peter_e@gmx.net
In reply to: David Fetter (#5)
Re: Keyword list sanity check

On Thursday 30 April 2009 10:27:45 David Fetter wrote:

I'd also like to propose that "strict clean" be a minimum code quality
metric for any Perl code in our code base. A lot of what's in there
is just about impossible to maintain.

use strict and use warnings, I think, although with use warnings I have
occasionally run into the trouble of some old versions not supporting it (only
via perl -w).

#11Andy Lester
andy@petdance.com
In reply to: David Fetter (#5)
Re: Keyword list sanity check

On Apr 30, 2009, at 2:27 AM, David Fetter wrote:

Please clean up this code at least to the point where it's
strict-clean, which means putting "use strict;" right after the
shebang line and not checking it in until it runs that way.

I can take care of this, David. Shouldn't be too tough.

--
Andy Lester => andy@petdance.com => www.petdance.com => AIM:petdance

#12Bruce Momjian
bruce@momjian.us
In reply to: David Fetter (#5)
Re: Keyword list sanity check

David Fetter wrote:

On Tue, Apr 28, 2009 at 11:33:28AM +0300, Heikki Linnakangas wrote:

I wrote a little perl script to perform a basic sanity check to
keywords in gram.y and kwlist.h. It checks that all lists are in
alphabetical order, all keywords present in gram.y are listed in
kwlist.h in the right category, and conversely that all keywords
listed in kwlist.h are listed in gram.y.

It found one minor issue already:

$ perl src/tools/check_keywords.pl 'SCHEMA' after 'SERVER' in
unreserved_keyword list is misplaced

SERVER is not in the right place in gram.y, it should go between
SERIALIZABLE and SERVER. I'll fix that.

I'll put this into src/tools. It's heavily dependent on the format
of the lists in gram.y and kwlist.h but if it bitrots due to
changes in those files, we can either fix it or just remove it if
it's not deemed useful anymore.

Please clean up this code at least to the point where it's
strict-clean, which means putting "use strict;" right after the
shebang line and not checking it in until it runs that way.

I tried, but couldn't make heads or tails of the thing, given all the
unused- and similarly-named variables, failure to indent, etc. I
don't know how to put this gently, but if you checked in C code with
quality like this, you'd have it bounced with derision.

I'd also like to propose that "strict clean" be a minimum code quality
metric for any Perl code in our code base. A lot of what's in there
is just about impossible to maintain.

Good suggestion; I see this has been done:

revision 1.2
date: 2009/04/30 10:26:35; author: heikki; state: Exp; lines: +24 -11
Clean up check_keywords.pl script, making it 'strict' and removing a few
leftover unused variables.

Laurent Laborde

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#13Andy Lester
andy@petdance.com
In reply to: Robert Haas (#9)
Re: Keyword list sanity check

On Apr 30, 2009, at 6:41 AM, Robert Haas wrote:

Please clean up this code at least to the point where it's
strict-clean, which means putting "use strict;" right after the
shebang line and not checking it in until it runs that way.

And "use warnings;", too.

I'll prob'ly come up with a policy file for Perl::Critic and a make
target for perlcritic.

xoa

--
Andy Lester => andy@petdance.com => www.theworkinggeek.com => AIM:petdance

#14Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#9)
Re: Keyword list sanity check

Robert Haas wrote:

On Thu, Apr 30, 2009 at 3:27 AM, David Fetter <david@fetter.org> wrote:

On Tue, Apr 28, 2009 at 11:33:28AM +0300, Heikki Linnakangas wrote:

I wrote a little perl script to perform a basic sanity check to
keywords ?in gram.y and kwlist.h. It checks that all lists are in
alphabetical ?order, all keywords present in gram.y are listed in
kwlist.h in the ?right category, and conversely that all keywords
listed in kwlist.h are ?listed in gram.y.

It found one minor issue already:

$ perl src/tools/check_keywords.pl 'SCHEMA' after 'SERVER' in
unreserved_keyword list is misplaced

SERVER is not in the right place in gram.y, it should go between
SERIALIZABLE and SERVER. I'll fix that.

I'll put this into src/tools. It's heavily dependent on the format
of ?the lists in gram.y and kwlist.h but if it bitrots due to
changes in ?those files, we can either fix it or just remove it if
it's not deemed ?useful anymore.

Please clean up this code at least to the point where it's
strict-clean, which means putting "use strict;" right after the
shebang line and not checking it in until it runs that way.

And "use warnings;", too.

I now see at the top of the CVS file:

#!/usr/bin/perl -w

use strict;

so it seems it has both now.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#15David Fetter
david@fetter.org
In reply to: Andy Lester (#13)
Re: Keyword list sanity check

On Thu, Apr 30, 2009 at 11:39:33AM -0500, Andy Lester wrote:

On Apr 30, 2009, at 6:41 AM, Robert Haas wrote:

Please clean up this code at least to the point where it's
strict-clean, which means putting "use strict;" right after the
shebang line and not checking it in until it runs that way.

And "use warnings;", too.

I'll prob'ly come up with a policy file for Perl::Critic and a make
target for perlcritic.

The current code has a bunch of 5s in it, so it's a target-rich
environment :)

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#16David Fetter
david@fetter.org
In reply to: David Fetter (#15)
1 attachment(s)
Re: Keyword list sanity check

On Thu, Apr 30, 2009 at 09:40:50AM -0700, David Fetter wrote:

On Thu, Apr 30, 2009 at 11:39:33AM -0500, Andy Lester wrote:

On Apr 30, 2009, at 6:41 AM, Robert Haas wrote:

Please clean up this code at least to the point where it's
strict-clean, which means putting "use strict;" right after the
shebang line and not checking it in until it runs that way.

And "use warnings;", too.

I'll prob'ly come up with a policy file for Perl::Critic and a make
target for perlcritic.

The current code has a bunch of 5s in it, so it's a target-rich
environment :)

Here's a patch that gets it to pass perlcritic -4 and still (as far as
I can tell) work.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

check_keywords.difftext/plain; charset=us-asciiDownload
diff --git a/src/tools/check_keywords.pl b/src/tools/check_keywords.pl
index 8d0d962..a5a01d2 100755
--- a/src/tools/check_keywords.pl
+++ b/src/tools/check_keywords.pl
@@ -1,111 +1,109 @@
-#!/usr/bin/perl -w
+#!/usr/bin/perl
 
 use strict;
+use warnings;
+use diagnostics;
+use Carp;
 
 # Check that the keyword lists in gram.y and kwlist.h are sane. Run from
 # the top directory, or pass a path to a top directory as argument.
 #
 # $PostgreSQL$
 
-my $path;
+local $, = ' ';        # set output field separator
+local $\ = "\n";        # set output record separator
 
-if (@ARGV) {
-	$path = $ARGV[0];
-	shift @ARGV;
-} else {
-	$path = "."; 
-}
-
-$[ = 1;			# set array base to 1
-$, = ' ';		# set output field separator
-$\ = "\n";		# set output record separator
+my $path =  $ARGV[0] || '.';
 
-my %keyword_categories;
-$keyword_categories{'unreserved_keyword'} = 'UNRESERVED_KEYWORD';
-$keyword_categories{'col_name_keyword'} = 'COL_NAME_KEYWORD';
-$keyword_categories{'type_func_name_keyword'} = 'TYPE_FUNC_NAME_KEYWORD';
-$keyword_categories{'reserved_keyword'} = 'RESERVED_KEYWORD';
+my %keyword_categories = (
+    unreserved_keyword     => 'UNRESERVED_KEYWORD',
+    col_name_keyword       => 'COL_NAME_KEYWORD',
+    type_func_name_keyword => 'TYPE_FUNC_NAME_KEYWORD',
+    reserved_keyword       => 'RESERVED_KEYWORD',
+);
 
 my $gram_filename = "$path/src/backend/parser/gram.y";
-open(GRAM, $gram_filename) || die("Could not open : $gram_filename");
 
-my ($S, $s, $k, $n, $kcat);
+my ($S, $s, $k, $kcat);
 my $comment;
-my @arr;
 my %keywords;
 
-line: while (<GRAM>) {
-    chomp;	# strip record separator
+open my $gram, '<', $gram_filename or croak "Could not open $gram_filename: $!";
+my @grammar = <$gram>;
+close $gram;
+line: foreach (@grammar) {
+    chomp;    # strip record separator
 
     $S = $_;
     # Make sure any braces are split
-    $s = '{', $S =~ s/$s/ { /g;
-    $s = '}', $S =~ s/$s/ } /g;
+    $s = '{'; $S =~ s/$s/ { /xg;
+    $s = '}'; $S =~ s/$s/ } /xg;
     # Any comments are split
-    $s = '[/][*]', $S =~ s#$s# /* #g;
-    $s = '[*][/]', $S =~ s#$s# */ #g;
+    $s = '[/][*]'; $S =~ s#$s# /* #xg;
+    $s = '[*][/]'; $S =~ s#$s# */ #xg;
 
     if (!($kcat)) {
-	# Is this the beginning of a keyword list?
-	foreach $k (keys %keyword_categories) {
-	    if ($S =~ m/^($k):/) {
-		$kcat = $k;
-		next line;
-	    }
-	}
-	next line;
+    # Is this the beginning of a keyword list?
+    foreach my $k (keys %keyword_categories) {
+        if ($S =~ m/^($k):/x) {
+            $kcat = $k;
+            next line;
+        }
+    }
+    next line;
     }
 
     # Now split the line into individual fields
-    $n = (@arr = split(' ', $S));
+    my @arr = split(' ', $S);
+    
+    my %comment_switch = (
+        '*/' => 0,
+        '/*' => 1,
+    );
 
     # Ok, we're in a keyword list. Go through each field in turn
-    for (my $fieldIndexer = 1; $fieldIndexer <= $n; $fieldIndexer++) {
-	if ($arr[$fieldIndexer] eq '*/' && $comment) {
-	    $comment = 0;
-	    next;
-	}
-	elsif ($comment) {
-	    next;
-	}
-	elsif ($arr[$fieldIndexer] eq '/*') {
-	    # start of a multiline comment
-	    $comment = 1;
-	    next;
-	}
-	elsif ($arr[$fieldIndexer] eq '//') {
-	    next line;
-	}
-
-	if ($arr[$fieldIndexer] eq ';') {
-	    # end of keyword list
-	    $kcat = '';
-	    next;
-	}
-
-	if ($arr[$fieldIndexer] eq '|') {
-	    next;
-	}
-	
-	# Put this keyword into the right list
-	push @{$keywords{$kcat}}, $arr[$fieldIndexer];
+    for (0..$#arr) {
+        if ($arr[$_] eq '//') {
+            next line;
+        }
+
+        if (exists $comment_switch{$arr[$_]}) {
+            $comment = $comment_switch{$arr[$_]};
+            next;
+        }
+
+        if ($comment) {
+            next;
+        }
+
+        if ($arr[$_] eq ';') {
+            # end of keyword list
+            $kcat = '';
+            next;
+        }
+
+        if ($arr[$_] eq '|') {
+            next;
+        }
+
+        # Put this keyword into the right list
+        push @{$keywords{$kcat}}, $arr[$_];
     }
 }
-close GRAM;
 
 # Check that all keywords are in alphabetical order
-my ($prevkword, $kword, $bare_kword);
-foreach $kcat (keys %keyword_categories) {
+my ($prevkword, $bare_kword);
+foreach my $kcat (keys %keyword_categories) {
     $prevkword = '';
 
-    foreach $kword (@{$keywords{$kcat}}) {
-	# Some keyword have a _P suffix. Remove it for the comparison.
-	$bare_kword = $kword;
-	$bare_kword =~ s/_P$//;
-	if ($bare_kword le $prevkword) {
-	    print "'$bare_kword' after '$prevkword' in $kcat list is misplaced";
-	}
-	$prevkword = $bare_kword;
+    foreach my $kword (@{$keywords{$kcat}}) {
+    # Some keyword have a _P suffix. Remove it for the comparison.
+    $bare_kword = $kword;
+    $bare_kword =~ s/_P$//x;
+    if ($bare_kword le $prevkword) {
+        print "'$bare_kword' after '$prevkword' in $kcat list is misplaced";
+    }
+    $prevkword = $bare_kword;
     }
 }
 
@@ -115,7 +113,7 @@ foreach $kcat (keys %keyword_categories) {
 # with a dummy value.
 my %kwhashes;
 while ( my ($kcat, $kcat_id) = each(%keyword_categories) ) {
-    @arr = @{$keywords{$kcat}};
+    my @arr = @{$keywords{$kcat}};
 
     my $hash;
     foreach my $item (@arr) { $hash->{$item} = 1 }
@@ -126,66 +124,69 @@ while ( my ($kcat, $kcat_id) = each(%keyword_categories) ) {
 # Now read in kwlist.h
 
 my $kwlist_filename = "$path/src/include/parser/kwlist.h";
-open(KWLIST, $kwlist_filename) || die("Could not open : $kwlist_filename");
 
 my $prevkwstring = '';
 my $bare_kwname;
 my %kwhash;
-kwlist_line: while (<KWLIST>) {
-    my($line) = $_;
-
-    if ($line =~ /^PG_KEYWORD\(\"(.*)\", (.*), (.*)\)/)
-    {
-	my($kwstring) = $1;
-	my($kwname) = $2;
-	my($kwcat_id) = $3;
-
-	# Check that the list is in alphabetical order
-	if ($kwstring le $prevkwstring) {
-	    print "'$kwstring' after '$prevkwstring' in kwlist.h is misplaced";
-	}
-	$prevkwstring = $kwstring;
-
-	# Check that the keyword string is valid: all lower-case ASCII chars
-	if ($kwstring !~ /^[a-z_]*$/) {
-	    print "'$kwstring' is not a valid keyword string, must be all lower-case ASCII chars";
-	}
-
-	# Check that the keyword name is valid: all upper-case ASCII chars
-	if ($kwname !~ /^[A-Z_]*$/) {
-	    print "'$kwname' is not a valid keyword name, must be all upper-case ASCII chars";
-	}
-
-	# Check that the keyword string matches keyword name
-	$bare_kwname = $kwname;
-	$bare_kwname =~ s/_P$//;
-	if ($bare_kwname ne uc($kwstring)) {
-	    print "keyword name '$kwname' doesn't match keyword string '$kwstring'";
-	}
-
-	# Check that the keyword is present in the grammar
-	%kwhash = %{$kwhashes{$kwcat_id}};
-
-	if (!(%kwhash))	{
-	    #print "Unknown kwcat_id: $kwcat_id";
-	} else {
-	    if (!($kwhash{$kwname})) {
-		print "'$kwname' not present in $kwcat_id section of gram.y";
-	    } else {
-		# Remove it from the hash, so that we can complain at the end
-		# if there's keywords left that were not found in kwlist.h
-		delete $kwhashes{$kwcat_id}->{$kwname};
-	    }
-	}
+
+open my $kwlist, '<', $kwlist_filename or croak "Could not open $kwlist_filename: $!";
+my @kwlist_lines = <$kwlist>;
+close $kwlist;
+kwlist_line: foreach (@kwlist_lines) {
+    my($kwstring, $kwname, $kwcat_id);
+    if (m{^PG_KEYWORD \( \"(.*)\", \s (.*), \s (.*) \) }x) {
+        ($kwstring, $kwname, $kwcat_id) = ($1, $2, $3);
+    }
+    else {
+        next kwlist_line;
+    }
+
+    # Check that the list is in alphabetical order
+    if ($kwstring le $prevkwstring) {
+        print "'$kwstring' after '$prevkwstring' in kwlist.h is misplaced";
+    }
+    $prevkwstring = $kwstring;
+
+    # Check that the keyword string is valid: all lower-case ASCII chars
+    if ($kwstring !~ /^[a-z_]*$/x) {
+        print "'$kwstring' is not a valid keyword string, must be all lower-case ASCII chars";
+    }
+
+    # Check that the keyword name is valid: all upper-case ASCII chars
+    if ($kwname !~ /^[A-Z_]*$/x) {
+        print "'$kwname' is not a valid keyword name, must be all upper-case ASCII chars";
+    }
+
+    # Check that the keyword string matches keyword name
+    $bare_kwname = $kwname;
+    $bare_kwname =~ s/_P$//x;
+    if ($bare_kwname ne uc($kwstring)) {
+        print "keyword name '$kwname' doesn't match keyword string '$kwstring'";
+    }
+
+    # Check that the keyword is present in the grammar
+    %kwhash = %{$kwhashes{$kwcat_id}};
+
+    if (!(%kwhash))    {
+        #print "Unknown kwcat_id: $kwcat_id";
+    }
+    else {
+        if (!($kwhash{$kwname})) {
+            print "'$kwname' not present in $kwcat_id section of gram.y";
+        }
+        else {
+            # Remove it from the hash, so that we can complain at the end
+            # if there's keywords left that were not found in kwlist.h
+            delete $kwhashes{$kwcat_id}->{$kwname};
+        }
     }
 }
-close KWLIST;
 
 # Check that we've paired up all keywords from gram.y with lines in kwlist.h
 while ( my ($kwcat, $kwcat_id) = each(%keyword_categories) ) {
     %kwhash = %{$kwhashes{$kwcat_id}};
 
     for my $kw ( keys %kwhash ) {
-	print "'$kw' found in gram.y $kwcat category, but not in kwlist.h"
+        print "'$kw' found in gram.y $kwcat category, but not in kwlist.h"
     }
 }
#17Andy Lester
andy@petdance.com
In reply to: David Fetter (#16)
Re: Keyword list sanity check

On Apr 30, 2009, at 2:11 PM, David Fetter wrote:

Here's a patch that gets it to pass perlcritic -4 and still (as far as
I can tell) work.

Tell ya what. Let me at it and I'll give a larger, more inclusive
patch.

xoxo,
Andy

--
Andy Lester => andy@petdance.com => www.theworkinggeek.com => AIM:petdance

#18Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#10)
Re: Keyword list sanity check

Peter Eisentraut wrote:

On Thursday 30 April 2009 10:27:45 David Fetter wrote:

I'd also like to propose that "strict clean" be a minimum code quality
metric for any Perl code in our code base. A lot of what's in there
is just about impossible to maintain.

use strict and use warnings, I think, although with use warnings I have
occasionally run into the trouble of some old versions not supporting it (only
via perl -w).

Right. I think strict mode is probably sufficient for utility code like
this.

Also note that we maintain the usual postgres indentation standards in
the perl MSVC stuff, by using perltidy thus:

perltidy -b -bl -nsfs -naws -l=100 -ole=unix *.pl *.pm

cheers

andrew