Every development project requires a code review tool and most of
the time the code rules and conventions are very specific to architecture.
Using Regular expressions (JDK 1.4), we can model the code rules and write
our own code review tool
Regular expressions are made up of normal characters and metacharacters. Normal characters include upper and lower case letters and digits. The metacharacters have special meanings and are described in detail below.
In the simplest case, a regular expression looks like a standard search string. For example, the regular expression "check" contains no metacharacters. It will match "checking" and "typechecking" but it will not match "Testing".
To really make good use of regular expressions it is critical to understand metacharacters. Some of the meta characters which can be used are . $ ^ * + etc. For further details about regular expressions you can check the javaworld article mentioned in the resources section.
Pattern & Matcher : These are the classes which can be used to create a regular
expression and find the matches for that expression.
Pattern pattern = Pattern.compile(regexp);
Matcher matcher = pattern.matcher(str);
// you can use matcher.matches() for checking the matching
// print the violation
A regular expression, specified as a string will be the input which gets compiled into an instance of Pattern class. The pattern can be used to get a matcher which matches arbitary character sequences (String, StringBuffer) against the regular expression. All of the state involved in matching resides in the matcher.
Matcher can be used to perform three different kinds of match operations:
matches method: matches the entire input sequence against the pattern
lookingAt method: match the input sequence, starting at the beginning against the pattern
find method: scans the input sequence looking for next subsequence which matches the pattern
After matching, group method can be used to obtain the matched string in the input sequence.
You can use the overloaded method for group which takes the integer group number. Group zero stands for entire expression.
Details about the Tool:
CodeRule : The attributes are name of the rule, the type of the java file for
which this rule applies and the regular expression. The rules are read first
and then each java file in the directory will be parsed if there is a match for
this particular violation.
Group/Single ParserMode : You can specify the parser mode for the rule. whether
the whole file need to be scanned for this rule or line by line.
Type of File : What type of files (suffix) need to be scanned for a particular rule
Custom Rules : You can extend this to have customrules by implementing an interface ICheck
There are other options like option for printing the code with the violation line number and the other details.
This tool can help out in catching issues which can be modeled as regular expressions in large code bases.Using perl or any other scripting language to capture this issues is surely another alternative.