Matthew's Development Review

Monday, August 18, 2008

Microsoft Word, Javadoc, and Perforce filenames

I've come across an interesting part of the Java documentation article "How to Write Doc Comments for the Javadoc Tool".

Quote:

Troubleshooting Curly Quotes (Microsoft Word)

Problem - A problem occurs if you are working in an editor that defaults to curly (rather than straight) single and double quotes, such as Microsoft Word on a PC -- the quotes disappear when displayed in some browers (such as Unix Netscape). So a phrase like "the display's characteristics" becomes "the displays characteristics."

The illegal characters are the following:

* 146 - right single quote
* 147 - left double quote
* 148 - right double quote

What should be used instead is:

* 39 - straight single quote
* 34 - straight quote

Preventive Solution - The reason the "illegal" quotes occurred was that a default Word option is "Change 'Straight Quotes' to 'Smart Quotes'". If you turn this off, you get the appropriate straight quotes when you type.

Fixing the Curly Quotes - Microsoft Word has several save options -- use "Save As Text Only" to change the quotes back to straight quotes. Be sure to use the correct option:

* "Save As Text Only With Line Breaks" - inserts a space at the end of each line, and keeps curly quotes.
* "Save As Text Only" - does not insert a space at the end of each lines, and changes curly quotes to straight quotes.

We've encountered this issue before when developers copy and paste documentation from design documents into code. The troubling part is when it exists in an SCM and needs to be corrected for all versions and all files that contain that illegal character. I encountered the situation with Perforce and had to backup to several checkpoints (> 2GB) and corrected it by writing a regex to process all of the illegal characters and replace them with the correct one. We even encountered issues with character codes 96 (base 16) "-" and FB (base 16) "û" that were pasted from Microsoft Word documents as filenames as well as in files themselves. This presented us with a real issue when it came to processing our maintenance jobs with Perforce.

I just thought I'd share the following work I've done. Hopefully someone out there can utilize it.

Regex used to process checkpoints:

Convert 96 (base 16) "-" to 2D (base 16) "-" dash Example: //depot/project/some– filename.cat
Convert FB (base 16) "û" to (not: 20 (base 16) " " space) to nothing Example: //depot/project/docs/some û other doc.doc@ 1 65539

# Find results and print
perl -n -e '/^(.*)([\xfb])(.*)$/ && print "$1$2$3\n"' checkpoint.1 > correction.1/checkpoint.1.FB
perl -n -e '/^(.*)([\x96])(.*)$/ && print "$1$2$3\n"' checkpoint.1 > correction.1/checkpoint.1.96

perl -n -e '/^(.*)([\xfb])(.*)$/ && print "$1$2$3\n"' checkpoint.2 > checkpoint.2.FB
perl -n -e '/^(.*)([\x96])(.*)$/ && print "$1$2$3\n"' checkpoint.2 > checkpoint.2.96

# Replace all
perl -pe 's/\xfb//g' checkpoint.1 > checkpoint.1.fb_removed
perl -pe 's/\x96/\x2d/g' checkpoint.1.fb_removed > checkpoint.1.96_and_fb_removed

p4d -jr /path/to/perforce/corrected/checkpoint/checkpoint.1.96_and_fb_removed

Regex to help parse Perforce specs and logs:

.*@$//.*$@.*

\1 db file
\2 workspace spec
\3 perforce spec

$//.*$@+.*@+.*

$//.*$\#.*

Force depot names to mv style to rename to lowercase from Perforce maint log

//$.*$/$.*$\#[0-9].*
\1 directory
\2 filename
mv /path/to/perforce/\1/\L\2 /path/to/perforce/\1/\e\2
filenameMap.put("/path/to/perforce/\1/\e\2", "/path/to/perforce/\1/\L\2");

To capture text/binary+l
//$.*$/$.*$\#[0-9].*($.*$).*
\1 directory
\2 filename
\3 type
filenameList.add(new FileRenameGroup("/path/to/perforce/\1/\e\2", "/path/to/perforce/\1/\L\2", "\3"));

A java utility program to rename bad files to correct filenames based on above regex:

import java.io.File;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;

public class PerforcemaintRenameFiles {
public static void main(String[] args) {
Map filenameMap = new HashMap();
filenameMap
.put(
"/absolute/path/to/filename/containing/bad/character/file.bad,filename",
"/absolute/path/to/filename/containing/bad/character/file.correct.filename");

Set filenameMapKeySet = filenameMap.keySet();
Iterator filenameIterator = filenameMapKeySet.iterator();
int filenameCorrectionsCount = 0;
int filenameTotal = 0;
int filenameLowercaseCount = 0;

while (filenameIterator.hasNext()) {
String filenameCorrectCase = (String) filenameIterator.next();
String filenameWrongLowercase = (String) filenameMap.get(filenameCorrectCase);

File correctCaseFile = new File(filenameCorrectCase + ",d");
File wrongLowercaseFile = new File(filenameWrongLowercase+",d");

if (correctCaseFile.exists()) {
filenameCorrectionsCount++;
}

if (wrongLowercaseFile.exists()) {
filenameLowercaseCount++;

}
else {
System.out.println("File: " + wrongLowercaseFile.getAbsolutePath() + " does not exist.");
}
filenameTotal++;
}

System.out.println("There were " + filenameTotal + " total files.\n\r"
+ filenameCorrectionsCount + " correct files exist.\n\r"
+ filenameLowercaseCount + " lower case files exist.");
}

}

Wednesday, May 21, 2008

Full Speed Ahead: New Developer Machine

I've recently bought and built a new high performance machine. I've previously done all of my development on my Inspiron 9300 with 2GB DDR2, but with latest activities with application servers and development environments it has become too slow to be highly productive on it. Hopefully this new machine will prove to be worth the investment.

I gave myself a budget of approximately $1000. The original intent wasn't for a gaming machine, albeit that's what the specs will look like. I've went through and researched using many sources such as articles, forums, and benchmarks to find the best product to fit my requirements as a multi-tasking developer that may be running application servers, running tests, and developing at the same time.

Specifications
Motherboard: Gigabyte GA-P35-DS3L
Processor: Intel Core 2 Quad Q9450 2.66GHz 12MB Cache 1333MHz FSB
Hard Drive: Western Digital Raptor X 150GB 10,000 RPM
Memory: 4GB (2x2GB) XMS2 Corsair 800MHZ
Monitor: 24" Samsung 245BW
Video Card: EVGA 8600GT 256MB DDR3
Case: Antec Three Hundred
PSU: Thermaltake Purepower 500W
Keyboard: Microsoft Natural 4000 Ergonomics Keyboard
CD/DVD: Samsung 20X DVD Burner with LightScribe

I'm pretty sure these look like specs for a gaming machine, but that's not the intent. The primary intent was to allow myself to be more efficient and productive in a working facet. It's in preparation for an amount of work on a startup idea. I was originally going to get the Q9300 processor, but decided to upgrade a little. The Q9300 would have cost me $279, but the Q9450 was only $299. The different is the Q9450 has a 12MB L2 cache vs. 6MB and 2.66GHZ vs 2.5GHZ. If I plan to optimize further it's an easier road ahead of me because the Q9450 over clocks easier.

Future Options
I'm usually very interested in squeezing every bit out of my machines (e.g. over clocking). I'm interested in overclocking my processor to at least 3GHZ with the current RAM. I was originally going to buy 4GB (2x2GB) Corsair Dominator 1066MHZ but decided to go with the cheaper option as shown above.

Balance
The tough problem is balancing my time of building, overclocking, and optimizing my machine over productive time in other venues such as development, requirements planning, and other such activities.

A First Look
I'll post the rest, but these are a start.

Full speed ahead!

Wednesday, May 14, 2008

Load Testing Tools: JMeter vs. The Grinder

This is an attempt at reviewing load testing tools in an evaluation format (JMeter vs. The Grinder). At my organization we have the following use requirements for a load testing tool.

Load test an application (e.g. HTTP) with approx. 2000 concurrent users with approx. 500 transactions/sec.
Parameterize URL for session id (e.g. Modifying URL parameters in the request based on previous test data)
Measure performance of logical page transactions/sets of requests. This could count an entire page load as a single transaction. The entire page load may have html, javascript, css, and images but the page should be regarded as a single unit.

For the two tools I've researched and used I will focus on the following:

Usability -- How usable the tool is (e.g. Functional GUI, fulfills goals specified above). It is important to reflect upon the target audience when looking at usability. Is it functional for a tester, QA lead, or developer?
Extensibility -- How extensible the tool is (e.g. Does it offer scripting support?).
Experience -- My experience with the tool and any comments while trying to offer an unbiased point of view.

For this article I am using JMeter 2.3.1 and The Grinder 3.0.1.

Note: By any means I am not an expert in either tool. This is simply a limited review of both tools. Please correct me if I am wrong in my statements.

JMeter
JMeter is a load testing tool for functional behavior and measuring performance. A user interface is provided to record, create, and run test scenarios.

Usability
The interface tool is very easy to use. However, there are issues with the included components for creating a test to be complicated. The tool was initially easy to figure out. I believe this is a product of it having a decent tutorial. Setting up a proxy is easy enough, but isn't enabled out of the box. It is very usable from a non technical perspective. I could trust that if the tool was given to a QA team the documentation would be okay to start performance/load testing.

Extensibility
There are many provided components when creating your test to do many things. It provides components for standard tasks. Many of the components aren't easily understood to work as you would expect them to.

Experience
In my limited experience with JMeter it is not viable for complex solutions to load testing. The documentation is plentiful, but not very helpful. It is extensible up to a point where the underlying use of the tool reaches a point. Given the above requirements JMeter isn't extensible enough. If your needs for a load testing tool involve less customizing then it should work just fine.

The Grinder
The Grinder is a tool designed to work in a distributed agent environment for load testing an application from many machines. It allows for easy scripting and customization of test scripts using Jython. It is very easy to load test and define logical transactions by redefining the way your data is recorded so it can then be analyzed.

Usability
The Grinder is seemingly aimed at the developer. It provides a usable interface that provides various metrics for collecting samples and looking at result data. The metrics that are given aren't too advanced, but it provides facilities for data to be exported into a higher quality analysis tool.

Extensibility
This is an area where The Grinder excels. It is very easy to customize the test scripts in Jython. This even includes importing other Jython scripts for other types of addons (e.g. for randomizing logins for multiple users. See below.)

Experience
With limited experience The Grinder has been an easy tool to work with. A strength is that it is very extensible and scripts can be written free form as said above. There are scripts out there for various functionality. When implementing login for multiple users it was very easy to adapt to the existing recorded/modified script that was written. If using Jython is a problem then the tool is probably not the best choice.

Conclusion
After looking at both applications I choose The Grinder as the tool to go with given the requirements for a load testing tool for my team. If your requirements are different and don't require customization then JMeter would be just as good.

Note: The Grinder has been in use and has so far been working great.

References

http://blackanvil.blogspot.com/2006/06/shootout-load-runner-vs-grinder-vs.html

http://blackanvil.blogspot.com/2006/11/grinder-addressing-warts.html