In order to fulfill my idea, and I also happened to know OpenOffice.org also has language checker, the LanguageTool.

Today, I tested JPype in order to have Java VM and package running with Python. The original Java example is:


JLanguageTool langTool = new JLanguageTool(Language.ENGLISH);
langTool.activateDefaultPatternRules();
List<rulematch> matches = langTool.check("A sentence " +
"with a error in the Hitchhiker's Guide tot he Galaxy");
for (RuleMatch match : matches) {
System.out.println("Potential error at line " +
match.getEndLine() + ", column " +
match.getColumn() + ": " + match.getMessage());
System.out.println("Suggested correction: " +
match.getSuggestedReplacements());
}

With the help from JPype, the equvalient code is


#!/usr/bin/python
# Simple proof test of using Java package in Python

from jpype import JPackage, startJVM, shutdownJVM


startJVM("/opt/sun-jdk-1.6.0.13/jre/lib/amd64/server/libjvm.so",
"-Djava.class.path=LanguageTool.jar")
#startJVM("/path/to/libjvm.so", "-Djava.class.path=/path/to/LanguageTool.jar")

LT = JPackage('de').danielnaber.languagetool
langTool = LT.JLanguageTool(LT.Language.ENGLISH)
langTool.activateDefaultPatternRules();
matches = langTool.check("A sentence with a error " +
"in the Hitchhiker's Guide tot he Galaxy")
for match in matches:
print "Potential error at line ", match.getEndLine(),\
", column ", match.getColumn(), ": ", match.getMessage()
print "Suggested correction: ", match.getSuggestedReplacements()

shutdownJVM()

If you want to run by yourself, you need to set up two paths, one is to the Java VM. In the example, which is for Gentoo. You can find / -name libjvm.so on Linux. The other path is to located the LanguageTool.jar, which you can find it in ZIP(oxt)1.

Actually, I don’t have to load the package, LanguageTool also support web server mode, but I think I prefer to load it if I will be going to use it. I know there is another Java related language, Jython, but I never touched that before.

As for the results, they look good:


Potential error at line 0 , column 16 : Use <suggestion>an</suggestion> instead of 'a' if the following word starts with a vowel sound, e.g. 'an article', 'an hour'
Suggested correction: [an]
Potential error at line 0 , column 50 : Did you mean <suggestion>to the</suggestion>?
Suggested correction: [to the]

However, the process time is quite long. On my computer—Core 2 Duo, 1.83G, it took around 300ms to 500ms.

I have not tried to find other checkers, maybe I will find one and also a Python one?

[1]http://sourceforge.net/project/showfiles.php?group_id=27298&g=1 is gone.