In order to fulfill my idea, and I also happened to know OpenOffice.org also has language checker, the LanguageTool.
Today, I tested JPype in order to have Java VM and package running with Python. The original Java example is:
JLanguageTool langTool = new JLanguageTool(Language.ENGLISH);
langTool.activateDefaultPatternRules();
List<rulematch> matches = langTool.check("A sentence " +
"with a error in the Hitchhiker's Guide tot he Galaxy");
for (RuleMatch match : matches) {
System.out.println("Potential error at line " +
match.getEndLine() + ", column " +
match.getColumn() + ": " + match.getMessage());
System.out.println("Suggested correction: " +
match.getSuggestedReplacements());
}
With the help from JPype, the equvalient code is
#!/usr/bin/python
# Simple proof test of using Java package in Python
from jpype import JPackage, startJVM, shutdownJVM
startJVM("/opt/sun-jdk-1.6.0.13/jre/lib/amd64/server/libjvm.so",
"-Djava.class.path=LanguageTool.jar")
#startJVM("/path/to/libjvm.so", "-Djava.class.path=/path/to/LanguageTool.jar")
LT = JPackage('de').danielnaber.languagetool
langTool = LT.JLanguageTool(LT.Language.ENGLISH)
langTool.activateDefaultPatternRules();
matches = langTool.check("A sentence with a error " +
"in the Hitchhiker's Guide tot he Galaxy")
for match in matches:
print "Potential error at line ", match.getEndLine(),\
", column ", match.getColumn(), ": ", match.getMessage()
print "Suggested correction: ", match.getSuggestedReplacements()
shutdownJVM()
If you want to run by yourself, you need to set up two paths, one is to the Java VM. In the example, which is for Gentoo. You can find / -name libjvm.so on Linux. The other path is to located the LanguageTool.jar, which you can find it in ZIP(oxt)1.
Actually, I don’t have to load the package, LanguageTool also support web server mode, but I think I prefer to load it if I will be going to use it. I know there is another Java related language, Jython, but I never touched that before.
As for the results, they look good:
Potential error at line 0 , column 16 : Use <suggestion>an</suggestion> instead of 'a' if the following word starts with a vowel sound, e.g. 'an article', 'an hour'
Suggested correction: [an]
Potential error at line 0 , column 50 : Did you mean <suggestion>to the</suggestion>?
Suggested correction: [to the]
However, the process time is quite long. On my computer—Core 2 Duo, 1.83G, it took around 300ms to 500ms.
I have not tried to find other checkers, maybe I will find one and also a Python one?
[1] | http://sourceforge.net/project/showfiles.php?group_id=27298&g=1 is gone. |
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.