Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that? Thanks, -Mark
Mark,
I don't know of any Common Lisp libraries, but the Apache Foundation has a Java library for that. 'Apache POI'. http://poi.apache.org/ I used it several years ago and it worked well, though I was not reading .doc files.
Hope this helps,
Chris
On 2/28/11 8:46 PM, Mark H. David wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that? Thanks, -Mark
pro mailing list pro@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/pro
On Mon, 28 Feb 2011, Mark H. David wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that?
I suspect that RDNZL might provide the best results. You can use it to hook into the beast itself.
Your other approach is to hook into the code for another office suite such as Open/LibreOffice, AbiWord, or KWord.
In addition to Apache POI, there is also wvWare, but it doesn't support the new XML formats...
Right when the libraries were becoming good at doc, MS went and changed formats. Funny coincidence, that.
Later, Daniel
A few years back I used the standalone 'antiword' binary to convert .doc files to plaintext. It seemed to work pretty well.
-Shaneal
On Mon, Feb 28, 2011 at 5:20 PM, Daniel Herring dherring@tentpost.com wrote:
On Mon, 28 Feb 2011, Mark H. David wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that?
I suspect that RDNZL might provide the best results. You can use it to hook into the beast itself.
Your other approach is to hook into the code for another office suite such as Open/LibreOffice, AbiWord, or KWord.
In addition to Apache POI, there is also wvWare, but it doesn't support the new XML formats...
Right when the libraries were becoming good at doc, MS went and changed formats. Funny coincidence, that.
Later, Daniel
pro mailing list pro@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/pro
I have some code that I used when doing my books for parsing and generating RTF. Worked pretty well but is nowhere near polished or well packaged.
-Peter
On Mon, Feb 28, 2011 at 3:46 PM, Mark H. David mhd@yv.org wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that? Thanks, -Mark
pro mailing list pro@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/pro
On Mon, 28 Feb 2011 18:46:54 -0500 "Mark H. David" mhd@yv.org wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that?
Unfortunately also not CL, but other utilities to look into would be antiword and OdfConverter (which also claims to support OOXML).
On Tue, Mar 1, 2011 at 12:46 AM, Mark H. David mhd@yv.org wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that? Thanks, -Mark
I know some time ago Knut Olav Bøhmer was fiddling with OpenOffice.org and ABCL. I don't know if that ever resulted in a library suiting your purpose.
hth, Alessio