[pro] Common Lisp Library for MS Word files (.doc or .docx)?
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that? Thanks, -Mark
Mark, I don't know of any Common Lisp libraries, but the Apache Foundation has a Java library for that. 'Apache POI'. http://poi.apache.org/ I used it several years ago and it worked well, though I was not reading .doc files. Hope this helps, Chris On 2/28/11 8:46 PM, Mark H. David wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that? Thanks, -Mark
_______________________________________________ pro mailing list pro@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/pro
On Mon, 28 Feb 2011, Mark H. David wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that?
I suspect that RDNZL might provide the best results. You can use it to hook into the beast itself. Your other approach is to hook into the code for another office suite such as Open/LibreOffice, AbiWord, or KWord. In addition to Apache POI, there is also wvWare, but it doesn't support the new XML formats... Right when the libraries were becoming good at doc, MS went and changed formats. Funny coincidence, that. Later, Daniel
A few years back I used the standalone 'antiword' binary to convert .doc files to plaintext. It seemed to work pretty well. -Shaneal On Mon, Feb 28, 2011 at 5:20 PM, Daniel Herring <dherring@tentpost.com> wrote:
On Mon, 28 Feb 2011, Mark H. David wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that?
I suspect that RDNZL might provide the best results. You can use it to hook into the beast itself.
Your other approach is to hook into the code for another office suite such as Open/LibreOffice, AbiWord, or KWord.
In addition to Apache POI, there is also wvWare, but it doesn't support the new XML formats...
Right when the libraries were becoming good at doc, MS went and changed formats. Funny coincidence, that.
Later, Daniel
_______________________________________________ pro mailing list pro@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/pro
I have some code that I used when doing my books for parsing and generating RTF. Worked pretty well but is nowhere near polished or well packaged. -Peter On Mon, Feb 28, 2011 at 3:46 PM, Mark H. David <mhd@yv.org> wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that? Thanks, -Mark
_______________________________________________ pro mailing list pro@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/pro
-- Peter Seibel http://www.codequarterly.com/
On Mon, 28 Feb 2011 18:46:54 -0500 "Mark H. David" <mhd@yv.org> wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that?
Unfortunately also not CL, but other utilities to look into would be antiword and OdfConverter (which also claims to support OOXML). -- Matt
On Tue, Mar 1, 2011 at 12:46 AM, Mark H. David <mhd@yv.org> wrote:
Does anyone know of any CL libraries for dealing with Microsoft Word files? Tools for creating them, reading from them, parsing them, converting them to plain text or other formats, things like that? Thanks, -Mark
I know some time ago Knut Olav Bøhmer was fiddling with OpenOffice.org and ABCL. I don't know if that ever resulted in a library suiting your purpose. hth, Alessio
participants (7)
-
Alessio Stalla
-
Chris Perkins
-
Daniel Herring
-
Mark H. David
-
Matthew Mondor
-
Peter Seibel
-
Shaneal Manek