catdoc ported to Windows
15 September 2009, by Ben 3 comments
Recently I had to automatically extract text from a bunch of Word documents under Windows. I liked the looks of catdoc, but didn’t see a native Win32 port around. The source code looked so very close to compiling under MinGW, so I made the few minor changes necessary and got it working (catdoc, catppt, and xls2csv). Native Win32 executables, support for long filenames, etc.
Basically all I did was:
- Add a glob function from the BSD-licensed unixem library.
- Change a few of the
ifdef __MSDOS__toif defined(__MSDOS__) || defined(_WIN32). - Make one or two other minor changes to
fileutil.c, including theexe_dir()function.
Nothing special, and it’s not perfect. But here is a zip of the compiled binaries and (GPL-licensed) source code, just for you:
3 comments (oldest first)
Hi Ernesto, I don’t think catdoc supports docx at all, so you’ll have to use another tool for that. Unfortunately I can’t help you with the Spanish issue. -Ben
you are awesome dude, this totally saved me a huge headache, as C++ is my weakest language. thank you!!





Excellent !
Really you improve some bug in 16bit catdoc. I have a question
1) what happen with docx files ? 2) i have some problem with spanish, what do i need to do?
thank