i'm tasked convert series of tables .doc
, .docx-files
.xls
,
but have not managed find efficient way this. tables may in between other text.
i have looked pywin32, xlwt
, couple of other libraries, seems have go through lot of steps.
any tips table conversion *.doc/*.docx
*.xls
file?
i'm assuming have many documents copy/paste, , seek pragmatic solution internal use. solution:
- opens file in word in batch mode
- you write little script cut outside tags html
- saves file in html, using .xls extension
- the html file open in excel default , click away warning.
create macro in word such this:
sub batchsaveas() ' set output_dir appropriately changefileopendirectory "output_dir" outdocname = left(activedocument.name, len(activedocument.name) - 4) & ".xls" activedocument.saveas filename:=outdocname, fileformat:= _ wdformatfilteredhtml, lockcomments:=false, password:="", addtorecentfiles _ :=true, writepassword:="", readonlyrecommended:=false, embedtruetypefonts _ :=false, savenativepictureformat:=false, saveformsdata:=false, _ saveasaoceletter:=false activewindow.view.type = wdwebview application.quit savechanges:=wddonotsavechanges end sub
now can run word in batch mode through script calls each input file:
winword file_name /mbatchsaveas
(you may need use full path names)
if warning on opening html / excel files not ok, write little python script run excel in batch mode. shows how run excel in python:
python com between python , excel
some tricks found useful: use clean-up; code need looks vba code, , if you're not @ vba, record macro want , modify python syntax.
Comments
Post a Comment