Friday, March 30, 2018

Python docx Highlight Matching Text in Microsoft Word Doc

Back to list
"""Usage : python /home/you/script/this.py file.docx 'phrase to highlight'  # notice the quotes
"""

from docx import Document
from docx.enum.text import WD_COLOR_INDEX
import sys

source = sys.argv[1]
phrase = " ".join(sys.argv[2:]).strip("'")

doc = Document( source )

for para in doc.paragraphs :
start = para.text.find( phrase )
if start > -1 :
pre = para.text[:start]
post = para.text[start+len(phrase):]
para.text = pre
para.add_run(phrase)
para.runs[1].font.highlight_color = WD_COLOR_INDEX.YELLOW
para.add_run(post)

doc.save( source )

3 comments:

Anonymous said...

Nice blog! Is your theme custom made or did you download it from somewhere?
A theme like yours with a few simple tweeks would really
make my blog stand out. Please let me know where you got your theme.
Thanks

Anonymous said...

Hi,
I tried using this code to highlight a word in my document. Strangely, it works only when I have one word to be highlighted and one instance of it in a paragraph. Could do suggest a method to highlight words, in case I have two separate words that need to be highlighted within a paragraph or if I have two occurrences of the same word(the word that needs to be highlighted) within one paragraph?

Akash Chauhan said...

This has a small issue, when you have multiple phrases in a single paragraph then only the last one is highlighted.
i.e. the style information for the previous highlight is not saved when we try to highlight multiple things in one paragraph.