I have a VBA-based tool that basically for each row in the Excel takes a defined word document and performs number of find-replace operations. This has been working well so far, but I am starting to feel the limitations of this approach.
The Word Find-Replace action works only with strings up to 256 characters, so I have to perform complex string splitting shenanigans if I want to insert longer strings.
The performance leaves a lot to be desired and doesn't scale super well.
Is there a way to do this more efficiently?
I am aware that the core problem is working with the Word document and Word operations in the first place; when I do the same with a TXT input/output, I can work directly with the string in the memory, which is way faster, however in Word, I need to worry about stuff like preserving formatting, so just can't read the whole document to a variable. I understand that the Word document is basically just a zipped XML file but I have no idea how to get to the underlying XML or markup in a reasonable manner.
Edit: OK, so sharing the code even though it is not going to be very revealing:
This is part of the main function
For row = 2 To rowLast
'feed dictionary of dynamic variables
set dictionary = Get_Dictionary
With appWD
.Documents.Open Filename:=sPathTemplate
.Activate
.ScreenUpdating = False
End With
Call Merge(appWD, Opn, Cls, dictionary)
With appWD.ActiveDocument
.Save
.Close
End With
Next row
The called procedure is this.
Sub Merge(appWD As Object, sOpn As String, sCls As String, dictionary as object)
Do
With appWD.Selection.Find
For Each Key In dictionary
.Text = Opn & Key & Cls
.Execute ReplaceWith:=dictionary(Key), Replace:=2
.Wrap = 1 'wdFindContinue
Next Key
'search for closing symbol to confirm whether an additional run is necessary (.Execute returns True when Find is successful)
.Text = Cls
bRepeat = .Execute
End With
iCounter = iCounter + 1
If iCounter = 50 Then
MsgBox "Error in merging."
End
Loop While bRepeat = True
End Sub
To explain, I am using a closing and opening delimiter to determine sections to be replaced. The procedure loops through a dictionary of key-value pairs and searches each key within the delimiters in order to replace it. It also contains recursion to allow for more complex replacements:
[foo] -> "blah"
[blahbar] -> "hello world"
[[foo]bar] -> "hello world"
This all works perfectly fine under normal cirumstances - a single document takes a second or three to produce. But when I have multiple layers of recursion and a lot of find-replace tokens, the resource requirements grow geometrically and start impacting performance - a single document may take even a coouple minutes, which is a problem when I need 500 documents.
However, the bottleneck is in the Word's Find-Replace functionality. I have another set of procedures following exactly the same principles, but working with pure strings (to achieve the same thing in an email, where I can access gthe underlying HTML Body as a text string); the Find-Replace operation on a string isn't impacted anywhere near this and remains lightning fast even with layers of recursion and lots of tokens.
So basically I need a way to work with the text in the Word document as with a regular string, avoinding Word's native Find-Replace functionality and using VBA's Replace function. However I just can't simply read the Word document text into the string because often I need to preserve the document formatting, so I hoped for a way to access the raw Word XML data instead.