How To Remove Html Tags From Word Content?
I know there are a couple threads about it which says simply using Regex.Replace(input, '<.*?>', String.Empty); but I cant use it in text written in word doc. my code is lik
Solution 1:
Give a try the following:
Convert the text with HTML addings to a simple string using
string unFormatted = paragrapf2.ToString(SaveOptions.DisableFormatting));
and then replace the paragraf2 contect with the unFormatted string.
Solution 2:
With some help provided in the comments, i realized the following working solution
findObject.ClearFormatting();
findObject.Text = @"\<*\>";
findObject.MatchWildcards=true;
findObject.Replacement.ClearFormatting();
findObject.Replacement.Text = "";
object replaceAll = Word.WdReplace.wdReplaceAll;
findObject.Execute(ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing,
ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing,
ref replaceAll, ref oMissing, ref oMissing, ref oMissing, ref oMissing);
which is using the search pattern \<*\>
(containing the wildcard character *
, hence findObject.MatchWildcards must be set to true).
Post a Comment for "How To Remove Html Tags From Word Content?"