A project manager gets a job returned with a bunch of trash in the target text. Oh no! But never fear... regular expressions can rescue this mess.
So many situations arise in translation workflows where source or target data need “interventions”. And regular expressions (“regex”) are the way to get there. But in memoQ, you don’t need to become an expert in that arcane syntax; instead you can use prepared, tested and documented libraries of expressions for quality assurance work to extend the power of that working environment while you do what you do best: find the right words for communication in translation.
Contents and notes:
0:00
Introduction 0:56
Advanced Find & Replace dialog 2:11
"Escaping" characters in regex 3:01
Square brackets for lists in regex 3:42
The period character in regex 3:55
The plus character in regex 4:20
Making the plus non-greedy 5:26
The Find expression 5:37
Using the regex Find & Replace 7:45
Filtering the target text for more trash 8:15
The Regex Assistant for memoQ 9:38
A regex filter expression for target text
Examples of trash text:
[g id="3103" mmq78catalogvalue="<cf bold=True>" mmq78shortcatalogvalue="<cf bold=True>"}
{g mmq78catalogvalue="</cf>" mmq78shortcatalogvalue="<cf bold=True>"]
The FIND expression for finding and replacing those weird tag structures is [\[\{]g.+?[\]\}]
The Replace field must be EMPTY.
Possible filtering expressions to use after the deletion of unwanted tags:
[\[\]\{\}]
This looks for any segments that have square or curly brackets
[\[\]\{\}“”]
This extends the filter to look for stray curly quote marks as well
Share this post