Playback speed
×
Share post
Share post at current time
0:00
/
0:00
Transcript

Cleaning up a mess with regex in memoQ

The translation and editing environment enables many problems to be identified and corrected using regular expressions.

A project manager gets a job returned with a bunch of trash in the target text. Oh no! But never fear... regular expressions can rescue this mess.

So many situations arise in translation workflows where source or target data need “interventions”. And regular expressions (“regex”) are the way to get there. But in memoQ, you don’t need to become an expert in that arcane syntax; instead you can use prepared, tested and documented libraries of expressions for quality assurance work to extend the power of that working environment while you do what you do best: find the right words for communication in translation.

Share

memoQuickies Substack is a reader-supported publication. To receive new posts and support this education project, please consider becoming a free or paid subscriber.

Contents and notes:

0:00 Introduction
0:56 Advanced Find & Replace dialog
2:11 "Escaping" characters in regex
3:01 Square brackets for lists in regex
3:42 The period character in regex
3:55 The plus character in regex
4:20 Making the plus non-greedy
5:26 The Find expression
5:37 Using the regex Find & Replace
7:45 Filtering the target text for more trash
8:15 The Regex Assistant for memoQ
9:38 A regex filter expression for target text

Examples of trash text:

[g id="3103" mmq78catalogvalue="<cf bold=True>" mmq78shortcatalogvalue="<cf bold=True>"}

{g mmq78catalogvalue="</cf>" mmq78shortcatalogvalue="<cf bold=True>"]

The FIND expression for finding and replacing those weird tag structures is
[\[\{]g.+?[\]\}]
The Replace field must be EMPTY.

Possible filtering expressions to use after the deletion of unwanted tags:

[\[\]\{\}]
This looks for any segments that have square or curly brackets

[\[\]\{\}“”]
This extends the filter to look for stray curly quote marks as well

Discussion about this podcast

memoQuickies Substack
Regex
The memoQ Regex Assistant and regular expression-based solutions
Authors
Kevin Lossner