In two recent contributions, I discussed the use of regular expressions for identifying possible monetary expressions in the texts we translate or in archival records, such as translation memories, and showed how even experts may stumble a bit when trying a very general approach to problems like that due to the many forms that human expression of such things can take.
Bottom line: if you find a regex that works well for your typical and maybe even atypical source language documents, guard it as a treasure to save your sanity and avoid hours of tired squinting at scrolling text to find shitty little inconsistencies long after midnight to meet that apocalyptic 7 a.m. delivery deadline.
The memoQ Regex Assistant can help you guard those treasures. The YouTube video here is from my first public webinar on that integrated library tool for regular expressions, which enables people who know nothing about writing expressions (as well as those of us who are good at it) to find and apply proven solutions to specific problems with ease.
The five minutes after the starting point of the link (at around the 1 hour mark of a 90 minute lecture) shows how I use the plain language names for my stored problem-solving tools to identify places in a DeepL trashlation where currency amounts from the German source text wrongly still have decimal commas rather than the correct decimal points in the English text. The placement of the currency symbols is also wrong in many places. With just a few clicks, I can fix all the trouble spots in a 90,000 word text!
I have somewhere around 15 hours of recorded webinars on subject matter like this, but only this webinar and a few short clips have been made public for want of the time to edit and index all of it. This webinar recording is, in fact, hardly edited and is not provided with a time-coded index as I usually like to do for easy reference in long videos, but later this year I’ll be doing an updated version of the course with a better organization of reference links and some other useful tools for teaching and practice. Stay tuned.
So the problem shown in the video was that German currency expressions like 3 €
or 2,50–5,50 €
in the source text were left that way in English rather than converted to the proper forms €3
or €2.50–5.50
that are expected.
In a first step, an expression I saved in my personal library under the name Fix decimal commas to periods for currency was retrieved from the Regex Assistant:
(\d+),(\d{2})\b
— the Find expression
$1.$2
— the Replace expression
and after all the bad decimal markers were fixed, the euro currency signs (€) were moved to the front and placed next to the amount with no space by this expression named Fix currency ranges with trailing euro sign:
(\d[\d\-\.]+?)\s?€
— the Find expression
€$1
— the Replace expression
If you ended up with a messy mix of trailing euro signs, currency codes and words like
2,50–5,50€ or 2,50–5,50 EUR or 2,50–5,50 euros
because the writing team has no discipline (as with many an annual report I’ve translated), then the last expression can be adapted to something like
(\d[\d\-\.]+?)\s?(?i)(€|euro?s?)
— the Find expression
€$1
— the Replace expression
which also happens to be case-insensitive, and that can be stored in your personal solutions library with whatever name, labels and description work for you. Rinse and repeat for any other currencies relevant to your work.
Note that you don’t need to know how to write regexes to do all this. Just get a solution that works from someone or someplace reliable and organize it so you and/or your team can find it again when needed.
Did I mention that these labeled collections of solutions can be selectively exported and shared with others? Here’s a video that shows how to make such a shared resource part of your working memoQ installation:
But what if you have the similar “money changing” problem and need to correct decimal periods to decimal commas, or you want to move the currency markers from the front of the numbers to the back of them? Well, I keep those regexes in my library too, and I’ll include them in my upcoming memoQ regex solutions book (I’ve also given these libraries away in a lot of webinars and tutoring sessions), but if you need something like that now, just ask for it in the comments below this post or in a direct message, and ye shall receive.
There’s another way to deal with these conversions that I’ll talk about in my next post. Auto-translation rules. These will create insertable, green-marked “hits” in the Translation results pane of the translation and editing window of memoQ, offer predictive typing text if you use that feature, provide warnings of incorrect formatting as you translate or they can be used in a QA profile to check those expressions automatically in a text before delivery.
And you’ll see my own approach to creating, documenting and maintaining these auto-translation rules, so they can serve you well and adapt easily to future challenges you may encounter.