LiveDocs corpora: one repository, many languages.
Escape the confines of strictly bilingual translation memories!
Sometimes memoQ LiveDocs corpora are described as an alternative to translation memories, though their ability to hold and index monolingual texts for searching and reading as well as any kind of file in its original format for reference or other uses, makes LiveDocs much more than a kinda TM, though bilingual content will provide matches of every kind.
But did you realize that a corpus can hold any number of languages?
And you can access any of that content while working on your project in whatever its language pair. But there’s a catch….
In an active project, the content available for matches and concordance reference has a limiting scheme that confuses many users. Often I hear that someone “cannot see” a document known to be in a given corpus. This will be due to the language variants chosen in the project, so it’s worthwhile to think very carefully about your choices of generic language versus specific variants like UK or US English. The crazy graphic below shows how that works based on the teaching corpus shown above when used in a project with UK English as its source language and Swiss German as the target:
Confusing as a swarm of bats in a dust storm, right? But it follows a simple principle: content accessible for matches in a project will either have the language variant in the project or it will use the generic variant of that language. So, for example, you’ll never see monolingual US English content or bilingual content with US English in a project involving UK English (unless the source and target languages of the project are EN-US and EN-UK, and yes I know that GB is used as a marker instead of UK).
If you need to see other content in that corpus (in any language), it can be put on a searchable and readable tab by opening the document in the corpus via the Resource Console (LiveDocs —> corpus name), but that content will never appear in the translation results. You can copy and paste what you like from the open tab, however.
An export of the test corpus shown in this article can be downloaded here and can be used freely for non-commercial learning purposes. Otherwise, all rights are reserved by the authors where relevant.
If there’s a problem with the download, please let me know. I’ll be using that example file in later posts.