Discussion about this post

User's avatar
José's avatar

Or just go here

https://porkopek.github.io/Multisearch/#/align-texts

paste the source text, paste the target text, click Align and you have the TMX ready to export in seconds

Expand full comment
Joe Zhou's avatar

Thank you Kevin for your detailed explanation.

What you mentioned pertains more to a special case, whereas another scenario is:

The same document contains the entire source text followed by the entire translation.

In this case, you can easily split the source text and translation into two separate documents by copying and pasting, and then align them for import into the TM. The reason for aligning is to enable post-alignment editing, such as checking and adjusting segment alignment quality by splitting or merging segments as needed. memoQ's LiveDocs supports aligning document pairs, which can function similarly to a TM. However, automatic alignment is usually not 100% accurate and still requires manual editing based on the actual content. Of course, this is not the main focus of the current discussion.

The same document contains a paragraph of the source text followed by a paragraph of the translation (arranged side-by-side or top-bottom).

Here, we are referring to paragraphs, not segments. The specific formats could be:

First paragraph of the source text - First paragraph of the translation

......

Nth paragraph of the source text - Nth paragraph of the translation

Or:

First paragraph of the source text

First paragraph of the translation

......

Nth paragraph of the source text

Nth paragraph of the translation

How should this text format be handled? Before importing into the TM, segments are usually aligned, whereas the original document is aligned by paragraphs. There may even be cases where paragraph alignment intersects with alignments of several paragraphs. The method you mentioned doesn't seem to directly address such a situation; at the very least, it requires converting paragraph alignment into segment alignment (how would that be achieved?).

If the alignment feature supports importing a single document with bilingual texts, it would be similar to aligning two documents. The source text and translation could be automatically split into segments (automatic splitting isn't 100% accurate) and then checked, edited, and modified in the alignment editor. This method is more intuitive and aligns with the existing experience of most CAT tool users.

Expand full comment
3 more comments...

No posts