There are situations in which it is useful to reimport a file into Deja Vu X (DVX), although part of the translation has already been done. From the information in the MDB (Memory DataBase), it should be possible most or all of the earlier translations as exact matches.
To get this closer to "all" than to "most", I propose the algorithm AutoJoin, for inclusion in DVX as an option.
Note: Auto Join could also be useful in combination with AutoPropagate!
Situations in which a re-import may be useful include the following:
Word, client suddenly remembers they want Trados uncleaned. (Bad example, because joining is not possible in Trados Workbench files imported into DVX. Should/could it be made possible in this case?)
Large number of typos and spelling errors. Better to correct these in the source file first, to avoid MDB fouling.
Hard returns due to an incompetent writer, or conversion from PDF, directly or via OCR. Hard returns removed before import, but in an annoyingly large number of cases, proper removal was forgotten (oversight).
RTF/Doc issue, see 2081221a (hyperlink)
Autosend to MDB not switched on.
Minumum length of segments to be saved in the MDB. (Does that setting still exist in DVX, or DV3 only?)
Numbers in segment: not counted for minimum length? (DV3, retest for DVX).
Manual joins during the translation process.
if ((source segment gives fuzzy match from MDB) and (source segment, disregarding codes, exactly matches the first part of found MDB match, i.e the MDB match is longer but otherwise the same) ) then { see if joining the next segment in the project to the current segment creates: either an exact match (disregarding codes) if so, make that join permanent in the project file or a partial match of the same type as before, but the initial part that exactly matches is longer than before if so: continue joining more following segments from the MDB, and try again. }
Note: the above pseudocode isn't well structured and contains an implicit "go to", so it should be re-structured to go in the direction of a proper real implementation. But hopefully it makes the idea clear.
The fuzzy match on the longer MDB segment should take precedence over a possible shorter exact match, which may be present due to an earlier "send to MDB" before the manual join was done.
Well, real life, I made it up. That's why it sounds stupid and looks ugly. But it may still make sense. I hope.
Original source sentence:
What you could do is the following: e.g. you could continue and try again.
Segmented as:
What you could do is the following:
e.g.
you could continue and try again.
Partial translation to Dutch:
Then the translator decides to perform two joins (ctrl-J), and translation the result as a single segment:
The MDB (as a result of AutoSend) now contains the following pairs:
Now for some unrelated reason, the translator reimports the source document, and pretranslates hoping to recover the translations already done. Pretranslate find the partial source segment: Reverse case: Twee zinnen achter elkaar (proj ATS 20090121) (waarom zo gesegmenteerd?) die allebei exact in het MDB voorkomen, maar hij vindt zelfs geen fuzzy match, maar stelt samen uit allerlei losse stukken.
Workarounds are available, so the problem is not severe.
The issue applies to 7.5.303, and also applied to DV3 builds such as 3.0.38.
© 2008 R. Harmsen