DVX cannot export Word file, or
Word cannot open RTF file exported by DVX

2, 21 & 24 December 2008

Summary

Sometimes DVX cannot export a translated Word file from a translation project. It reports this as follows:

Deja Vu X was unable to export the following files: – list of files –

DVX doesn't crash, it properly continues after the message.

At other times, if the project involves an RTF file, DVX does export it, but MS Word cannot open the resulting RTF file.

Although workarounds are available, I consider the problem as severe, because the existence of the problem usually becomes clear shortly before a deadline (even if probe exports are attempted earlier), and the workarounds may involve lengthy procedures.

The problem is now reproducible in a small hand-crafted example file.

Detailed description

RTF

In the first project I did with DVX, I had manually converted the Word file to be translated to an RTF file. (RTF = Rich Text Format, a file format developed by Microsoft, and the format that Deja Vu uses internally in its project database, to be able to reconstruct the translated Word files.)

The reason for my manual conversion is irrelevant for this error report.

I tried to open the exported RTF file with various MS Word versions, and they failed to open it. Details are here.

Thanks to a workaround I found (the one that uses WordPad), I managed to resolve the situation and meet my deadline despite dealing with this problem.

Word

At another occasion, in the third project I did with DVX, I encountered the other variant of the problem: DVX failed to export a Word file. The cause is probably the same: in order to export a Word file, DVX (as did DV3) first creates an RTF file from the data in its project database. Then it instructs MS Word (via an OLE interface or similar device) to open that RTF file, and save it in Word format. If Word then reports back that it cannot open the RTF file, without saying why, all DVX can do is report that the export failed, without being more specific as to possible causes or remedies.

This is very inconvenient to the user, but it's not Atril's fault.

Workarounds

Word

If you have the problem in a Word file you are translating using DVX, and you want to use workaround nos. 1 or 2 as described in the next chapter, you must first convert it to RTF (Rich Text Format) yourself.
This is not necessary for being able to use workaround no. 3.

To convert to RTF, you open the Word file in Microsoft Word, and you save it as RTF. Then you import that file in DVX, pretranslate, and hope you have 100% exact matches. (But that isn't always the case!) Then you fix the non-exact matches, and export the RTF file from DVX. Then you can use one of the workarounds (no. 1 or 2) for RTF, as described below.

RTF

  1. Open the RTF file in Vista's WordPad, save as RTF, then open again in Word. (Perhaps this also works in XP's Wordpad, but I didn't test it.)

    Here is the result in my example.

  2. Open the file in Notepad. You'll see the underlying codes of the RTF format. There is no need to understand that, so don't be alarmed by what you see. Skip to the end of the file. Add some closing braces at the end of the file (one, two, four braces; try as necessary).
    (A closing brace is the character that looks like '}'. It's known as "accolade" in French and Dutch.)

    Save, close, then open again in Word.

    Here is the result in my example.

    This workaround I owe to Walter Weyne of Atril Benelux, who proposed it on 20 November 2008 in Yahoo group "Vertalers", in this message. (That message is in Dutch and only subscribers to that translators' discussion group can read it.)

  3. If the cause of the problem is really in unescaped opening braces (i.e. "{" that should be "\{" ), as described below, you could try to correct them, in the DVX project prior to exporting the file. The only trouble is: how do you find them! That can be hard if there are many such third party codes, nearly all of them correct. But see checks.

Probable cause

Idea

When trying to fix the problem in my third serious DVX project ever, I got this idea that unescaped braces could be the cause (or: a cause!). I managed to find two such cases, added the missing backslash ('\'), tried to export again, and the problem was gone!

Small hand-crafted file

I remembered that experience, and later on created this hand-crafted file similar to the real one. In essence, it is a bilingual file, similar to actual files that my first translation client, which is still very important to me, regularly sends to me. The real thing is of course more complicated and much smarter than this dumbed down example I created specifically to demonstrate the problem.

The hidden text in the sample file is my doing, that isn't in the real format. I did that for an easier reconciliation of this format with Deja Vu (version 3 or X alike).

Such files contain codes (for italics, bold, bullets etc.) between braces, for example {F2}. The actual meaning can be different each time.

Escaping the braces

Now when Deja Vu imports such files, it converts { to \{ and } to \}. Upon export, it removes the backslashes again. It does this, of course, to achieve that such real braces in the text (there to indicate codes or for any other reason) do not interfere with Deja Vu's own codes, of the type {1}, {2} etc.

This Word file I imported into this DVX project file. When translating this trivial file, I deliberately made the mistake I must have inadvertently made in those real projects involving maybe hundreds of segments:

I removed the backslash before the opening brace, so \{ became {.

Deja Vu X (7.5.303) allows this, although the Check Code function (Ctrl-Shift-F8) detects it if it is run. Then when exporting the internally databased RTF file, DV somehow creates code that Word cannot interpret. Perhaps it does this by interpreting the unprotected (unescaped, in Unix-like terms) brace as the start of a DV code, which then doesn't have valid RTF codes behind it. Or maybe it does something else. I looked at the valid and invalid RTF code, but it is way too complicated for me to draw any conclusions. But maybe the Atril developers can.

Anyhow, the resulting file cannot be opened by Word versions as described.

On the other hand, if the situation is simply avoided, i.e. blocked before it can occur, it doesn't really matter anymore what might happen if it did occur.

Note that the project contains both a Word file and an RTF file, both with the same contents. I did it wrong in a different code in the translations of the two files, just to see if that makes any difference. It didn't.

It isn't certain that this brace issue is the only thing that can cause problems of this type. There may be others.

Possible solutions in DVX

Several possible improvements in DVX come to mind, in order to avoid this problem.

Keyboard handling

  1. When the user deletes a backslash character ( \ ) – either by pressing the Del key when positioned before the '\', or by pressing Backspace when after it –, DVX could check if the following character is a '{' or '}', and if so, also remove that character.

  2. DVX could make the Delete key ineffective when the cursor is before a '\', and make the Backspace ineffective when the cursor is after '\'.

    This is the approach that Deja Vu 3 (a.k.a. DV3) follows (behaviour verified in version 3.0.38). The user can still remove the backslash and brace together ("\{" or "\}") by highlighting the two and then pressing the Del key. This way, it is also possible to remove the backslash and leave the brace, if for some obscure reason the user might really want that.

    Deja Vu X (7.5.303) has no such safeguarding keyboard handling restrictions.

Quality assurance checks

  1. When the backslash before the brace has been removed, and the segment is committed (Ctrl Down Arrow), DV3 immediately marks the segment as having bad codes (red colour). DVX doesn't (no red icon to the left of the source language segment). If DVX did this too, there would be a greater chance that the user noticed the error and corrected it right then.

  2. The function Check Embedded Codes (Ctrl-Shift-F8, or Ctrl-Shift-C in DV3) does find these situations. So in hindsight, I could and should have used this in workaround no. 3.

    To verify this behaviour, this larger sample file might come handy.

Checks on export

  1. DVX should internally run Check Embedded Codes again before attempting to export a file. It should not allow the export to go through as long as code errors are still present. DV3 (3.0.38) had such a safeguard, DVX (7.5.303) hasn't.

Severity

Because the problem typically occurs near deadlines, it causes stress and is severe. Workarounds are available, but they often involve quite some extra time, which is tight near a deadline. Thus the availability of these workarounds in my opinion doesn't warrant calling the problem unsevere.

There is also the principal argument that a software tool should never generate invalid files if it can be avoided.

History

I found the problem in DVX version 7.5.303.