Odd

23 and 24 December 2021

Old habits

An idiom that’s still in my fingers, or its MCU (Muscle Control Unit) is od -ch. Show what’s in a file, as characters (if possible), and in hexadecimal. Plain and simple.

But it doesn’t work. Not any more. Or it does, but incorrectly, and undocumented. A test on Linux Mint 20.1 and FreeBSD 12.2: when I type:
od -ch↩
then:
This.↩ ctrl-d
this is the result:

0000000    T   h   i   s   .  \n
            6854    7369    0a2e
0000006

The man page under Linux Mint, by GNU, doesn’t mention the option -h at all, FreeBSD does:

-h, -x  Output hexadecimal shorts.  Equivalent to -t x2.

So they take two characters (strictly: bytes, but in the olden days, that was the same), interpret the two of them as a 16-bit short integer, something-endian – I can never remember which of big-endian and little-endian is which – anyhow, the characters are hexed in the wrong order: full stop-newline is hex 2e-0a, not 0a-2e. And h and i are 68 and 69 in hex, but it’s hard to see that when the codes are so far apart.

How can such a display be useful? Who would ever need this?

I expressly remember that the -h in od -ch used to refer to bytes, not shorts, so the hex was always in the same order as the chars, regardless of endianness. My first encounter with Unix was in 1985, on a machine that supported some kind of BSD, but also AT&T System V, and some policy-maker in the company where I worked had decided the latter would be our standard.

Would that explain the difference? System V had h = byte and in BSD it was h = short? I don’t know and I can’t remember when, where and in which implementation I first noticed the change. Cygwin maybe?

The behaviour I prefer can still be achieved like this:
od -c -t x1
or, as I discovered yesterday, much shorter as:
od -ctx1
But it’s cumbersome, to remember and to type, in comparison to
od -ch. The output:

0000000   T   h   i   s   .  \n
         54  68  69  73  2e  0a
0000006

That makes sense.

Multibyte

Of course, in 2021, a (now possibly wide) character and a byte are no longer equivalent. I am quite aware of that, and I wrote some tools of my own. But that is overkill in a simple ASCII-only situation, like in the following chapter.

Octal

There is also an option -b, not -o as I thought, that display bytes in octal. Used to make sense, as od started as an octal tools, in fact od is short for “octal dump”.

From the days of early Unix development, on some Dec machine that had 36-bit words, to be used as 5 7-bit ASCII characters, with a parity byte, or as 4 9-bit bytes. Or some such, my memory of what I once read somewhere may be inaccurate.

Nano shorthand

When still working with Windows, until August 2019, my text editor of choice was mdiNotepad, written by Tom Kostiainen. It had a simple and effective way to type often occurring strings with just a few keystrokes. I don’t remember how exactly it worked, but it did.

Now under Linux, and for a longer time already for my web server before under FreeBSD, my editor of choice is nano. Its current versions (I have 4.8 under Mint, 5.8 under FreeBSD) are quite powerful, with search-and-replace, also using regular expressions when preferred, full Unicode support, syntax highlighting, etc. But it doesn’t have a usable macro facility.

Yes, you can record and replay one series of keystrokes, but that is forgotten as soon as you leave the program. That makes it virtually unusable. Macros are perhaps also called shortcuts, fast keys, or snippets – a term I learned recently, in the context of Gnome’s and Ubuntu’s Gedit. And Sublime Text also has that.

So I thought nano hasn’t. But it has! Search strings in that link: “record a macro”, and Marco Diego, who is actually one of the maintainers of nano. You can bind keystrokes to a string, a command, or a combination of both. It’s great! See the output of man 5 nanorc, the section entitled REBINDING KEYS, for full details. I used it to get this:

   <a href=""
     target=></a>

In my ~/.config/nano/nanorc I have this to achieve that:

bind F21 "   <a href=""^M     target=></a>^[[D^[[D^[[D^[[D^[[D^[[A" all

Some noteworthy points:

F21 stands for shift-F9, so add 12. The docs say it works on some machines. Does on mine.
Note that nano quite smartly handles my embedded "" within a string enclosed in " ", without requiring any escape method.
The string can include cursor movements. I used this so after invoking the “a href” syntax, the cursor is put between the two double quotes enclosing the URL, ready for typing or (more commonly) pasting.
To enter a cursor command, first type M-V (the M is for meta, M- stands for holding the left Alt-key, or pressing Esc once). This is to indicate that the next key press is to be interpreted verbatim. So pressing Alt-V-leftarrow places the cursor-left escape sequence into the file.

Unfortunately, if you need six such cursor commands, that somewhat complicated sequence has to be repeated six times, including the M-V for verbatim. However normally, this is a one-time operation, so it doesn’t matter much.
The string resulting from the shortcut key press can extend over lines. In my example,
```
   <a href=""
     target=></a>
   
```
, I want the target and string on a separate line from the one containing the URL, because that latter can be rather long. So in nano, while creating my ~/.nanorc or ~/.config/nano/nanorc, I typed M-V-return. In the editor display this looks like ^M, but it’s different from an actual caret (^) followed by an M. When saving the file, nano writes a return character to the file: \r, CR, or in hex: 13.

A complication may occur when you read the config file into nano again, perhaps to amend it, because nano, seeing the \r character on its own, if that is in the first line of the file, thinks this is a text file in Apple Macintosh style. Windows uses \r\n as its line terminator, Unix has \n, and Mac uses just \r. When saving the file, nano turns all other line ends into \r as well, making the file invalid as a config file for nano. A config file must follow the Unix way of indicating line ends.

Solution: when done editing, save the file by pressing ^O (ctrl-o), not crtl-s, then use M-M to disable Mac line ends. Open the file again, and repair any verbatim ^M line ends within a string, as described above.
To verify what was happening, I used od, which reminded me of writing this article about its unfortunate behaviour at option -h. Some people would insert the phrase “The rest is history” here, but I find that a stupid cliché, so I don’t.