. Continued from the previous.
This is still about the 2nd of July 2021, when I was compiling Mikio
Hirabayashi’s QDBM version 1.8.78 and Hyper Estraier version 1.4.13,
under Ubuntu Server 20.04.2 LTS (GNU/Linux 5.4.0-77-generic x86_64),
using as the C compiler gcc version 9.3.0-17. Compiler options, as
set by the makefile, were
-Wall -pedantic -fPIC -fsigned-char -O3 -fomit-frame-pointer
-fforce-addr -minline-all-stringops
.
There are several compiler warnings of this type, about functions like
nice, system, and, more suspiciously, write and fread (why mix
man 2
and man 3
functions, I wonder?). I admit
there are situations in which the chances of failure for a file read or
write are slim (file open was checked, plenty of disk space), but of course
testing the return value, and responding appropriately to an unexpected
failure, is always better.
There were several warnings similar to the following:
cabin.c: In function ‘cbdatestrwww’: cabin.c:3066:47: warning: ‘%s’ directive writing up to 63 bytes into a region of size between 0 and 45 [-Wformat-overflow=] 3066 | sprintf(date, "%04d-%02d-%02dT%02d:%02d:%02d%s", year, mon, day, hour, min, sec, tzone);
I analysed some of them, and found that the compiler is right (of course!) and overflow can occur in theory. But in practice, it won’t. In the example given, overflow would only occur if the date and time fields were corrupted, and contained much higher integer values than is normal, considering their meaning.
So the compiler is pedantic here, as requested, and this probably isn’t the cause of the segmentation fault that kept me from using this software under Ubuntu Server. However, it is of course better to write code that even a pedantic compiler has no comments about, whenever possible.
Then there’s this:
In file included from villa.h:26, from villa.c:19: In function ‘vlleafaddrec’, inlined from ‘vlput’ at villa.c:299:7: ./cabin.h:1308:16: warning: argument 2 range [18446744071562067968, 18446744073709551615] exceeds maximum object size 9223372036854775807 [-Wallo> 1308 | (((CB_ptr) = realloc((CB_ptr), (CB_size))) ? (CB_ptr) : cbmyfatal("out of memory")) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./cabin.h:1514:7: note: in expansion of macro ‘CB_REALLOC’ 1514 | CB_REALLOC((CB_list)->array, (CB_list)->anum * sizeof((CB_list)->array[0])); \ | ^~~~~~~~~~ villa.c:2241:11: note: in expansion of macro ‘CB_LISTPUSHBUF’ 2241 | CB_LISTPUSHBUF(recp->rest, tbuf, tsiz);
I translated the long decimal numbers into hexadecimal, which makes them clearer:
18446744071562067968 = FFFFFFFF80000000 18446744073709551615 = FFFFFFFFFFFFFFFF (64 bits) 9223372036854775807 = 7FFFFFFFFFFFFFFF
Is this an issue of signed and unsigned values? Or 32 versus 64 bits?
Or both?
The FreeBSD 11.2 and 12.2 on which Hyperestraier did and still does
work without dumping core, are also 64 bits. But perhaps the compilation
and executable are not? I checked using the file
command,
and no, all of /usr/local/bin/est*
are listed as
“ELF 64-bit LSB executable, x86-64”. So why is there a segmentation
fault under Ubuntu and not under FreeBSD? Well, implementation details,
I mentioned them before.
Due to all the macros and typedef’d structures I find it hard, even through I program in C since 1985, of which professionally between 1990 and 2004, to see what is really going on here, if this is really dangerous, and if it could explain the segmentation fault. Well, somebody else’s code is always harder to understand that one’s own.
During the compilation, there are several occurrences of this type of warning. Not reassuring.
The source file estraier.c produces several warnings like this:
estraier.c: In function ‘est_aidx_attr_narrow’: estraier.c:7574:10: warning: comparison with string literal results in unspecified behavior [-Waddress] 7574 | if(cop == ESTOPSTROREQ && sign && !sval){
where the variable ‘cop’ is of type const char *
and
in estraier.h it says #define ESTOPSTROREQ "STROREQ"
.
Where and how the compiler stores string literals, and whether duplicates
are combined or stored separately, per source file or for the whole program,
all of that is implementation dependent. So I agree with the compiler that
comparisons of this type are unwise. Yet, I suspect in practice this doesn’t
cause any real problems here.
The compiler warnings I would have swallowed, if only Hyperestraier had worked. But it doesn’t. Already in the database checking phase of the installation, and also later in hyperestraier itself, it runs into a segmentation fault:
rm -rf casket* LD_LIBRARY_PATH=.:/lib:/usr/lib:/usr/local/lib:/home/rudhar/lib:/usr/local/lib \ ./odtest write casket 500 50 5000 <Writing Test> name=casket dnum=500 wnum=50 pnum=5000 ibnum=-1 idnum=-1 cbnum=-1 csiz=-1 ......make[1]: *** [Makefile:311: check] Segmentation fault
I used gdb
(GNU’s debugger) to find out where it happens, and the
result was:
Program received signal SIGSEGV, Segmentation fault. 0x000000000040d795 in cblistpush (list=0x53adb0, ptr=ptr@entry=0x7fffffffe0a0 "00000192", size=8, size@entry=-1) at cabin.c:780 780 CB_MALLOC(list->array[index].dptr, (size < CB_DATUMUNIT ? CB_DATUMUNIT : size) + 1);
Via the macro definition, this is indeed a malloc
. It isn’t easy
to see how a malloc
could cause a segmentation fault. A realloc
could, if an invalid pointer was passed. Therefore I suspect that the actual
problem occurred already before the malloc
, and that some of the
memory areas not intended for use by application code, but for internal
bookkeeping purposes, have been overwritten and corrupted.
The cause, and so the remedy, of bugs of this type can be hard to find. I am not the person who is going to do it.
I’m not the only one experiencing this segmentation fault. I found
this Japanese site, which quotes the name segmentation
fault from ./odtest write casket 500 50 5000
that I had,
and from the translation by Google Translate, I learnt that it says:
“When I searched for various information, there was a similar
report. Apparently it happens with gcc 7 and not with gcc 6.”
It quotes some other site (now non-existent) that said:
“estcmd built with gcc-7.2.0 caused segfault.
I sent mails to the author, but I couldn't get a reply.”
That night of the 2nd of July 2021, I was so fed up with the whole situation, of local search engines that don’t properly install, don’t properly work, or work a few years, but not on all platforms, and that are not properly maintained, that I took the brave step: I decided to write one myself. Couldn’t be so hard, if you keep it simple.
More on that in the next episodes.
Copyright © 2021 by R. Harmsen, all rights reserved.