Technical opinion: comparing Java vs. C/C++ efficiency differences to interpersonal differences

T he relative efficiency of Java programs is much discussed today, particularly in comparison to well-established implementation languages such as C or C++. Java is often considered very slow and memory intensive. However, most benchmarks compare only a single implementation of a program in, say, C++, to one implementation in Java— neglecting the possibility that alternative implementations might compare differently. In contrast, this article presents a comparison of 40 different implementations of the same program, written by 38 different programmers (there are two double Java implementations). The data compares, for one particular programming task, the average relative performance between languages as well as the performance differences from one programmer to another within a group of programs written in the same language. As noted, these interpersonal program differences are larger than those between the languages. Origin of the Data The 40 program implementations investigated were created by graduate students during the course of a controlled experiment on a different question (L. There are 24 programs written in Java, 11 in C++, and 5 in C. Each program was written by a single person. These programmers had an average of 8 years of programming experience and estimated that they had previously written an average of 100 KLOC each (median: 20 KLOC). All programs implement the same functionality, namely a conversion of telephone numbers into word strings. The program first loads a dictionary of 73,113 words into memory from a flat text file (one word per line, 93KB overall). Then it reads " telephone numbers " from another file, converts them one by one, and prints the results. The conversion is defined by a fixed mapping of characters to digits as follows: e j n q r w x d s y f t a m c i v b k u l o p g h z The task of the program is to find a sequence of words such that the sequence of characters in these words exactly corresponds to the sequence of digits in the phone number. All possible solutions must be found and printed. The solutions are created word by word and if no word from the dictionary can be inserted at some point during that process, a single digit from the phone number can appear in the result at that position. Many phone numbers have no solution at all. Here is an example of the program output for …