Benchmark 1



Speed comparison of various number crunching packages (version 1)

Speed of execution is an important aspect in choosing a data analysis software. Since it can vary from a factor 10, or more, on the same computer, this can make the difference between a quick-reacting package and another one that seems to takes hours to calculate!

The benchmark tests we used are adapted from Stephan Steinhaus' benchmark v. 2. Major changes are: (1) to adjust the size of the test so as it takes about 1 sec in the reference software that is Matlab 6 R12 and on our reference computer (see hereunder), (2) to keep only tests that run on all checked software, (3) to place them in two categories ("matrix calculation" versus "matrix functions"), (4) to add a "programming" category to evaluate how fast the software executes scripts, (5) to adapt or optimize tests to recent versions of the software, and (6) to consider only trimmed geometric means (worst and best results eliminated) inside each category and for the overall index (warning! current scripts calculate trimmed arithmetic mean). Note that Stephan Steinhaus' report evaluates also the "richness" of the packages (which functions are present, and which one are absent). Here, we only compare software for speed!

We have compared:

Matlab 6.0 (R12), our reference (download Matlab benchmark script and accompanying gcd2.m custom function; text file, 10 Kb)
Matlab 5.3 (R11), to show the drastic changes between v. 5.3 and v. 6.0 (same script)
Splus 6 R2, an excellent programmable statistic software (Splus benchmark script; text file, 10 Kb)
R 1.5.1, a free "clone" of Splus (R benchmark script; text file, 12 Kb)
O-Matrix 5.1, a cheap but very fast package, that can run most Matlab scripts (O-Matrix native mode benchmark & O-Matrix Matlab mode benchmarks scripts; text files, 10 Kb & 9 Kb respectively)
Octave 2.1.36, a free "clone" of Matlab 4 (Octave benchmark script; text file, 9 Kb). The version used was compiled with an optimized ATLAS library.
Scilab 2.6, a very complete free software, "not unlike" Matlab (Scilab benchmark script; text file, 10 Kb)
Rlab 2.1, a free package quite similar to Matlab -but whose development seems to be discontinued?- (Rlab benchmark script; text file, 10 Kb)
Ox 3.00, a very efficient matrix package similar to Gauss and free for academic use (Ox benchmark script; text file, 10 Kb)

Tests are:

I. Matrix calculation: evaluates the ability of performing some common matrix computations.

I.A: creation, transposition, deformation of a 1200x1200 matrix. This test evaluates the ability to create and manipulate matrices.
I.B: creation of a 1250x1250 normally distributed random matrix and taking the 1000th power of all its elements. Evaluates the speed at which a random matrix is processed element by element.
I.C: sorting of 1,100,000 random values. Tests the speed of a sorting operation.
I.D: 550x550 cross-product matrix (b = a' * a). Evaluates matrix operations.
I.E: linear regression over a 700x700 matrix (b = a \ b'). Tests the speed of execution for linear models evaluation.

II. Matrix functions: evaluates speed of some preprogrammed matrix functions.

II.A: fast Fourrier transform over 900,000 values. Fourrier transform is a commonly used method in signal processing.
II.B: eigenvalues of a 220x220 random matrix. Eigenvalues are used in multivariate analyses (PCA, ...).
II.C: determinant of a 750x750 random matrix. Calculation of the determinant of a matrix is a common, but unequally optimized, function in matrix calculation packages.
II.D: cholesky decomposition of a 1000x1000 matrix. Another commonly preprogrammed function.
II.E: inverse of a 500x500 random matrix. A computationally intensive function for which various algorithms exist (with very different performances).

III. Programming: evaluates efficiency to run scripts and custom functions.

III.A: 225,000 Fibonacci numbers calculation. This evaluates the speed of vector calculation.
III.B: creation of a 1500x1500 Hilbert Matrix. Evaluates performances in matrix calculation in scripts.
III.C: grand common divisors of 35,000 pairs. Tests potentials in using recursive functions.
III.D: creation of a 220x220 Toeplitz matrix. Check the speed of execution of loops.
III.E: Escoufier's method on a 22x22 random matrix. Tests various aspects of programming in a single test.

Note that tests III.A-E are not most optimized algorithms for each package, but they do test similar features in all of them. For instance, a matrix algorithm for test III.D is often much more efficient, as is a possibly preprogrammed toeplitz() function. Yet, we keep the loop algorithm in all cases... just to test the speed of loops execution in scripts!

Results

The tests were run three times on a Celeron 500 Mhz computer with 256 Mb of memory under Windows 2000 professional and the mean value is recorded. The next table presents results:

Test (sec) Matlab 6.0 Matlab 5.3 Splus 6 r2 R  1.5.1 O-Matrix native O-Matrix Ml mode Octave 2.1.36 Scilab 2.6 Rlab 2.1 Ox 3.00
I. Matrix calculation
I.A 0.95 1.00 4.11 2.10 1.03 1.35 4.30 2.95 1.92 1.21
I.B 0.99 2.61 3.94 2.90 2.60 2.48 3.03 2.32 2.38 2.84
I.C 1.04 7.91 3.61 1.26 1.09 1.10 8.03 2.72 4.00 3.47
I.D 0.98 4.57 1.21 1.90 0.85 0.86 0.95 14.84 8.92 1.79
I.E 1.07 6.91 5.59 5.66 0.72 0.96 4.37 7.67 6.31 5.60
Score 1.00 4.35 3.88 2.26 0.98 1.13 3.85 3.95 3.92 2.60
II. Matrix functions
II.A 0.99 4.04 4.78 2.76 2.21 2.16 3.09 4.06 2.23 3.83
II.B 0.90 1.41 0.83 1.10 0.42 0.42 2.25 2.94 3.09 1.09
II.C 1.06 7.96 3.64 5.82 0.88 0.92 5.42 9.14 7.50 3.34
II.D 1.29 4.59 7.22 3.42 0.96 0.96 1.19 7.81 6.64 2.66
II.E 1.10 6.18 6.15 5.63 0.75 0.76 3.92 7.61 7.65 2.89
Score 1.05 4.86 4.75 3.76 0.86 0.88 3.01 6.22 5.36 2.95
III. Programming
III.A 1.04 0.97 0.84 0.55 0.52 0.39 0.80 0.60 0.49 0.51
III.B 0.98 2.03 1.68 1.29 1.79 2.09 0.74 1.92 1.21 1.05
III.C 1.03 0.92 0.86 1.12 0.38 0.43 1.05 1.23 6.03 0.84
III.D 0.92 0.78 22.54 2.52 0.20 0.26 9.27 4.52 0.63 0.13
III.E 1.00 1.06 14.48 0.50 0.22 0.24 2.04 1.32 0.73 0.16
Score 1.00 0.98 2.76 0.92 0.35 0.35 1.20 1.46 0.82 0.41
Total 15.34 52.95 81.48 38.51 14.62 15.38 50.46 71.65 59.73 31.41
Overall 1.01 2.52 3.46 1.99 0.77 0.81 2.51 3.64 3.23 1.75

Comments

The higher the result (in seconds), the slower the test executes. Low values mean thus higher performances. Results lower than 0.90 (faster than the reference) are in green; result larger than 5.00 (more than five times slower than the reference) are in violet. We immediately see that Matlab 6.0 tends to be faster (as an overall trend) than the others in categories I and II, but slower in category III (programming).

Matlab 6.0 uses more recent, and more optimized matrix calculation libraries than Matlab 5.3. Consequently, Matlab 6.0 is more than twice faster than Matlab 5.3! And with a much better user interface, it is worth the while to upgrade to Matlab 6.x if you still use an older version! On the other hand, scripts are not faster (unless they use optimized preprogrammed functions, of course). Matlab is a well-recognized standard in many fields requiring matrix computation, like signal processing for instance. As being one of the fastest, the richest, the most commonly used and having one of the best user interface, we choose Matlab 6.0 as a reference for matrix programming software. However, it is quite expensive (comparing to the open source alternatives), and often has to be expanded with optional toolboxes!

Splus is another well-recognized standard, this time in statistics, a field where Matlab is comparatively weaker, even using additional toolboxes. Splus is also very expensive, but it excels in almost all fields of statistics. It is appreciated by professionals for its versatility, and for the ease of exploring statistical models in its environment. Its limits are reached when working with huge datasets. In this case, SAS (not evaluated here) is considered to be faster, and thus more efficient. In our tests, Splus is somewhat slower than Matlab 6.0, especially in loops programming where it is desperately slow! However, Splus propose alternatives: the For() function for optimized loops, and the apply() family of functions that "vectorize" loops. If it is slower than Matlab, it is also because many functions calculate extra diagnostics useful for statistical analysis. This is not taken into account in our raw benchmark, but could turn out to be a decisive advantage in practice, depending on the use of the calculation.

R is developed by a very active community of statisticians and is evolving quickly. At the moment, it proposes about the same panel of functions as Splus does... and it is totally free! It also runs on almost all platforms (Windows, Macintosh, Unix/Linux) and it has not the "loop problem" of Splus (and also provides apply() and the like to accelerate loops). However, it does not propose the same nice user interface with menus and dialog boxes (GUI) as Splus 6 does,... (though many professionals do not care about that because they prefer to use scripts and the command line for a finer control on their calculations). It is also somewhat slower than Matlab 6.0, but compares to Matlab 5.3. Some serious lacks in performances (noted with this test in earlier versions) are now corrected, and it behaves more homogeneously in all tests. It is object-oriented, meaning it is a little harder to learn than Matlab, but much more flexible once one understands its object-oriented features. The same remarks as for Splus can be also formulate for R.

It is also possible to recompile R for using an optimized linear calculus library (ATLAS) instead of the standard BLAS library used by default. On the test computer, with ATLAS optimized for winNT, PII with SSE1, gains are mainly visible in test I.D (cross-product), which drops from 1.90 sec to 0.59 sec. In a lesser extend, test II.C (determinant) is also optimized: from 5.82 sec to 4.53 sec). Although significant, these improvements do not change its position in the midrange, just half-way between fastest and slowest software. R is an excellent choice for those who are looking for statistical analysis because it offers similar features as Splus and it is free and faster! It compares with Matlab 5.3 in most tests, both in matrix computation and programming ability.

As being a well-recognized standard, Matlab has several contenders that propose similar features for lower price (O-Matrix, Octave, Scilab, Rlab). Among them, only one is fighting also on the performance level with Matlab 6.0: O-Matrix. Overall, O-Matrix is the fastest matrix computation package we have tested. It is much less expensive than Matlab, and it provides reasonable compatibility. However, O-Matrix does not propose the same range of specialized toolboxes and it runs only on Windows.

The three others (Octave, Scilab & Rlab) are all free open source software. Their performances are somewhat lower than Matlab 6.0 and better compare with Matlab 5.3. Octave aims to be fully compatible with the base version of Matlab 4.2. Its strongest point, among free Matlab "clones" is its language compatibility. It is also the fastest where -presumably- the ATLAS library is used. One should note that Octave runs under the cygwin emulation of Unix in Windows, and this has probably some negative impact on its performances. The Unix/Linux native version should run comparatively faster.

Scilab proposes many more functions than Octave, but it is not 100% compatible with the Matlab language. Rlab is neither fully compatible, nor very rich. Moreover, its development seems to be stopped. However, since its source code is distributed and modifiable, someone else could continue the project (although it is perhaps wiser to contribute to ongoing open source projects: either Octave, or Scilab).

Ox is a little apart. It is the only package that does not claim compatibility with one of the two standards previously cited: Matlab or Splus. However, it is partly compatible with Gauss, another high quality commercial matrix calculation software regarded as a standard in econometry (not evaluated here, but you will find detailed tests in Stephan Steinhaus' report). Its performances for categories I and II are closer (although a little bit better) to Matlab 5.3 than to Matlab 6.0. However, regarding category III, programming, it is more than twice as fast as Matlab 6.0! As it is a lightweight console application that can easily run scripts in batch mode, Ox is an excellent choice to shell matrix calculation scripts in various kind of applications. O-Matrix is even quicker, but it is unfortunately not embeddable in a custom application.

Conclusions

The choice of a data analysis software is a difficult task. Among them, "matrix languages" (like all the software we evaluated here) are very flexible because they are programmable and they are able to work very efficiently with matrices (by definition!) that are widely used in data analysis. However, they differ from each other in term of price, richness (the number of function provided), usability (including the quality of their user interface, their status of established standard or not, the quality of their support, their availability on different platforms like Windows, Macintosh, Unixes or Linux), and finally, in term of their pure performances. We evaluated the latter criterion here by using adapted versions of Stephan Steinhaus' benchmarks. Considering results obtained with our benchmark tests (but beware of the limits of this survey: only few features were tested, and solely on a Windows platform!), one can conclude:

Matlab 6.0 deserves its status of standard in matrix languages.
Splus and especially R are good alternatives with stronger potentials in statistics.
O-Matrix is the fastest matrix language we have tested on Windows.
Currently, no free "clone" of Matlab is as fast as Matlab 6.0 itself.
Octave is language compatible with Matlab, but not a top performer on Windows.
Scilab is a free alternative of Matlab for "richness" more than for performance. 
Ox is a very efficient matrix language, especially for batch process of scripts.

To top

Last update: 08/03/2003

[ HOME ] [ SciViews-R ] [ Tinn-R] [ Pastecs ] [ ShellAxis ] [ LaboKit ] [ Benchmark] [ Links ]