Using the Mathematical Acceleration Subsystem (MASS)

The MASS libraries consist of a library of scalar functions, described in Using the scalar library; and a set of vector libraries tuned for specific architectures, described in Using the vector libraries. The functions contained in both scalar and vector libraries are automatically called at certain levels of optimization, but you can also call them explicitly in your programs. Note that the accuracy and exception handling might not be identical in MASS functions and system library functions.

Compiling and linking a program with MASS describes how to compile and link a program that uses the MASS libraries, and how to selectively use the MASS scalar library functions in concert with the regular libm.a scalar functions.

Using the scalar library

The MASS scalar library, libmass.a, contains an accelerated set of frequently used math intrinsic functions in the AIX math library. When you compile programs with any of the following options:

the compiler automatically uses the faster MASS functions for all math library functions (with the exception of atan2, dnint, sqrt, rsqrt). In fact, the compiler first tries to "vectorize" calls to math library functions by replacing them with the equivalent MASS vector functions; if it cannot do so, it uses the MASS scalar functions. When the compiler performs this automatic replacement of math library functions, it uses versions of the MASS functions contained in the system library libxlopt.a; you do not need to add any special calls to the MASS functions in your code, or to link to the libxlopt library.

If you are not using any of the optimization options listed above, and/or want to explicitly call the MASS scalar functions, you can do so by:

  1. Providing the prototypes for the functions (except dnint), by including math.h in your source files.
  2. Providing the prototypes for dnint, by including mass.h in your source files.
  3. Linking the MASS scalar library libmass.a with your application. For instructions, see Compiling and linking a program with MASS.

The MASS scalar functions accept double-precision parameters and return a double-precision result, and are summarized in Table 21.

Table 21. MASS scalar functions
Function Description Prototype
sqrt Returns the square root of x double sqrt (double x);
rsqrt Returns the reciprocal of the square root of x double rsqrt (double x);
exp Returns the exponential function of x double exp (double x);
expm1 Returns (the exponential function of x) - 1 double expm1 (double x);
log Returns the natural logarithm of x double log (double x);
log1p Returns the natural logarithm of (x + 1) double log1p (double x);
sin Returns the sine of x double sin (double x);
cos Returns the cosine of x double cos (double x);
tan Returns the tangent of x double tan (double x);
atan Returns the arctangent of x double atan (double x);
atan2 Returns the arctangent of x/y double atan2 (double x, double y);
sinh Returns the hyperbolic sine of x double sinh (double x);
cosh Returns the hyperbolic cosine of x double cosh (double x);
tanh Returns the hyperbolic tangent of x double tanh (double x);
dnint Returns the nearest integer to x (as a double) double dnint (double x);
pow Returns x raised to the power y double pow (double x, double y);

The trigonometric functions (sin, cos, tan) return NaN (Not-a-Number) for large arguments (abs(x)>2**50*pi).

Note:
In some cases the MASS functions are not as accurate as the libm.a library, and they might handle edge cases differently (sqrt(Inf), for example).

Using the vector libraries

When you compile programs with any of the following options:

the compiler automatically attempts to vectorize calls to system math functions by calling the equivalent MASS vector functions (with the exceptions of functions vdnint, vdint, vsincos, vssincos, vcosisin, vscosisin, vqdrt, vsqdrt, vrqdrt, vsrqdrt, vpopcnt4, and vpopcnt8). For automatic vectorization, the compiler uses versions of the MASS functions contained in the system library libxlopt.a; you do not need to add any special calls to the MASS functions in your code, or to link to the libxlopt library.

If you are not using any of the optimization options listed above, and/or want to explicitly call any of the MASS vector functions, you can do so by including the header massv.h in your source files and linking your application with any of the following vector library archives (information on linking is provided in Compiling and linking a program with MASS):

libmassv.a
The general vector library.
libmassvp3.a
Contains some functions that have been tuned for the POWER3 architecture. The remaining functions are identical to those in libmassv.a.
libmassvp4.a
Contains some functions that have been tuned for the POWER4 architecture. The remaining functions are identical to those in libmassv.a. If you are using a PPC970 machine, this library is the recommended choice.
libmassvp5.a
Contains some functions that have been tuned for the POWER5 architecture. The remaining functions are identical to those in libmassv.a.

All libraries can be used in either 32-bit or 64-bit mode.

The single-precision and double-precision floating-point functions contained in the vector libraries are summarized in Table 22. The integer functions contained in the vector libraries are summarized in Table 23. Note that in C and C++ applications, only call by reference is supported, even for scalar arguments.

With the exception of a few functions (described below), all of the floating-point functions in the vector libraries accept three parameters:

The functions are of the form function_name (y,x,n), where y is the target vector, x is the source vector, and n is the vector length. The parameters y and x are assumed to be double-precision for functions with the prefix v, and single-precision for functions with the prefix vs. As an example, the following code:

#include <massv.h>

double x[500], y[500];
int n;
n = 500;
...
vexp (y, x, &n);

outputs a vector y of length 500 whose elements are exp(x[i]), where i=0,...,499.

The integer functions are of the form function_name (x, n), where x is a pointer to a vector of 4-byte (for vpopcnt4) or 8-byte (for vpopcnt8) numeric objects (integral or floating-point), and n is the vector length.

Table 22. MASS floating-point vector functions
Double-precision function Single-precision function Description Double-precision function prototype Single-precision function prototype
vacos vsacos Sets y[i] to the arccosine of x[i], for i=0,..,*n-1 void vacos (double y[], double x[], int *n); void vsacos (float y[], float x[], int *n);
vacosh vsacosh Sets y[i] to the arc hyperbolic cosine of x[i], for i=0,..,*n-1 void vacosh (double y[], double x[], int *n); void vsacosh (float y[], float x[], int *n);
vasin vsasin Sets y[i] to the arcsine of x[i], for i=0,..,*n-1 void vasin (double y[], double x[], int *n); void vsasin (float y[], float x[], int *n);
vasinh vsasinh Sets y[i] to the arc hyperbolic sine of x[i], for i=0,..,*n-1 void vasinh (double y[], double x[], int *n); void vsasinh (float y[], float x[], int *n);
vatan2 vsatan2 Sets z[i] to the arctangent of x[i]/y[i], for i=0,..,*n-1 void vatan2 (double z[], double x[], double y[], int *n); void vsatan2 (float z[], float x[], float y[], int *n);
vatanh vsatanh Sets y[i] to the arc hyperbolic tangent of x[i], for i=0,..,*n-1 void vatanh (double y[], double x[], int *n); void vsatanh (float y[], float x[], int *n);
vcbrt vscbrt Sets y[i] to the cube root of x[i], for i=0,..,*n-1 void vcbrt (double y[], double x[], int *n); void vscbrt (float y[], float x[], int *n);
vcos vscos Sets y[i] to the cosine of x[i], for i=0,..,*n-1 void vcos (double y[], double x[], int *n); void vscos (float y[], float x[], int *n);
vcosh vscosh Sets y[i] to the hyperbolic cosine of x[i], for i=0,..,*n-1 void vcosh (double y[], double x[], int *n); void vscosh (float y[], float x[], int *n);
vcosisin1 vscosisin1 Sets the real part of y[i] to the cosine of x[i] and the imaginary part of y[i] to the sine of x[i], for i=0,..,*n-1 void vcosisin (double _Complex y[], double x[], int *n); void vscosisin (float _Complex y[], float x[], int *n);
vdint Sets y[i] to the integer truncation of x[i], for i=0,..,*n-1 void vdint (double y[], double x[], int *n);
vdiv vsdiv Sets z[i] to x[i]/y[i], for i=0,..,*n-1 void vdiv (double z[], double x[], double y[], int *n); void vsdiv (float z[], float x[], float y[], int *n);
vdnint Sets y[i] to the nearest integer to x[i], for i=0,..,n-1 void vdnint (double y[], double x[], int *n);
vexp vsexp Sets y[i] to the exponential function of x[i], for i=0,..,*n-1 void vexp (double y[], double x[], int *n); void vsexp (float y[], float x[], int *n);
vexpm1 vsexpm1 Sets y[i] to (the exponential function of x[i])-1, for i=0,..,*n-1 void vexpm1 (double y[], double x[], int *n); void vsexpm1 (float y[], float x[], int *n);
vlog vslog Sets y[i] to the natural logarithm of x[i], for i=0,..,*n-1 void vlog (double y[], double x[], int *n); void vslog (float y[], float x[], int *n);
vlog10 vslog10 Sets y[i] to the base-10 logarithm of x[i], for i=0,..,*n-1 void vlog10 (double y[], double x[], int *n); void vslog10 (float y[], float x[], int *n);
vlog1p vslog1p Sets y[i] to the natural logarithm of (x[i]+1), for i=0,..,*n-1 void vlog1p (double y[], double x[], int *n); void vslog1p (float y[], float x[], int *n);
vpow vspow Sets z[i] to x[i] raised to the power y[i], for i=0,..,*n-1 void vpow (double z[], double x[], double y[], int *n); void vspow (float z[], float x[], float y[], int *n);
vqdrt vsqdrt Sets y[i] to the fourth root of x[i], for i=0,..,*n-1 void vqdrt (double y[], double x[], int *n); void vsqdrt (float y[], float x[], int *n);
vrcbrt vsrcbrt Sets y[i] to the reciprocal of the cube root of x[i], for i=0,..,*n-1 void vrcbrt (double y[], double x[], int *n); void vsrcbrt (float y[], float x[], int *n);
vrec vsrec Sets y[i] to the reciprocal of x[i], for i=0,..,*n-1 void vrec (double y[], double x[], int *n); void vsrec (float y[], float x[], int *n);
vrqdrt vsrqdrt Sets y[i] to the reciprocal of the fourth root of x[i], for i=0,..,*n-1 void vrqdrt (double y[], double x[], int *n); void vsrqdrt (float y[], float x[], int *n);
vrsqrt vsrsqrt Sets y[i] to the reciprocal of the square root of x[i], for i=0,..,*n-1 void vrsqrt (double y[], double x[], int *n); void vsrsqrt (float y[], float x[], int *n);
vsin vssin Sets y[i] to the sine of x[i], for i=0,..,*n-1 void vsin (double y[], double x[], int *n); void vssin (float y[], float x[], int *n);
vsincos vssincos Sets y[i] to the sine of x[i] and z[i] to the cosine of x[i], for i=0,..,*n-1 void vsincos (double y[], double z[], double x[], int *n); void vssincos (float y[], float z[], float x[], int *n);
vsinh vssinh Sets y[i] to the hyperbolic sine of x[i], for i=0,..,*n-1 void vsinh (double y[], double x[], int *n); void vssinh (float y[], float x[], int *n);
vsqrt vssqrt Sets y[i] to the square root of x[i], for i=0,..,*n-1 void vsqrt (double y[], double x[], int *n); void vssqrt (float y[], float x[], int *n);
vtan vstan Sets y[i] to the tangent of x[i], for i=0,..,*n-1 void vtan (double y[], double x[], int *n); void vstan (float y[], float x[], int *n);
vtanh vstanh Sets y[i] to the hyperbolic tangent of x[i], for i=0,..,*n-1 void vtanh (double y[], double x[], int *n); void vstanh (float y[], float x[], int *n);
Notes:
  1. By default, these functions use the __Complex data type, which is only available for AIX 5.2 and later, and will not compile on older versions of the operating system. To get an alternate prototype for these functions, compile with -D__nocomplex. This will define the functions as: void vcosisin (double y[][2], double *x, int *n); and void vscosisin(float y[][2], float *x, int *n);
Table 23. MASS integer vector library functions
Function Description Prototype
vpopcnt4 Returns the total number of 1 bits in the concatenation of the binary representation of x[i], for i=0,..,*n-1, where x is vector of 32-bit objects unsigned int vpopcnt4 (void *x, int *n)
vpopcnt8 Returns the total number of 1 bits in the concatenation of the binary representation of x[i], for i=0,..,*n-1, where x is vector of 64-bit objects unsigned int vpopcnt8 (void *x, int *n)

The functions vdiv, vsincos, vpow, and vatan2 (and their single-precision versions, vsdiv, vssincos, vspow, and vsatan2) take four parameters. The functions vdiv, vpow, and vatan2 take the parameters (z,x,y,n). The function vdiv outputs a vector z whose elements are x[i]/y[i], where i=0,..,*n-1. The function vpow outputs a vector z whose elements are x[i]y[i], where i=0,..,*n-1. The function vatan2 outputs a vector z whose elements are atan(x[i]/y[i]), where i=0,..,*n-1. The function vsincos takes the parameters (y,z,x,n), and outputs two vectors, y and z, whose elements are sin(x[i]) and cos(x[i]), respectively.

In vcosisin(y,x,n), x is a vector of n double elements and the function outputs a vector y of n double complex elements of the form (cos(x[i]),sin(x[i])). If -D__nocomplex is used (see note in Table 22), the output vector holds y[0][i] = cos(x[i]) and y[1][i] = sin(x[i]), where i=0,..,*n-1.

Overlap of input and output vectors

In most applications, the MASS vector functions are called with disjoint input and output vectors; that is, the two vectors do not overlap in memory. Another common usage scenario is to call them with the same vector for both input and output parameters (for example, vsin (y, y, &n)). For other kinds of overlap, be sure to observe the following restrictions, to ensure correct operation of your application:

Consistency of MASS vector functions

The accuracy of the vector functions is comparable to that of the corresponding scalar functions in libmass.a, though results might not be bitwise-identical.

In the interest of speed, the MASS libraries make certain trade-offs. One of these involves the consistency of certain MASS vector functions. For certain functions, it is possible that the result computed for a particular input value will vary slightly (usually only in the least significant bit) depending on its position in the vector, the vector length, and nearby elements of the input vector. Also, the results produced by the different MASS libraries are not necessarily bit-wise identical.

The following functions are consistent in all versions of the library: vcbrt, vscbrt, vrcbrt, vsrcbrt, vlog, vsin, vssin, vcos, vscos, vsexp, vacos, vasin, vrqdrt, vsqdrt, vsrqdrt, vacosh, vsacosh, vasinh, vsasinh, vtanh, vstanh. The following functions are consistent in libmassvp3.a, libmassvp4.a, and libmassvp5.a: vsqrt, vrsqrt. The following functions are consistent in libmassvp4.a, and libmassvp5.a: vrec, vsrec, vdiv, vsdiv, vexp. The following function is consistent in libmassv.a and libmassvp5.a: vsrsqrt. Older, inconsistent versions of some of these functions are available on the MASS Web site, at http://www.ibm.com/software/awdtools/mass/aix/. If consistency is not required, there may be a performance advantage to using the older versions. For more information on consistency and avoiding inconsistency with the vector libraries, as well as performance and accuracy data, see the MASS Web site.

Related information

Compiling and linking a program with MASS

To compile an application that calls the functions in the MASS libraries, specify mass and massv (or massvp3, massvp4 or massvp5) on the -l linker option. For example, if the MASS libraries are installed in the default directory, you could specify:

xlc progc.c -o progc -lmass -lmassv

The MASS functions must run in the round-to-nearest rounding mode and with floating-point exception trapping disabled. (These are the default compilation settings.)

Using libmass.a with libm.a

If you wish to use the libmass.a scalar library for some functions and the normal math library libm.a for other functions, follow this procedure to compile and link your program:

  1. Create an export list (this can be a flat text file) containing the names of the desired functions. For example, to select only the fast tangent function from libmass.a for use with the C program sample.c, create a file called fasttan.exp with the following line:
    tan
  2. Create a shared object from the export list with the AIX ld command, linking with the libmass.a library. For example:
    ld -bexport:fasttan.exp -o fasttan.o -bnoentry -lmass -bmodtype:SRE
    where directory is the location where libmass.a is installed.
  3. Archive the shared object into a library with the AIX ar command. For example:
    ar -q libfasttan.a fasttan.o
  4. Create the final executable using xlc, specifying the object file containing the MASS functions before the standard math library, libm.a. This links only the functions specified in the object file (in this example, the tan function) and the remainder of the math functions from the standard math library. For example:
    xlc sample.c -o sample -Ldir_containing_libfasttan.a -lfasttan -lm
Note:
The MASS cos function is automatically linked if you export MASS sin; MASS atan2 is automatically linked if you export MASS atan.

Related information