-qipa

Description

Turns on or customizes a class of optimizations known as interprocedural analysis (IPA).

Compile-time syntax

Read syntax diagramSkip visual syntax diagram>>- -qipa--+-----------------+---------------------------------><
           |    .-object---. |
           '-=--+-noobject-+-'

where:

-qipa Compile-time Options Description
-qipa Activates interprocedural analysis with the following -qipa suboption defaults:
  • inline=auto
  • level=1
  • missing=unknown
  • partition=medium
-qipa=object

-qipa=noobject

Specifies whether to include standard object code in the object files.

Specifying the noobject suboption can substantially reduce overall compile time by not generating object code during the first IPA phase.

If the -S compiler option is specified with noobject, noobject is ignored.

If compilation and linking are performed in the same step, and neither the -S nor any listing option is specified, -qipa=noobject is implied by default.

If any object file used in linking with -qipa was created with the -qipa=noobject option, any file containing an entry point (the main program for an executable program, or an exported function for a library) must be compiled with -qipa.

Link-time syntax

Read syntax diagramSkip visual syntax diagram        .-noipa------------------------------------------------.
>>- -q--+-ipa--+---------------------------------------------+-+-><
               |    .-:------------------------------------. |
               |    |   .-noclonearch------------.         | |
               |    |   |               .-,----. |         | |
               |    V   |               V      | |         | |
               '-=----+-+-clonearch--=----arch-+-+-------+-+-'
                      | .-nocloneproc------------.       |
                      | |               .-,----. |       |
                      | |               V      | |       |
                      +-+-cloneproc--=----name-+-+-------+
                      |           .-,----.               |
                      |           V      |               |
                      +-exits--=----name-+---------------+
                      +-inline--+----------------------+-+
                      |         |    .-auto----------. | |
                      |         '-=--+-noauto--------+-' |
                      |              | .-,---------. |   |
                      |              | V           | |   |
                      |              +---suboption-+-+   |
                      |              +-threshold=num-+   |
                      |              | .-,----.      |   |
                      |              | V      |      |   |
                      |              '---name-+------'   |
                      |              .-,----.            |
                      |              V      |            |
                      +-noinline--=----name-+------------+
                      |                     .-,----.     |
                      |                     V      |     |
                      +-infrequentlabel--=----name-+-----+
                      |              .-,----.            |
                      |              V      |            |
                      +-isolated--=----name-+------------+
                      |           .-1-.                  |
                      +-level--=--+-0-+------------------+
                      |           '-2-'                  |
                      |          .-a.lst-.  .- short-.   |
                      +-list--=--+-------+--+--------+---+
                      |          '-name--'  '- long--'   |
                      |             .-,----.             |
                      |             V      |             |
                      +-lowfreq--=----name-+-------------+
                      | .-malloc16---.                   |
                      +-+-nomalloc16-+-------------------+
                      |             .-unknown--.         |
                      +-missing--=--+-safe-----+---------+
                      |             +-isolated-+         |
                      |             '-pure-----'         |
                      |               .-medium-.         |
                      +-partition--=--+-small--+---------+
                      |               '-large--'         |
                      | .-nopdfname----------------.     |
                      +-+-pdfname--+-------------+-+-----+
                      |            '-=--filename-'       |
                      | .-nothreads------.               |
                      +-+-threads-+----+-+---------------+
                      |           '-=N-'                 |
                      |                 .-,----.         |
                      |                 V      |         |
                      +-+-pure----+--=----name-+---------+
                      | +-safe----+                      |
                      | '-unknown-'                      |
                      '-filename-------------------------'

where:

-qipa Link-time Options Description
-qnoipa Deactivates interprocedural analysis.
-qipa Activates interprocedural analysis with the following -qipa suboption defaults:
  • inline=auto
  • level=1
  • missing=unknown
  • partition=medium

Suboptions can also include one or more of the forms shown below.

Note:
C++ only For all suboptions that specify function names you must use the mangled names of the functions. The original source function names are not valid .
Link-time Suboptions Description
clonearch=arch{,arch}

noclonearch

Specifies the architectures for which multiple versions of the same instruction set are produced.

During the IPA link phase, the compiler generates a generic version of a procedure targeted for the default architecture setting and then if appropriate, produces another version that is optimized for the specified architectures. At run time, the compiler dynamically determines which architecture the program is running on, and chooses the particular version of the function that will be executed accordingly. Using this option, your program can achieve compatibility for different PowerPC architectures.

arch is a comma-separated list of architectures. The supported clonearch values are pwr4, pwr5 and ppc970. If you specify no value, an invalid value or a value equal to the -qarch setting, no function versioning will be performed for this option.

Notes:
  1. To ensure compatibility across multiple platforms, the -qarch value must be the subset of the architecture specified by -qarch=clonearch.
  2. When -qcompact is in effect, -qarch=clonearch is disabled.
  3. For information on allowed clonearch values on different architectures, see Allowable clonearch values table.
cloneproc=name{,name}

nocloneproc=name{,name}

Specifies the name of the functions to clone for the architectures specified by clonearch suboption. Where name is a comma-separated list of function names.
Note:
If you do not specify -qipa=clonearch or specify -qipa=noclonearch, -qipa=cloneproc=name,{name} and -qipa=nocloneproc=name,{name} have no effect.
exits=name{,name} Specifies names of functions which represent program exits. Program exits are calls which can never return and can never call any procedure which has been compiled with IPA pass 1.
infrequentlabel=name{,name} Specifies a list of user-defined labels that are likely to be called infrequently during a program run.
inline=auto

inline=noauto

Enables or disables automatic inlining only. The compiler still accepts user-specified functions as candidates for inlining.
inline[=suboption] Same as specifying the -qinline compiler option, with suboption being any valid -qinline suboption.
inline=threshold=num Specifies an upper limit for the number of functions to be inlined, where num is a non-negative integer. This argument is implemented only when inline=auto is on.
inline=name{,name} Specifies a comma-separated list of functions to try to inline, where functions are identified by name.
noinline=name{,name} Specifies a comma-separated list of functions that must not be inlined, where functions are identified by name.
isolated=name,{name} Specifies a list of isolated functions that are not compiled with IPA. Neither isolated functions nor functions within their call chain can refer to global variables.
level=0

level=1

level=2

Specifies the optimization level for interprocedural analysis. The default level is 1. Valid levels are as follows:
  • Level 0 - Does only minimal interprocedural analysis and optimization.
  • Level 1 - Turns on inlining, limited alias analysis, and limited call-site tailoring.
  • Level 2 - Performs full interprocedural data flow and alias analysis.
list

list=[name] [short|long]

Specifies that a listing file be generated during the link phase. The listing file contains information about transformations and analyses performed by IPA, as well as an optional object listing generated by the back end for each partition. This option can also be used to specify the name of the listing file.

If listings have been requested (using either the -qlist or -qipa=list options), and name is not specified, the listing file name defaults to a.lst.

The long and short suboptions can be used to request more or less information in the listing file. The short suboption, which is the default, generates the Object File Map, Source File Map and Global Symbols Map sections of the listing. The long suboption causes the generation of all of the sections generated through the short suboption, as well as the Object Resolution Warnings, Object Reference Map, Inliner Report and Partition Map sections.

lowfreq=name{,name} Specifies names of functions which are likely to be called infrequently. These will typically be error handling, trace, or initialization functions. The compiler may be able to make other parts of the program run faster by doing less optimization for calls to these functions.

malloc16

nomalloc16

Informs the compiler that the dynamic memory allocation routines will return 16-byte aligned memory addresses. The compiler can then optimize the code based on that assertion.

In 64-bit mode, AIX always returns 16-byte aligned addresses and therefore by default -qipa=malloc16 is in effect. You can use -qipa=nomalloc16 to override the default setting.

Note:
You must make sure that the executables generated with -qipa=malloc16 run in an environment in which dynamic memory allocations return 16-byte aligned addresses, otherwise, wrong results can be generated.
missing=attribute Specifies the interprocedural behavior of procedures that are not compiled with -qipa and are not explicitly named in an unknown, safe, isolated, or pure suboption.

The following attributes may be used to refine this information:

  • safe - Functions which do not indirectly call a visible (not missing) function either through direct call or through a function pointer.
  • isolated - Functions which do not directly reference global variables accessible to visible functions. Functions bound from shared libraries are assumed to be isolated.
  • pure - Functions which are safe and isolated and which do not indirectly alter storage accessible to visible functions. pure functions also have no observable internal state.
  • unknown - The default setting. This option greatly restricts the amount of interprocedural optimization for calls to unknown functions. Specifies that the missing functions are not known to be safe, isolated, or pure.
partition=small

partition=medium

partition=large

Specifies the size of each program partition created by IPA during pass 2.
nopdfname

pdfname

pdfname=filename

Specifies the name of the profile data file containing the PDF profiling information. If you do not specify filename, the default file name is ._pdf.

The profile is placed in the current working directory or in the directory named by the PDFDIR environment variable. This lets you do simultaneous runs of multiple executables using the same PDFDIR, which can be useful when tuning with PDF on dynamic libraries.

nothreads

threads

threads=N

Specifies the number of threads the compiler assigns to code generation.

Specifying nothreads is equivalent to running one serial process. This is the default.

Specifying threads allows the compiler to determine how many threads to use, depending on the number of processors available.

Specifying threads=N instructs the program to use N threads. Though N can be any integer value in the range of 1 to MAXINT, N is effectively limited to the number of processors available on your system.

pure=name{,name} Specifies a list of pure functions that are not compiled with -qipa. Any function specified as pure must be isolated and safe, and must not alter the internal state nor have side-effects, defined as potentially altering any data visible to the caller.
safe=name{,name} Specifies a list of safe functions that are not compiled with -qipa and do not call any other part of the program. Safe functions can modify global variables, but may not call functions compiled with -qipa.
unknown=name{,name} Specifies a list of unknown functions that are not compiled with -qipa. Any function specified as unknown can make calls to other parts of the program compiled with -qipa, and modify global variables and dummy arguments.
filename Gives the name of a file which contains suboption information in a special format.

The file format is the following:

# ... comment 
attribute{, attribute} = name{, name}
clonearch=arch,{arch}
cloneproc=name,{name}
missing = attribute{, attribute} 
exits = name{, name} 
lowfreq = name{, name} 
inline [ = auto | = noauto ] 
inline = name{, name} [ from name{, name}] 
inline-threshold = unsigned_int
inline-limit = unsigned_int
list [ = file-name | short | long ] 
noinline 
noinline = name{, name} [ from name{, name}]
level = 0 | 1 | 2 
prof [ = file-name ] 
noprof 
partition = small | medium | large | unsigned_int

where attribute is one of:

  • clonearch
  • cloneproc
  • exits
  • lowfreq
  • unknown
  • safe
  • isolated
  • pure

Notes

In the case that suboptions are specified for -qipa=clonearch and -qarch that do not match the target architecture, the compiler will generate instructions based on the suboption that most closely matches the system on which the application is currently running.

The following table shows the allowed clonearch values for different -qarch settings.

Table 42. Allowable clonearch values
-qarch setting Allowed clonearch value
com, ppc, pwr3, ppc64, ppcgr, ppc64gr, ppc64grsq pwr4, pwr5, ppc970
pwr4 pwr5, ppc970
ppc64v ppc970
pwr5, ppc970 N/A

This option turns on or customizes a class of optimizations known as interprocedural analysis (IPA).

Regular expression syntax can be used when specifying a name for the following suboptions.

Syntax rules for specifying regular expressions are described below:

Expression Description
string Matches any of the characters specified in string. For example, test will match testimony, latest, and intestine.
^string Matches the pattern specified by string only if it occurs at the beginning of a line.
string$ Matches the pattern specified by string only if it occurs at the end of a line.
str.ing The period ( . ) matches any single character. For example, t.st will match test, tast, tZst, and t1st.
string\special_char The backslash ( \ ) can be used to escape special characters. For example, assume that you want to find lines ending with a period. Simply specifying the expression .$ would show all lines that had at least one character of any kind in it. Specifying \.$ escapes the period ( . ), and treats it as an ordinary character for matching purposes.
[string] Matches any of the characters specified in string. For example, t[a-g123]st matches tast and test, but not t-st or tAst.
[^string] Does not match any of the characters specified in string. For example, t[^a-zA-Z]st matches t1st, t-st, and t,st but not test or tYst.
string* Matches zero or more occurrences of the pattern specified by string. For example, te*st will match tst, test, and teeeeeest.
string+ Matches one or more occurrences of the pattern specified by string. For example, t(es)+t matches test, tesest, but not tt.
string? Matches zero or one occurrences of the pattern specified by string. For example, te?st matches either tst or test.
string{m,n} Matches between m and n occurrence(s) of the pattern specified by string. For example, a{2} matches aa, and b{1,4} matches b, bb, bbb, and bbbb.
string1 | string2 Matches the pattern specified by either string1 or string2. For example, s | o matches both characters s and o.

The necessary steps to use IPA are:

  1. Do preliminary performance analysis and tuning before compiling with the -qipa option, because the IPA analysis uses a two-pass mechanism that increases compile and link time. You can reduce some compile and link overhead by using the -qipa=noobject option.
  2. Specify the -qipa option on both the compile and the link steps of the entire application, or as much of it as possible. Use suboptions to indicate assumptions to be made about parts of the program not compiled with -qipa. During compilation, the compiler stores interprocedural analysis information in the .o file. During linking, the -qipa option causes a complete recompilation of the entire application.

Note: If a severe error occurs during compilation, -qipa returns RC=1 and terminates. Performance analysis also terminates.

Example

To compile a set of files with interprocedural analysis, enter:

xlc++ -c -O3 *.C -qipa
xlc++ -o product *.o -qipa 

Here is how you might compile the same set of files, improving the optimization of the second compilation, and the speed of the first compile step. Assume that there exits two functions, trace_error and debug_dump, which are rarely executed.

xlc++ -c -O3 *.C -qipa=noobject
xlc++ -c *.o -qipa=lowfreq=trace_error,debug_dump 

Related information