Alternative VE Offloading
2.7.1
|
VE Offloading framework (VEO) is a framework to provide accelerator-style programming on Vector Engine (VE).
SX-Aurora TSUBASA provides VE Offloading framework (VEO) for the accelerator programming model. The accelerator programming model executes parallelized and/or vectorized numeric code such as matrix operations on accelerators and a main code controlling accelerators and performing I/O on a host.
The Alternative VE Offloading framework (AVEO) is a faster and much lower latency replacement to the previous VEO implementation which brings multi-VE support, simultaneous debugging of VE and VH side, API extensions.
You can migrate to AVEO from the previous VEO implementation by installing the AVEO's packages and re-linking your program with AVEO without modification of makefiles.
To run programs, please install veoffload-aveo and veoffload-aveorun and the runtime packages of the compiler (2.2.2 or later).
To install the packages to run programs by yum, execute the following command as root:
You need not uninstall veoffload and veoffload-veorun which are the packages of the previous VEO implementation. So, you can execute your program linked with the previous VEO implementation.
To develop programs, veoffload-aveo-devel and veoffload-aveorun-devel and the development packages of the compiler (2.2.2 or later) are also required.
To install the packages to develop programs by yum, execute the following command as root:
veoffload-devel and veoffload-veorun-devel will be uninstalled automatically, because they conflict with veoffload-aveo-devel and veoffload-aveorun-devel.
Then, you can link your program with AVEO. If you want to link your program with the previous VEO implementation, please install its packages into another machine.
First, let's try a "Hello, World!" program on VE.
VEO requires HugePages for data transfer. The required number of HugePages 32 per VEO thread context.
Code to run on VE is shown below. Standard C functions are available, hence, you can use printf(3).
Save the above code as libvehello.c.
A function on VE called via VEO needs to return a 64-bit unsigned integer. A function on VE called via VEO can have arguments as mentioned later.
VEO supports a function in an executable or in a shared library.
To execute a function on VE using VEO, compile and link a source file into a binary for VE.
To build an executable with the functions statically linked, execute as follows:
To build a shared library with the functions for dynamic loading, execute as follows:
Main routine on VH side to run VE program is shown here.
A program using VEO needs to include "ve_offload.h". In the header, the prototypes of VEO functions and constants for VEO API are defined.
The example VH program to call a VE function in a statically linked executable:
Save the above code as hello.c.
To call a VE function in a statically linked executable:
The example VH program to call a VE function in a dynamic library with VEO:
To call a VE function in a dynamic library with VEO:
Compile source code on VH side as shown below.
The headers for VEO are installed in /opt/nec/ve/veos/include; libveo, the shared library of VEO, is in /opt/nec/ve/veos/lib64.
Execute the compiled VEO program.
VE code is executed on VE node 0, specified by veo_proc_create_static()
or veo_proc_create()
.
You can pass one or more arguments to a function on VE. To specify arguments, VEO arguments object is used. A VEO argument object is created by veo_args_alloc(). When a VEO argument object is created, the VEO argument object is empty, without any arguments passed. Even if a VE function has no arguments, a VEO arguments object is still necessary.
VEO provides functions to set an argument in various types.
To pass an integer value, the following functions are used.
You can pass also a floating point number argument.
For instance: suppose that proc is a VEO process handle and func(int, double) is defined in a VE library whose handle is handle.
In this case, func(1, 2.0) is called on VE.
Non basic typed arguments and arguments by reference are put on a stack. VEO supports an argument on a stack.
To set a stack argument to a VEO arguments object, call veo_args_set_stack().
The third argument specifies the argument is for input and/or output.
You can create a VEO context which has a specified attribute.
Available attribute is 'stack size of VEO context' only.
For instance: suppose that proc is a VEO process handle.
In this case, VEO context which has a 256MB stack is created.
Code written by Fortran to run on VE is shown below.
Save the above code as libvefortran.f90.
To build an executable with the functions statically linked, execute as follows:
To build a shared library with the functions for dynamic loading, execute as follows:
Main routine on VH side to run VE program written by Fortran is shown here.
The example VH program to call a VE Fortran function in a statically linked executable:
Save the above code as fortran.c.
If you want to pass arguments to VE Fortran function, please use veo_args_set_stack() to pass arguments as stack arguments. However if you want to pass arguments to arguments with VALUE attribute in Fortran function, please pass arguments by value in the same way as VE C function.
When you want to call VE Fortran function by veo_call_async_by_name() with the name of a Fortran function, please change the name of the Fortran function to lowercase, and add "_" at the end of the function name.
Taking libvefortran.f90 and fortran.c as an example, pass "sub1_" as a argument to veo_call_async_by_name() in fortran.c when calling the Fortran function named "SUB1" in libvefortran.f90.
The method of compiling and running VH main program are same as C program.
Compile source code on VH side as shown below. This is the same as the compilation method described above.
Execute the compiled VEO program. This is also the same as the execution method described above.
The following is an example of VE code using OpenMP written in C.
Save the above code in libomphello.c
The following shows the example written in Fortran.
Save the above code in libompfortran.f90.
To use OpenMP parallelization, specify -fopenmp at compilation and linking.
Here is an example of building VE code written in C. To build a static-linked binary, execute as follows:
To build a shared library, execute as follows:
To build code written in Fortran, change the compiler to nfort.
To generate a ftrace.out, specify "-ftrace" option at compilation and linking VE code. A ftrace.out is generated on invocation of veo_proc_destroy() from VH main program. Here is an example of building VE code wiritten in C using or not using OpenMP.
To build a static-linked binary without OpenMP for ftrace, execute as follows:
To build a shared library for ftrace, execute as follows:
To build code written in Fortran, change the compiler to nfort.
To build a static-linked binary with OpenMP for ftrace, execute as follows:
To build a shared library for ftrace, execute as follows:
To build code written in Fortran, change the compiler to nfort.
Relinking veorun for newer compilers is requred to dynamically load shared library using OpenMP written in Fortran.
To link veorun which can loads shared library using OpenMP written in Fortran, execute as follows.
If you need to generate ftrace.out file, please add "-ftrace" option to mk_veorun_static.
To use the newly created veorun, set the environment variable VEORUN_BIN.
And execute a VEO program.
Set the environment variable VEO_LOG_DEBUG to some value and execute a VEO program. The log is output as standard output.