[ English | Japanese ]

FFTW3 Interface (For Fortran)

Introduction

FFTW3 interface is a library whose interface is mostly compatible with that of FFTW version 3.x (a part of routines and constants are not supported). Just by replacing the include file provided by FFTW to ASL's one, user programs using FFTW can employ ASL's Fourier transform routines highly optimized for Vector Engine. The original FFTW is an open-source Fourier transform library available at the FFTW Official site[1]. Refer to the FFTW documentation[2] for details.

How to use FFTW3 interface

The file name of the include file of the FFTW3 interface is different from that of the include file provided by the original FFTW. When you use the FFTW3 interface, please substitute the name of the include file as follows:

For Fortran 2003 programs

include "fftw3.f03"  ⇒  include "aslfftw3.f03"

For legacy Fortran programs

include "fftw3.f"  ⇒  include "aslfftw3.f"

Compiling and Linking tells how to compile and link with the FFTW3 interface library.

Thread Safety

Since ASL FFTW3 Interface internally uses ASL Unified Interface, the thread safety of ASL FFTW3 Interface is based on that of ASL Unified Interface. (Please see here for the thread safety of ASL Unified Interface.)

  • The routines of ASL FFTW3 Interface are not thread-safe when linking the sequential version and the distributed-memory parallel version, i.e. the MPI version, of ASL libraries. You must ensure that any other threads do not call the library while a certain thread is creating or destroying a plan. With this requirement, two or more threads can call the library simultaneously only if different plans are used in respective threads.
    When you would like to call the routines in a parallel region, please do both 1. and 2.:
    1. Please create and destroy plans used in the respective threads outside of the parallel region. In the parallel region, please ensure that a different plan is used in each thread corresponding to the thread number that omp_get_thread_num() returns.
    2. Please link the sequential version or the distributed-memory parallel version of ASL library. (Please see here for how to link the library.)
    Please note that it is not possible to apply this technique for automatically parallelized loops.
  • The routines of ASL FFTW3 Interface are partially thread-safe and already multithreaded internally when linking the shared-memory parallel version and the hybrid parallel version, i.e. the shared-memory + MPI version, of ASL libraries. Two or more threads can create and destroy plans simultaneously.
    • NEC Numeric Library Collection version 2.3.0 or later
      A plan can be shared among threads. However, it is strongly recommended to use plans outside of multithreaded parts, ex. parallel regions, in your program because the FFT execution routines are already multithreaded internally.
    • NEC Numeric Library Collection version 2.2.0 or earlier
      Please do not share the same plan among threads. It is strongly recommended to use plans outside of multithreaded parts, ex. parallel regions, in your program because the FFT execution routines are already multithreaded internally.

The source code of the FFTW3 interface

The source code of the FFTW3 interface is stored in the following compressed file:

/opt/nec/ve/nlc/X.X.X/src/aslfftw3-Y.Y.tar.gz
(X.X.X is the version number of NEC Numeric Library Collection)
(Y.Y is the version number of FFTW3 interface)

You can freely use it under the BSD license.

Available routines

Available routines are listed on FFTW3 interface routines.

Remarks
  • Even if any of the following flags is set in the argument "flags" of a plan creation routine, FFTW_ESTIMATE is just applied instead.
    • FFTW_MEASURE
    • FFTW_PATIENT
    • FFTW_EXHAUSTIVE
    • FFTW_WISDOM_ONLY
  • When unable to create a plan, the plan creation routines return 0.
  • The interface routines other than that listed on FFTW3 interface routines are not supported.
  • The names of interface routines for legacy Fortran are different from those for Fortran 2003. If you use the FFTW3 interface for legacy Fortran, please change the prefixes of the function names as follows:
    For double precision routines
    "fftw_"  ⇒  "dfftw_"
    For single precision routines
    "fftwf_"  ⇒  "sfftw_"
  • If you create a plan for a multi-dimensional FFT by using the FFTW3 interface for Fortran 2003, the order of the array dimensions must be reversed comparing with the FFTW3 interface for legacy Fortran as follows:
    For Fortran 2003
     complex(C_DOUBLE_COMPLEX), dimension(NX, NY, NZ) :: in, out
     plan = fftw_plan_dft_3d(NZ, NY, NX, in, out, FFTW_FORWARD, FFTW_ESTIMATE)
    
    For legacy Fortran
     complex(8), dimension(NX, NY, NZ) in, out
     call dfftw_plan_dft_3d(plan, NX, NY, NZ, in, out, FFTW_FORWARD, FFTW_ESTIMATE)
    

FFTW3 interface routines

 

Plan Creation

Sequential / Shared Memory Parallel Distributed Memory Parallel
  • fftw_plan_dft_1d()
  • fftwf_plan_dft_1d()
 
  • fftw_plan_dft_2d()
  • fftwf_plan_dft_2d()
  • fftw_mpi_plan_dft_2d()
  • fftwf_mpi_plan_dft_2d()
  • fftw_plan_dft_3d()
  • fftwf_plan_dft_3d()
  • fftw_mpi_plan_dft_3d()
  • fftwf_mpi_plan_dft_3d()
  • fftw_plan_dft()
  • fftwf_plan_dft()
  • fftw_mpi_plan_dft()
  • fftwf_mpi_plan_dft()
  • fftw_plan_dft_r2c_1d()
  • fftwf_plan_dft_r2c_1d()
 
  • fftw_plan_dft_r2c_2d()
  • fftwf_plan_dft_r2c_2d()
  • fftw_mpi_plan_dft_r2c_2d()
  • fftwf_mpi_plan_dft_r2c_2d()
  • fftw_plan_dft_r2c_3d()
  • fftwf_plan_dft_r2c_3d()
  • fftw_mpi_plan_dft_r2c_3d()
  • fftwf_mpi_plan_dft_r2c_3d()
  • fftw_plan_dft_r2c()
  • fftwf_plan_dft_r2c()
  • fftw_mpi_plan_dft_r2c()
  • fftwf_mpi_plan_dft_r2c()
  • fftw_plan_dft_c2r_1d()
  • fftwf_plan_dft_c2r_1d()
 
  • fftw_plan_dft_c2r_2d()
  • fftwf_plan_dft_c2r_2d()
  • fftw_mpi_plan_dft_c2r_2d()
  • fftwf_mpi_plan_dft_c2r_2d()
  • fftw_plan_dft_c2r_3d()
  • fftwf_plan_dft_c2r_3d()
  • fftw_mpi_plan_dft_c2r_3d()
  • fftwf_mpi_plan_dft_c2r_3d()
  • fftw_plan_dft_c2r()
  • fftwf_plan_dft_c2r()
  • fftw_mpi_plan_dft_c2r()
  • fftwf_mpi_plan_dft_c2r()
  • fftw_plan_many_dft()
  • fftwf_plan_many_dft()
  • fftw_mpi_plan_many_dft()
  • fftwf_mpi_plan_many_dft()
  • fftw_plan_many_dft_r2c()
  • fftwf_plan_many_dft_r2c()
  • fftw_mpi_plan_many_dft_r2c()
  • fftwf_mpi_plan_many_dft_r2c()
  • fftw_plan_many_dft_c2r()
  • fftwf_plan_many_dft_c2r()
  • fftw_mpi_plan_many_dft_c2r()
  • fftwf_mpi_plan_many_dft_c2r()
  • fftw_plan_r2r_1d()
  • fftwf_plan_r2r_1d()
 
  • fftw_plan_r2r_2d()
  • fftwf_plan_r2r_2d()
  • fftw_mpi_plan_r2r_2d()
  • fftwf_mpi_plan_r2r_2d()
  • fftw_plan_r2r_3d()
  • fftwf_plan_r2r_3d()
  • fftw_mpi_plan_r2r_3d()
  • fftwf_mpi_plan_r2r_3d()
  • fftw_plan_r2r()
  • fftwf_plan_r2r()
  • fftw_mpi_plan_r2r()
  • fftwf_mpi_plan_r2r()
  • fftw_plan_many_r2r()
  • fftwf_plan_many_r2r()
  • fftw_mpi_plan_many_r2r()
  • fftwf_mpi_plan_many_r2r()
  • fftw_plan_guru_dft()
  • fftwf_plan_guru_dft()
 
  • fftw_plan_guru_split_dft()
  • fftwf_plan_guru_split_dft()
 
  • fftw_plan_guru64_dft()
  • fftwf_plan_guru64_dft()
 
  • fftw_plan_guru64_split_dft()
  • fftwf_plan_guru64_split_dft()
 
  • fftw_plan_guru_dft_r2c()
  • fftwf_plan_guru_dft_r2c()
 
  • fftw_plan_guru_split_dft_r2c()
  • fftwf_plan_guru_split_dft_r2c()
 
  • fftw_plan_guru64_dft_r2c()
  • fftwf_plan_guru64_dft_r2c()
 
  • fftw_plan_guru64_split_dft_r2c()
  • fftwf_plan_guru64_split_dft_r2c()
 
  • fftw_plan_guru_dft_c2r()
  • fftwf_plan_guru_dft_c2r()
 
  • fftw_plan_guru_split_dft_c2r()
  • fftwf_plan_guru_split_dft_c2r()
 
  • fftw_plan_guru64_dft_c2r()
  • fftwf_plan_guru64_dft_c2r()
 
  • fftw_plan_guru64_split_dft_c2r()
  • fftwf_plan_guru64_split_dft_c2r()
 
  • fftw_plan_guru_r2r()
  • fftwf_plan_guru_r2r()
 
  • fftw_plan_guru64_r2r()
  • fftwf_plan_guru64_r2r()
 

Plan Execution

Sequential / Shared Memory Parallel Distributed Memory Parallel
  • fftw_execute()
  • fftwf_execute()
  • fftw_execute()
  • fftwf_execute()
  • fftw_execute_dft()
  • fftwf_execute_dft()
  • fftw_mpi_execute_dft()
  • fftwf_mpi_execute_dft()
  • fftw_execute_dft_r2c()
  • fftwf_execute_dft_r2c()
  • fftw_mpi_execute_dft_r2c()
  • fftwf_mpi_execute_dft_r2c()
  • fftw_execute_dft_c2r()
  • fftwf_execute_dft_c2r()
  • fftw_mpi_execute_dft_c2r()
  • fftwf_mpi_execute_dft_c2r()
  • fftw_execute_r2r()
  • fftwf_execute_r2r()
  • fftw_mpi_execute_r2r()
  • fftwf_mpi_execute_r2r()
  • fftw_execute_split_dft()
  • fftwf_execute_split_dft()
 
  • fftw_execute_split_dft_r2c()
  • fftwf_execute_split_dft_r2c()
 
  • fftw_execute_split_dft_c2r()
  • fftwf_execute_split_dft_c2r()
 

Plan Destruction

  • fftw_destroy_plan()
  • fftwf_destroy_plan()

Utility Routines (common)

  • fftw_alignment_of()
  • fftwf_alignment_of()
  • fftw_init_threads()
  • fftwf_init_threads()
  • fftw_plan_with_nthreads()
  • fftwf_plan_with_nthreads()
  • fftw_cleanup_threads()
  • fftwf_cleanup_threads()
  • fftw_export_wisdom_to_filename()
  • fftwf_export_wisdom_to_filename()
  • fftw_export_wisdom_to_file()
  • fftwf_export_wisdom_to_file()
  • fftw_export_wisdom_to_string()
  • fftwf_export_wisdom_to_string()
  • fftw_export_wisdom()
  • fftwf_export_wisdom()
  • fftw_import_system_wisdom()
  • fftwf_import_system_wisdom()
  • fftw_import_wisdom_from_filename()
  • fftwf_import_wisdom_from_filename()
  • fftw_import_wisdom_from_file()
  • fftwf_import_wisdom_from_file()
  • fftw_import_wisdom_from_string()
  • fftwf_import_wisdom_from_string()
  • fftw_import_wisdom()
  • fftwf_import_wisdom()
  • fftw_forget_wisdom()
  • fftwf_forget_wisdom()

Utility Routines (sequential / shared memory parallel)

  • fftw_cleanup()
  • fftwf_cleanup()

Utility Routines (distributed memory parallel)

  • fftw_mpi_init()
  • fftwf_mpi_init()
  • fftw_mpi_local_size_2d()
  • fftwf_mpi_local_size_2d()
  • fftw_mpi_local_size_2d_transposed()
  • fftwf_mpi_local_size_2d_transposed()
  • fftw_mpi_local_size_3d()
  • fftwf_mpi_local_size_3d()
  • fftw_mpi_local_size_3d_transposed()
  • fftwf_mpi_local_size_3d_transposed()
  • fftw_mpi_local_size()
  • fftwf_mpi_local_size()
  • fftw_mpi_local_size_transposed()
  • fftwf_mpi_local_size_transposed()
  • fftw_mpi_local_size_many()
  • fftwf_mpi_local_size_many()
  • fftw_mpi_local_size_many_transposed()
  • fftwf_mpi_local_size_many_transposed()
  • fftw_mpi_cleanup()
  • fftwf_mpi_cleanup()

Example Programs

Program:
Programming Language:
Floating Point Precision:
Shared Memory Parallelism:
Distributed Memory Parallelism:
download

External Links

  1. FFTW Official site
  2. FFTW documentation (pdf format download)

Version Information

  • The API version this manual page targets: 1.3
  • This manual page version: 2.2.0-201225