P3DFFT User Guide
Parallel Three-Dimensional Fast Fourier Transforms, dubbed P3DFFT, is a library for computational computing in a wide range of sciences, such as physics, climatology, chemistry. This project was developed at SDSC by Dmitry Pekurovsky as a product of a Strategic Applications Collaborations (SAC) project.
- Current Version: 2.2; last updated 4/8/2008
- Subscribe to update announcements; improvements to this product are ongoing. Enter your e-mail address and select Parallel 3-D FFT from the list of subscriptions.
- Download P3DFFT
- See bottom of this page for package information.
Requirements
The latest version of the P3DFFT library is available on SDSC Teragrid systems Datastar (IBM Power4), Blue Gene and IA-64 TG cluster (in the directory, /usr/local/apps/p3dfft). To utilize P3DFFT on one of these systems, all that is required is an SDSC user account. Alternately, P3DFTT can be installed on a system running either IBM Engineering and Scientific Software Library (ESSL) or Fastest Fourier Transform in the West (FFTW).
Features
- Parallel implementation with 2D data decomposition, overcoming an important limitation to scalability of other 3D FFT libraries implementing 1D, or slab, decomposition.
- Optimized for parallel communication and single-CPU performance.
- Built on top of well-optimized and flexible 1D FFT libraries.
P3DFFT uses 2D, or pencil, decomposition. This overcomes an important limitation to scalability of other 3D FFT libraries implementing 1D or slab decomposition, since the number of processors used to run this problem in parallel can be as large as N^2, where N is the linear problem size. P3DFFT has shown good scalability up to 32768 BG processors when integrated into a Direct Numerical Simulation (DNS) turbulence application (see scaling analysis presentation).
P3DFFT is written in Fortran with MPI. It is optimized with regard to parallel communication and single processor performance. It is built on top of one of the libraries doing 1D FFT in serial, such as ESSL on IBM systems and FFTW, an open source library. Either one of these libraries needs to be installed in order to build P3DFFT, and can be chosen in compile options. Also, it is possible to choose single or double precision, and indicate preference for 1D decomposition instead of the default 2D decomposition, in cases where the processor count does not exceed N.
In the forward transform, given an input of an array of 3D real values, an output of 3D complex array of Fourier coefficients is returned. Note that only half of the complex coefficients is returned, since the other half can be restored by conjugate symmetry. In backward transform, the input is the half-sized complex array, and the output is full-sized real array. Starting with X pencils, two all-to-all transposes are required in any given call to process Y and Z transforms in turn.


