PyOFTK can now perform the Split-Step Fourier algorithm on the GPU via the PyOFTK.ssfgpu() function. The first draft of the code do not take into accound the Raman response time. The implementation was made very easy with the pyfft python package done by Bogdan Opanchuk which is the Apple’s FFT implementation ported for PyCuda and PyOpenCL.
Here some performance results. The first chart shows the execution time for a Soliton propagation simulation on 1000 steps for different problem sizes (temporal resolution). The performance of ssfgpu() running on a NVIDIA GTX 260 was compared with the plain SSF version done in C and the numpy version running on a intel core @ 3.2Ghz. The second chart shows the overall speedup for the different problem sizes.
Like the underlying FFT gpu implementation, the ssfgpu() start to really scream on large problem. I manage to keep the data on the device memory between each step in order to avoid the transfer bottleneck so the performance scaling is essentially the same as pyfft.
PyOFTK.ssfgpu() is not avalaible yet. I will push it to the github repository soon.
Here is an example of similariton interaction computed using ssfgpu().
from numpy import * from scipy import * import matplotlib.pyplot as plt from numpy.fft import fftshift import PyOFTK from pycuda.tools import make_default_context import pycuda.driver as cuda cuda.init() context = make_default_context() lambdaZero = 1.03e-6 T = 60.0 nt = pow(2,12) dt = T/float(nt) t = linspace(-T/2, T/2, nt) C = 2.99792458e-4 nz = 4000 dz = 0.01 betap = array([0.0 , 0.0 ,0.024]) gamma = 0.065 # Gain profil omega = PyOFTK.wspace(T,nt) vs = fftshift(omega/(2*pi)) wavelength = (1/((vs/C)+1/(lambdaZero)))*1e9 alpha = array([-0.2]) gainBandwidth = 50.0 alphaVec = PyOFTK.gainProfil(wavelength, alpha[0], gainBandwidth) dureePulse = 0.5 puissanceCrete = 2.0 u_ini_g = PyOFTK.gaussianPulse(t[0:nt/2],dureePulse,-7.5,puissanceCrete,1,0) u_ini_d = PyOFTK.gaussianPulse(t[nt/2:nt],dureePulse,7.5,puissanceCrete,1,0) u_ini = r_[u_ini_g, u_ini_d] [u_out, i_arch] = PyOFTK.ssfgpuFull(u_ini, dt, dz, nz, alpha, betap, gamma, context, 500, 1e-5) context.pop() plt.imshow(i_arch, aspect='normal') plt.show() |



