GPUE-group.github.io/index.json at master · GPUE-group/GPUE-group.github.io · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
[
{
	"uri": "https://gpue-group.github.io/data_analysis/output/",
	"title": "GPUE Output Format",
	"tags": [],
	"description": "",
	"content": " GPUE outputs data in a simple ASCII format where all arrays have been re-indexed and output element-by-element in a 1 dimensional array. For 1 dimensional simulations, these values can be immediately plotted with separate plotters like gnuplot. For 2 and 3 dimensional simulations, a certain amount of data analysis is required; however, this should be straightforward depending on the application.\nGPUE will automatically output the following files unless the -f flag is used:\n V_0: The potential of the first timestep K_0: The momentum-space components used in the split-operator method Ax_0, Ay_0, Az_0: The gauge field in the x, y, and z directions Bx_0, By_0, Bz_0: The magnetic field in the x, y, and z directions  Note that if the -J flag is used, the magnetic fields will be output in polar coordinates.\nIf the -W flag is used, the wavefunction will be output every printstep, which is set by the -p flag. For example:\n./gpue -e 1000 -W -p 100  will run GPUE in real-time for 1000 steps and output the wavefunction every timestep. In 2 dimensions, this will also output vortex tracking values and in 3 dimensions, this will output an additional Edges_n file, which is the sobel filter of the wavefunction density. The Edges_n file can be used for vortex highlighting in 3 dimensions.\nThe -d flag sets the data directory relative to the gpue executable. If the data directory does not exist, that directory will be created.\nReshaping arrays for usage in various scripting languages If GPUE is run in 2 or 3 dimensions and it is necessary to analyze the data, it may be worthwhile to reshape the data. For example, in python, this might look like:\n data = data_dir + \u0026quot;/\u0026quot; + val lines = np.loadtxt(data) A = np.reshape(lines, (xDim,yDim,zDim)); return A  This will create a 3 dimensional array that can be read as A[i,j,k]. A simlar procedure must be done in 2 dimensions, simply dropping the k index and zDim from the array indexing and reshaping function, respectively.\nParams.dat file GPUE holds many variables in an unordered map for usage within the code, itself. This allows us to keep a name associated with each variable for dynamic equation parsing and also allows us to output each variable into a single file for data analysis later. All of these variables can be found in the Params.dat file in the data directory. If the user needs a single variable for data analysis (such as xDim or dx), they can find it there. An example can be found here:\n[Params] dpz=125664 sepMinEpsilon=0 y0_shift=0 a0z=7.62574e-06 x0_shift=0 dy=1.95313e-07 fudge=1 winding=0 mask_2d=0.00015 yMax=2.5e-05 laser_power=0 gdt=0.0001 omegaX=6.283 xMax=2.5e-05 gammaY=1 omega=1 z0_shift=0 omegaZ=6.283 dpy=125664 DX=0 box_size=2.5e-05 interaction=1 thresh_const=1 omegaY=6.283 mass=1.44316e-25 a_s=4.76e-09 angle_sweep=0 a0x=7.62574e-06 a0y=7.62574e-06 dpx=125664 gDenConst=4.6095e-51 dt=0.0001 Rxy=0.36659 zMax=2.5e-05 pyMax=1.6085e+07 pxMax=1.6085e+07 pzMax=1.6085e+07 dz=1.95313e-07 dx=1.95313e-07 plan_dim3=5 plan_3d=6 plan_other2d=2 yDim=256 kill_idx=-1 gsteps=1 esteps=1 zDim=256 plan_dim2=4 plan_2d=1 atoms=1 xDim=256 dimnum=3 printSteps=100 kick_it=0 ramp_type=1 gSize=16777216 device=2 charge=0 result=0 plan_1d=3  "
},
{
	"uri": "https://gpue-group.github.io/intro/",
	"title": "Introduction to GPUE",
	"tags": [],
	"description": "",
	"content": "   \n  GPUE is a Gross\u0026mdash;Pitaevskii equation solver that is accelerated on GPU hardware with CUDA. Though the project began as a general method to study vortex dynamics in 2 dimensional Bose\u0026mdash;Einstein Condensates (BECs), it has since grown into a general-purpose BEC simulator using the Split-Operator method for 1, 2, and 3 dimensions and allows for dynamic gauge fields and potentials. The purpose of this documentation is to provide a general introduction to running GPUE for various purposes and provide a detailed guide for those wishing to develop GPUE in the future.\nGPUE has been developed primarily for HPC computing with Tesla-series Nvidia compute devices and should run on various linux distributions. Though other platforms may run GPUE, they are not officially supported.\nPurpose of GPUE GPUE is GPU accelerated software to solve the Gross-Pitaevskii equation for superfluid simulations of Bose\u0026ndash;Einstein Condensates (BECs) with a particular emphasis on vortex simulations generated via rotation or gauge fields. The Gross-Pitaevskii equation is a nonlinear Schr\u0026ouml;dinger equation:\n$$ \\frac{\\partial\\Psi(\\mathbf{r},t)}{\\partial t} = \\left( -\\frac{(p-mA)^2}{2m} + V(\\mathbf{r}) + g|\\Psi(\\mathbf{r},t)|^2\\right)\\Psi(\\mathbf{r},t) $$\nWhere $\\Psi(\\mathbf{r},t)$ is the many-body wavefunction of the quantum system, $m$ is the atomic mass, $p = -i\\hbar\\nabla$ is the standard momentum space operator, $A$ is a gauge field to induce rotational effects, $V(\\mathbf{r})$ is a potential to trap the atomic system, $g = \\frac{4\\pi\\hbar^2a_s}{m}$ is a coupling factor, and $a_s$ is the scattering length of the atomic species. Here, the GPE is shown in 1 dimension, but it can easily be extended to 2 or 3 dimensions, if necessary.\nFrom this equation, we can describe how superfluid BEC systems will behave in an experimental setting. As such, the Gross-Pitaevskii equation is a powerful tool that allows theoretical quantum physicists to better understand superfluid dynamics and explore areas like quantum turbulence in a straightforward computational system.\nCurrently, there are standard software packages like GPELab in Matlab or TrotterSuzuki which uses GPU acceleration. These software packages are either too slow or too narrowly focused on other areas of superfluid simulations for simulating vortex dynamics quickly and efficiently. As such, we have designed GPUE to provide an experimentally realistic model of how a BEC could behave in superfluid turbulence simulations.\nIn addition, GPUE is able to simulate a wide variety of other BEC and standard Schr\u0026ouml;dinger equations experiments and also allows for the modification of potentials and gauge fields during both imaginary and real-time dynamics.\nThe simulation method GPUE uses the split-step (also called the split-operator) method to perform it\u0026rsquo;s simulations. Because of this, the simulation is split into two distinct parts: 1. Imaginary time evolution to find the lowest energy (ground) state of the quantum system 2. Real time evolution to determine how the BEC would behave in an actual experimental set-up\nIn most cases, GPUE simulations will consist of creating a standard wavefunction \u0026ldquo;guess\u0026rdquo; that is close to the ground state, followed by a small number of steps in imaginary time to find the ground state, and then subsequent simulation in real time.\n MathJax.Hub.Config({ tex2jax: { inlineMath: [['$','$'], ['\\\\(','\\\\)']], displayMath: [['$$','$$'], ['\\[','\\]']], processEscapes: true, processEnvironments: true, skipTags: ['script', 'noscript', 'style', 'textarea', 'pre','code'], TeX: { equationNumbers: { autoNumber: \"AMS\" }, extensions: [\"AMSmath.js\", \"AMSsymbols.js\"] } } });  "
},
{
	"uri": "https://gpue-group.github.io/functionality/running_gpue/",
	"title": "Running GPUE",
	"tags": [],
	"description": "",
	"content": "  After building GPUE, it is important to run the unit tests with the ./gpue -u command and make sure all tests pass. If they do not, there could be a failure in the building process or a problem with running GPUE on the provided hardware.\nAfter the unit tests have passed, you are free to use GPUE to your heart\u0026rsquo;s content. The easiest way to run GPUE is by using the provided help menu with the ./gpue -h command and creating a command from that; however, a parameter file can also be used for this purpose if it is easier.\nThis section provides an introduction to the unit tests and a description of the help menu along with some example simulations to run.\nUnit tests GPUE provides a suite of unit tests for the following operations:\n Math Operator Test: A suite of tests for all in-built mathematical operators Double2 Functions Test: A suite of tests for all in-built operators on CufftDoubleComplex or double2 types Grid Test: This tests to ensures the gpu is functional and simply sends data back and forth to the GPU using a specified number of threads Parallel Summation Test: this tests the parallel summation routine necessary for imaginary-time evolution FFT Test: This is a simple test of the CUFFT library and makes sure that all plans used for FFT operations work Dynamic Test: This is a test of all Abstract Syntax Tree (AST) functions necessary for dynamic fields and potentials. Bessel Test: this is a test of the polynomial approximation to the bessel functions used in the AST data structure Make Complex Test: A test of the make_complex function that turns potentials into operators for evolution cMultPhi_test: Test of the cMultPhi kernel for evolution with an imprinted Phi parameter 10.*Evolution Test: Test of imaginary and real0time evolution in 1, 2, and 3 dimensions. This first runs the simulation in imaginary time to ensure the appropriate energy for a single particle in the ground state of an n-dimensional harmonic oscillator and then runs the simulation in real-time to ensure the energy does not vary from this value.  These are all done with the ./gpue -u command. Please run this command after building GPUE to test your installation.\nThe most important of these tests is the Evolution Test. This one ensures the physical accuracy of the results, so long as the appropriate energy is found in the ground state of the harmonic oscillator, the code should be working as-intended for simple evolution. If any test fails, please create an issue on GitHub\nHelp menu To run GPUE, simply use the ./gpue command with appropriate flags for the desired simulation. To determine which flags should be used, please use the ./gpue -h command, which will output the following:\n GPUE Graphics-Processing Unit Gross--Piteavskii Equation solver Options: -A rotation Set gauge field mode [beta] -a Set flag to graph [deprecated] -b 2.5e-5 Set box size (xyzMax in 3d) -C 0 Set device (Card) for GPU computing -c 2 Set coords (1, 2, 3 dimensions) -D 0.0 Set's offset for kill (-K flag) distance radially -d data Set data directory - where to store / read data (here in data/) -E Perform energy calculation every writing step -e 1 Set esteps, number of real-time evolution steps -f Unset write to file flag -G 1 Set GammaY, ratio of omega_y to omega_x -g 1 Set gsteps, number of imaginary-time evolution steps -H phrase Print help menu, searching for \u0026quot;phrase\u0026quot; -h Print help menu, exit GPUE -I \u0026quot;param.cfg\u0026quot; Set Input parameter file -i 1 Set interaction strength between particles -j 1 Set threshold multiplier for easier vortex detection -J Set cylindrical coordinate output for B-field (no letters) -K 0 Selects vortex with specified ID to be killed/flipped/mutliply-charged -k 0 Set kick_it, kicking for Moire lattice simulations 0 = off, 1 = periodic kicking, 2 = single kick -L 0 Set l, vortex winding [2*pi*L] -l Set ang_mom flag to use angular momentum -m Set 2d_mask for vortex tracking -n 1 Set N, number of particles in simulation -O 0 Set angle_sweep, kicking potential rotation angle -P 0 Set laser_power, strength of kicking [hbar * omega_perp] -p 100 Set printSteps, frequency of printing -Q 0 Set z0_shift, z shift of kicking potential (bad flag, I know) -q 0 Set the vortex winding for imprinting/annihilation/flipping during real-time. The value is the charge of the new vortex -R 1 Set ramping flag for imaginary time evolution 1 for ramping up, 0 for ramping down -r Set read_wfc to read wavefunction from file -S 0 Set sepMinEpsilon, kicking potential lattice spacing scaling -s Set gpe, flag for using Gross-Pitaevskii Equation -T 1e-4 Set gdt, timestep for imaginary-time -t 1e-4 Set dt, timestep for real-time -U 0 Set x0_shift, x shift of kicking potential, shift of the imprinting vortex position defined by K and q -u Performs all unit tests -V 0 Set y0_shift, y shift of kicking potential, shift of the imprinting vortex position defined by K and q -v Set potential [beta] -W Set write_it, flag to write to file -w 0 Set Omega_z, rotation rate relative to omega_perp This acts as a value to multiply the gauge fields by -X 6.283 Set omega_x -x 256 Set xDim, dimension in X (power of 2, please!) -Y 6.283 Set omega_y -y 256 Set yDim, dimension in Y (power of 2, please!) -Z 6.283 Set omega_z, confinement in z-dimension -z 256 Set zDim, dimension in Z (power of 2, please!) Notes: - Parameters specified above represent the default values for the classic linear Schrodinger equation and may need to be modified to match your specific problem. - We use real units with an Rb87 condensate - You may generate a simple vortex lattice in 2d with the following command: ./gpue -x 512 -y 512 -g 50000 -e 1 -p 5000 -W -w 0.5 -n 1e5 -s -l -Z 10 - Thanks for using GPUE!  If you would like to search this for a particular flag, please use the ./gpue -H search_variable command. For example, if I wanted to change omega_x I might search for it like so:\n./gpue -H omega  which will provide the following:\n-G 1 Set GammaY, ratio of omega_y to omega_x -P 0 Set laser_power, strength of kicking [hbar * omega_perp] -w 0 Set Omega_z, rotation rate relative to omega_perp -X 6.283 Set omega_x -Y 6.283 Set omega_y -Z 6.283 Set omega_z, confinement in z-dimension  This will make it easier to find the flags you might want for the simulation.\nSpecial considerations There are several flags that need special consideration here:\n -r: This flag will read in a wavefunction from file and will naturally look for it in the data/wfc_load and data_wfci_load files. If these files do not exist, GPUE will ask for appropriate filenames. If the file lengths do not match the dimensions of the simulation, it will ask for new filenames again. -A: This flag will read in a gauge field, which can be rotation, constant, test, or file. If the gauge field is being read in from a file, it will look for the field in the data/Axgauge, data/Aygauge, and data/Azgauge files. If they do not exist or are of the wrong dimensions, GPUE will prompt for a new file. -v: This flag is naturally set to be a harmonic oscillator, but can use the torus option in 3D. Otherwise, the flag\u0026rsquo;s default will be harmonic -I: This flag will read in a configuration file for dynamic fields, which will be described later in the documentation -d: This flag will specify a data directory. All configuration files are expected to be in this directory. If the directory does not exist at the start of the simulation, GPUE will create the directory.  Example simulation In addition to the ./gpue -u command, the help menu recommends running the following command:\n./gpue -x 512 -y 512 -g 50000 -e 1 -p 5000 -W -w 0.6 -n 1e5 -s -l -Z 100  Here, we will run a slightly modified version of this:\n./gpue -x 512 -y 512 -g 50001 -e 1 -p 5000 -W -w 0.6 -n 1e5 -s -l -Z 100 -d data_2d_example  This will simulate a BEC in imaginary time influenced by a typical angular momentum field and will provide several quantized vortices. Here are the parameters of this simulation:\n -x 512 -y 512: 512x512 resolution -g 50001: 50001 imaginary time steps. Note that we start counting at 0, so adding the 1 guarantees evolving through the 50000th step -e 1: 1 real time step. -p 5000: printing to terminal every 5000 steps -W: writing to file when printing to terminal -w 0.6: rotational omega value of 0.6 (1 is the maximum) -n 1e5: 100000 atoms -s: using nonlinear Schr\u0026ouml;dinger equation -l: turning on angular momentum (gauge fields) -Z 100: strongly confined in the Z direction with a trapping frequency of 100 -d data_2d_example: outputting data into the data_2d_example directory  If more ground state evolution timesteps were requested by increasing the value after the -g flag, a triangular vortex lattice can be generated.\nThe data can be plotted with the plot.py file in the py/ directory with the following command:\npython plot.py i wfc r 0 50000 5000 d data_2d_example  Here, we are plotting the wavefunction (i wfc) from 0 to 50,000 in steps of 5000 (r 0 50000 5000) in the data_2d_example directory (d data_2d_example).\nThis will provide a series of images that should look something like this:\n"
},
{
	"uri": "https://gpue-group.github.io/build/",
	"title": "Building GPUE",
	"tags": [],
	"description": "",
	"content": "   \n  GPUE is designed with both a traditional Makefile and CMake, both of which may be used for different purposes, depending on the hardware available.\nDependencies and Hardware We have attempted to maintain a small dependency list for GPUE, and so only require the following dependencies:\n CUDA (version $\\geq$7.5.18) CUFFT (bundled with CUDA) GCC (version $\\geq$4.9) or clang (version $\\geq$3.9) CMake (version $\\geq$3.8; optional)  GPUE performs computation almost entirely on GPU hardware, and the sm_2.0 (Fermi) to sm_7.0 (Volta) architectures are all supported. A list of all cards within these ranges can be found on wikipedia.\nCMake CMake is the preferred building system for GPUE; however, a CMake version $\u0026gt;=$ 3.8 must be used. In this case, run\ncmake .  in the primary GPUE directory and then run make as in the Makefile example. make clean will clean the directory for rebuilding.\nMakefile If you wish to build without CMake, a sample makefile is provided that may be modified to suit the CUDA paths on your test system. Here a slightly modified GPUE Makefile:\nCUDA_HOME = /path/to/cuda/ GPU_ARCH = sm_XX OS:= $(shell uname) ifeq ($(OS),Darwin) CUDA_LIB = $(CUDA_HOME)/lib CUDA_HEADER = $(CUDA_HOME)/include CC = $(CUDA_HOME)/bin/nvcc -ccbin /usr/bin/clang --ptxas-options=-v CFLAGS = -g -std=c++11 -Wno-deprecated-gpu-targets else CUDA_LIB = $(CUDA_HOME)/lib64 CUDA_HEADER = $(CUDA_HOME)/include CC = $(CUDA_HOME)/bin/nvcc --ptxas-options=-v --compiler-options -Wall CHOSTFLAGS = #-fopenmp CFLAGS = -g -O3 -std=c++11 -Xcompiler '-std=c++11' -Xcompiler '-fopenmp' endif CUDA_FLAGS = -lcufft CLINKER = $(CC) RM = /bin/rm INCFLAGS = -I$(CUDA_HEADER) LDFLAGS = -L$(CUDA_LIB) EXECS = gpue # BINARY NAME HERE DEPS = ./include/constants.h ./include/ds.h ./include/edge.h ./include/evolution.h ./include/fileIO.h ./include/init.h ./include/kernels.h ./include/lattice.h ./include/manip.h ./include/minions.h ./include/node.h ./include/operators.h ./include/parser.h ./include/split_op.h ./include/tracker.h ./include/unit_test.h ./include/vort.h ./include/vortex_3d.h ./include/dynamic.h OBJ = fileIO.o kernels.o split_op.o tracker.o minions.o ds.o edge.o node.o lattice.o manip.o vort.o parser.o evolution.o init.o unit_test.o operators.o vortex_3d.o dynamic.o %.o: ./src/%.cc $(DEPS) $(CC) -c -o $@ $(INCFLAGS) $(CFLAGS) $(LDFLAGS) -Xcompiler \u0026quot;-fopenmp\u0026quot; -arch=$(GPU_ARCH) $\u0026lt; %.o: ./src/%.cu $(DEPS) $(CC) -c -o $@ $(INCFLAGS) $(CFLAGS) $(LDFLAGS) $(CUDA_FLAGS) -Xcompiler \u0026quot;-fopenmp\u0026quot; -arch=$(GPU_ARCH) $\u0026lt; -dc gpue: $(OBJ) $(CC) -o $@ $(INCFLAGS) $(CFLAGS) $(LDFLAGS) $(CUDA_FLAGS) -Xcompiler \u0026quot;-fopenmp\u0026quot; -arch=$(GPU_ARCH) $^ clean: @-$(RM) -f r_0 Phi_0 E* px_* py_0* xPy* xpy* ypx* x_* y_* yPx* p0* p1* p2* EKp* EVr* gpot wfc* Tpot 0* V_* K_* Vi_* Ki_* 0i* k s_* si_* *.o *~ PI* $(EXECS) $(OTHER_EXECS) *.dat *.eps *.ii *.i *cudafe* *fatbin* *hash* *module* *ptx test* vort* v_opt*;  To use the Makefile, the first 2 lines must be modified to reflect your desired cuda/ path and the architecture of your GPU device. After this, simply make to build the code. If rebuilding is necessary, run make clean then make.\nIf you are developing GPUE, this file will need to be modified as new code is developed.\nTesting GPUE Once GPUE is built, please run unit tests with ./gpue -u and make sure everything passes. If there is a failure in the build, please create an issue on GitHub.\nExperimental Docker support Given the recent interest in containerised HPC software, we have provided some support for using GPUE within a Docker environment. To take advantage of this requires the installation of the Nvidia CUDA runtime for Docker (see here for details). For a successfully installed (and working) Docker environment, a local GPUE Docker image can be built using:\ncd gpue/ docker build . After the build, the container may be run as\ndocker run --runtime=nvidia \u0026lt;IMAGE TAG\u0026gt; /gpue/gpue -u The provided GPUE path assumes it has been installed within the root (/) directory on the container. Additionally, an automated build is available for the latest changes of the GPUE source code using the Dockerhub mlxd/gpue:latest, wherein a build is triggered by a commit hook against the GPUE master branch.\nPlease note that tests with native builds are expected to offer greater performance.\n MathJax.Hub.Config({ tex2jax: { inlineMath: [['$','$'], ['\\\\(','\\\\)']], displayMath: [['$$','$$'], ['\\[','\\]']], processEscapes: true, processEnvironments: true, skipTags: ['script', 'noscript', 'style', 'textarea', 'pre','code'], TeX: { equationNumbers: { autoNumber: \"AMS\" }, extensions: [\"AMSmath.js\", \"AMSsymbols.js\"] } } });  "
},
{
	"uri": "https://gpue-group.github.io/data_analysis/python/",
	"title": "Python Scripts",
	"tags": [],
	"description": "",
	"content": " GPUE provides a number of python scripts in the py/ directory. This page explains what each script is and its intended use.\nAs a note: GPUE is primarily a simulation program. Data analysis of this nature must be done on a case-by-case basis, and as such, these scripts only intend to provide basic functionality for plotting the output wavefunction and additional variables. If more advanced functions are required, please create these on your own or discuss them with us on GitHub\nplot.py This file is a stand-alone plotting script for 2 dimensional data and can be run like so:\npython plot.py i wfc g 256 256 r 0 1000 100 d data  This will plot the wavefunction from imaginary-time evolution using a grid of 256x256 from values 0 to 1000 in steps of 100. Obviously, the code requires these data files to exist in the data directory before plotting. The script will default to plotting only a single 512x512 data file from the data directory if no gridsize, range, or directory is provided.\nAll 2 dimensional variables can be plotted in this way. If real-time dynamics are desired, simply use the wfc_ev flag instead of wfc.\ngen_data.py This file is a list of functions necessary to analyze 3 dimensional data, and an example of how it might be used can be found in the vis_scripts.py file.\nHere are the primary functions and their intended usage:\nwfc_density(xDim, yDim, zDim, data_dir, pltval, i)  This function outputs an item that corresponds to the wavefunction density. This item can be turned into a .bvox or .vtk file with the to_bvox() and to_vtk() functions described later. The pltval argument can read either wfc for imaginary-time evolution or wfc_ev for real-time evolution.\nwfc_phase(xDim, yDim, zDim, data_dir, pltval, i)  This function outputs an item that corresponds to the wavefunction phase. This item can be turned into a .bvox or .vtk file with the to_bvox() and to_vtk() functions described later. The pltval argument can read either wfc for imaginary-time evolution or wfc_ev for real-time evolution.\nproj_phase_2d(xDim, yDim, zDim, data_dir, pltval, i)  This slices the wavefunction along the z-axis and outputs the phase as a wfc_ph_i file, where i corresponds to the value read into this function. The pltval argument can read either wfc for imaginary-time evolution or wfc_ev for real-time evolution. This file can be plotted with plot.py, for example.\nproj_2d(xDim, yDim, zDim, data_dir, pltval, i)  This slices the wavefunction along the z-axis and outputs the density as a wfc_i file, where i corresponds to the value read into this function. The pltval argument can read either wfc for imaginary-time evolution or wfc_ev for real-time evolution. This file can be plotted with plot.py, for example.\nvar(xDim, yDim, zDim, data_dir, pltval)  This function outputs an item that corresponds to the variable to be plotted. This item can be turned into a .bvox or .vtk file with the to_bvox() and to_vtk() functions described later. The pltval argument can be any existing file in the data_directory that is formatted for 3 dimensional output.\nproj_var2d(xdim, yDim, zDim, data_dir, pltval, file_string)  This slices the 3 dimensional variable along the z-axis and outputs the phase as a file with the name file_string This file can be plotted with plot.py, for example.\nproj_var1d(xdim, yDim, zDim, data_dir, pltval, file_string)  This slices the 3 dimensional variable along the z and y axes and outputs the phase as a file with the name file_string This file can be plotted with any standard plotter, like gnuplot.\nto_bvox(item, xDim, yDim, zDim, nframes, filename)  This takes an item output from the wfc_phase(), wfc_density(), or var() functions and turns it into a .bvox file with name filename to be read by blender. The nframes function corresponds to the number of frames blender will plot. For almost all cases, this should be 1. The .bvox file can be used further with the visualize_3d.py file.\nto_vtk(item, xDim, yDim, zDim, data_dir, filename)  This takes an item output from the wfc_phase(), wfc_density(), or var() functions and turns it into a .vtk file with name filename to be read by blender. This can be directly read into 3 dimensional plotters like Paraview and is the preferred method to visualize data in 3 dimensions.\nvisualize_3d.py This file works with the blender bpy library to create a 3 dimensional image from a .bvox file and the voxelfile and outfile must be modified depending on the file input and output.\nIt is run with the following command:\nblender -b -P visualize_3d.py  To open up the blender GUI with the density plotted, run:\nblender -P visualize_3d.py  en.py This is an example script of how to find the energy of a 2 dimensional simulation from GPUE. As a note, this functionality is already completely implemented in GPUE by outputting the energy every timestep with the -E flag; however, this script provides an example of how this can be done with python.\nvort.py This file enables ordering and trajectory calculation of the vortices that are output from GPUE. All available vort_arr_XYZ files are loaded from the GPUE simulation output, and a linked-list data-structure is create, wherein each vortex is assigned a unique ID (uid). These uid values are then ordered and swapped between subsequent timesteps to the most-likely candidate after each time-step of evolution. The resulting data, once ordered, is then output into a new file format vort_ord_XYZ.csv, with each vortex uid constant over the time-series. This allows for the creation of vortex trajectories, as is perfomed by matlab/vtxTrajectory.m. Some additional details are also given on the 2D GPUE functionality page.\nA typical workflow of involving this file is as follows:\ngpue \u0026lt;simulation parameters\u0026gt; cd \u0026lt;data_directory\u0026gt; python py/vort.py mpi_vis.py This script enables MPI-distributed wavefunction density generation for 2D data only. $|\\Psi|^2$ is plotted over all available timesteps, and is distributed using mpi4py. To ensure this works, mpi4py MUST be built against the MPI library running on the local machine (be it a laptop, or a supercomputer). Some useful instructions on this are provided by NERSC under the \u0026ldquo;Building MPI4PY\u0026rdquo; subtitle.\nOnce set-up, this script can be run to generate images both with and without the colour-bar, and enumerating the specific vortices as analysed by vort.py, as follows:\nNote: to ensure this script runs correctly, vort.py should be run first to generate ordered vortex lists as vort_ord_XYZ.csv.\nmpirun -n 32 python py/mpi_vis.py true true #first true is for colorur-bar, second true is for vortex enumeration in plot The result will be n PNG images, one for each wavefunction dataset.\nstats.py [deprecated] This file was originally developed for calculating statistical quantities over the vrortex data. However, much of the functionality has been added to other segments of the code-base, and all that remains here is a sample script for the vortex least-squares position estimation, as discussed here. The example can be run as:\npython py/stats.py 0 1000 #where 0 and 1000 represent the start and end values for which to calculate the least-sqyuares refinement. observables.py [deprecated] This file offers a re-write of the MATLAB code for calculation of the kinetic energy spectra, as well as the ability to calculate values for some known observables, such as the quadrupole mode, breathing mode, and angular momentum as key examples. Running this file expects the wavefunctions and Params.dat to be available in the running directory. Values are calculated and plotted over the available timesteps, and output as PDF files.\n"
},
{
	"uri": "https://gpue-group.github.io/functionality/",
	"title": "GPUE functionality",
	"tags": [],
	"description": "",
	"content": "GPUE is a general-purpose BEC simulator and can be used for a number of different superfluid simulations. The basic usage of GPUE will be described here, including:\n Basic usage of GPUE for 1, 2, and 3 dimensional simulations Gauge fields in 2 and 3 dimensions Vortex tracking in 2 dimensions Dynamic fields and equation parsing  "
},
{
	"uri": "https://gpue-group.github.io/functionality/gauge/",
	"title": "Gauge Fields",
	"tags": [],
	"description": "",
	"content": "One notable feature of GPUE is its ability to use gauge fields to create rotational effects in BEC simulations. Gauge fields are somewhat tricky to understand and this guide is not meant to provide a full physical understanding of what the fields are or how they work. If you with to learn more about the physical interpretation of these fields, this review by Jean Dalibard is relatively in-depth.\nThis section will highlight how gauge fields are implemented in GPUE and how to use them on your own as a user of the codebase. If you wish to provide a new default field in GPUE, itself, please go to the developer guide for this later in the documentation and provide an appropriate PR to GPUE on GitHub\n Introduction to gauge fields For the purposes of the GPUE simulations, we will be solving the non-linear Schr\u0026ouml;dinger equation:\n$$ \\frac{\\partial \\Psi(\\mathbf{r},t)}{\\partial t} = \\left( \\frac{(p-m\\mathbf{A})^2}{2m} + V_0 + g|\\Psi(\\mathbf{r},t)|^2 \\right)\\Psi(\\mathbf{r},t), $$\nwhere $\\Psi(\\mathbf{r},t)$ is a complex-valued wavefunction, $r$ is a real-space grid, $p$ is a momentum-space grid, $m$ is the mass of the atomic system, $V_0$ is a real-valued trapping potential, $g$ is a coupling constant, and $A$ is a real-valued gauge field. As mentioned, the gauge field is hard to physically interpret; however, it can be more easily understood through the artificial magnetic field, $B = \\nabla \\times A$. Here, it appears that rotation occurs around the magnetic field lines, so vortices in a BEC will follow the direction of the magnetic field at any moment in time.\nGauge fields are implemented in GPUE by splitting the operator up in position and momentum space. Because $p$ is completely in momentum space, while $A$ is in real space, the $\\frac{(p-mA)^2}{2m}$ term will have three different types of values after expanding:\n$$ \\frac{(p-mA)^2}{2m} = \\frac{p^2 + (mA)^2 + 2mpA}{2m} $$\nHere, $\\frac{p^2}{2m}$ will be in momentum-space, $\\frac{mA^2}{2}$ will be completely in position-space, and $pA$ will be in momentum-space for one dimension and position space for all other dimensions. For example, one possible gauge field can be created with the angular momentum operator as $xp_y - yp_x$.\nBecause of this, we need to perform a 1 dimensional FFT on our wavefunction in order to apply our gauge fields to the system. This means that when we run the simulation with gauge fields, we need to perform an additional FFT every timestep for each dimension of our system. In addition, we need to read in a field for each dimension. For example, with $A = xp_y - yp_x$, we will use the following field:\n$$ \\begin{align} A_x \u0026amp;= -y \\newline A_y \u0026amp;= x \\newline A_z \u0026amp;= 0 \\end{align} $$\nTo apply this field, we will need to FFT our wavefunction in first the x-direction, then multiply by $A_x$, iFFT back and do the same for $A_y$ and $A_z$. This is a costly process, which is why you should only use gauge fields if you need them.\nAs mentioned, this documentation does not intend to provide a full in-depth description of gauge fields and artificial magnetic fields. It instead attepts to provide the basics for users that may need more information about what gauge fields are and how they influence the simulation. Basically, if you want to do vortex simulations, you will need gauge fields.\nUsing gauge fields in GPUE GPUE comes packaged with a few in-built gauge fields:\n Rotation: This is used with the ./gpue -A rotation flag. Here, we apply the rotational operation to the wavefunction as $\\Omega(xp_y - yp_x)$, where $\\Omega$ is some rotation frequency. If a rotation above some critical rotation frequency is used, vortices will appear in the BEC, eventually creating a triangular vortex lattice. Constant: This is used with the ./gpue -A constant flag. Here, we simply read in $A_x = A_y = A_z = 0$. This is useful for debugging in certain cases Test: This is used with the ./gpue -A test flag. Here, we apply a constant field in $A_y$ and $A_z$ and provide a sinusoudal field in $A_x$. This is another testing field that does not use standard rotation. File: This is used with the ./gpue -A file flag. Here, we read the fields in from file. This will naturally look for the fields in data/Axgauge, data/Aygauge, and data/Azgauge. If these files are not found, it will ask for an appropriate filename. If the file length does not match the dimensions of the system, it will also ask for a new file.  To run the simulation with gauge fields, the -l flag is necessary and will turn on the part of the simulation with the additional FFT\u0026rsquo;s for applying the necessary gauge field. No gauge field will be applied without the -l flag!\nGauge fields can also be used with the dynamic fields to be covered in the next section!\n MathJax.Hub.Config({ tex2jax: { inlineMath: [['$','$'], ['\\\\(','\\\\)']], displayMath: [['$$','$$'], ['\\[','\\]']], processEscapes: true, processEnvironments: true, skipTags: ['script', 'noscript', 'style', 'textarea', 'pre','code'], TeX: { equationNumbers: { autoNumber: \"AMS\" }, extensions: [\"AMSmath.js\", \"AMSsymbols.js\"] } } });  "
},
{
	"uri": "https://gpue-group.github.io/data_analysis/matlab/",
	"title": "Matlab Scripts",
	"tags": [],
	"description": "",
	"content": " GPUE provides a number of MATLAB scripts in the matlab/ directory. This page explains what each script is and its intended use. Please note that these scripts were verified to work under MATLAB r2016b, but have not been tested with newer releases. Some in-built function changes may require updating, but it is expected that most scripts work as standard. MATLAB development has ceased on this project, and any newer funcionality required should be performed in Julia, Python, or C++/CUDA directly.\nAs with the Python scripts, data analysis of this nature must be done on a case-by-case basis, and as such, these scripts only intend to provide basic functionality for data examination of the simulation results. If more advanced functions are required, please create these on your own or discuss them with us on GitHub\nbinData.m Bin the data returned from g6_struct into equi-separated partitions\nfunction [g6B,bin] = binData(g6C,binMax,bins) % g6C: Matrix containing the g6 values output from g6_struct % binMax: The maximum binning value % bins: The number of bins to take %Returns % g6B: The binned g6 data % bin: The bin values of g6B findNN.m Returns values and indices of nearest neighbours to location pos in positions X,Y within radius.\nfunction [neighbours,locations] = findNN(pos,X,Y,radius) getAngle.m Returns angle between 2 points on a plane.\nfunction [ang] = getAngle(P1,P2) kinSpec.m This function is used to process a large number of wavefunctions to examine the kinetic energy spectra. The function first transforms the position-space wavefunction into momentum space, and decomposes the kinetic energy into compressible (phonons) and incompressible (vortices) components (quantum pressure term assumed to be neglible for these routines).\nfunction [EKc,EKi] = kinSpec(wfc_str,dimSize,dx,mass,qSpec,init,incr,fin) % wfc_str is the string of the wavefunction name. For time-ev this is % usually wfc_ev; wfc_0_const or wfc_0_ramp for ground-states % dimSize is a vector with the dimenion size, so that the wavefunctions % can be appropriately reshaped when loaded % dx is the increment along x. dx==dy % qSpec enables classical (0) or quantum (1) variant of energy spectra % init,incr,fin are the loop values for specifiying the range of the data loadVtx.m Loads the processed vorts from the output file produced from vort.py, and structures them into Matlab structs for Vortex position and index. Forms the basis of analysing all vortex data output from GPUE.\nfunction [vorts] = loadVtx(start, step, fin, lostCheck) % start step and fin are the initial dataset, step size, and final % dataset to load for the vortex postions % lostCheck turns on the ability to test for vortex losses and ensure % that these vortices do not contribute to the dataset. % The Data should be in the format of [X Y Sign UID IsLost]  psi6.m Calculate the orientational order parameter defined at point pos between the points (X,Y)\n[nn,idx] = findNN(pos,X,Y,radius); %find number of neighbours and indices of neighbours for (X,Y) % pos: Defines the location to calculate the orientational order parameter % X,Y: Vector of entire X,Y range of points % radius: Radius over which to determine neighbouring points %Returns % psi6_pos: The value of orientational order psi_6 at position pos % nn: Number of nearest neighbours uniqPairIdx_precalc.m Calculate all unique pairings of the orientational correlations from the orientational order parameters, and create a struct to return that holds the distance between paired elements, and the respective orientational order values.\nfunction [S] = uniqPairIdx_precalc(R,psi6p) % R: Vectors of R=[X Y] values for points. % psi6p: Orientational order values defined over the range of points %Return: % S: Struct of orientational corder values cor0,cor1 and distance % between elements voronoi2dCellColour.m Determine the Voronoi diagram of the input data vortex position data, and colour the cells with the value of the orientational correlations defined at each site (col=1), or with the area of the cell (col=0).\nfunction [p6cp6, area, avg_area, num_edges, var]=voronoi2dCellColour(x,y,radius,edgeCol,dx,colScheme) %vorCellColour %x,y: are the locations of the sites %radius: is the area over which to perform the triangulation. This should % avoid too large values to ensure the traingulation does not run off to % infinity %edgeCol: Defines the Voronoi diagram edge colours for edge indexed pair. edgeCol\u0026lt;0 is % red, edgeCol==0 is black, edgeCol\u0026gt;0 is white %dx: this is the increment size of the data %colScheme: defines the colouring scheme, 1 for orientational correlations on the site % with 6-fold symmetry, 0 for cell area % Returns: p6cp6 = local values of correlations % area = cell areas % avg_area = average area of cells % num_edges = cell edges count % var = variance Testcase: Generate a 2D grid centred on 0 and calculate voronoi diagram:\nx=linspace(-1,1,20); y=x; radius = 0.5; dx = 1; q = ones(length(x)*length(y),1); voronoi2dCellColour(kron(x\u0026#39;,ones(length(y),1)),kron(ones(length(x),1),y\u0026#39;),0.75,zeros(length(x)^2,1),1,0); vtxTrajectory.m Generates trajectory plots of 2D vortex data from the postprocessed vortex positions as loaded by loadVtx.m. Since the processed data assumes an initial starting position the data at 0 is named vort_arr_0, while postprocessed data is vort_ord_xxx. Plotting assumes the data to be centered at 0 with for half $dimSize$ of the system.\nfunction [vorts] = vtxTrajectory(start, step, fin, initial,threeD, dx, dimSize, dt) % Start step and fin are the initial, stepsize and final datasets % initial indicates if you wish to highlight the % threeD also plots aspacetime (XYT) diagram in 3D. % dx is the stepsize along one dimension and assumes the same over x,y % dimSize is the length of the dimensions, and assumes x==y % dt is the time increment for the spacetime diagram. defectTriangulation.m Plots delaunay triangulation of input data points, with (5,7) (3,9) and (4,8) dislocations indicated. Can be used to calculate the defects numbers with no plotting by setting plotit to 0\nfunction [h,DefCount] = defectTriangulation(x, y, dx, dimSize, plotit, radius) % radius defines the distance from 0,0, for which the defects will be %considered. This can be used to avoid defects that arise naturally from %the edge of the triangulation % h is a handle for generated plot % DefCount is a vector of the total number of defects counted for each % type with the index representing the number of connected edges g6_struct.m Determine the orientational correlation function between every pairing of points, and sorts the results based on distance between points. This gives $g_6\u0026reg;$.\nfunction [g6C] = g6_struct(X,Y,rad) % X,Y: Vectors of x,y values for points. % rad: Radius in which to examine for neighbouring points. %Return: % g6C: Matrix of g6 values and parameters GPE_2d.m This script is an example code for simulating a rotating BEC using the split-operator based Gross\u0026ndash;Pitaevskii equation. This simulation code is merely a test-case for simple functionality tests of the Gross\u0026ndash;Pitaevskii equation.\nlatexFig.m LaTeXify the plot axes for an existing figure. This sets the plotting axis on supplied figure to be LaTeX formatted.\nfunction latexFig(gca, fontSize, cbar, xtick, ytick, xticklabels, yticklabels) % {x,y}tick can be used to overwrite the automatically defined tick % locations % {x,y}ticklabels overwrites the labels at the positions defined by % {x,y}tick values % This assumes that the Latin Modern fonts are installed % Fontsize sets the size of the text % cbar enables the colorbar if needed Example usage:\npcolor(rand(10,10)); latexFig(gca, 30, 1, 2:2:10, 1:2:10, {\u0026#39;a1\u0026#39;,\u0026#39;b1\u0026#39;,\u0026#39;c1\u0026#39;,\u0026#39;d1\u0026#39;,\u0026#39;e1\u0026#39;}, {\u0026#39;a2\u0026#39;,\u0026#39;b2\u0026#39;,\u0026#39;c2\u0026#39;,\u0026#39;d2\u0026#39;,\u0026#39;e2\u0026#39;}) psi6_DT.m psi6.m variant for use in Delaunay triangulation/Voronoi cell calculation routines. Use of psi6.m is preferred.\nquKineticSpec.m Evalutes the quantum kinetic energy spectrum (see O\u0026rsquo;Riordan, 2017 for specific details). Used by kinSpec.m for evaluation of both quantum and classical kinetic energy spectra in compressible and incompressible cases.\nfunction [varargout] = quKineticSpec(Psir,m,dx,q,avg,iii) % Ek = quKineticSpec(Psir,m,dx,q,avg,iii), % Psir N-dim wavefunction in r space with grid spacing dx and mass m % q = include phase for quantum spectrum, 1=yes,0=no; % avg = perform angled average. Likely always want this to be yes, 1 % iii output index for naming the figure files. Useful when looped with % kinSpec script velField.m Calculates the velocity field of wavefunction, and plots the magnitude and direction of the velocity field for the given wavefunction in 2D.\nfunction [] = velField(wfc0,dx,m,incr,x,y,normed, lims) % dx is the increment along x, and assumes dx==dy % m is the mass of the atomic species % incr is the increment over which to calculate field direction. Too low % and it may be very dense, too high and very sparse. Start at 1, and % increase until happy % x,y are the grid spacings along the x and y axis % normed specifies whether to normalise the vector directions. 1 if yes, 0 otherwise % lims is [xMin xMax yMin yMax]. Hides the edge garbage. Otherwise, [] VtxCorr.m Plot the Voronoi diagram over the range of values spanned by dataSteps, and print the results to file. Essentially wraps voronoi2dCellColour and saves the resulting plots, with mean, variance and standard deviation values.\nfunction [] = VtxCorr(dataSteps, dx, dimSize, radius, printIt, edgeCol) % %dataSteps: The range of values to be plotted (e.g. [1e3 1e4 5e6]) %dx: this is the spatial increment size of the data %dimSize: The number of elements along one dimension. Used to calculate % max and min spatial values. Assumes x==y %radius: defines the bounded region to calculate the Voronoi diagram and all % quantities. Useful to avoid cells tending to infinity. % Assumes a circularly symmetric dataset. %printIt: 1 if plots are to be saved, 0 otherwise. %edgeCol: Defines the Voronoi diagram edge colours for edge indexed pair. % edgeCol==0 is black, edgeCol\u0026gt;0 is white % Returns %%Testcase: Generate diagrams for data at steps 1e3 1e4 and 1e5 % dx=1e-4; dimSize=1024; radius = 200*dx; printIt = 0; edgeCol = 1; % VtxCorr([1e3 1e4 1e5], dx, dimSize, radius, printIt, edgeCol)"
},
{
	"uri": "https://gpue-group.github.io/functionality/vortex_2d/",
	"title": "2D Vortex Tracking",
	"tags": [],
	"description": "",
	"content": " One of the major components of GPUE is the ability to track and manipulate vortices in Bose-Einstein condensates. While we can work with creating vortices in 3D, the majority of the vortex creation and manipulation framework exists solely in 2D. This is due to the 2D vortex code being developed for the project on vortex dynamics, published in PRA here (arXiV version here).\nTo allow us to track and manipulate vortices in a (2D) condensates, we require some numerical techniques and tools:\n Vortex detection. Vortex position refinement Vortex unique identification and tracking. Lattice model creation and arbitrary phase control.  1. Vortex detection To find a vortex in the condensate, we can rely on several methods, such as tracking the condensate density minima. However, given the nature of the wavefunction (a complex valued scalar field), and a quantum vortex (topological defect of the wavefunction), we can examine the phase of the condensate, $\\phi$, such that $\\Psi=|\\Psi|\\exp\\left(i \\phi\\right)$. Every vortex in the condensate will have integer winding in the complex plane, wherein the phase has a singular point around which it winds through $2\\pi$. By examining the condensate for these phase windings, we can identify the presence of a vortex to a location on our numerical grid.\nThe above image show the phase of a condensate containing four vortices. The zoomed region shows the numerical values of the sampled grid near this vortex core. By following a closed path around the dotted green line the the vortex core can be determined when the value is $\\pm 2\\pi$, positive for vortices, negative for anti-vortices (ie, vortices rotating in the opposite direction). In this instance, $$ \\displaystyle\\sum\\limits_{i=1}^{4} \\phi_i = 3.14 + 3.14 +2.76 -2.76 = 2\\pi. $$ From the above plaquette, we can say that a vortex core has been identified, and as a result, we can keep note of the planar indices corresponding to this.\n2. Vortex position refinement With the above formalism, we can identify the vortex core to the region of a $2\\times 2$ grid plaquette. However, we may also determine a better sub-grid resolution position of the core, by realising that a vortex core corresponds with a zero-crossing in the wavefunction for both the real and imaginary components. Following the formalism discussed in O\u0026rsquo;Riordan, 2017, we can calculate the 2D deviation from 0 in for the phase within this region, $\\Delta \\mathbf{r} = (\\Delta x, \\Delta y)$, and use this value to update the position as $\\mathbf{r} \\rightarrow \\mathbf{r} - \\Delta\\mathbf{r}$.\nFor this we use the least-squares formalism, and attempt to minimise the function $$ S(\\mathbf{r}) = \\displaystyle\\sum |b_i - \\displaystyle\\sum A_{ij} r_j |^2 $$ with our observations as $$ \\mathbf{b} = \\left( \\begin{array}{cccc} \\Psi(x_0,y_0) \u0026amp; \\Psi(x_0,y_1) \u0026amp; \\Psi(x_1,y_0) \u0026amp; \\Psi(x_1,y_1) \\end{array} \\right)^{T}, $$. The integers are the index $(x,y)$ indices in our plaquette (asuming clockwise or counter-clockwise is fine, provided we are consistent). Our $\\mathbf{A}$ matrix represents the $(x,y,c)$ data points, such that $x+y=-c$, on which we sample our wavefunction, and can be defined as $$ \\mathbf{A} = \\left( \\begin{array}{ccc} 0 \u0026amp; 0 \u0026amp; 1 \\\\\n0 \u0026amp; 1 \u0026amp; 1 \\\\\n1 \u0026amp; 0 \u0026amp; 1 \\\\\n1 \u0026amp; 1 \u0026amp; 1 \\end{array}\\right). $$\nThe minimisation problem can be solved by setting $\\partial_\\mathbf{r}S(\\mathbf{r})=0$. For a given $\\mathbf{A}$ and $\\mathbf{b}$, with an unknown position $\\mathbf{r}$, and an approximate vortex plaquette position, $\\mathbf{r}_0$, we can determine the unknown $\\mathbf{r}$ as \\begin{align} \\mathbf{A}^{T}\\mathbf{A} \\mathbf{r} \u0026amp;= \\mathbf{A}^{T}\\mathbf{b}, \\\\\n\\mathbf{r} \u0026amp;= (\\mathbf{A}^{T}\\mathbf{A})^{-1}\\mathbf{A}^{T}\\mathbf{b}. \\end{align}\nThe resulting $\\mathbf{r}$ then becomes \\begin{align} \\mathbf{r} \u0026amp;= \\left(\\begin{array}{c} x \\\\\ny \\\\\nc \\end{array}\\right) = \\left( \\begin{array}{c} {\\left( -\\Psi(x_0,y_0) + \\Psi(x_0,y_1) - \\Psi(x_1,y_0) + \\Psi(x_1,y_1) \\right)}{/2} \\\\\n{\\left( -\\Psi(x_0,y_0) - \\Psi(x_0,y_1) + \\Psi(x_1,y_0) + \\Psi(x_1,y_1) \\right)}{/2} \\\\\n3\\Psi(x_0,y_0) + \\Psi(x_0,y_1) - \\Psi(x_1,y_0) - \\Psi(x_1,y_1) \\end{array}\\right). \\end{align}\nFrom this, we can determine the correction to the vortex plaquette position using $\\mathbf{r} = \\mathbf{r}_0 - \\Delta\\mathbf{r}$, and by solving the linear system $$ \\left(\\begin{array}{cc} \\Re(x) \u0026amp; \\Re(y) \\\\\n\\Im(x) \u0026amp; \\Im(y) \\\\\n\\end{array}\\right) \\left( \\begin{array}{c} \\Delta x \\\\\n\\Delta y \\end{array}\\right) = - \\left( \\begin{array}{c} \\Re(\\mathrm{c}) \\\\\n\\Im(\\mathrm{c}) \\end{array}\\right), $$ which allows us to examine the zero crossings in both the real and imaginary planes for both $x$ and $y$ dimensions. Using this approach our vortex positions are now siginificantly more accurate, and can be used to allow trajectory calculations, and statistical properties. When tracking vortices, the condensate outputs their discovered positions into a CSV file vort_arr_XYZ, one for every printed wavefunction time-point XYZ. A sample output of the CSV file format is\n# X, X_refined, Y, Y_refined, Winding 485,4.858494e+02,485,4.858132e+02,1 485,4.858717e+02,538,5.381655e+02,1 538,5.381728e+02,485,4.858345e+02,1 538,5.381506e+02,538,5.381441e+02,1 for a given four vortex condensate. The values for positions are given in terms of the numerical grid coordinates. To determine actual positions, for a given simulation, it is necessary to examine Params.dat, and check the grid-size ($\\textrm{xDim}$), grid-increments ($\\textrm{dx}$), and normalise the values from 0. As an example, $X_\\textrm{pos} = (X_\\textrm{refined}-\\textrm{xDim}/2)*\\textrm{dx}$. For the above simulation, $\\textrm{xDim}=1024$, $\\textrm{dx}=6.84732\\times 10^{-7}~\\textrm{(m)}$, given a position of $X_\\textrm{pos} = -1.7906\\times 10^{-5}$ for a condensate centered on $(0,0)$ at grid point $[511,511]$ (or $[512, 512]$ if you are a 1-based indexing person).\n3. Vortex unique identification and tracking. If we are interested in statistical quantities such as vortex count, average separation, or distribution of windings (rotation directions), we can use the above methods. However, sometimes we wish to indentify and track unqiue vortices over the course of a simulation. As we are employing a field-based simulation in GPUE, namely solving the Gross-Pitaevskii equation for the wavefunction at points in time, we cannot maintain knowledge of our vortices between timesteps \u0026mdash; each newly simulated wavefunction follows no particle objects which may be easily tracked. Instead, we must re-run the above tracking methods for each wavefunction at an in time to determine the vortex positions.\nGiven these newly determined vortex positions, it is important to identify which vortices at previous timepoints correspond to vortices in the current timepoints. For this, we rely on the unordered vortex positions file vort_arr_XYZ, and the Python script py/vort.py. This file reads all available vort_arr_XYZ files, creates a vortex with a unique index, maintains it in a lniked-list (the ability to move and link vortices becomes important for larger condensates). For every vortex, at timepoint $t_i$, the identified vortices at $t_{i-1}$ are compared, with a distance threshold set to determine if the vortex within a given radius could have moved through the set range between the timesteps. The vortex that is closest between timesteps is identified as the successor of the previous time-points unique index, and the list is array to reflect this new information.\nUpon processing this data, the file outputs a new file vort_ord_XYZ.csv, which can be used to examine vortex trajectories over time. The new CSV file changes the output format to # X_refined, Y_refined, Winding, UID, onFlag. The onFlag in this instance is used to indicate whether a vortex which was found at an earlier time point exists at later time points. Given certain perturbations can remove vortices from the condensate, this flag is a useful indicator of the physical behaviour. It is worth nothing, the the first 2 timepoints of analysis do not output files, as these are used to set-up the ordering of the later observed vortex data.\n4. Lattice model creation and arbitrary phase control. For condensates with multiple vortices, it is well known that the vortices arrange themselves into an ordered lattice. Therefore, it can be useful to have a lattice model for the vortices in rotating condensates. The C++ files \u0026quot;include/lattice.h\u0026quot;, include/node.h, include/edge.h\u0026quot; define the function signatures that create lattices of vortices, and maintain this structure as a graph. Each vortex represents a node having a unique index and the previously determined positions, the nearest vortices are connected as edges determined by nearest-neighbour distances, and the overall structure is maintained as a lattice. This data structure allows us to address specific vortices in the condensate, read their properties, and even manipulate them before output to a vort_arr_XYZ file.\nGiven the ability to index a vortex, we can also implement functions that operate on the vortices at given positions. For a vortex at position $\\mathbf{r}_1 = (x_1,y_1)$, one such application of the lattice model is that we can modified the phase at this position. For the previously mentioned work, we chose the annihilate and add vortices to the condensate at pre-defined positions by apply a phase profile that matches that of a vortex core. As such, the lattice model allows one to read this position directly for a vortex of arbitrary UID, and use the resulting poisiton for further manipulations. Such functions to apply arbitrary phase profiles are defined in include/manip.h. `\n MathJax.Hub.Config({ tex2jax: { inlineMath: [['$','$'], ['\\\\(','\\\\)']], displayMath: [['$$','$$'], ['\\[','\\]']], processEscapes: true, processEnvironments: true, skipTags: ['script', 'noscript', 'style', 'textarea', 'pre','code'], TeX: { equationNumbers: { autoNumber: \"AMS\" }, extensions: [\"AMSmath.js\", \"AMSsymbols.js\"] } } });  "
},
{
	"uri": "https://gpue-group.github.io/data_analysis/",
	"title": "Data analysis",
	"tags": [],
	"description": "",
	"content": "GPUE outputs data in a basic ASCII format so it is easy for users to read the data into an auxiliary program and analyze it as necessary; however, GPUE also provides a series of scripts for 2 and 3D analysis with the following functionality:\n Plotting in 2 dimensions Generation of 2 dimensional slices of 3 dimensional data Generation of .vbox or .vtk files in 3 dimensional for plotting in blender and paraview, respectively Generation of images that can later be concatenated into a video with standard tools (ffmpeg, ImageMagick, etc.)  GPUE is primarily a simulation program and thus provides only limited tools necessary to visualize the data. If the user requires more advanced data analysis, these must be further developed by the user for their specific research purpose.\n"
},
{
	"uri": "https://gpue-group.github.io/functionality/dynamic_fields/",
	"title": "Dynamic Fields",
	"tags": [],
	"description": "",
	"content": "GPUE allows for users to input their own custom, dynamic fields that can vary with time, space, and auxiliary parameters. These fields are parsed during run-time and allow users to run GPUE with multiple different distributions without recompiling. This was done for the following reasons:\n GPUE hardware is inherently limited in memory. After some development, we found that there simply was not enough room on older GPU hardware for the wavefunction, gauge fields, and auxiliary operators for evolution. If we wanted to change these fields during evolution, that is very difficult to do on the GPU because transferring between the CPU host and GPU device is a slow process \u0026ndash; even with recent progress in NVLink technology! We cannot assume that all users are also developers. Adding new operators to GPUE was once a very difficult process that would take a few hours on the part of the developer. This obviously meant that users could only use GPUE for a few small-scale simulations and could not use it for general use. By allowing users to input their own fields during run-time, GPUE can be used for a much larger number of cases.   Using dynamic fields All dynamic fields must be specified in some configuration file to be read by the ./gpue -I file.cfg flag.\nAn example is provided in the src/example.cfg file:\nrad = 10 omegaR = 7071 V = rad*omegaR*x*y*z  Here, we specify whatever variables we like. This also allows us to redefine parameters for the simulation. The terms x, y, z, t, Ax, Ay, Az, V, and K are all reserved and recognized by the parsing scheme. All standard mathematical operations are also supported.\nAs a note: using the dynamic parser in this way saves on memory usage; however, it also slows down the simulation slightly. Dynamic fields should only be used if your simulation requires time-dependent potentials or gauge fields or if you need to save on memory.\n"
},
{
	"uri": "https://gpue-group.github.io/development/",
	"title": "Development",
	"tags": [],
	"description": "",
	"content": " Here is the DOXYGEN generated documentation.\nThis section is devoted to particularly difficult or tricky parts of the GPUE codebase that deserve special attention for future development. More information will come as features are developed.\nCuFFT usage with angular momentum/gauge fields To implement angular momentum in the split-operator based pseudo-spectral methods, we must take special care of these evolution operators. As outlined in O\u0026rsquo;Riordan, 2017, th non-commutative nature of the position and momentum space operators required for angular momentum present a challenge \u0026mdash; we cannot truly implment a numerical model without a certain degree of error. This is well documented in the above literature, though what is not discussed is an implementation of this method using CUDA and the CuFFT library.\nLet\u0026rsquo;s assume that we have a 3D dataset, with the data layout given by the following figure:\n        Fig. 1: Data layout for a $2\\times 2\\times 2$ grid.    For simplicity we assume a $2\\times 2\\times 2$ grid, giving us 8 data elements. Here we assume indexing is performed along $(x,y,z)$, where $x$ is the fastest axis and $z$ is the slowest axis.\nFor rotation about a specific axis, we assume that the applied operator is applied in a planar (2D) manner that is constant along this axis. In other words, $\\Omega_z = x\\p_y - yp_x$. For our split-operator method, we attain the $k$-space basis along a specific axis by performing a Fourier transform along this axis, and with appropriate scaling the resulting momentum space basis.\nLet us now look at the layout of the data in memory for the above $2^3$ cube. In the given layout, keeping $y,z$ constant, all $x$ data elements are adjacent. This is optimal for computational performance, as CUDA loads data in the width of warps at any instance. Data loads are expensive, and to allow for the best performance we must keep loads (aka reads) to a minimum. To act along the $x$ axis, given that the data is always adjacent, Fourier transforms can be nicely batched to transform each respective $x$, for each combination of $y,z$.\n        Fig. 2: Memory locations and connections for data along each respective axis. $x$-axis data is adjacent, and hence is the fastest to process as data is chunks of CUDA warp sizes. The dashed straight lines represent the individual divisions between indendent data for transforming. The curved lines show the links between data of the same (dependent) transforms.    If we wish to transform along one of the other axes, say $y$, this becomes a challenge, as the data layout is no longer adjacent. To calculate $\\Omega_z$ we require the calculation of $p_x$ and $p_y$, thus we must deal with this non-adjacency for any rotation operations. The CuFFT API supports 1D, 2D, 3D, and n-D transforms, but only assumes transforms are performed over all dimensions of a dataset (ie, there are no functions for selective transformation of a single axis). One might say, that this is appropriately solved by a transpose in 2D, or a permutation in 3+ dimensions. As of the time of writing, no open-source in-place memory transpose/permutations exists, and out-of-place is not an option to enable storage of 3D data sets on the GPU. CuFFT has internal tranpose operations, but none that are publically exposed (these functions are called as part of the FFT transform calls, and observable in cuda-gdb or nvprof).\nAn initial implementation to calculate the appropriately transformed data using the in-built CuFFT functions can be performed by batching 1D FFT\u0026rsquo;s for the adjacent ($x$) data set, and 2D forward-1D invserse (2DF-1DI) for the non-adjacent ($y$) data. This transforms the data along a 2D slice, and inverses the adjacent data set, so the remaining data set is the Fourier transformed non-adjacent data. However, this method is extremely inefficient, as it performs many more transforms than necessary. We can instead manipulate the data by appropriately batched CuFFT tranforms, making use of the manual control provided by cufftPlanMany, wherein we can control the stride, distance, offsets and size of the data being transformed. While not as performant as an FFT over adjacent data, the prior constraint of no out-of-place transposes/permutes is maintained as we are strictly using CuFFT function calls, leaving the underlying implementation to manipulate the data transforms.\n   Param Property     inembed Number of elements in input data to be transformed   onembed Number of elements in output data following transform   idist Spacing between consecutive input data sets in memory for batched transforms   odist Spacing between consecutive output data sets in memory following batched transforms   istride Spacing between input data elements in the same dataset in memory   ostride Spacing between output data elements in the same dataset in memory   batch Number of transforms to perform in batched mode    The following script demonstrates the use of the above parameters to control the transforms in 2D and 3D. The transform of the fast-axis ($x$) data is simply performed by using batched 1D transforms using cufftPlan1d, with the comparative result obtained also from the cufftPlanMany method using the above parameters. To transform the non-adjacent ($y$) data, first we examine the 2DF-1DI result as previously mentioned, and compare the result with the cufftPlanMany method. In 2D this is natively supported by giving the above parameters the appropriate values for the data set. In 3+ dimensions it is worth noting that for this to work we must shift the pointer of the data by the product of the length of all faster axes dimensions. The slowest axis can simply be treated then without any need for pointer shifting. The earlier figure shows dotted lines wherever the transform boundaries do not cross.\n#include \u0026lt;stdio.h\u0026gt;#include \u0026lt;cstdlib\u0026gt;#include \u0026lt;cufft.h\u0026gt;#include \u0026lt;cuda.h\u0026gt;#include \u0026lt;iostream\u0026gt;#include \u0026lt;assert.h\u0026gt; // ********************************************************* // // Error checking macros // ********************************************************* //  #define ERR_CHECK(err_val) { \\  cudaError_t err = err_val; \\ if (err != cudaSuccess) { \\ fprintf(stderr, \u0026#34;Error %s at line %d in file %s\\n\u0026#34;, \\ cudaGetErrorString(err), __LINE__, __FILE__); \\ exit(1); \\ } \\ } #define FFT_ERR_CHECK(err_val) { \\  cufftResult err = err_val; \\ if (err != CUFFT_SUCCESS) { \\ fprintf(stderr, \u0026#34;Error %d at line %d in file %s\\n\u0026#34;, \\ err, __LINE__, __FILE__); \\ exit(1); \\ } \\ } // ********************************************************* // // FT params for ...many plans // ********************************************************* //  typedef struct FTParams{ int numTranforms; int numLoops; int stride; int dist; int offset; }; // ********************************************************* // // Sample cuda kernels // ********************************************************* //  __host__ __device__ double retVal(double val) { return val; } __global__ void copyVal(double *inData, double *outData) { outData[threadIdx.x] = inData[threadIdx.x]; } // ********************************************************* // // FT test functions // ********************************************************* //  void fftIt2D(){ cudaDeviceReset(); int dimSize = 6; int NX = dimSize*dimSize; int numTransform = std::round(sqrt(NX)); int sqrtNX = std::round(sqrt(NX)); int dims1D[] = {(NX)}; int dims2D[] = {sqrtNX,sqrtNX}; int rank = 1; int inembed[] = {NX}; int onembed[] = {NX}; int istride[] = {1,sqrtNX}; // Consecutive elements, same signal  int ostride[] = {1,sqrtNX}; //  int idist[] = {sqrtNX,1}; // Consecutive signals  int odist[] = {sqrtNX,1}; cufftHandle planMany, plan1D, plan2D; cufftDoubleComplex *data_H1DFFT, *data_H2DF1DI, *data_HmanyFFT, *data_H0, *data_D; ERR_CHECK( cudaMalloc( (cufftDoubleComplex**) \u0026amp;data_D, sizeof(cufftDoubleComplex)*NX) ); data_H1DFFT = (cufftDoubleComplex*) malloc(sizeof(cufftDoubleComplex)*NX); data_H2DF1DI = (cufftDoubleComplex*) malloc(sizeof(cufftDoubleComplex)*NX); data_HmanyFFT = (cufftDoubleComplex*) malloc(sizeof(cufftDoubleComplex)*NX); data_H0 = (cufftDoubleComplex*) malloc(sizeof(cufftDoubleComplex)*NX); // ********************************************************* //  // Create the input data  // ********************************************************* //  std::cout \u0026lt;\u0026lt; \u0026#34;INPUT:\\n\u0026#34;; for(int ii=0; ii\u0026lt;sqrtNX; ++ii){ for(int jj=0; jj\u0026lt;sqrtNX; ++jj){ data_H0[jj + ii*sqrtNX].x = (double) ii; data_H0[jj + ii*sqrtNX].y = (double) jj; std::cout \u0026lt;\u0026lt; data_H0[jj + ii*sqrtNX].x \u0026lt;\u0026lt; \u0026#34; + 1i*\u0026#34; \u0026lt;\u0026lt; data_H0[jj + ii*sqrtNX].y \u0026lt;\u0026lt; \u0026#34;\\t\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; ERR_CHECK( cudaMemcpy(data_D, data_H0, sizeof(cufftDoubleComplex) * NX, cudaMemcpyHostToDevice)); // ******************************************************************************** //  // First, check the 1D FFT along the standard dimension  // ******************************************************************************** //  FFT_ERR_CHECK(cufftPlan1d(\u0026amp;plan1D, dims2D[0], CUFFT_Z2Z, numTransform)); FFT_ERR_CHECK(cufftExecZ2Z(plan1D, data_D, data_D, CUFFT_FORWARD)); ERR_CHECK(cudaMemcpy(data_H1DFFT, data_D, sizeof(cufftDoubleComplex) * NX, cudaMemcpyDeviceToHost)); std::cout \u0026lt;\u0026lt; \u0026#34;OUTPUT 1D:\\n\u0026#34;; for(int ii=0; ii\u0026lt;sqrtNX; ++ii){ for(int jj=0; jj\u0026lt;sqrtNX; ++jj){ std::cout \u0026lt;\u0026lt; data_H1DFFT[jj + ii*sqrtNX].x \u0026lt;\u0026lt; \u0026#34; + 1i*\u0026#34; \u0026lt;\u0026lt; data_H1DFFT[jj + ii*sqrtNX].y \u0026lt;\u0026lt; \u0026#34;\\t\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } //Overwrite GPU data to original values  ERR_CHECK( cudaMemcpy(data_D, data_H0, sizeof(cufftDoubleComplex) * NX, cudaMemcpyHostToDevice)); // ******************************************************************************** //  // ******************************************************************************** //  // Next, check the Many FFT along the same expected dimension  // ******************************************************************************** //  FFT_ERR_CHECK(cufftPlanMany(\u0026amp;planMany, rank, dims2D, inembed, istride[0], idist[0], onembed, ostride[0], odist[0], CUFFT_Z2Z, numTransform)); FFT_ERR_CHECK(cufftExecZ2Z(planMany, data_D, data_D, CUFFT_FORWARD)); ERR_CHECK(cudaMemcpy(data_HmanyFFT, data_D, sizeof(cufftDoubleComplex) * NX, cudaMemcpyDeviceToHost)); std::cout \u0026lt;\u0026lt; \u0026#34;OUTPUT MANY 1D:\\n\u0026#34;; for(int ii=0; ii\u0026lt;sqrtNX; ++ii){ for(int jj=0; jj\u0026lt;sqrtNX; ++jj){ std::cout \u0026lt;\u0026lt; data_HmanyFFT[jj + ii*sqrtNX].x \u0026lt;\u0026lt; \u0026#34; + 1i*\u0026#34; \u0026lt;\u0026lt; data_HmanyFFT[jj + ii*sqrtNX].y \u0026lt;\u0026lt; \u0026#34;\\t\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } try { for (int ii=0; ii\u0026lt;NX; ++ii){ assert( (data_H1DFFT[ii].x - data_HmanyFFT[ii].x) \u0026lt; 1e-7 ); assert( (data_H1DFFT[ii].y - data_HmanyFFT[ii].y) \u0026lt; 1e-7 ); } } catch (const char* msg) { std::cerr \u0026lt;\u0026lt; msg \u0026lt;\u0026lt; std::endl; } //Overwrite GPU data to original values  ERR_CHECK( cudaMemcpy(data_D, data_H0, sizeof(cufftDoubleComplex) * NX, cudaMemcpyHostToDevice)); // ******************************************************************************** //  // Check the 2D FFT Forward, 1D FFT back  // ******************************************************************************** //  FFT_ERR_CHECK(cufftPlan2d(\u0026amp;plan2D, dims2D[0], dims2D[1], CUFFT_Z2Z)); FFT_ERR_CHECK(cufftExecZ2Z(plan2D, data_D, data_D, CUFFT_FORWARD)); FFT_ERR_CHECK(cufftExecZ2Z(plan1D, data_D, data_D, CUFFT_INVERSE)); ERR_CHECK(cudaMemcpy(data_H2DF1DI, data_D, sizeof(cufftDoubleComplex) * NX, cudaMemcpyDeviceToHost)); std::cout \u0026lt;\u0026lt; \u0026#34;OUTPUT 2DF-1DI:\\n\u0026#34;; for(int ii=0; ii\u0026lt;sqrtNX; ++ii){ for(int jj=0; jj\u0026lt;sqrtNX; ++jj){ std::cout \u0026lt;\u0026lt; data_H2DF1DI[jj + ii*sqrtNX].x/sqrtNX \u0026lt;\u0026lt; \u0026#34; + 1i*\u0026#34; \u0026lt;\u0026lt; data_H2DF1DI[jj + ii*sqrtNX].y/sqrtNX \u0026lt;\u0026lt; \u0026#34;\\t\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } //Overwrite GPU data to original values  ERR_CHECK( cudaMemcpy(data_D, data_H0, sizeof(cufftDoubleComplex) * NX, cudaMemcpyHostToDevice)); // ******************************************************************************** //  // ******************************************************************************** //  // Lastly, check the Many FFT along the other dimension  // ******************************************************************************** //  FFT_ERR_CHECK(cufftPlanMany(\u0026amp;planMany, rank, dims2D, inembed, istride[1], idist[1], onembed, ostride[1], odist[1], CUFFT_Z2Z, sqrtNX)); FFT_ERR_CHECK(cufftExecZ2Z(planMany, data_D, data_D, CUFFT_FORWARD)); //ERR_CHECK(cudaDeviceSynchronize());  ERR_CHECK(cudaMemcpy(data_HmanyFFT, data_D, sizeof(cufftDoubleComplex) * NX, cudaMemcpyDeviceToHost)); std::cout \u0026lt;\u0026lt; \u0026#34;OUTPUT MANY 1D Other:\\n\u0026#34;; for(int ii=0; ii\u0026lt;sqrtNX; ++ii){ for(int jj=0; jj\u0026lt;sqrtNX; ++jj){ std::cout \u0026lt;\u0026lt; data_HmanyFFT[jj + ii*sqrtNX].x \u0026lt;\u0026lt; \u0026#34; + 1i*\u0026#34; \u0026lt;\u0026lt; data_HmanyFFT[jj + ii*sqrtNX].y \u0026lt;\u0026lt; \u0026#34;\\t\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } try { for (int ii=0; ii\u0026lt;NX; ++ii){ //std::cout \u0026lt;\u0026lt; ( (data_H2DF1DI[ii].x/sqrtNX - data_HmanyFFT[ii].x) \u0026lt; 1e-7 ) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;;  assert( (data_H2DF1DI[ii].x/sqrtNX - data_HmanyFFT[ii].x) \u0026lt; 1e-7 ); assert( (data_H2DF1DI[ii].y/sqrtNX - data_HmanyFFT[ii].y) \u0026lt; 1e-7 ); } } catch (const char* msg) { std::cerr \u0026lt;\u0026lt; msg \u0026lt;\u0026lt; std::endl; } //Overwrite GPU data to original values  ERR_CHECK( cudaMemcpy(data_D, data_H0, sizeof(cufftDoubleComplex) * NX, cudaMemcpyHostToDevice)); // ******************************************************************************** //  // ******************************************************************************** //  // Free stuff  // ******************************************************************************** //  cufftDestroy(planMany);cufftDestroy(plan1D);cufftDestroy(plan2D); cudaFree(data_D); free(data_HmanyFFT);free(data_H0); free(data_H2DF1DI);free(data_H1DFFT); } void fftIt3D(){ int dimLength = 5; int NX = dimLength*dimLength*dimLength; int cbrtNX = std::cbrt(NX); int numTransform = cbrtNX*cbrtNX; int paramsMatrix[3][5] = {{cbrtNX*cbrtNX,1,1,cbrtNX,0},{cbrtNX,cbrtNX,cbrtNX,1,cbrtNX*cbrtNX},{cbrtNX*cbrtNX,1,cbrtNX*cbrtNX,1,0}}; FTParams params[3]; for(int ii=0; ii\u0026lt;3; ++ii){ params[ii].numTranforms = paramsMatrix[ii][0]; params[ii].numLoops = paramsMatrix[ii][1]; params[ii].stride = paramsMatrix[ii][2]; params[ii].dist = paramsMatrix[ii][3]; params[ii].offset = paramsMatrix[ii][4]; } int dims[] = {NX}; int dims3D[] = {cbrtNX,cbrtNX,cbrtNX}; int inembed[] = {cbrtNX,cbrtNX,cbrtNX}; int onembed[] = {cbrtNX,cbrtNX,cbrtNX}; int istride[] = {1,cbrtNX,cbrtNX*cbrtNX}; // Indexed value is respective dimensionality of the transform along a specific dimension.  int ostride[] = {1,cbrtNX,cbrtNX*cbrtNX}; int idist[] = {cbrtNX,1,1}; // [Here][] // The next dimension  int odist[] = {cbrtNX,1,1}; cufftHandle planMany, plan1D, plan3D; cufftDoubleComplex *data_H1DFFT, *data_HmanyFFT, *data_H0, *data_D; ERR_CHECK( cudaMalloc( (cufftDoubleComplex**) \u0026amp;data_D, sizeof(cufftDoubleComplex)*NX) ); data_H1DFFT = (cufftDoubleComplex*) malloc(sizeof(cufftDoubleComplex)*NX); data_HmanyFFT = (cufftDoubleComplex*) malloc(sizeof(cufftDoubleComplex)*NX); data_H0 = (cufftDoubleComplex*) malloc(sizeof(cufftDoubleComplex)*NX); // ******************************************************************************** //  // Create the input data  // ******************************************************************************** //  std::cout \u0026lt;\u0026lt; \u0026#34;INPUT:\\n\u0026#34;; for(int ii=0; ii\u0026lt;cbrtNX; ++ii){ std::cout \u0026lt;\u0026lt; \u0026#34;C(:,:,\u0026#34; \u0026lt;\u0026lt; ii+1 \u0026lt;\u0026lt; \u0026#34;)=[\u0026#34;; for(int jj=0; jj\u0026lt;cbrtNX; ++jj){ for(int kk=0; kk\u0026lt;cbrtNX; ++kk){ data_H0[kk + cbrtNX*(jj + ii*cbrtNX)].x = (double) ii; data_H0[kk + cbrtNX*(jj + ii*cbrtNX)].y = (double) jj;//(double) jj;  std::cout \u0026lt;\u0026lt; data_H0[kk + cbrtNX*(jj + ii*cbrtNX)].x \u0026lt;\u0026lt; \u0026#34; + 1i*\u0026#34; \u0026lt;\u0026lt; data_H0[kk + cbrtNX*(jj + ii*cbrtNX)].y \u0026lt;\u0026lt; \u0026#34;\\t\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;]\\n\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n--- \\n\u0026#34;; ERR_CHECK( cudaMemcpy(data_D, data_H0, sizeof(cufftDoubleComplex) * NX, cudaMemcpyHostToDevice)); // ******************************************************************************** //  // First, check the 1D FFT along the standard dimension  // ******************************************************************************** //  FFT_ERR_CHECK(cufftPlan1d(\u0026amp;plan1D, cbrtNX, CUFFT_Z2Z, cbrtNX*cbrtNX)); FFT_ERR_CHECK(cufftExecZ2Z(plan1D, data_D, data_D, CUFFT_FORWARD)); ERR_CHECK(cudaMemcpy(data_H1DFFT, data_D, sizeof(cufftDoubleComplex) * NX, cudaMemcpyDeviceToHost)); std::cout \u0026lt;\u0026lt; \u0026#34;OUTPUT 1D_1:\\n\u0026#34;; for(int ii=0; ii\u0026lt;cbrtNX; ++ii){ for(int jj=0; jj\u0026lt;cbrtNX; ++jj){ for(int kk=0; kk\u0026lt;cbrtNX; ++kk){ std::cout \u0026lt;\u0026lt; data_H1DFFT[kk + cbrtNX*(jj + ii*cbrtNX)].x \u0026lt;\u0026lt; \u0026#34; + 1i*\u0026#34; \u0026lt;\u0026lt; data_H1DFFT[kk + cbrtNX*(jj + ii*cbrtNX)].y \u0026lt;\u0026lt; \u0026#34;\\t\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n--- \\n\u0026#34;; //Overwrite GPU data to original values  ERR_CHECK( cudaMemcpy(data_D, data_H0, sizeof(cufftDoubleComplex) * NX, cudaMemcpyHostToDevice)); // ******************************************************************************** //  // ******************************************************************************** //  // Next, check the Many FFT along the same expected dimension  // ******************************************************************************** //  int tDim = 0; //Transform dimension  int dims2D[] = {cbrtNX,cbrtNX}; FFT_ERR_CHECK(cufftPlanMany(\u0026amp;planMany, 1, dims2D, inembed, cbrtNX, 1, onembed, cbrtNX, 1, CUFFT_Z2Z, cbrtNX)); for (int ii=0; ii\u0026lt;cbrtNX; ++ii){ FFT_ERR_CHECK(cufftExecZ2Z(planMany, \u0026amp;data_D[ii*cbrtNX*cbrtNX], \u0026amp;data_D[ii*cbrtNX*cbrtNX] , CUFFT_FORWARD)); } ERR_CHECK(cudaMemcpy(data_HmanyFFT, data_D, sizeof(cufftDoubleComplex) * NX, cudaMemcpyDeviceToHost)); std::cout \u0026lt;\u0026lt; \u0026#34;OUTPUT MANY 1D:\\n\u0026#34;; for(int ii=0; ii\u0026lt;cbrtNX; ++ii){ for(int jj=0; jj\u0026lt;cbrtNX; ++jj){ for(int kk=0; kk\u0026lt;cbrtNX; ++kk){ std::cout \u0026lt;\u0026lt; data_HmanyFFT[kk + cbrtNX*(jj + ii*cbrtNX)].x \u0026lt;\u0026lt; \u0026#34; + 1i*\u0026#34; \u0026lt;\u0026lt; data_HmanyFFT[kk + cbrtNX*(jj + ii*cbrtNX)].y \u0026lt;\u0026lt; \u0026#34;\\t\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } std::cout \u0026lt;\u0026lt; \u0026#34;\\n--- \\n\u0026#34;; //Overwrite GPU data to original values  ERR_CHECK( cudaMemcpy(data_D, data_H0, sizeof(cufftDoubleComplex) * NX, cudaMemcpyHostToDevice)); // ******************************************************************************** //  // ******************************************************************************** //  // Free stuff  // ******************************************************************************** //  cufftDestroy(planMany);cufftDestroy(plan1D);cufftDestroy(plan3D); cudaFree(data_D); free(data_HmanyFFT);free(data_H1DFFT);free(data_H0); } int main(void) { fftMulti3D(); ERR_CHECK(cudaDeviceReset()); return (0); }  Compiling the above code using nvcc ./multiDFFT.cu -o mdfft -lcufft will showcase this method. By performing the FFT in this manner we can achieve approximately 25% improvement over the original 2DF-1DI method. Given that we can make use of the CuFFT internal operations for in-place transforms also allows us to enable memory usage to examine much larger systems sizes without hitting the memory limit (or run the 3D code on smaller consumer GPUs).\n MathJax.Hub.Config({ tex2jax: { inlineMath: [['$','$'], ['\\\\(','\\\\)']], displayMath: [['$$','$$'], ['\\[','\\]']], processEscapes: true, processEnvironments: true, skipTags: ['script', 'noscript', 'style', 'textarea', 'pre','code'], TeX: { equationNumbers: { autoNumber: \"AMS\" }, extensions: [\"AMSmath.js\", \"AMSsymbols.js\"] } } });  "
},
{
	"uri": "https://gpue-group.github.io/",
	"title": "",
	"tags": [],
	"description": "",
	"content": "\nDocumentation website for GPUE. If you are new to GPUE, please go to the introduction to learn what it\u0026rsquo;s about!\nIf you are having trouble building GPUE, please go the the building instructions.\nIf you would like to learn about what GPUE can do, please go to the functionality page\nAnd if you would like to begin developing for GPUE, please go to the development section.\nAll of these can be found in the navigational side-bar to the left.\nAn overview of the GPUE can be found in our Journal of Open Source Software publication (pdf). If you use GPUE for your research, pelase let us know and cite us as:\n James Schloss and Lee James O\u0026rsquo;Riordan, GPUE: Graphics Processing Unit Gross\u0026ndash;Pitaevskii Equation solver. Journal of Open Source Software, 3(32), 1037 (2018), https://doi.org/10.21105/joss.01037  "
},
{
	"uri": "https://gpue-group.github.io/build/gpue_build/",
	"title": "",
	"tags": [],
	"description": "",
	"content": ""
},
{
	"uri": "https://gpue-group.github.io/data_analysis/gpue_build/",
	"title": "",
	"tags": [],
	"description": "",
	"content": ""
},
{
	"uri": "https://gpue-group.github.io/development/gpue_build/",
	"title": "",
	"tags": [],
	"description": "",
	"content": ""
},
{
	"uri": "https://gpue-group.github.io/development/html/",
	"title": "",
	"tags": [],
	"description": "",
	"content": " GPUE: Main Page      /* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3\u0026amp;dn=gpl-2.0.txt GPL-v2 */ $(document).ready(initResizable); /* @license-end */   MathJax.Hub.Config({ extensions: [\"tex2jax.js\"], jax: [\"input/TeX\",\"output/HTML-CSS\"], });      GPUE \u0026#160;v1.0  GPU Gross-Pitaevskii Equation numerical solver for Bose-Einstein condensates      /* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3\u0026amp;dn=gpl-2.0.txt GPL-v2 */ var searchBox = new SearchBox(\"searchBox\", \"search\",false,'Search'); /* @license-end */    /* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3\u0026amp;dn=gpl-2.0.txt GPL-v2 */ $(function() { initMenu('',true,false,'search.php','Search'); $(document).ready(function() { init_search(); }); }); /* @license-end */       /* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3\u0026amp;dn=gpl-2.0.txt GPL-v2 */ $(document).ready(function(){initNavTree('index.html','');}); /* @license-end */     GPUE Documentation    Generated by  1.8.14      "
},
{
	"uri": "https://gpue-group.github.io/functionality/gpue_functionality/",
	"title": "",
	"tags": [],
	"description": "",
	"content": ""
},
{
	"uri": "https://gpue-group.github.io/intro/gpue_intro/",
	"title": "",
	"tags": [],
	"description": "",
	"content": ""
},
{
	"uri": "https://gpue-group.github.io/mathjax/",
	"title": "",
	"tags": [],
	"description": "",
	"content": " MathJax.Hub.Config({ tex2jax: { inlineMath: [['$','$'], ['\\\\(','\\\\)']], displayMath: [['$$','$$'], ['\\[','\\]']], processEscapes: true, processEnvironments: true, skipTags: ['script', 'noscript', 'style', 'textarea', 'pre','code'], TeX: { equationNumbers: { autoNumber: \"AMS\" }, extensions: [\"AMSmath.js\", \"AMSsymbols.js\"] } } });  "
},
{
	"uri": "https://gpue-group.github.io/categories/",
	"title": "Categories",
	"tags": [],
	"description": "",
	"content": ""
},
{
	"uri": "https://gpue-group.github.io/tags/",
	"title": "Tags",
	"tags": [],
	"description": "",
	"content": ""
}]