Matlab / CUDA: Ocean Wave Modeling

Question

Matlab / CUDA: Ocean Wave Modeling

I studied "Modeling Ocean Water" in an article by Jerry Tessendorf and tried to program a statistical wave model, but I did not get the correct result, and I do not understand why.

In my program, I tried to create a wave height field at time t = 0 without any further changes in time. After running my program, I didn’t get what I expected: enter image description here

Here is my source code:

 clear all; close all; clc; rng(11); % setting seed for random numbers meshSize = 64; % field size windDir = [1, 0]; % ||windDir|| = 1 patchSize = 64; A = 1e+4; g = 9.81; % gravitational constant windSpeed = 1e+2; x1 = linspace(-10, 10, meshSize+1); x = x1(1:meshSize); y1 = linspace(-10, 10, meshSize+1); y = y1(1:meshSize); [X,Y] = meshgrid(x, y); H0 = zeros(size(X)); % height field at time t = 0 for i = 1:meshSize for j = 1:meshSize kx = 2.0 * pi / patchSize * (-meshSize / 2.0 + x(i)); % = 2*pi*n / Lx ky = 2.0 * pi / patchSize * (-meshSize / 2.0 + y(j)); % = 2*pi*m / Ly P = phillips(kx, ky, windDir, windSpeed, A, g); % phillips spectrum H0(i,j) = 1/sqrt(2) * (randn(1) + 1i * randn(1)) * sqrt(P); end end H0 = H0 + conj(H0); surf(X,Y,abs(ifft(H0))); axis([-10 10 -10 10 -10 10]);

And the phillips function:

 function P = phillips(kx, ky, windDir, windSpeed, A, g) k_sq = kx^2 + ky^2; L = windSpeed^2 / g; k = [kx, ky] / sqrt(k_sq); wk = k(1) * windDir(1) + k(2) * windDir(2); P = A / k_sq^2 * exp(-1.0 / (k_sq * L^2)) * wk^2; end

Is there any source code for modeling for Matlab based simulation that can help me understand my mistakes? A quick Google search did not get any results.

Here is the “correct” result I got from CUDA FFT Ocean Simulation. I haven’t reached this behavior in Matlab yet, but I have stopped surfing in Matlab using the CUDA FFT Ocean Simulation data. Here's what it looks like:

I did an experiment and got an interesting result:

I took the generated h0 from "CUDA FFT Ocean Simulation". Therefore, I have to do ifft to convert from the frequency domain to the spatial domain for plotting. I did this for the same h0 using matlab ifft and using cufftExecC2C from the CUDA library. Here is the result:

CUDA ifft:

Matlab ifft:

Either I don’t understand some aspects of the implementation of cufftExecC2C or cufftExecC2C , and matlab ifft are different algorithms with different results.

By the way, the parameters for creating such a surface:

meshSize = 32
A = 1e-7
patchSize = 80
windSpeed = 10

+5

matlab simulation cuda simulator simulate

Eugene kolesnikov Feb 02 '15 at 13:56

source share

2 answers

Here is the work program.

First of all, the source code:

 clear all; close all; clc; rng(13); % setting seed for random numbers meshSize = 128; % field size windDir = [0.1,1]; patchSize = 200; A = 1e-7; g = 9.81; % gravitational constant windSpeed = 100; timeStep = 1/25; x1 = linspace(-10, 10, meshSize+1); x = x1(1:meshSize); y1 = linspace(-10, 10, meshSize+1); y = y1(1:meshSize); [X,Y] = meshgrid(x,y); % wave field i = 1:meshSize; j = 1:meshSize; % indecies [I,J] = meshgrid(i,j); % field of indecies Kx = 2.0 * pi / patchSize * (-meshSize / 2.0 + I); % = 2*pi*n / Lx, -N/2 <= n < N/2 Ky = 2.0 * pi / patchSize * (-meshSize / 2.0 + J); % = 2*pi*m / Ly, -M/2 <= m < M/2 K = sqrt(Kx.^2 + Ky.^2); % ||K|| W = sqrt(K .* g); % deep water frequencies (empirical parameter) P = zeros(size(X)); % Cant compute P without loops for i = 1:meshSize for j = 1:meshSize P(i,j) = phillips(Kx(i,j), Ky(i,j), windDir, windSpeed, A, g); % phillips spectrum end end H0 = 1/sqrt(2) .* (randn(size(X)) + 1i .* randn(size(X))) .* sqrt(P); % height field at time t = 0 rotate3d on; for t = 1:10000 % 10000 * timeStep (sec) Ht = H0 .* exp(1i .* W .* (t * timeStep)) + ... conj(flip(flip(H0,1),2)) .* exp(-1i .* W .* (t * timeStep)); [az,el] = view; surf(X,Y,real(signCor(ifft2(Ht),meshSize))); axis([-10 10 -10 10 -1e-4 1e-4]); view(az,el); blue = linspace(0.4, 1.0, 25)'; map = [blue*0, blue*0, blue]; %shading interp; % improve shading (remove "faceted" effect) colormap(map); pause(1/60); end

phillips.m: (I tried to vectorize the calculation of the Phillips spectrum, but I ran into a difficulty, which I will show later)

 function P = phillips(kx, ky, windDir, windSpeed, A, g) k_sq = kx^2 + ky^2; if k_sq == 0 P = 0; else L = windSpeed^2 / g; k = [kx, ky] / sqrt(k_sq); wk = k(1) * windDir(1) + k(2) * windDir(2); P = A / k_sq^2 * exp(-1.0 / (k_sq * L^2)) * wk^2; if wk < 0 P = 0; end end end

signCor.m: (This function is absolutely mysterious for me ... I copied it from the CUDA FFT Ocean Simulation implementation. Modeling works much worse without it. And again, I don’t know how to vectorize this function.)

 function H = signCor(H1, meshSize) H = H1; for i = 1:meshSize for j = 1:meshSize if mod(i+j,2) == 0 sign = -1; % works fine if we change signs vice versa else sign = 1; end H(i,j) = H1(i,j) * sign; end end end

The biggest mistake I made was that I used ifft instead of using ifft2 , so CUDA ifft and Matlab ifft did not match.

My second mistake was in these lines of code:

 kx = 2.0 * pi / patchSize * (-meshSize / 2.0 + x(i)); % = 2*pi*n / Lx ky = 2.0 * pi / patchSize * (-meshSize / 2.0 + y(j)); % = 2*pi*m / Ly

I had to write:

 kx = 2.0 * pi / patchSize * (-meshSize / 2.0 + i); % = 2*pi*n / Lx ky = 2.0 * pi / patchSize * (-meshSize / 2.0 + j); % = 2*pi*m / Ly

I played around a bit with the parameters A, meshSize, patchSize, and I came to the conclusion that:

Some plausible parameter of the wave amplitude is A * (patchSize / meshSize), where A is nothing but a scaling factor.

For peace of mind, patchSize / meshSize <= 0.5 .
Behind the tsunami patchSize / meshSize >= 3.0 .

Phillips spectrum vectorization complexity:

I have 2 functions:

 % non-vectorized spectrum function P = phillips1(kx, ky, windDir, windSpeed, A, g) k_sq = kx^2 + ky^2; if k_sq == 0 P = 0; else L = windSpeed^2 / g; k = [kx, ky] / sqrt(k_sq); wk = k(1) * windDir(1) + k(2) * windDir(2); P = A / k_sq^2 * exp(-1.0 / (k_sq * L^2)) * wk^2; if wk < 0 P = 0; end end end % vectorized spectrum function P = phillips2(Kx, Ky, windDir, windSpeed, A, g) K_sq = Kx .^ 2 + Ky .^ 2; L = -g^2 / windSpeed^4; WK = (Kx ./ K_sq) .* windDir(1) + (Ky ./ K_sq) .* windDir(2); P = (A ./ (K_sq .^ 2)) .* ( exp(L ./ K_sq) .* (WK .^ 2) ); P(K_sq == 0) = 0; P(WK < 0) = 0; P(isinf(P)) = 0; end

After calculating P1 using phillips1 and P2 using phillips2 I will draw their difference:

 subplot(2,1,1); surf(X,Y,real(P2-P1)); title('Difference in real part'); subplot(2,1,2); surf(X,Y,imag(P2-P1)); title('Difference in imaginary part');

This perfectly illustrates that there is a huge difference between the two spectra in the real part.

0

Eugene kolesnikov Feb 04 '15 at 0:20

source share

Hoki · Accepted Answer · 2015-02-03T00:19:31+0000

Well, that was a fun exercise. This is a completely rewritten answer, since you have found the problems that you asked yourself.

Instead of deleting my answer, there is still a merit in publishing to help you vectorize and / or explain a few bits of code.

I completely rewrote the GUI that I gave in my previous answer to include your changes and add some options. It began to grow with arms and legs, so I won’t list here, but you can find the full file there:

ocean_simulator.m .

This is completely self-sufficient and includes all the computing functions that I vecturized and listed separately below.

The graphical interface allows you to play with parameters, animate waves, export a GIF file (and several other options, such as a "preset", but they are not smooth yet). A few examples of what you can achieve:

The main

This is what you get with quick default settings and several rendering options. This uses a small grid size and a quick time step, so it works pretty fast on any machine.

I am pretty limited at home (Pentium E2200 32bit), so I could only deal with limited settings. Gui will work even with maxed settings, but it will become slow to really enjoy.

However, when you quickly run ocean_simulator at work (I7 64 bit, 8 cores, 16 GB RAM, 2xSSD in the raid), this makes it a lot more fun! Here are some examples:

_{Despite what was done on a much better machine, I did not use any parallel functions and GPU calculations, so Matlab used only part of these specifications, which means that it could work just as well on any 64-bit system with worthy of ram}

Windy lake

It is a fairly flat water surface, like a lake. Even strong winds do not create waves with high amplitude (but still a lot of mini-bursts). If you are the owner of the wind looking at it from a window on top of a hill, your heart is about to miss a beat, and your next step is to call Dave " Man!" Meet you at five on the water! "

swell

You are watching this from the bridge of your boat in the morning after you have been fighting a storm all night. The storm dissipated, and long waves became the last witness of what was definitely a shaky night (people with sailing experience will find out ...).

T storm

And that was what you were before the night before ... _{second gif made at home, hence the lack of details ... sorry}

Down below:

Finally, gui will let you add a patch around the water domain. In gui, this is transparent so you can add objects under water or a beautiful ocean floor. Unfortunately, the GIF format cannot include an alpha channel, so there is no transparency (but if you export to video, then you should be fine).

In addition, exporting to GIF degrades the image, the connection between the domain border and the surface of the water is perfect if you run it in Matlab. In some cases, it also causes Matlab to degrade the rendering of lighting, so this is definitely not the best option for export, but it allows more things to play in Matlab.

Now for the code:

Instead of listing a complete GUI that would be very long (this post is already long enough), I just listed here the rewritten version of your code and explain the changes.

You should notice a significant increase in speed (an order of magnitude), due to the remaining vectorization, but mainly for two reasons:
(i) Numerous calculations have been repeated. Caching values and reusing them is much faster than recalculating full matrices in loops (during the animation part).
(ii) Notice how I defined the surface graphics. It is defined only once (empty even), then all further calls (in a loop) update the base ZData surface object (instead of re-creating the surface object at each iteration).

Here:

 %% // clear workspace clear all; close all; clc; %% // Default parameters param.meshsize = 128 ; %// main grid size param.patchsize = 200 ; param.windSpeed = 100 ; %// what unit ? [m/s] ?? param.winddir = 90 ; %// Azimuth param.rng = 13 ; %// setting seed for random numbers param.A = 1e-7 ; %// Scaling factor param.g = 9.81 ; %// gravitational constant param.xLim = [-10 10] ; %// domain limits X param.yLim = [-10 10] ; %// domain limits Y param.zLim = [-1e-4 1e-4]*2 ; gridSize = param.meshsize * [1 1] ; %% // Define the grid XY domain x = linspace( param.xLim(1) , param.xLim(2) , param.meshsize ) ; y = linspace( param.yLim(1) , param.yLim(2) , param.meshsize ) ; [X,Y] = meshgrid(x, y); %% // get the grid parameters which remain constants (not time dependent) [H0, W, Grid_Sign] = initialize_wave( param ) ; %% // calculate wave at t0 t0 = 0 ; Z = calc_wave( H0 , W , t0 , Grid_Sign ) ; %% // populate the display panel h.fig = figure('Color','w') ; h.ax = handle(axes) ; %// create an empty axes that fills the figure h.surf = handle( surf( NaN(2) ) ) ; %// create an empty "surface" object %% // Display the initial wave surface set( h.surf , 'XData',X , 'YData',Y , 'ZData',Z ) set( h.ax , 'XLim',param.xLim , 'YLim',param.yLim , 'ZLim',param.zLim ) %% // Change some rendering options axis off %// make the axis grid and border invisible shading interp %// improve shading (remove "faceted" effect) blue = linspace(0.4, 1.0, 25).' ; cmap = [blue*0, blue*0, blue]; %'// create blue colormap colormap(cmap) %// configure lighting h.light_handle = lightangle(-45,30) ; %// add a light source set(h.surf,'FaceLighting','phong','AmbientStrength',.3,'DiffuseStrength',.8,'SpecularStrength',.9,'SpecularExponent',25,'BackFaceLighting','unlit') %% // Animate view(75,55) %// no need to reset the view inside the loop ;) timeStep = 1./25 ; nSteps = 2000 ; for time = (1:nSteps)*timeStep %// update wave surface Z = calc_wave( H0,W,time,Grid_Sign ) ; h.surf.ZData = Z ; pause(0.001); end %% // This block of code is only if you want to generate a GIF file %// be carefull on how many frames you put there, the size of the GIF can %// quickly grow out of proportion ;) nFrame = 55 ; gifFileName = 'MyDancingWaves.gif' ; view(-70,40) clear im f = getframe; [im,map] = rgb2ind(f.cdata,256,'nodither'); im(1,1,1,20) = 0; iframe = 0 ; for time = (1:nFrame)*.5 %// update wave surface Z = calc_wave( H0,W,time,Grid_Sign ) ; h.surf.ZData = Z ; pause(0.001); f = getframe; iframe= iframe+1 ; im(:,:,1,iframe) = rgb2ind(f.cdata,map,'nodither'); end imwrite(im,map,gifFileName,'DelayTime',0,'LoopCount',inf) disp([num2str(nFrame) ' frames written in file: ' gifFileName])

You will notice that I have changed a few things, but I can assure that the calculations are exactly the same. This code calls several subfunctions, but all of them are vectorized, so if you want, you can just copy / paste them here and run all the built-in ones.

The first function is called initialize_wave.m Everything calculated here will be constant later (it does not change with time when you revive the waves later), so it makes sense to put this in your block.

 function [H0, W, Grid_Sign] = initialize_wave( param ) % function [H0, W, Grid_Sign] = initialize_wave( param ) % % This function return the wave height coefficients H0 and W for the % parameters given in input. These coefficients are constants for a given % set of input parameters. % Third output parameter is optional (easy to recalculate anyway) rng(param.rng); %// setting seed for random numbers gridSize = param.meshsize * [1 1] ; meshLim = pi * param.meshsize / param.patchsize ; N = linspace(-meshLim , meshLim , param.meshsize ) ; M = linspace(-meshLim , meshLim , param.meshsize ) ; [Kx,Ky] = meshgrid(N,M) ; K = sqrt(Kx.^2 + Ky.^2); %// ||K|| W = sqrt(K .* param.g); %// deep water frequencies (empirical parameter) [windx , windy] = pol2cart( deg2rad(param.winddir) , 1) ; P = phillips(Kx, Ky, [windx , windy], param.windSpeed, param.A, param.g) ; H0 = 1/sqrt(2) .* (randn(gridSize) + 1i .* randn(gridSize)) .* sqrt(P); % height field at time t = 0 if nargout == 3 Grid_Sign = signGrid( param.meshsize ) ; end

Note that the initial parameter winDir now expressed using a single scalar value representing the "azimuth" (in degrees) of the wind (from 0 to 360). Later it is translated into components X and Y thanks to the pol2cart function.

 [windx , windy] = pol2cart( deg2rad(param.winddir) , 1) ;

This ensures that the norm is always 1 .

The function calls your problem phillips.m separately, but, as said earlier, it even works fully vectorized, so you can copy it back to the line if you want. (don’t worry, I checked the results against your versions => exactly the same). Please note that this function does not output complex numbers, so there is no need to compare the imaginary parts.

 function P = phillips(Kx, Ky, windDir, windSpeed, A, g) %// The function now accept scalar, vector or full 2D grid matrix as input K_sq = Kx.^2 + Ky.^2; L = windSpeed.^2 ./ g; k_norm = sqrt(K_sq) ; WK = Kx./k_norm * windDir(1) + Ky./k_norm * windDir(2); P = A ./ K_sq.^2 .* exp(-1.0 ./ (K_sq * L^2)) .* WK.^2 ; P( K_sq==0 | WK<0 ) = 0 ; end

The next function called by the main program is calc_wave.m . This function completes the calculation of the wave field for a given time. It is definitely worth it in itself, because it is a set of mimimun calculations that will need to be repeated for every given point in time when you want to animate the waves.

 function Z = calc_wave( H0,W,time,Grid_Sign ) % Z = calc_wave( H0,W,time,Grid_Sign ) % % This function calculate the wave height based on the wave coefficients H0 % and W, for a given "time". Default time=0 if not supplied. % Fourth output parameter is optional (easy to recalculate anyway) % recalculate the grid sign if not supplied in input if nargin < 4 Grid_Sign = signGrid( param.meshsize ) ; end % Assign time=0 if not specified in input if nargin < 3 ; time = 0 ; end wt = exp(1i .* W .* time ) ; Ht = H0 .* wt + conj(rot90(H0,2)) .* conj(wt) ; Z = real( ifft2(Ht) .* Grid_Sign ) ; end

The last 3 lines of calculations require a bit of explanation, because they got the biggest changes (all for the same result, but with much greater speed).

Source line:

 Ht = H0 .* exp(1i .* W .* (t * timeStep)) + conj(flip(flip(H0,1),2)) .* exp(-1i .* W .* (t * timeStep));

recount the same thing too many times to be effective:

(t * timeStep) computed twice per line in each cycle, while it is easy to get the correct time value for each line when time initialized at the beginning of the cycle for time = (1:nSteps)*timeStep .
Also note that exp(-1i .* W .* time) same as conj(exp(1i .* W .* time)) . Instead of doing 2 * m * n multiplications to calculate each of them, it is faster to calculate once, and then use the conj() operation, which is much faster. Thus, your only line will become:

 wt = exp(1i .* W .* time ) ; Ht = H0 .* wt + conj(flip(flip(H0,1),2)) .* conj(wt) ;

The last minor touch, flip(flip(H0,1),2)) can be replaced with rot90(H0,2) (also slightly faster).

Please note that since the calc_wave function will be repeated many times, it is definitely worth reducing the number of calculations (as we did above), but also sending it the Grid_Sign parameter (instead of letting the function recalculate it for each iteration). That's why:

Your cryptic signCor(ifft2(Ht),meshSize)) function signCor(ifft2(Ht),meshSize)) , just change the sign of any other Ht element. There is a faster way to achieve this: just multiply Ht by a matrix of the same size ( Grid_Sign ), which is a matrix of alternating +1 -1 ... etc.

therefore signCor(ifft2(Ht),meshSize) becomes ifft2(Ht) .* Grid_Sign .

Since Grid_Sign depends only on the size of the matrix, it does not change for each time in the loop, you only calculate it once (before the loop), and then use it the same way as for every other iteration. It is calculated as follows (vectorized, so you can also add it to your code):

 function sgn = signGrid(n) % return a matrix the size of n with alternate sign for every indice % ex: sgn = signGrid(3) ; % sgn = % -1 1 -1 % 1 -1 1 % -1 1 -1 [x,y] = meshgrid(1:n,1:n) ; sgn = ones( n ) ; sgn(mod(x+y,2)==0) = -1 ; end

Finally, you will notice the difference in how the [Kx,Ky] grids are defined between your version and this. They give a slightly different result, it's just a matter of choice.
To explain with a simple example, consider a small meshsize=5 . Your way of doing things will divide it into 5 values evenly spaced like this:

 Kx(first line)=[-1.5 -0.5 0.5 1.5 2.5] * 2 * pi / patchSize

while my way of creating a grid will produce equally spaced values, but also focus on domain boundaries, for example:

 Kx(first line)=[-2.50 -1.25 0.0 1.25 2.50] * 2 * pi / patchSize

It seems that your comment % = 2*pi*n / Lx, -N/2 <= n < N/2 matches your line more.

I prefer symmetrical solutions (plus it is also a little faster, but it is only calculated once, so it is not very important), so I used my vector method, but this is purely a matter of choice, you can definitely keep your way, it only when- either slightly “biases” the entire matrix of results, but it does not perturb the calculations as such.

_{last remnants of the first answer}
Side program notes: I found that you came from a C / C ++ world or family. In Matlab you do not need to define a decimal with a coma (for example, 2.0 , you used this for most of your numbers). Unless otherwise specified, Matlab by default passes any number to double , which is a 64-bit floating point type. Therefore, writing 2 * pi enough to get maximum accuracy (Matlab will not distinguish pi as a whole ;-)), you do not need to write 2.0 * pi . Although it will still work if you do not want to change your habits.

Also (one of the great advantages of Matlab) is adding . before the operator usually means an operation "over the elements". You can add ( .+ ), Substract ( .- ), multiply ( .* ), Divide ( ./ ) the full matrix element in this way. This is how I got rid of all the loops in your code. This also works for the power operator: A.^2 will return a matrix of the same size as A , with each square of the element.

Matlab / CUDA: Ocean Wave Modeling

The main

Windy lake

swell

T storm

Down below:

Now for the code:

More articles: