This discussion applies in any dimension. For your 3D case, we’ll first talk about computational geometry to understand why part of the area gives NaN from griddata .
The scattered points in your volume make up a convex hull; geometric shape with the following properties:
- The surface is always convex (as the name implies)
- The volume of the form is the smallest possible without breaking the bulge
- The surface (in 3d) is triangulated and closed
A less formally convex hull ( which you can easily calculate with scipy ) is like stretching a balloon over a frame, where the frame angles are the outermost points of your scattered cluster.
At a regular place on the grid inside the balloon, you are surrounded by known points. You can interpolate these locations. Beyond this, you must extrapolate.
Extrapolation is complicated. There is no general rule on how to do this ... this is a problem. In this area, algorithms such as griddata return NaN - this is the safest way to inform the scientist that he / she should choose a reasonable way to extrapolate.
Let's look at some ways to do this.
1. [WORST] Botch it
Assign some scalar value outside the enclosure. In numpy docs you will see that this is done with: s = average (b) bCube = griddata ((x, y, z), b, (X, Y, Z), method = 'linear', fill_value = s)
Cons: This creates a sharp gap in the interpolated field at the housing boundary, strongly distorts the average value of the scalar field and does not take into account the functional form of the data.
2. [NEXT WORST] "Mixed Search"
Suppose you apply some value in the corners of your domain. This may be the average of the scalar field associated with your scattered points.
Sorry, this is pseudo code, as I don't use numpy at all, but it will probably be pretty clear.
Cons: This creates a sharp gap in the gradient of the interpolated field at the border of the case, pretty much distorts the average value of the scalar field and does not take into account the functional form of the data.
3. [STILL PRETTY BAD] Nearest neighbor
For each of the regular NaN points, find the nearest non-NaN and assign this value. It is efficient and stable, but rude, because your field can have patterned functions (for example, stripes or rays emitted from the body) that are often visually unattractive or, even worse, unacceptable in terms of data smoothness
Depending on the data density, you can use the nearest scattered data set instead of the nearest non-NaN regular point. This can be done simply (again, with pseudo-code):
bCube = griddata((x, y, z), b, (X, Y, Z), method = 'linear', fill_value=nan) bCubeNearest = griddata((x, y, z), b, (X, Y, Z), method = 'nearest') indicesMask = isNan(bCube)
Using MATLAB delaunay-based approaches will show more powerful methods for achieving similar ones in single-line space, but numpy looks a bit limited here.
4. [NOT ALWAYS TERRIBLE] Naturally Weighted
Sorry for the poor explanation in this section, I never wrote an algorithm, but I'm sure that some research on the natural neighbor method will take you far.
Use the distance weighting function with some parameter D , which may be similar to, or twice (say) the length of your field. You can customize. For each NaN location, determine the distance to each of the scattered points.
You basically want to get a function that smoothly transitions to the average intensity B at a distance D from the body, but matches the body at the border. Far from the border, it is most heavily weighted at the nearest points.
Pros: well stable and reasonably continuous. Due to weighting, it is more resistant to noise at single data points than the nearest neighbor.
5. [HEROIC ROCKSTAR] Assumption of functional form
What do you know about physics? Suppose that a functional form representing what you expect from physics should make the least squares (or some equivalent) of this form match the scattered data. Use the function to stabilize extrapolation.
Some good ideas that can help you build a function: - Do you expect symmetry or periodicity? - Is a component of a vector field that has some property as zero divergence? - Orientation: do you expect all angles to be the same? Or maybe linear variation in one direction? - field b at a certain point in time - perhaps smoothed measurement timers can be used to create a basic function? - Is there already a known shape, such as Gaussian or quadratic?
Some examples: - b represents the intensity of a laser beam passing through a volume. You expect the input side to be nominally identical to the outlet, and the remaining four boundaries of zero intensity. The intensity will have a concentric Gaussian profile.
b is one component of the velocity field in an incompressible fluid. The liquid should be free of discrepancies, so any field created in the NaN zone should also be divergent free, so you apply this condition.
b represents the temperature in the room. You expect the upper temperature to be higher because hot air rises.
b is an elevator on an aerodynamic surface, tested for three independent variables. You can easily lift the elevator to the stall, so you know exactly what will happen in some parts of the space.
Pros / Cons: Get it right and it will be awesome. Wrong, especially with non-linear functional forms, and it will be very bad and can lead to very unstable results.
Security warning you cannot take a functional form, get good results, and then use them to prove the correctness of the functional form. This is just bad science. The form should be well-behaved and well-known, regardless of your data analysis.