I am a mechanical engineer, training and working in a research environment, mainly expanding the existing numerical code base from 25+ years C. I recently decided that I want to learn how to create a serious part of scientific software from scratch.
I spoke with several scientists in the CS department at the university, and it seems to be accepted that the people who are most likely building large-scale digital applications are in the departments of mechanical / chemical / biology. Equally, the majority of people who write these applications are almost not involved in the development of software development principles.
Like most engineers, I study this, so I'm going to ask myself the task: To develop an adaptive mesh diagram that locally refines / coarsens based on the location of an arbitrary moving curve. Through this grid, solve the heat equation (or other PDE) .
Things I would like to include:
- Parallel (I have a short experience with MPI, so probably stick to this) - maybe unite in OpenCL (there are no Nvidia cards, so there is no CUDA)
- Combination of Python and C ++ (script managed interface in Python, execution in C ++)
- Object oriented based on design patterns (one part that I really want to learn)
- Unit testing (I used gtest and probably stick to this, but don’t know how to perform unit tests in detail, I read various recommendations for unit testing scientific code).
- Linux based - don't care about portability at this stage
- Boost libraries are possible
- Use HDF5 or VTK to save the results (I know VTK, but I feel that HDF5 is better)
- Profiled Performance
Some questions I'm trying to answer:
- It looks like a gigantic task, it is normal, but what is the general process of its destruction? Do you start with a basic infrastructure (MPI wrappers, matrix classes, etc.) Or do you start with high-level interaction (main controller, user interface, etc.) Or somewhere else?
- Is the Python + C ++ paradigm consistent with running MPI in a cluster?
- I have not found books that relate to application design in a scientific context - is it because it does not exist, or am I not looking in the right place?
- I know the perfect way to “launch it and then profile” for optimization, but I assume that some of the most fundamental design decisions made at the beginning will affect performance. What basic information do you need to know for high-level numeric code design?
NB: I'm not sure if this question matches the stackexchange format - if not, I will happily rephrase ...
source share