Subcutaneous and excess
For online asymmetry and kurtosis algorithms (according to variance), see the same wiki page for parallel algorithms for statistics with higher momentum.
Median
The median is tough without sorted data. If you know how much data you have, in theory you only need to partially sort, for example. using the selection algorithm. However, this does not help much with billions of values. I would suggest using frequency, see the next section.
Median and frequency counting mode
If these are integers, I would calculate the frequencies , possibly cutting off the highest and lowest values outside of a certain value, where I am sure that this is no longer relevant. For a float (or too many integers) I would probably create buckets / intervals and then use the same approach as for integers. (Approximate) mode and median calculation, which simplifies, based on a table of frequencies.
Usually distributed random variables
If it is normally distributed, I would use a sample of mean , variance , skewness, and kurtosis as estimates of maximum likelihood for a small subset. (On-line) algorithms for calculating you now. For example. read a few hundred thousand or a million data until your error estimate is small enough. Just make sure that you select randomly from your set (for example, you do not introduce bias by selecting the first 100,000 values). The same approach can also be used to estimate the mode and median for the normal case (since the average value of the sample is an estimate).
Additional comments
All of the algorithms described above can be executed in parallel (including many sorting and selection algorithms like QuickSort and QuickSelect), if that helps.
I always assumed (with the exception of the normal distribution section) that we are talking about sample moments, median and mode, and not about estimates for theoretical moments, given the known distribution.
In general, data sampling (i.e., viewing only a subset) should be quite successful, given the amount of data, if all observations are the implementation of the same random variable (have the same distribution) and the moments, mode and median really exist for this distribution . The last warning is not harmless. For example, the average (and all higher points) for Cauchy Distribution does not exist. In this case, the average sample value of the "small" subset can be massively disconnected from the average sample value for the entire sample.