You understand confused. This documentation is terrible. I had to return to an article based on it (Hyndman, RJ, Fan, Y. (November 1996). “Examples of quantiles in statistical packages.” American statistician 50 (4): 361-365. Doi: 10.2307 / 2684934 ) to understand. Let's start with the first problem.
where 1 <= i <= 9, (jm) / n <= p <(jm + 1) / n, x [j] is the statistics of the jth order, n is the sample size, and m is a constant determined by the type of quantization sample. Here the gamma depends on the fractional part g = np + mj.
The first part comes directly from the article, but what the documentation authors skipped was j = int(pn+m) . This means that Q[i](p) depends on only two order statistics closest to being p part of the path through (sorted) observations. (For those like me who are not familiar with the term, “order statistics” of a series of observations is a sorted series.)
In addition, this last sentence is simply incorrect. He must read
Here the gamma depends on the fractional part np + m, g = np + mj
As for m , it is simple. m depends on which of the 9 algorithms was chosen. So, as Q[i] is the quantile function, m should be considered m[i] . For algorithms 1 and 2, m is 0, for 3, m is -1/2, and for the rest, in the next part.
For continuous samples of quantile types (from 4 to 9), sample quanta can be obtained by linear interpolation between statistics of the kth order and p (k):
p (k) = (k - alpha) / (n - alpha - beta + 1), where α and β are constants determined by type. In addition, m = alpha + p (1 - alpha - beta) and gamma = g.
This is really confusing. That the documentation calls p(k) does not match the previous version of p . p(k) is the construction position . In the article, the authors write it as p k , which helps. Moreover, in the expression for m p is the original p and m = alpha + p * (1 - alpha - beta) . Conceptually for algorithms 4–9, points ( p k , x[k] ) are interpolated to obtain a solution ( p , Q[i](p) ). Each algorithm differs only in the algorithm for p k .
Regarding the last bit, R simply indicates that it is using S.
The original article provides a list of 6 “desirable properties for the sample quantization function” and indicates that preference # 8, which meets all the requirements of 1. # 5, satisfies all of them, but they don’t like it on the other (this is more phenomenological than based on principles). # 2 - this is what non-static scum, like me, will consider quantiles and this is what is described on Wikipedia.
By the way, in response to dreeves answer , Mathematica does things differently. I seem to understand cartography. While Mathematica is easier to understand, (a) it’s easier to shoot in the leg with meaningless parameters, and (b) it cannot execute the R # 2 algorithm. (Here's the Mathworld Quantile page , which says that Mathematica cannot do # 2, but gives a simpler generalization of all other algorithms in terms of four parameters.)