Why are mad (x) results different than expected results?

I am trying to calculate the average mean deviation of a sample ("S") of numbers. The results that I get when using the "mad ()" function, and when calculating the average of the mean deviations one step at a time, are different. Why?

s<- c(100,110,114,121,130,130,160) 

Using the function "mad ()", I get:

 > mad(s) [1] 13.3434 

When breaking the formula and performing one operation in one step, I get:

 > sum(abs(s-mean(s)))/length(s) [1] 14.08163 

Why are these results different?

Am I making a mistake while entering my formula? (This is not surprising - I'm just starting to learn R). What is wrong with my formula?

Or the formula that R uses to calculate the average mean deviation different from the following (given on Wikipedia )

MAD = (sum (absolute values ​​(each value minus the average value for the sample))) divided by (number of values ​​in the sample)?

(Thanks for the help!)

+5
source share
3 answers

"MAD", unfortunately, is a term with several meanings; absolute absolute deviation from the mean (sometimes simply called MD or mean deviation), median absolute deviation from the median, mean absolute deviation from the median (which occurs when calculating the scale in Laplace), etc. Wikipedia - although often useful - is not an arbiter of use; sometimes it can be a little peculiar in the use of terms (which is not particularly a criticism of Wikipedia, partly inherent in the nature of things). [Personally, in the absence of further clues, I usually interpreted MAD as the average absolute deviation from the median and expected the average absolute deviation from the mean, if it was not written completely to be written, or as "average deviation" / "MD", the average absolute deviation. "]

The question of which R is being computed is resolved by a simple method ?mad :

  mad {stats} R Documentation Median Absolute Deviation Description Compute the median absolute deviation, ie, the (lo-/hi-) median of the absolute deviations from the median, and (by default) adjust by a factor for asymptotically normal consistency. 

As a general suggestion, when you use a function for the first time, do not assume that you know what it does. For example, before I first read the help for MAD, I would not expect it to be multiplied by this constant by default. (I think it's a bad idea, since by default it does not actually compute anything called MAD, but instead is a reliable estimate of σ for a population where the uncontaminated part is Gaussian - but that's how it works.)

Most functions will do what you think they do, but some may surprise you. Check the definitions in the help system, see how inputs and outputs are determined, and try examples.

By the way, if you want a median (absolute) deviation from the mean, you can get this mad(x,mean(x),1) . But if you want the average deviation from the mean, I don’t know if there is anything simpler than mean(abs(x-mean(x))) ; it has at least the advantage of being absolutely explicit.

+8
source

As @Glen_b suggested, mad does more than applying a formula, including “correction,” for consistency with normality.

Check out the examples:

 #with mad mad(s) mad(s,center= mean(s)) # using formulas sum(abs(s-median(s)))/length(s) sum(abs(s-mean(s)))/length(s) > mad(s) [1] 13.3434 > mad(s,center= mean(s)) [1] 14.1906 > > sum(abs(s-median(s)))/length(s) [1] 13.71429 > sum(abs(s-mean(s)))/length(s) [1] 14.08163 
+1
source

As an option, if you are trying to calculate the median absolute deviation from the median, enter

 mad(s,constant=1) 
0
source

All Articles