Benford Law in Java - how to make a math function in Java

I have a quick question. I am trying to make a fraud detection application in java, the application will be mainly based on Benford law. Benford's law is very cool, it can be interpreted to say that in a real financial transaction, the first digit is usually 1, 2 or 3 and very rarely 8, 9. I could not get the Benford formula translated into code that can be run in Java.

http://www.mathpages.com/home/kmath302/kmath302.htm This link provides more information on what Benford's law is and how it can be used.

I know that I will have to use the Java math class to be able to use the natural log function, but I'm not sure how to do this. Any help would be greatly appreciated.

Thank you so much!

+5
source share
3 answers

@Rui mentioned how to calculate the probability distribution function, but that will not help you here.

What you want to use is the Kolmogorov-Smirnov Test or the Chi-squared test . Both are used to compare data with a known probability distribution to determine if this data set is likely to have a probability distribution.

- , K-S .


chi-squared H [N], . 9 N = 1,2,... 9, , , ( 90 ). - E [N].

, , 100 . E [N] :

E[1] = 30.1030 (=100*log(1+1))
E[2] = 17.6091 (=100*log(1+1/2))
E[3] = 12.4939 (=100*log(1+1/3))
E[4] =  9.6910
E[5] =  7.9181
E[6] =  6.6946
E[7] =  5.7992
E[8] =  5.1152
E[9] =  4.5757

& Chi; 2= sum ((H [k] -E [k]) ^ 2/E [k]) , . ( , s = 0 p = s + 1 = 1, n 9, = np = 8 *. - , . 8 :

& Chi; 2 > 13.362: 10% , -

& Chi; 2 > 15.507: 5% , -

& Chi; 2 > 17.535: 2.5% , -

& Chi; 2 > 20.090: 1% , -

& Chi; 2 > 26.125: 0.1% , -

, H = [29,17,12,10,8,7,6,5,6] a & Chi; 2= 0,5585. . ( , !)

, H = [27,16,10,9,5,11,6,5,11] a & Chi; 2= 13,89. , , , 10%. , .

, (, 10%/5%/ ..). 10%, 1 10 , , , . .

, Apache Commons Math Java - :

ChiSquareTestImpl.chiSquare(double[] expected, long[] observed)


* = 8: ; 9 , 1 , , , , 8 , .


- (- , , ), . :

  • (CDF) .
  • (ECDF), .
  • D = () .

enter image description here

.

  • CDF : C = log 10 x, x [1,10), .. 1, 10. , , log (1 + 1/n), (n + 1) -log (n) - , , log (n), log (n) CDF

  • ECDF: , , 0. ( , , , 0; , , .) . ECDF <= x x.

  • D = max (d [k]) d [k] = max (CDF (y [k]) - (k-1)/N, k/N - CDF (y [K]).

: , = [3.02, 1.99, 28.3, 47, 0.61]. ECDF [1.99, 2.83, 3.02, 4.7, 6.1], D :

D = max(
  log10(1.99) - 0/5, 1/5 - log10(1.99),
  log10(2.83) - 1/5, 2/5 - log10(2.83),
  log10(3.02) - 2/5, 3/5 - log10(3.02),
  log10(4.70) - 3/5, 4/5 - log10(4.70),
  log10(6.10) - 4/5, 5/5 - log10(6.10)
)

= 0,2988 (= log10 (1.99) - 0).

, D - , Apache Commons Math KolmogorovSmirnovDistributionImpl.cdf(), D , D . , 1-cdf (D), , D : 1% 0,1%, , , , , 25% 50%, , , .

+5

, Java?

public static double probability(int i) {   
    return Math.log(1+(1/(double) i))/Math.log(10);
}

import java.lang.Math;

.

, ... > _ >

+4

, - :

for(int i = (int)Math.pow(10, position-1); i <= (Math.pow(10, position)-1); i++)
        {
           answer +=  Math.log(1+(1/(i*10+(double) digit)));
        }

answer *= (1/Math.log(10)));
+1
source

All Articles