Benford's Law Calculator

Analyze digit distribution and test conformity with Benford's Law

Input Data

Enter positive numbers. The calculator will extract the first digit from each number.

Benford's Law

Formula

P(d) = log₁₀(1 + 1/d)

Probability that digit d appears as the first digit

Expected Distribution

Digit 1:30.10%
Digit 2:17.61%
Digit 3:12.49%
Digit 4:9.69%
Digit 5:7.92%
Digit 6:6.69%
Digit 7:5.80%
Digit 8:5.12%
Digit 9:4.58%

Applications

🔍

Fraud detection in financial data

📊

Data quality assessment

🏛️

Election data analysis

🧮

Scientific data validation

💰

Accounting data verification

Understanding Benford's Law

What is Benford's Law?

Benford's Law, also known as the first-digit law, states that in many naturally occurring collections of numbers, the leading digit d occurs with probability P(d) = log₁₀(1 + 1/d). This means that the digit 1 appears as the first digit about 30% of the time, while 9 appears only about 4.6% of the time.

When Does It Apply?

  • Data spanning several orders of magnitude
  • Natural, unmanipulated datasets
  • Financial and accounting data
  • Population data, scientific measurements

Mathematical Foundation

The law emerges from the assumption that the logarithms of the numbers follow a uniform distribution. On a logarithmic scale, the interval [log(1), log(2)] is wider than [log(9), log(10)], explaining why smaller digits appear more frequently.

Probability Formula

P(d) = log₁₀(d+1) - log₁₀(d)

P(d) = log₁₀((d+1)/d)

P(d) = log₁₀(1 + 1/d)

Chi-Square Test

The chi-square goodness-of-fit test compares your observed frequencies with Benford's expected frequencies. A low p-value (< 0.05) suggests your data significantly deviates from Benford's Law.

Real-World Examples

Data That Follow Benford's Law:

  • • Stock prices and market data
  • • Population figures
  • • Physical constants
  • • Fibonacci sequence
  • • Tax return data
  • • Scientific measurements

Data That Don't Follow:

  • • Telephone numbers
  • • Heights and weights
  • • Lottery numbers
  • • Data with imposed limits
  • • Small datasets
  • • Artificially generated data