Benford's Law Calculator

Analyze digit distribution and test conformity with Benford's Law

Input Data

Enter Your Data Numbers

Enter positive numbers. The calculator will extract the first digit from each number.

Benford's Law

Formula

P(d) = log₁₀(1 + 1/d)

Probability that digit d appears as the first digit

Expected Distribution

Digit 1:30.10%

Digit 2:17.61%

Digit 3:12.49%

Digit 4:9.69%

Digit 5:7.92%

Digit 6:6.69%

Digit 7:5.80%

Digit 8:5.12%

Digit 9:4.58%

Applications

🔍

Fraud detection in financial data

📊

Data quality assessment

🏛️

Election data analysis

🧮

Scientific data validation

💰

Accounting data verification

Understanding Benford's Law

What is Benford's Law?

Benford's Law, also known as the first-digit law, states that in many naturally occurring collections of numbers, the leading digit d occurs with probability P(d) = log₁₀(1 + 1/d). This means that the digit 1 appears as the first digit about 30% of the time, while 9 appears only about 4.6% of the time.

When Does It Apply?

✓Data spanning several orders of magnitude
✓Natural, unmanipulated datasets
✓Financial and accounting data
✓Population data, scientific measurements

Mathematical Foundation

The law emerges from the assumption that the logarithms of the numbers follow a uniform distribution. On a logarithmic scale, the interval [log(1), log(2)] is wider than [log(9), log(10)], explaining why smaller digits appear more frequently.

Probability Formula

P(d) = log₁₀(d+1) - log₁₀(d)

P(d) = log₁₀((d+1)/d)

P(d) = log₁₀(1 + 1/d)

Chi-Square Test

The chi-square goodness-of-fit test compares your observed frequencies with Benford's expected frequencies. A low p-value (< 0.05) suggests your data significantly deviates from Benford's Law.

Real-World Examples

Data That Follow Benford's Law:

• Stock prices and market data
• Population figures
• Physical constants
• Fibonacci sequence
• Tax return data
• Scientific measurements

Data That Don't Follow:

• Telephone numbers
• Heights and weights
• Lottery numbers
• Data with imposed limits
• Small datasets
• Artificially generated data