Skip to content
This repository was archived by the owner on Nov 19, 2020. It is now read-only.
This repository was archived by the owner on Nov 19, 2020. It is now read-only.

"CDF computation generated NaN values" thrown for large dataset in Mann Whitney Wilcoxon Test #1327

@zhouchgh

Description

@zhouchgh

What would you like to submit? (put an 'x' inside the bracket that applies)

  • question
  • bug report
  • feature request

Issue description
MannWhitneyWilcoxonTest class will throw "CDF computation generated NaN values" exception if the sample size is larger than 50,000.

Here's the code snippet I'm trying to execute:

var trandomA = new TRandom();
var A = trandomA.ExponentialSamples(1.0).Take(50000).ToList();
var B = trandomB.ExponentialSamples(1.0).Take(50000).ToList();
mannWhitneyWilcoxonTest = new MannWhitneyWilcoxonTest(A, B);

The MannWhitneyWilcoxonTest class will throw "CDF computation generated NaN values" exception.

After looking into the source of MannWhitneyWilcoxonTest and MannWhitneyDistribution, I find that there are some places have calculation as below:
NumberOfSamples1 * (NumberOfSamples1 + 1)

Since NumberOfSamples1 are int type, not sure if the issue is caused by interger overflow.
In my hunch, 50000 * 50000 = 2,500,000,000 > 2,147,483,647 (INT_MAX)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions