Friday, February 4, 2011

How to test randomness (case in point - Shuffling)

First off, this question is ripped out from this question. I did it because I think this part is bigger than a sub-part of a longer question. If it offends, please pardon me.

Assume that you have a algorithm that generates randomness. Now how do you test it? Or to be more direct - Assume you have an algorithm that shuffles a deck of cards, how do you test that it's a perfectly random algorithm?

To add some theory to the problem - A deck of cards can be shuffled in 52! (52 factorial) different ways. Take a deck of cards, shuffle it by hand and write down the order of all cards. What is the probability that you would have gotten exactly that shuffle? Answer: 1 / 52!.

What is the chance that you, after shuffling, will get A, K, Q, J ... of each suit in a sequence? Answer 1 / 52!

So, just shuffling once and looking at the result will give you absolutely no information about your shuffling algorithms randomness. Twice and you have more information, Three even more...

How would you black box test a shuffling algorithm for randomness?

  • Shuffle alot, and then record the outcomes (if im reading this correctly). I remember seeing comparisons of "random number generators". They just test it over and over, then graph the results.

    If it is truly random the graph will be mostly even.

    From Deinumite
  • Have a look at this Coding Horror article, and the comments on it.

    From Ash
  • The only way to test for randomness is to write a program that attempts to build a predictive model for the data being tested, and then use that model to try to predict future data, and then showing that the uncertainty, or entropy, of its predictions tend towards maximum (i.e. the uniform distribution) over time. Of course, you'll always be uncertain whether or not your model has captured all of the necessary context; given a model, it'll always be possible to build a second model that generates non-random data that looks random to the first. But as long as you accept that the orbit of Pluto has an insignificant influence on the results of the shuffling algorithm, then you should be able to satisfy yourself that its results are acceptably random.

    Of course, if you do this, you might as well use your model generatively, to actually create the data you want. And if you do that, then you're back at square one.

    Ed Guiness : What are you sorry about?
    From MegaHAL
  • Statistics. The de facto standard for testing RNGs is the Diehard suite. Alternatively, the Ent program provides tests that are simpler to interpret but less comprehensive.

    As for shuffling algorithms, use a well-known algorithm such as Fisher-Yates (a.k.a "Knuth Shuffle"). The shuffle will be uniformly random so long as the underlying RNG is uniformly random. If you are using Java, this algorithm is available in the standard library (see Collections.shuffle).

    It probably doesn't matter for most applications, but be aware that most RNGs do not provide sufficient degrees of freedom to produce every possible permutation of a 52-card deck (explained here).

    From Dan Dyer
  • I'm not fully following your question. You say

    Assume that you have a algorithm that generates randomness. Now how do you test it?

    What do you mean? If you're assuming you can generate randomness, there's no need to test it.

    Once you have a good random number generator, creating a random permutation is easy (e.g. Call your cards 1-52. Generate 52 random numbers assigning each one to a card in order, and then sort according to your 52 randoms) . You're not going to destroy the randomness of your good RNG by generating your permutation.

    The difficult question is whether you can trust your RNG. Here's a sample link to people discussing that issue in a specific context.

    Tnilsson : Heh. A clarification then. "Assume that you have a algorithm that you believe generates randomness."
    Baltimark : OK. I wasn't trying to be snarky. I don't really know if you're asking "how to test randomness" which can be asked without referring to card shuffling, or if you're asking "how to test if my shuffling alogrithm has screwed up my good RNG."
    From Baltimark
  • First, it is impossible to know for sure if a certain finite output is "truly random" since, as you point out, any output is possible.

    What can be done, is to take a sequence of outputs and check various measurements of this sequence against what is more likely. You can derive a sort of confidence score that the generating algorithm is doing a good job.

    For example, you could check the output of 10 different shuffles. Assign a number 0-51 to each card, and take the average of the card in position 6 across the shuffles. The convergent average is 25.5, so you would be surprised to see a value of 1 here. You could use the central limit theorem to get an estimate of how likely each average is for a given position.

    But we shouldn't stop here! Because this algorithm could be fooled by a system that only alternates between two shuffles that are designed to give the exact average of 25.5 at each position. How can we do better?

    We expect a uniform distribution (equal likelihood for any given card) at each position, across different shuffles. So among the 10 shuffles, we could try to verify that the choices 'look uniform.' This is basically just a reduced version of the original problem. You could check that the standard deviation looks reasonable, that the min is reasonable, and the max value as well. You could also check that other values, such as the closest two cards (by our assigned numbers), also make sense.

    But we also can't just add various measurements like this ad infinitum, since, given enough statistics, any particular shuffle will appear highly unlikely for some reason (e.g. this is one of very few shuffles in which cards X,Y,Z appear in order). So the big question is: which is the right set of measurements to take? Here I have to admit that I don't know the best answer. However, if you have a certain application in mind, you can choose a good set of properties/measurements to test, and work with those -- this seems to be the way cryptographers handle things.

    From Tyler
  • There's a lot of theory on testing randomness. For a very simple test on a card shuffling algorithm you could do a lot of shuffles and then run a chi squared test that the probability of each card turning up in any position was uniform. But that doesn't test that consecutive cards aren't correlated so you would also want to do tests on that.

    Volume 2 of Knuth's Art of Computer Programming gives a number of tests that you could use in sections 3.3.2 (Empirical tests) and 3.3.4 (The Spectral Test) and the theory behind them.

    From Ian G
  • Testing 52! possibilities is of course impossible. Instead, try your shuffle on smaller numbers of cards, like 3, 5, and 10. Then you can test billions of shuffles and use a histogram and the chi-square statistical test to prove that each permutation is coming up an "even" number of times.

  • No code so far, therefore I copy-paste a testing part from my answer to the original question.

      // ...
      int main() {
        typedef std::map<std::pair<size_t, Deck::value_type>, size_t> Map;
        Map freqs;    
        Deck d;
        const size_t ntests = 100000;
    
        // compute frequencies of events: card at position
        for (size_t i = 0; i < ntests; ++i) {
          d.shuffle();
          size_t pos = 0;
          for(Deck::const_iterator j = d.begin(); j != d.end(); ++j, ++pos) 
            ++freqs[std::make_pair(pos, *j)]; 
        }
    
        // if Deck.shuffle() is correct then all frequencies must be similar
        for (Map::const_iterator j = freqs.begin(); j != freqs.end(); ++j)
          std::cout << "pos=" << j->first.first << " card=" << j->first.second 
                    << " freq=" << j->second << std::endl;    
      }
    

    This code does not test randomness of underlying pseudorandom number generator. Testing PRNG randomness is a whole branch of science.

  • Copying my answer from the original topic :

    YXJuLnphcnQ, that's the way I did it too. It's the most obvious.

    But the fact is, that if you write an algorithm, that just shuffles all the cards in the collection one position to the right every time you call sort() it would pass the test, even though the output is not random.

    What does everybody else think ? Maybe I'm wrong.

    Dan Dyer : Most of these simple tests for randomness can be gamed in some way. They don't prove anything either way, but they are useful for detecting sequences that probably aren't random (they're not so good for showing that a sequence *is* random).
    J.F. Sebastian : http://beta.stackoverflow.com/questions/56215/interesting-interview-questions#56501
  • Here's one simple check that you can perform. It uses generated random numbers to estimate Pi. It's not proof of randomness, but poor RNGs typically don't do well on it (they will return something like 2.5 or 3.8 rather ~3.14).

    Ideally this would be just one of many tests that you would run to check randomness.

    Something else that you can check is the standard deviation of the output. The expected standard deviation for a uniformly distributed population of values in the range 0..n approaches n/sqrt(12).

    /**
     * This is a rudimentary check to ensure that the output of a given RNG
     * is approximately uniformly distributed.  If the RNG output is not
     * uniformly distributed, this method will return a poor estimate for the
     * value of pi.
     * @param rng The RNG to test.
     * @param iterations The number of random points to generate for use in the
     * calculation.  This value needs to be sufficiently large in order to
     * produce a reasonably accurate result (assuming the RNG is uniform).
     * Less than 10,000 is not particularly useful.  100,000 should be sufficient.
     * @return An approximation of pi generated using the provided RNG.
     */
    public static double calculateMonteCarloValueForPi(Random rng,
                                                       int iterations)
    {
        // Assumes a quadrant of a circle of radius 1, bounded by a box with
        // sides of length 1.  The area of the square is therefore 1 square unit
        // and the area of the quadrant is (pi * r^2) / 4.
        int totalInsideQuadrant = 0;
        // Generate the specified number of random points and count how many fall
        // within the quadrant and how many do not.  We expect the number of points
        // in the quadrant (expressed as a fraction of the total number of points)
        // to be pi/4.  Therefore pi = 4 * ratio.
        for (int i = 0; i < iterations; i++)
        {
            double x = rng.nextDouble();
            double y = rng.nextDouble();
            if (isInQuadrant(x, y))
            {
                ++totalInsideQuadrant;
            }
        }
        // From these figures we can deduce an approximate value for Pi.
        return 4 * ((double) totalInsideQuadrant / iterations);
    }
    
    /**
     * Uses Pythagoras' theorem to determine whether the specified coordinates
     * fall within the area of the quadrant of a circle of radius 1 that is
     * centered on the origin.
     * @param x The x-coordinate of the point (must be between 0 and 1).
     * @param y The y-coordinate of the point (must be between 0 and 1).
     * @return True if the point is within the quadrant, false otherwise.
     */
    private static boolean isInQuadrant(double x, double y)
    {
        double distance = Math.sqrt((x * x) + (y * y));
        return distance <= 1;
    }
    
    Tnilsson : Me likes. Not the solution to the exact shuffle problem, but a good starting point. Have an upvote :)
    J.F. Sebastian : There is no need for `Math.sqrt()` in `isInQuadrant()`.
    Dan Dyer : YXJuLnphcnQ, good point.
    JoeBloggs : How is this, aside from all the extra processing, different from just counting higher/lower than 50% of the range of a random number?
    From Dan Dyer
  • Pondering it myself, what I would do is something like:

    Setup (Pseudo code)

    // A card has a Number 0-51 and a position 0-51
    int[][] StatMatrix = new int[52][52]; // Assume all are set to 0 as starting values
    ShuffleCards();
    ForEach (card in Cards) {
       StatMatrix[Card.Position][Card.Number]++;
    }
    

    This gives us a matrix 52x52 indicating how many times a card has ended up at a certain position. Repeat this a large number of times (I would start with 1000, but people better at statistics than me may give a better number).

    Analyze the matrix

    If we have perfect randomness and perform the shuffle an infinite number of times then for each card and for each position the number of times the card ended up in that position is the same as for any other card. Saying the same thing in a different way:

    statMatrix[position][card] / numberOfShuffle = 1/52.
    

    So I would calculate how far from that number we are.

    From Tnilsson
  • Most of the answers here seem to me to miss the point. People dork the implementation of the Fisher-Yates-Durstenfeld-Knuth shuffle algorithm all the time, even given a perfect RNG—I've done it myself. My question is simply this: Is there someplace I can find an existing Diehard-like battery of statistical tests specifically for shuffle output?

    The suggested frequency tests are a good start, but clearly much more could be done here. Indeed, many of the tests in Diehard look like they could be adapted to test permutations rather than uniform distributions by adapting the statistics appropriately. I've Googled for something like this with no success; have I missed something?

    "Beware of bugs in the above code; I have only proved it correct, not tried it." —Donald Knuth

    From PO8

0 comments:

Post a Comment