Skip to content

Building a Probability-Based Random Number Generator

Published: at 04:04 PM

Recently, I tackled an interesting coding task; the objective was to implement a method, next_num(), that generates random numbers based on predefined probabilities. Below, I’ll outline the task, describe my approach, and provide a link to the GitHub repository containing the complete code and unit tests.

Task Overview

The task was to implement a method, next_num(), which generates random numbers from a predefined list based on associated probabilities. Here’s a quick summary of the requirements:

  1. Input:
    • A list of random numbers: [-1, 0, 1, 2, 3]
    • A list of corresponding probabilities: [0.01, 0.3, 0.58, 0.1, 0.01]
  2. Output:
    • A random number from the list based on the specified probabilities.
  3. Performance:
    • The method should be well-structured and tested, as if it were to be deployed in a production environment.

Solution Approach

1. Understanding Probability Distribution

The task involves creating a probability distribution over a set of numbers. Each number in the list has a probability associated with it, indicating how likely it is to be chosen. For example:

Importantly, the probabilities must sum to 1. If they do not, we need to normalize them.

2. Normalization

If the given probabilities do not sum to 1, we can normalize them by dividing each probability by the total sum of the probabilities.

Formula for Normalization:

P=PiPP' = \frac{P_i}{\sum P}

where:

3. Cumulative Distribution Function (CDF)

To facilitate random selection based on the given probabilities, we construct a Cumulative Distribution Function (CDF). The CDF is a running total of the probabilities.

For a list of normalized probabilities PP', the CDF can be calculated as follows:

CDFi=j=1iPjCDF_i = \sum_{j=1}^{i} P'_j

where CDFiCDF_i is the cumulative probability up to the ii-th number.

4. Random Selection

To select a number based on the defined probabilities, generate a random number rr in the range [0,1)[0, 1). Compare this random number rr to the CDF:

Why this works?

Over a large number of trials, the relative frequencies of each number should converge to the defined probabilities. For example, if the probability of selecting the number 1 is 0.58, then out of 1000 selections, you would expect approximately 580 of them to be the number 1.

Conclusion

The implementation of the next_num() function serves as a practical example of how to create a probability-based random number generator in Python. The accompanying unit tests provide additional confidence that the function operates correctly and adheres to the specified probabilities.

You can find the complete code and unit tests here on Github

Feel free to reach out with any questions or insights!