### Easter egg or not?

Hi, I'm Marcin, technical support engineer here at NAG and I've recently encountered an interesting problem. I coauthored a recent paper on option pricing with the NAG Library (Option pricing in Excel). My coauthor Jeremy subsequently blogged about it (Extending, imitating and collaborating). Now it’s time for the next step.

While working on a demonstration of NAG random number generators in Excel and their application in option pricing using the Monte Carlo method I noticed a considerable discrepancy in results while using unscrambled Sobol quasirandom numbers. Well, to start with, how do you price a European option when the underlying asset follows geometric Brownian motion? Basically for each random number you simulate what the price of the asset will be in, say T=1 year, moving in one step from the starting time point t=0 to the end point t=T, calculate the payoff, and the average value of all payoffs will be our desired option price (Monte Carlo methods for option pricing). I’m not going to get into the details of Monte Carlo method, that will be covered in a paper I'm currently working on. The idea is that this simulation works very well and gives good convergence to the result obtained using the close-form formula, no surprises here. The stock prices have log-normal distribution, as expected. This is illustrated in the figure below. This histogram groups the stock prices from 20000 simulations.

One could however want to price Asian options, which are path-dependent. The payoff of such options depends not on the final price of the asset (as in European options), but on the average price of the asset throughout the life of the option. We therefore need to generate a path of stock prices, so we’ll move from t=0 to t=T not in one step as previously, but in many steps, say 252 (this is the average number of trading days per year). In practice we generate 252 random numbers, generate a path of the asset price and for the final step we calculate the payoff. This is done for every Monte Carlo simulation, so as you’ve probably noticed we need 252 times more random numbers than in case of the European option. The quantity is not an issue, the Sobol sequence is long enough. What happens is that in case of such pathwise Monte Carlo simulation the distribution is not log-normal, it doesn’t actually resemble any distribution, as can be seen below.

Why is that? Why do the quasirandom numbers fail? The simulation using the same data, but pseudorandom numbers instead of quasirandom, does return stock prices that are log-normally distributed (below).

What is going on here? Well, I know the answer, it took me some time to realise, and it would have taken more time if not the help of my colleagues Jacques and Robert. Do you, dear reader of this blog, know why it is like this? Is this just an easter egg, or an interesting property of quasirandom numbers? Happy Easter!

While working on a demonstration of NAG random number generators in Excel and their application in option pricing using the Monte Carlo method I noticed a considerable discrepancy in results while using unscrambled Sobol quasirandom numbers. Well, to start with, how do you price a European option when the underlying asset follows geometric Brownian motion? Basically for each random number you simulate what the price of the asset will be in, say T=1 year, moving in one step from the starting time point t=0 to the end point t=T, calculate the payoff, and the average value of all payoffs will be our desired option price (Monte Carlo methods for option pricing). I’m not going to get into the details of Monte Carlo method, that will be covered in a paper I'm currently working on. The idea is that this simulation works very well and gives good convergence to the result obtained using the close-form formula, no surprises here. The stock prices have log-normal distribution, as expected. This is illustrated in the figure below. This histogram groups the stock prices from 20000 simulations.

One could however want to price Asian options, which are path-dependent. The payoff of such options depends not on the final price of the asset (as in European options), but on the average price of the asset throughout the life of the option. We therefore need to generate a path of stock prices, so we’ll move from t=0 to t=T not in one step as previously, but in many steps, say 252 (this is the average number of trading days per year). In practice we generate 252 random numbers, generate a path of the asset price and for the final step we calculate the payoff. This is done for every Monte Carlo simulation, so as you’ve probably noticed we need 252 times more random numbers than in case of the European option. The quantity is not an issue, the Sobol sequence is long enough. What happens is that in case of such pathwise Monte Carlo simulation the distribution is not log-normal, it doesn’t actually resemble any distribution, as can be seen below.

Why is that? Why do the quasirandom numbers fail? The simulation using the same data, but pseudorandom numbers instead of quasirandom, does return stock prices that are log-normally distributed (below).

What is going on here? Well, I know the answer, it took me some time to realise, and it would have taken more time if not the help of my colleagues Jacques and Robert. Do you, dear reader of this blog, know why it is like this? Is this just an easter egg, or an interesting property of quasirandom numbers? Happy Easter!

Well, let me take a crack at this egg. The main (only?) difference between pseudo and quasi random numbers seems to be that the latter are more evenly distributed across the interval of interest than the former (there are more words about this, along with a rather nice graphic which illustrates the point here). Perhaps this has something to do with why their distribution is closer-to-normal than that of the pseudo-random sequence?

ReplyDeleteIs it something to do with the fact that quasi random numbers are not statistically independent from each other?

ReplyDeleteThis could due to the poor quality of quasi-random number beyond a certain dimension (say 100). Even if NAG's sobol generator is able to generate up to 50,000 dimensions, there is no guarantee that projection of any two of them is evenly distributed over the unit square.

ReplyDeleteThe fact is the convergence of low discrepancy number is of order n*(log n)^d where d is dimension. With a large d, it barely has any advantage of pseudo-random numbers. This has not taken into account the reliability of Sobol' numbers in high dimension.

Jeremy, Mike and Kai, thank you for your interest in this post of mine. This wasn't an Easter egg, the problem I encountered was real.

ReplyDeleteJeremy, you're right that quasirandoms are more evenly distributed, but when I perform the Monte Carlo using one single step the results are fine. The problem arises in the stepwise MC.

Mike, the numbers are indeed not statistically independent, you're right. Do you have an idea how this could be solved? You can use pseudorandoms of course, but is there a way to this properly with quasirandoms? I actually have an idea how this can be done, but need to try it out. The answer is to use a scrambled sequence...

Kai, I agree with you about the higher dimensions, but in this case I've taken numbers just from the first dimension. There are 20k simulations and 50 steps in each, altogether 1m quasirandom numbers from the first dimension!

Hi,

ReplyDeleteIf in the second graph, you plot the average of Spot, then it is not a log normal distribution (it is well know that sum of log normal random variable is not a log normal random variable).

Now suppose that for each path you generate 252 quasi-random number and take the first one to contruct the S_T (the other is constructed with Brownian Bridge). It is like you consider a 252 dimension problem and consider the projection to the 1st axis.

I guess then 252 is a special dimension of your Sobol manager?

Feyn

Hi Feyn,

ReplyDeleteThanks for your interest in this blog.

In both second and third graphs I plot the prices of Asian options, but only the one where I use pseudorandoms shows a proper distribution. What you say about lognormal distributions is right, but note that I don't really plot an average of MC simulations, but for every simulation I compute average as in Asian options and this is proper way to do it (as seen on the 3rd pic).

I don't need to fill the bits inbetween with numbers from BB, they are not required. In the first example I move straight from S0 to ST and I don't need what's between them. If I calculate the prices between, then I have the 3rd example.

I'm not sure what you mean by projecting the problem to the 1st axis, could you elaborate more on that please?

And then 252 is not a special dimension of the NAG Sobol generator. It can be an arbitrary number. I tried it with 50 steps, with 252 steps and the diestribution in example 2 (using quasirandoms) is still hairy.