Per-minute tick data for USDJPY is available here. Suppose we download this file to usdjpy.txt and then save it into a Numpy array in Python 3 as follows:

```
import numpy as np
with open('USDJPY.txt','r') as f: data=f.readlines()
data=[x.split(',') for x in data][1:]
jpy=np.array([float(close) for (ticker,yy,time,open,high,low,close,vol) in data])
```

The per-minute returns in USDJPY, expressed in basis points, will be:

```
rjpy=10000.0*np.diff(jpy)/jpy[0:-1]
```

Define a histogram function and empirical PDF function as follows:

```
def histc(X,bins):
map_to_bins = np.digitize(X,bins)
r = np.zeros(bins.shape)
for i in map_to_bins:
r[i-1] += 1
return [r,map_to_bins]
def epdf(S,numIntervals=100):
minS=np.min(S)
maxS=np.max(S)
intervalWidth=(maxS-minS)/numIntervals
x=np.arange(minS,maxS+intervalWidth/2.,intervalWidth)
[ncount,ii]=histc(S,x)
if ncount[1]>len(S)/2:
medS=np.median(S)
minS=0.8*medS
maxS=1.2*medS
intervalWidth=(maxS-minS)/numIntervals
x=np.arange(minS,maxS+intervalWidth/2.,intervalWidth)
[ncount,ii]=histc(S,x)
relativefreq=ncount/sum(ncount)
return (x,relativefreq)
```

The empirical PDF of USDJPY 1-minute pip returns is then:

```
(x,rf)=epdf(djpy,numIntervals=1000)
```

which if we plot it

```
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
output_notebook()
simple=[(x[i],int(100*rf[i])) for i in range(rf.shape[0]) if int(100*rf[i]) > 0]
X=np.array([x for x,y in simple])
Y=np.array([y for x,y in simple])
p=figure(plot_width=600,plot_height=200,tools="pan,wheel_zoom,box_zoom,reset,resize")
p.line(X,Y)
show(p)
```

Looks like this:

Now suppose I want to know what could happen in an hour (60 one-minute moves). Following the answer to this question, I could convolve the EPDF above 60 times and I should get the right answer. I think this would look something like this:

```
def step_pdf(pdf1,pdf2):
pdf=np.convolve(pdf1,pdf2)
pdf=(pdf[0:-1:2]+pdf[1::2])/2
pdf=np.append(np.array([0]),pdf)
pdf=pdf/pdf.sum()
return pdf
from functools import reduce
pdf60=reduce(step_pdf,[rf for i in range(60)])
```

If I then plot the new pdf60 on top of the old pdf

```
p=figure(plot_width=600,plot_height=200,tools="pan,wheel_zoom,box_zoom,reset,resize")
p.line(x,rf,color='red')
p.line(x,pdf60,color='blue')
show(p)
```

I see (call this “Convolution PDF60”):

The blue line is my 60-minute PDF from the above 60-fold convolution. It is smoother, which I expect, but it is still roughly in the same range as the original 1-minute PDF, which I do not expect. So now I will try a more constructive way of generating the 60-minute PDF: I will construct as many 60-minute samples randomly as I have 1-minute samples, by summing randomly selected vectors of size 60 from my original population of 1-minute moves. Then I will compute the empirical PDF of the result. I completely trust this construction, so I will use it as a benchmark against my original construction. So:

```
n=djpy.shape[0]
draws=np.random.randint(0,n,size=(n,60))
djpy60=np.array([djpy[draws[i]].sum() for i in range(n)])
(x,pdf60)=epdf(djpy60,numIntervals=1000)
```

Now if I plot pdf60:

```
p=figure(plot_width=600,plot_height=200,tools="pan,wheel_zoom,box_zoom,reset,resize")
p.line(x,rf,color='red')
p.line(x,pdf60,color='blue')
show(p)
```

I see a much wider distribution of 60-minute moves, which corresponds much more strongly to my intuition (call this “Monte Carlo PDF60”):

**Question:** Why aren’t my Convolution PDF60 and my Monte Carlo PDF60 in agreement?