Author: Lehman Garrison (http://lgarrison.github.io)
However, using ˉρ=NL2 is not quite right. When sitting on a particle, only N−1 particles are available to be in a bin some distance away. The other particle is the particle you're sitting on, which is always at distance 0. Thus, the correct expression is RRi=NAiN−1L2.
The following notebook empirically demonstrates that using N−1 is correct, and that using N introduces bias of order 1N into the estimator. This is a tiny correction for large N problems, but important for small N.
Cross-correlations of two different particle sets don't suffer from this problem; the particle you're sitting on is never part of the set of particles under consideration for pair-making.
This N−1 correction is implemented in the Corrfunc code.
import numpy as np
# Generates a set of N uniform random points in boxsize L
# and returns binned pair counts
# Pair counts are doubled, and self-pairs are counted
def rand_DD(N, L, bins):
# Make a 2D random set of N particles in [0,L)
x = np.random.random((2,N))*L
# Form the periodic distance array
dist = x[:,None] - x[...,None]
dist -= np.round(dist/L)*L
dist = (dist**2).sum(axis=0)**.5
# Count pairs in 2 bins
DD = np.histogram(dist, bins=bins)[0]
return DD
# DD parameters
N = 10
L = 1.
bins = np.linspace(0.0,0.49, 5)
n_iter = 100000 # number of times to repeat the experiment
seed = 42
np.random.seed(seed)
# First, calculate RR analytically
dA = (bins[1:]**2 - bins[:-1]**2)*np.pi # The area of each 2D annular bin
RR = np.empty((2, len(dA)))
RR[:] = N/L**2*dA
# Calculate RR using N-1 and N
RR[0] *= N - 1
RR[1] *= N
# If the first bin includes 0, the random term must include the self-pairs
if bins[0] == 0.:
RR[:, 0] += N
xi_mean = np.zeros((2, len(bins)-1))
for i in xrange(n_iter):
DD = rand_DD(N, L, bins)
xi_mean += DD/RR
xi_mean /= n_iter
print 'DD/RR using density (N-1) = {}'.format(xi_mean[0])
print 'DD/RR using density (N) = {}'.format(xi_mean[1])
DD/RR using density (N-1) = [ 1.00022627 0.99934544 0.99956887 1.00006867] DD/RR using density (N) = [ 0.96817988 0.8994109 0.89961199 0.9000618 ]
As we expected, using N in the density leads to a 1N=10% bias in DDRR, when averaging over many realizations of DD.
The first bin has a smaller bias because we chose to include a bin that starts at 0 separation, and thus includes self-pairs, which are indepdendent of the N or N−1 density term.