- genevb's home page
- Posts
- 2021
- 2020
- 2019
- December (1)
- October (4)
- September (2)
- August (6)
- July (1)
- June (2)
- May (4)
- April (2)
- March (3)
- February (3)
- 2018
- 2017
- December (1)
- October (3)
- September (1)
- August (1)
- July (2)
- June (2)
- April (2)
- March (2)
- February (1)
- 2016
- November (2)
- September (1)
- August (2)
- July (1)
- June (2)
- May (2)
- April (1)
- March (5)
- February (2)
- January (1)
- 2015
- December (1)
- October (1)
- September (2)
- June (1)
- May (2)
- April (2)
- March (3)
- February (1)
- January (3)
- 2014
- December (2)
- October (2)
- September (2)
- August (3)
- July (2)
- June (2)
- May (2)
- April (9)
- March (2)
- February (2)
- January (1)
- 2013
- December (5)
- October (3)
- September (3)
- August (1)
- July (1)
- May (4)
- April (4)
- March (7)
- February (1)
- January (2)
- 2012
- December (2)
- November (6)
- October (2)
- September (3)
- August (7)
- July (2)
- June (1)
- May (3)
- April (1)
- March (2)
- February (1)
- 2011
- November (1)
- October (1)
- September (4)
- August (2)
- July (4)
- June (3)
- May (4)
- April (9)
- March (5)
- February (6)
- January (3)
- 2010
- December (3)
- November (6)
- October (3)
- September (1)
- August (5)
- July (1)
- June (4)
- May (1)
- April (2)
- March (2)
- February (4)
- January (2)
- 2009
- November (1)
- October (2)
- September (6)
- August (4)
- July (4)
- June (3)
- May (5)
- April (5)
- March (3)
- February (1)
- 2008
- 2005
- October (1)
- My blog
- Post new blog entry
- All blogs

# Covariance and the Centrality Bin Width effect

Updated on Wed, 2017-12-13 18:15. Originally created by genevb on 2017-12-13 18:09.

In reading the STAR paper on flow harmonics correlations in Au+Au data, I came upon the Centrality Bin Width effect. The authors of the paper have tried to correct for the effect by narrowing refMult bin widths, but I believe this may fall short of the intent.

As an example, let's take two observables, A and B. Let A have the form:

A = C + f(x) + δA

where C is a constant, and δA represents its statistical fluctuations (such that if f(x) = 0, A would have a mean value of C and a variance of (δA)

Let B have the same form:

B = D + g(x) + δB

You can see that for a fixed multiplicity, my observables A and B are independent and should have a covariance of zero.

However, the covariance of these in a finite multiplicity bin will be:

cov(A,B) = E(f(x)*g(x)) - E(f(x)) * E(g(x))

where E() is the "expectation of", and is essentially the mean value over the bin.

It's easy to do the math for this if we take f(x) and g(x) to be linear functions of x, f(x) = Fx, g(x) = Gx, and then we get:

cov(A,B) = (1/12)*F*G*w

where w is the multiplicity bin width. As we go to the limit where w approaches 0, the covariance approaches 0. The fact that it is non-zero for finite bins is the bin width effect whereby a mutual dependence on the variable used for binning contributes to the variance. This contribution can be minimized by taking smaller bins. This is what is already being done.

But...

Let's now replace x with x+δx, so that x is still the multiplicity that we measure, and x+δx is the centrality that A and B may functionally depend upon, modulo a scale factor that we can absorb into f() and g(). In this case, the mean centrality (in a multiplicity bin) is still the multiplicity x, while (δx)

cov(A,B) = E(f(x+δ)*g(x+δx)) - E(f(x+δx)) * E(g(x+δx))

Again, this is easily solved for the linear case that f(x) = Fx, g(x) = Gx, and we get for a multiplicity bin of width w:

cov(A,B) = (1/12)*F*G*w

You can see that as w approaches zero, you will get a covariance that becomes independent of the multiplicity bin width (and you will see results that appear to approach a steady state by doing so), but it does not go to zero as we would like it to for these independent observables. There is a term related to the variance of centrality about the mean for a given multiplicity, and it will contribute to your observed correlation functions because δx is clearly nonzero for STAR.

-Gene

As an example, let's take two observables, A and B. Let A have the form:

A = C + f(x) + δA

where C is a constant, and δA represents its statistical fluctuations (such that if f(x) = 0, A would have a mean value of C and a variance of (δA)

^{2}), while f(x) is some function of multiplicity.Let B have the same form:

B = D + g(x) + δB

You can see that for a fixed multiplicity, my observables A and B are independent and should have a covariance of zero.

However, the covariance of these in a finite multiplicity bin will be:

cov(A,B) = E(f(x)*g(x)) - E(f(x)) * E(g(x))

where E() is the "expectation of", and is essentially the mean value over the bin.

It's easy to do the math for this if we take f(x) and g(x) to be linear functions of x, f(x) = Fx, g(x) = Gx, and then we get:

cov(A,B) = (1/12)*F*G*w

^{2}where w is the multiplicity bin width. As we go to the limit where w approaches 0, the covariance approaches 0. The fact that it is non-zero for finite bins is the bin width effect whereby a mutual dependence on the variable used for binning contributes to the variance. This contribution can be minimized by taking smaller bins. This is what is already being done.

But...

Let's now replace x with x+δx, so that x is still the multiplicity that we measure, and x+δx is the centrality that A and B may functionally depend upon, modulo a scale factor that we can absorb into f() and g(). In this case, the mean centrality (in a multiplicity bin) is still the multiplicity x, while (δx)

^{2}would be the variance in centrality about the mean. Now, the covariance of A and B becomes:cov(A,B) = E(f(x+δ)*g(x+δx)) - E(f(x+δx)) * E(g(x+δx))

Again, this is easily solved for the linear case that f(x) = Fx, g(x) = Gx, and we get for a multiplicity bin of width w:

cov(A,B) = (1/12)*F*G*w

^{2}+ F*G*(δx)^{2}You can see that as w approaches zero, you will get a covariance that becomes independent of the multiplicity bin width (and you will see results that appear to approach a steady state by doing so), but it does not go to zero as we would like it to for these independent observables. There is a term related to the variance of centrality about the mean for a given multiplicity, and it will contribute to your observed correlation functions because δx is clearly nonzero for STAR.

-Gene

»

- genevb's blog
- Login or register to post comments