# Work It, Work It

posted in

As I discussed in my post "How I Met the Gaussian Distribution," I am assuming that quantum dot sizes follow a Gaussian bell-curve distribution in terms of size in a single layer of a quantum dot solar cell. The presence of variation in quantum dot sizes, as well as the fact that quantum dot sizes "center" about a face-value size like 3 or 4 nm diameter (which is the "advertised" size) are both confirmed by Santra & Kamat. While the presence of these factors does not altogether confirm a Gaussian nature of quantum dot sizes, it does lend significant evidence that such a phenomenon is occurring.

Thus, to account for this variation in sizes of quantum dots in one layer, it can be assumed that the quantum dot sizes remain centered about the face-value size (i.e. 3 or 4 nm diameter) and follow a normal Gaussian distribution, which can be expressed as

Thus, I developed a theoretical algorithm to calculate any absorption point Ai as a function of quantum dot size x given assumed Gaussian properties and a quantum dot size range of 0 to infinity, which is widest range possible.

In retrospect, this algorithm looks a little redundant / common-sensical, because it LOOKS like the big expressions in the numerator and denominator cancel each other only to leave A(x) for Ai, but the A(x) is actually changing for every value of 0 to infinity substituted for x. Thus, this is much more complicated than a simple cancellation (and using a cancellation here would be incorrect).

For the purposes of computation, the above algorithm is simply impossible because it involves an integral to infinity, which the computer can NEVER exactly compute! But I solved this problem with Riemann sums, which is essentially the practice of using a bunch of rectangles defined by the function's x increments and y values to approximate the area under the curve.

My next steps are programming my algorithms and determining the number of sigma (confidence level, significance level) needed to define S in the algorithm immediately above. Do you have any recommendations for number of sigma, or feedback on any other parts of these algorithms? My impression is that the scientific community considers 3 sigma significant, 5 sigma "proof", and 6 sigma CERN Higgs boson-level, but I'd be interested in hearing your take.

Thanks so much for your time and feedback,
-VD

### Don't judge me too harshly,

Don't judge me too harshly, but I'm used to thinking about statistical significance in relation to biological data, and since biologists deal with messy systems that have a great deal of variability, they can be happy with data that's significant to within just two standard deviations. They consider 3 sigma very fantastic, so you're kind of blowing my mind by even bringing up 5 sigma.
I appreciate your explanation of why the second equation above can't be simplified by cancelling out the seemingly similar numerator and denominator; I was scratching my head at first!

### Haha, I'm not judging at all!

Haha, I'm not judging at all! We've examined psychology studies in Statistics class, so I have definitely heard about the significance issue in different fields of study (and Kenny's opinions on the ethics of researchers that twist their data or their conditions to meet values set by the field). http://www.physicscentral.com/buzz/blog/index.cfm?postid=5248358123737529836 "Does 5-sigma = discovery?" might shed some light on this. I think 3 sigma is pretty good - thank you - but because my study is mainly theoretical I may be able to use 5 sigma... is it ethical, though, to analyze my data before claiming a sigma benchmark? I do not plan to modify my data, nor do I plan to set a condition BEFORE testing, but I am wondering if sigma accuracy can be included in results and analysis.

### Someday when you look too

Someday when you look too happy, I'll tell you my own dark tales about researchers massaging their data into significance....shudder.
I think it's perfectly reasonable for you to proceed without having a set-in-stone benchmark before analysis. You're doing all the right things by thinking ahead about how to determine statistical significance ahead of time, and you've got some ideas about the cutoffs you'll use, and you understand the relative merits and drawbacks of them.