The Gaussian Prior

The sample s is the vector sum of two independently distributed random variables U and C. The distribution of the sum of two independent variables is the convolution of the distributions of the variables.

The distribution for U is N(0,1/2)ⁿ and for C is N(0,d²/2)ⁿ. See this about the N(, ) notation. The respective probability densities are:
U: π^−n/2e^−r²
C: π^−n/2d⁻ⁿe^−r²/d²
d/√2 is the standard deviation of the coordinates of the center of the prior distribution of the unknown center of the distribution from which our sample comes. We will let d approach infinity.

If we adopt a coordinate system in which the sample is found at s = <s₀, 0, 0, ...> then our Bayesian estimate for C will be <m, 0, 0, ...> where m is the expected value for z from the distribution e^−z² e^{−(z−s)²/d²} = e^−v(z)
where v(z) = z² + (z − s)²d⁻² which is a quadratic in z with a minimum at z_min = s/(d² + 1).

v(z) is symmetric about z_min (v(z_min + z) = v(z_min − z)) and thus the centroid of e^−v(z) is its center of symmetry. Thus m = s/(d² + 1). In the limit, m → s as d → ∞. For large standard deviation of the prior distribution Bayes tells us to ascribe the sample mainly to the large variate, i.e. the prior distribution — in the limit, entirely to C!

I have been vague concerning the dimensionality of some of these formulae. When we adopted a coördiniate system to place the sample conveniently, the two marginal distributions along the x₀ axis are indeed normal distributions.

The stuff above makes sense for 1D distributions. If you reinterpret the expression for m casting z and s as n dimensional vectors and taking the square operation to be the dot product with self, then the expression yields a vector. By symmetry the direction remains the same but the off axis contributions to the integrals might change the vector’s magnitude and thus its centroid. Here is why they does not.

Each of these vectors can be expressed as the sum of a vector in the x₀ direction of the adopted coordinate system, and an n−1 dimensional vector in the space orthogonal to that direction. The integrands of both integrals are the product of a function of x₀ and a point in the orthogonal space. Both integrals can be transformed into a double integral, one over the real z line, and a function of the location in the orthogonal space.

Let Y be the space orthogonal to X, the x-axis. The spaces X and Y span the entire space. In the following “∫_X ... dx” denotes integration over the entire x axis while “∫_Y ... dy” is integration over Y. For any vector z we write z = x(z) + y(z) to decompose z into two components. x(z) ∊ X and y(z) ∊ Y.

First ∫ φ dz = ∫_X ∫_Y φ dy dx under the liberal hypothesis of Fubini’s theorem (since our functions become small very quickly with large arguments).
e^−z² = e^{−(x(z)² + y(z)²)} = e^−x(z)² e^−y(z)²
e^{−(z−s)²/d²} = e^{−(x(z) + y(z) − s)²/d²} = e^{−((x(z) − s)²
+ y(z)²)/d²} = e^{−(x(z) − s)²/d²} e^{−y(z)²/d²}

Note that the coordinates were chosen to make y(s) = 0 and x(s) = s. Also x(z) and s are each orthogonal to y(z). Both of the expressions e^−z², and e^{−(z−s)²/d²} are factored into functions of X and functions of Y.