Counting Constrained Patterns

The two small programs introduced here purport to count the exact number of conforming patterns for certain rectangular bit arrays. Here is some of the logic of those programs and speculation how they could be improved.

The program logic computes the exact count by induction on the length of one of the edges of the rectangle. For an m × n rectangle we start with an m × 1 rectangle and proceed by induction on n. The information necessary for the induction step is for each configuration of the bottom row of bits (along which the rectangle is inductively growing), how many constrained m × n patterns end in that configuration. The count is 0, of course, for configurations that do not themselves conform and we save much space by not storing these zero counts. We identify the counts in the following by the number whose binary representation is those edge bits.

The induction step transforms this vector of 2^m counts by multiplying by a matrix A of 0’s and 1’s. A_ij is 1 just in case i and j each lack adjacent ones and the bits of i and j are disjoint. In C jargon
A[i][j] = !(i & (i<<1)) && !(j & (j<<1)) && !(i & j).

The code keeps the count vector which is called old and new. These vectors are substantially compressed by omitting elements necessarily 0 by the nonconformance of their subscript values. There are indeed just F_m elements stored for each vector. If j is an index into a real compressed vector then pats[j] is the value of the index for the theory provided in this page.

Indeed the calculation merely computes Aⁿ one where one is a vector each component of which is 1. This suggests that we are computing the eigen vector of A. I know no better way to do this than what the present code does except that the multi precision counts might be better done with floating point. Even the floating values might need renormalization to avoid overflow. This would lack the charm of an exact count but I don’t think we need that.

Another more serious issue is the boundary conditions along the two edges orthogonal to the growing edge. Those points are adjacent to all zeros outside the computed mesh. The calculation is thus not representative of a much larger mesh as it allows more 1’s among those edges than would be typical in an infinite mesh. For more realistic edge conditions we might establish an artificial exterior boundary that matches statistically, somehow, what would be expected. What is expected is somewhat circular, of course, for that is the whole point of this calculation. Nonetheless it seems plausible to at least put boundary conditions which have the correct density of 1’s, and better conform to the prohibition of repeated 1’s. Better yet, perhaps, we should take the eigen vector, as we compute it, to produce some sort of Markovian process to produce a boundary condition that matches the patterns that the calculation develops along its leading edge. This requires a precise statistical model and how to derive parameters for the model from the current vector.

These ideas get to the heart of the impetus of this calculation: How can multigraph statistics of a bit stream be turned into a probability distribution over bit streams, and how can such distributions be effectively sampled algorithmically?