Here we discuss a simple implicit solution of the 1D heat equation. Now we note how to avoid the recurrence property that thwarts mapping this process onto simple SIMD architectures. In the scheme described the two inner loops each use values produced in the prior loop iteration. If instead the unknowns are eliminated in the following order the problem can be parallelized.
ϕ10, ϕ14, ϕ18 ... ϕ196
ϕ11, ϕ15, ϕ19 ... ϕ197
ϕ12, ϕ16, ϕ110 ... ϕ198
ϕ13, ϕ17, ϕ111 ... ϕ199

Eliminating ϕ10 does not interact with eliminating ϕ14 and can be carried out with SIMD style parallelism. Novel placement of data in RAM is probably required. After the first three groups have been eliminated, however, the last group presents the same recurrence dilemma as we saw initially. At least we have solved 3/4 of the problem. We can declare victory or notice that the remaining problem is exactly equivalent to the original problem, except for being only 1/4 as big. We can recurse.

After the 25 unknowns have been computed, recursively or recurrently, we must then compute the other 75 values, which is again appropriate for normal SIMD.