This note is about CPU design, code optimizers and tangentially about GF(2^{8}) or the finite field with 256 elements.
This mathematical structure is used heavily in Reed-Solomon codes (Erasure coding), and lightly, possibly in Shamir secret sharing.
Being a field, and a computational field at that, there are add, multiply and divide operations to be provided.
The theory is elsewhere but we examine the programming of multiply here which is usually the bulk of the necessary heavy computing.
The common header file defines a few types and provides a short fast definition of multiplication.
It is in a header file so as to be available for inlining by clients.
Here is the core:

typedef unsigned char uc; typedef unsigned int ui; static const uc irp = 27; static ui fmul(char a, ui b){uc sm=0; while(b) {if(b&1) sm ^= a; a = (a<<1)^((a>>7)&irp); b>>=1;} return sm;}

clang -Wmost null.c -S -O3produces a file

gcc -Wall null.c -S -O3produces a file

clang tmul.c Afmul.c -O3 -Wmostor

gcc tmul.c Afmul.c -O3 -Wall -std=c99verifies field properties of the algebra.

Substituting Bfmul.c for Afmul.c tests a system where we precalculate 768 bytes of tables.

This computes Lagrange interpolation coefficients and moves towards code to restore lost portions.