The terms covariant and contravariant arose first in mathematics, or perhaps mathematical physics. Fairly recently they have been adapted to the theory of types for computer languages. I thought I would explain the connection that I see between these two worlds.
If we compute these gradients in miles then the numbers we get will be in degrees per mile. If we choose kilometers, then the degrees per kilometer will be less than the degrees per mile. The mile is larger than the kilometer and thus leads to larger vector components. There are more degrees per mile than degrees per kilometer. For the same gradient the components are smaller when the units are smaller. As a vector, the gradient is thus deemed to be covariant.
By contrast if we speak of velocity then a vector component is smaller when the unit is bigger. There are fewer miles per hour than kilometers per hour. Velocity is thus deemed contravariant.
Golf scores are contravariant—small wins. High jumping scores are covariant—large wins.
In C I may write typedef struct{zot q; int p;} tx;. If I change zot to a larger type (more values) then the type tx will also be larger with more values. This kind of production of new types from old is covariant.
If I write in C typedef int fun(zot); this means that fun is the type of a function that takes a zot and returns an int. If I modify the definition of zot to include more values, then there will be fewer functions that can cope with the expanded set of possible arguments—i.e fewer functions that conform to the modified type zot. This form of type construction is called contravariant thereby.
I have resurrected a sequence of very interesting email that was once found here, and which was captured thus by the Internet Archive Wayback Machine. It speculates on history of such terms in the mathematical context.