The first necessary notion is the probability integral transform which is itself a useful notion which I explore first. Suppose you have the heights of the students of some High School. You can compute several statistics from this information but we propose here instead to sort the heights and put them in a list and assign indexes to them from smallest height to largest. These indexes are numbers from 1 to the population which is the number of students in the school. We divide each index by the population and get numbers between 0 and 1. We now have a tabulated function, f, from the numbers between 0 and 1, to the heights of the students. f(½) is the median height of the students. f(4/5) is the height of the shortest student in the tallest quintile of heights. It is important to realize that f is not a linear function of its argument, but it is monotonic: x≤y → f(x) ≤ f(y). It may help to visualize the student body lined up on a stage by height, each student occupying the same horizontal space, and a large horizontal ruler with 0 at the left end and 1 at the right end. This image constitutes a graph of f. Sometimes the inverse of f is useful and since f is monotonic this is well defined. 0 ≤ f−1(h) ≤ 1. There are ((population)*f−1(5 ft)) students whose height is less than five feet.
The copula might be called: a cumulative distribution in a box. Each component of the random vector has been replaced by another that runs uniformly from 0 to 1. The population whose individual stats that had each been distributed over some natural range, has been replaced by another normalized stat ranging from 0 to 1. If we visualize the original population, distributed in n-space by their individual stats, to be redistributed in this n-box, the distribution in the box is the copula.
If heights and weights were independently distributed which means that knowing one gives no clue about the other, then this distribution in the box is uniform, or flat, and conversely.