The sturm and drang surrounding mathematical foundations in the early 20th century still echoed at Berkeley. I found it important and still do. Predicate logic together with ZFC axioms for sets seemed adequate for the math people knew. I prefer the old because I know it.

Just as a computer designer chooses primitive instructions which together provide general purpose computing, so do proposers of mathematical foundations propose primitive notions to explain conventional mathematics. The engineer tries for efficient computation and the mathematician tries for simple proofs upon his foundations to support familiar math. I think both ZFC and category theory do a pretty good job of this.

Russel’s paradox badly wounded early foundation builders and to avoid those problems we got a set theory that abjured the universal set and indeed many sets that seemed natural but turned out to be so tall that they hit their head on that paradox. For instance the pattern of groups is a crown jewel in math but the set of groups is too big for the axioms and where we might want to speak of the set of groups we must instead only speak of what it means to be a group. In this context “group”, the word, remains informal. Klein introduced the word “vierergruppe” in 1884 for that 4 element group that was not cyclic. Such a noun was out of fashion in 1955 as there was no mathematical thing for it to denote. It was easy to formally say, however, when two groups were isomorphic. In 1955 many sentences needed such a noun and there was none.

Von Neumann-Bernays extended set theory a bit to include classes which stood off a bit from sets and which included a class of all sets. No Class of all classes was possible however. I have not seen that theory developed very far. I think that there is an opinion that these dodges do not add materially to math, except to improve presentation.

Category theory comes at this from a different angle. They have axioms, built on predicate calculus as best I can see, where there is a thing for “vierergruppe” to denote, even “group”. This certainly improves the flow of mathematical English prose in many places. They take functions as primitive and call them arrows. I have no objection. The first few theorems get them farther towards useful math that the same number of theorems starting with sets. It goes somewhat easier than sets. I have not seen, however, anyone worry whether there are hazards such as Russel’s paradox. I suspect them of not caring. I have not ascertained the foundations on which Mizar is built. I suppose they would worry if some one proved that 0=1.

Just now I see that there are old-school and Category style monoids. It is the same math with different words. I was going to make some negative comments on the latter but then realized that the latter is intended to describe how monoids are defined in category theory, not to introduce what they are. That is left to the old-school math.