I cant’t find my Cray-X-MP manual but this is the logic that I recall of the hardware interlock by which multiprocessing programs can coördinate on shared data in memory. There are a few, perhaps four, flip-flops shared among the several processors. One flop would be enough but a few are rather better. There were short fast commands to turn these on and off, one at a time. The following rule was enforced: If you turned a flop on which was already on, you stalled until someone else turned it off.
Sometimes non-cooperating tasks would occupy the processors and it was necessary to map these flops so as protect program logic from cross contamination. I do not recall how this was done.
The machine was the third Cray machine with substantial pipelining. It would suffice to broadcast a signal to other processors and arrange that none of you effects on memory happened until you knew that you were the only processor seizing that flop. Perhaps you would also have to cancel all loads subsequent to the turn-on, and redo them when the coast was clear.