An Interview with Chuck Leith

An Interview with Chuck Leith[1]

Chuck Leith = CL

George Michael = GAM

GAM: It’s December 7, 1994, and I’m talking with Chuck Leith, who was here very, very early in the Laboratory’s history. In fact, a lot of the interesting things about Chuck might start with the days at Oak Ridge and at graduate school at the University of California in Berkeley. So, Chuck, why don’t you begin by telling us how you got to Berkeley, and how you got to Oak Ridge, and how you came here?

CL: I started working for the Radiation Laboratory in Berkeley in the fall of 1943, right after I had acquired my Bachelor’s degree in the Mathematics Department at Berkeley. Within a couple of months, I was sent along with a number of other people at the Radiation Laboratory in Berkeley—Ernest Lawrence’s Laboratory—to Oak Ridge to work in the Manhattan Project there. The Manhattan Project work ended, of course, at the end of World War II. At that time, I’d been drafted into the Army, although I stayed working with the same group. It took a little longer, therefore, for me to get out of the Army and back to Berkeley than for the other people who were not in that situation.

GAM: While you were at Oak Ridge, you worked with Herb York and others there. Herb was the first Director at Livermore. Can you say what were you doing without breaking security rules?

CL: The work in Oak Ridge, as is well known, was involved in the electromagnetic separation of uranium isotopes, using mechanisms which essentially were devised by the Radiation Laboratory in Berkeley. The calutron,[2] which was used at Oak Ridge, made use of the large magnetic fields that Ernest Lawrence had begun developing before the war for his 184-inch cyclotron.

The work that I did there with Herb York and a few other people from the Laboratory had to do with aspects of the electromagnetic separation process. In fact, my first computational effort dealt with the integration of a simple system of nonlinear equations relevant to details of plant production as depending on various parameters that we had control over. And that integration was carried out by wiring a board on a 601 Multiplier—an IBM accounting machine. As I say, that was my first computer program, crude as it was. But it was satisfactory in that we were proposing a counter-intuitive change based on the integration of this system of equations, and the integration showed us that, in fact, it was a sensible change to make.

As I was saying, I’d returned to Berkeley to start graduate work after the war.

GAM: That was about 1945, then?

CL: Yes, that was around 1945, 1946—in those years. And I was working as an experimental physicist on high-energy particle scattering experiments using the 184-inch Cyclotron, which, by that time, was working. I was not particularly involved in computational activities at that time. I was a graduate student in the Mathematics Department.

Around 1950, a group of us—again under Herb York’s leadership—became involved in a measurements project. It was a largely experimental effort to measure how effective a group at Los Alamos had been in producing a first thermonuclear reaction in a test situation. That involved actually getting more into computational techniques, although at that stage we were not in a position to use any very high-powered computers or computational techniques, because they didn’t exist.

However, during those years John von Neumann, at the Institute for Advanced Study in Princeton, was developing such computers and computational techniques in the first “Johnniac.”[3]

And he was also making some big and critical calculations on thermonuclear reactions. So, I became familiar with the work that he was doing. It turned out that also at the Institute for Advanced Study I became aware of the fact that another interest of von Neumann’s was being carried out—namely, numerical weather prediction.

GAM: Ah—your weather work goes that far back?

CL: Yes, that was around 1950. That’s one of the first things that von Neumann recognized—the fact that the big and complicated nonlinear problems which we were facing both in the weapons business and in the weather business could be tackled with the machine that he was developing at that time.

The Livermore Laboratory was set up in the fall of 1952, and I came along with Herb York, who was the first Director at the Livermore branch of the Laboratory. I came to Livermore with him and a number of other people, as we got things started. So, one of the first things we did was to buy a computer—the UNIVAC 1. We realized that we would need such equipment in order to try to solve the problems that we were dealing with. That was von Neumann’s advice to Edward Teller, who was involved in the creation of the Livermore branch of the Laboratory, and it was advice to which we all agreed. We clearly recognized the role that computers would play in the activity that we were getting involved in.

So it was then, in fact, when that machine was delivered in early 1953, that I first became deeply involved in the issue of how such computers should be used for the simulation of complicated physical processes. In particular, I became mostly interested in the simulation of the explosion of nuclear and thermonuclear devices.

GAM: It got here in April.

CL: Yes, is that right? I’d forgotten the exact month. April of 1953. Well, we had ordered the UNIVAC 1. Some of us had visited the factory in Philadelphia during the winter, trying it out and testing some simple codes that we had developed for those matters, and I was among the group that did this. So, although it didn’t arrive until April, we were actually using it some months earlier than that.

GAM: That was in a sense the origin of the EGG code?

CL: Yes, that was the code that was designed to simulate, in a spherically symmetric configuration, the explosion of nuclear and/or thermonuclear weapons. It was highly simplified in many respects, of course, but it was able to give some reasonable estimates—fairly accurate estimates—of what explosive yield would probably come from such devices. It also provided a lot of the details for what was going on during the explosion process.

I became very much interested in that, as well as the techniques of how to use such a computer for doing that kind of rather complicated problem that involved fluid flow, radiation transport, and the like. It was very similar, in many ways, to the kinds of calculations that people were starting in about that time for modeling stars, except that this was dynamic, and those tended to be stationary configurations.

GAM: Well, do you remember any of the guys who were working with you? It’s always nice to dredge up old names like that. You were working with Harold Harlan Hall at one point?

CL: Yes.

GAM: “Tritium” we used to call him. When did Sid rise up in—?

CL: Sid Fernbach was involved almost from the very beginning. He was certainly involved in the procurement of the UNIVAC 1, and he was the one who spent a lot of time dealing with the Eckert-Mauchly people in Philadelphia in connection with the machine and its delivery. Among the small group of people who first came out to the Livermore Laboratory, he took over—with other people being happy that he had—the whole issue of getting this new computer that we needed, and didn’t know too much about how to use it.

The UNIVAC, of course, was an interesting machine inasmuch as it had a thousand-word memory. When people would ask von Neumann early on about, roughly speaking, how big a memory one needed for these things, he said, “About a thousand words ought to be enough.” And I think we decided later that maybe that was enough for him, but most of us would have preferred to have a much larger memory. But, in fact, it was a memory into which one could buffer information from tape—reading and writing in 60-word buffers. So, with careful planning, there was fairly easy transfer of information out to a working tape, and back from a working tape into the memory, so that there was, in fact, no particular limitation on the size of the problems that we could carry out. The fact of the matter is that the design was clever enough so that—recognizing that it was natural to write a tape forward—it permitted you to read it backwards.

GAM: Yes, excellent.

CL: It was a very good idea, which I think sort of vanished from sight after the end of that machine.

GAM: Indeed. Yes, there were a couple of others that did it, but they weren’t in the mainstream at all.

CL: This was, of course, aided by the fact that it was a fixed block length of 60 words that one was dealing with in this case. I have recently been wishing that the DEC alpha chip, which has a thousand-word cache, would in fact have a thousand-word local memory, with a 64-word read-write buffer that I could use in exactly the same way as we handled the input-output (i/o) problem in the UNIVAC 1. Unfortunately, that’s not the case, and it’s not particularly easy to feed the alpha chip.

GAM: Yes. Well, that was what the CRAY people decided.

CL: It’s a common problem with microprocessors that they can do arithmetic faster than they can be fed numbers to do it on.

GAM: So, back in 1953, Harold Brown became the center around which people clustered, and we used to have a weekly meeting—the LMG (Livermore Megaton Group) meeting.

CL: Well, he was certainly one of the key people in the beginning years of the Laboratory, and became, after awhile, its Director, following Herb York’s departure to go to the Defense Department. But his background was essentially in theoretical physics, and he was strongly supportive of the computational program that people were carrying out.

Software, of course, was pretty primitive in those years. In fact, for the UNIVAC 1, one wrote essentially in absolute assembly language—that is, there was a thousand-word memory, and you addressed memory locations by their number, like 2-3-7, for example. In my mind, one of the greatest boons that made machine programming easier came with the introduction of symbolic assembly language. I consider that a greater benefit than the later introduction of FORTRAN. Of course, FORTRAN has obviously proved to be a useful tool.

GAM: Well, it’s kind of like the crossbow of programming, you know. It made everybody possibly a programmer. But you had to be pretty good to use symbolic assembly, anyway.

CL: But it was a lot better than the earlier absolute machine language that we were using. And it permitted one to write sections of programs that assembled properly.

GAM: Do you remember Meritt Elmore, Tom Wilder, and people like that?

CL: Sort of, vaguely, yes.

GAM: They produced a thing called LMO,[4] a compiler that handled formatting for output on the UNIVAC. Did you use that?

CL: I did not use that—not that I remember.

GAM: I don’t think many people did. I remember Grace Hopper jumping up and down and saying that it was really a great invention.

CL: During the 1950s, I kept being more and more involved in this, and the trouble was that I was not pursuing my graduate work as rapidly as I should have been. So, along about 1957, I took a year off and went into Berkeley, stayed there, and finished my thesis research. I had finished course requirements earlier, but I’d been delayed in getting around to my thesis research because of my getting so deeply involved in the early years of computing, which I found almost fatally attractive in that regard. But then, after having finished that and returning about 1958, I wrote one further big weapons explosion code.

Along about that time, I was beginning to wonder whether there was any particular future in what I was engaged in. This was partly from personal reasons of wanting to get involved in other things. But it was partly also that there was temporarily some sort of geopolitical move toward banning nuclear tests, and there seemed to be some evidence that the world was moving away from interest in this sort of thing. It turned out to be only temporary, but, nonetheless, that made me interested in getting involved in the simulation of other complicated physical systems. And, in particular, I was curious about what was being done on the numerical simulation of the atmosphere—the problem that von Neumann had been tackling with the people at the Institute for Advanced Study in Princeton. I went and talked to some of the people in the atmospheric modeling business who encouraged me to get into it even though I had no particular background in that area except for a background in mathematics and physics.

But in the fall of 1960, we were getting the LARC computer, which was the first transistorized computer at the Livermore Laboratory. And it was thought that it would be about ten times faster than anything that we or others had. So I pointed out to the people I talked to that I didn’t know very much about the atmosphere, but we were about to get a computer that was ten times faster—so didn’t it make sense for me to try to build an atmospheric model on the UNIVAC 1, and what kind of contribution might that make? I was strongly encouraged to go ahead, to do that, and therefore I spent about a year or so before the delivery of the LARC in October 1960—getting ready for it. In fact, during the summer of 1960, I spent some months at the International Institute for Meteorology in Stockholm working on the development of this model code and having access to a library and people who had some familiarity with the nature of the problems encountered in doing this sort of thing.

In the fall, I returned to Livermore, the LARC was delivered, and I started running on it. In fact, I think I was running on it quite a bit sooner than almost anyone else, because I had spent this time getting ready for it.

GAM: Right. That’s why we called it—“the Leith Atmospheric Research Calculator.”

CL: There was this muttering about the LARC standing for it, as you say. It was a useful machine if one programmed it properly. It had, as the next level of working storage, rotating drums. And if you used them properly you could maintain minimum latency on acquiring information from the drums. So it was a machine that could be well balanced in its i/o versus arithmetic capabilities. I was able to build a fairly decent atmospheric model which was later perceived as competitive with two others that were being developed at other centers in the country, one in Princeton and the other at UCLA.

GAM: Oh, UCLA. Yes, I remember that—the Smagarinsky one?

CL: Smagarinsky actually developed his model when the Geophysics Fluid Dynamics Laboratory was still in Washington, D.C. They later moved to Princeton.

But they, in fact, had acquired the STRETCH about those years, which was another first-generation transistorized computer from IBM, as contrasted with the LARC, which was from Eckert-Mauchly. And they had been planning to use that machine for some of their first global atmospheric modeling work.

GAM: So, actually, the LARC was late and so was the STRETCH. So, if they had a running STRETCH at Princeton, it must have been after 1962.

CL: I think that could be. It was a year or two later, I believe, after I got things running on the LARC, which would be 1961.

GAM: Well, you did many, many “first” things on the LARC. Do you remember we had the electronic page recorders—EPRs? And using them, you made some very interesting movies.

CL: Ah, yes. Well, then, of course, you helped me on that, George.

GAM: Well, never mind that. You did a polar projection?

CL: It’s true that on that it was possible effectively to do graphics to look at single images of isobars on the northern hemisphere polar projection, for example, of isopressure or cloud patterns—things of that sort. And by doing this sequentially, of course, one could generate motion pictures. This was, I think, one of the first of the evolving motion picture displays from a computer-generated atmospheric model.

GAM: We did that.

CL: The way it was done, if you’ll remember, was to print three successive black and white frames of 35mm film, which later in the printing process were printed through filters and superimposed so that we got a three-color image for single printing for every three frames that we made originally. With this display the evolving features of the global atmosphere were readily identified, and it led to a lot more interest in the way these models worked.

GAM: Listen, Chuck, it was nothing short of sensational, okay? It was great!

CL: It’s a fairly common procedure now. I believe it was one of the first that was done.

GAM: Now it is. But you did it thirty-four years ago or so.

CL: In fact, there was a downside to it—some of my colleagues in the atmospheric modeling business accused me of blatant showmanship!

GAM: They were just jealous!

CL: Well, in fact, they later also started making movies.

GAM: Sure!

CL: But I sympathized with them, because I noticed that when I was, in those years, talking about the behavior of the model to any audience, they didn’t seem to care anything about the details of the model. They just wanted to see the movies!

GAM: Now, there was a period where you were working on an IBM 709 with AEOLIS? Do you remember AEOLIS?

CL: That’s right. That was an early model of a sample piece of the atmosphere, before the LARC.

GAM: Yes, I understand, but that was in preparation for the LARC.

CL: Yes, it was done for testing ideas of numerical weather prediction. When I first became interested in modeling the atmosphere, I wanted to try a sample piece of atmosphere to find out whether, in fact, the model of the atmosphere moved anywhere like the way the real atmosphere evolved. And for that reason I acquired the Teletype information from all the weather stations around the United States and set up a rather simple scheme for generating the initial state of the atmosphere in a box over the United States, which then was run in a kind of a forecast mode with this simple, early version of my code.

GAM: You can call it simple, but there is at that point, absolutely nothing like it anywhere in the world.

CL: Well, it was comparable to some numerical weather prediction models that had been developed at the Institute for Advanced Study in Princeton.

GAM: And I remember there was another guy we talked to named Cressman.

CL: George Cressman was the man connected with this in the Weather Service.

GAM: Yes, NANWEP or something like that.

CL: Well, there was a Joint Numerical Weather Prediction Unit (JNWPU), which spun out of the von Neumann group’s early work. The JNWPU was joint between the Weather Service, the Navy, and the Air Force. They pooled their resources, and, because it had been shown in principle by earlier work at the Institute for Advanced Study at Princeton that something was probably likely to work here, they got together to try to push this thing through. George Cressman was very much involved in leading that activity, which finally became the numerical weather prediction facility of the Weather Service. This has been carried on ever since, and improved steadily through the years, in Suitland, Maryland.

GAM: But when you were deciphering or pulling information out of these Teletype messages, that was a first. The other people had a human doing it.

CL: Yes, I was doing it by an objective, purely numerical scheme, essentially recording the weather station reports and interpolating between them to generate initial fields for the purposes of this program. That, of course, a bit later was done routinely, but I think that mine perhaps was one of the first relatively crude attempts to automate the whole process from the acquisition of the Teletype reports all the way through the forecast process.

GAM: Really astonishing!

CL: Frankly, I had done that because I wanted to convince myself that there might be some similarity between what the model would project and what actually happened. And I found fairly soon that there was, and so that encouraged me to continue with that process.

GAM: Well, you can argue that you were at a neat place at a neat time, but nonetheless, you did some incredible stuff. Well, it’s just recognized everywhere. So, we now have the Livermore Atmospheric Model (LAM) running on the LARC. What happened next?

CL: Well, I became interested during the 1960s in the more fundamental aspect of the behavior of the atmosphere. That had to do with treating the large-scale motions of the atmosphere as a turbulent fluid, and trying to understand elements of the turbulence theory associated with this. Now, the turbulence theory involved in this case, because of the size of the global atmosphere and its thinness is more two-dimensional than it is three-dimensional. So this led to my becoming involved in what’s called two-dimensional turbulence, which we discovered to have properties quite different than those of three-dimensional turbulence, and which served as kind of a prototype for the statistical behavior of the large-scale motions in the atmosphere. And so, much of my interest during the '60s was devoted to these issues of the nature of two-dimensional turbulence.

In the 1960s also, was when the National Center for Atmospheric Research (NCAR) was set up. About the same time that I was returning to the Livermore Laboratory to run on the LARC computer, other people were moving to Boulder, Colorado, to set up this new national center. And, of course, it’s not surprising that I found myself making more and more frequent visits during the 1960s to the national center because the work that they were doing there was similar in nature—of setting up models of two-dimensional turbulence problems and the like. So, finally, in 1968, I moved there. I left the Laboratory and moved to NCAR, the National Center for Atmospheric Research in Boulder, where I continued to work on these general problems.

GAM: Well, many people say, and I think they were right, that it was absolutely fundamental that you go there. They would have fallen flat on their faces without it.

CL: I think that’s unlikely!

GAM: I read some of the stuff—I think it is not unlikely. I mean, you gave them your years of experience from here and elsewhere.

CL: Well, but I had done that without going there. I had done that during my visits in the '60s, when I had been talking to them about their plans.

After I went to NCAR, within a few years I made a mistake in my own career choices, because I became drawn into administrative activities. I became the Director of the division at NCAR within which the modeling of climate, weather, and ocean circulations was carried out. It was, of course, the activity that I was mostly interested in, and it was strongly dependent on numerical modeling techniques. But that took me away from my own work, because of the responsibilities of being Director of this activity. And I have mixed feelings about it.

GAM: Yes, I understand. I remember. So, you stood that for how long?

CL: That was 1968 that I went, so I was involved in that pretty much during the ’70s. It was toward the end of the '70s—actually 1980—that I was starting to try to extricate myself from administrative responsibilities, and had more time to spend on turbulence issues that I was getting more involved in. But what I was really aiming for was a return, which I finally accomplished in 1983, to the Livermore Laboratory. And that meant that overall I was at NCAR about fifteen years. And, of course, I still keep in close touch with what’s going on at NCAR. I make several visits a year and talk to people there.

GAM: Well, they still talk about you, you know. Do you visit there fairly regularly?

CL: Oh yes. Well, I have been a member of the advisory panel for the Scientific Computing Division at NCAR off and on for about thirty years or so—from before I went to NCAR, while I was at Livermore, during the time that I was at NCAR, and since I’ve been away. And that’s kept me in close touch with their plans for new generations of computers and the like. But, as I say, I returned to Livermore in 1983.

GAM: Well, long before you left here, you were also involved in the development of DAS, and you taught out there.

CL: Well, that’s right. The Department of Applied Science (DAS), Livermore campus, was set up under the leadership of Edward Teller. During the '60s, about a third of my time was spent in teaching a course. It was a graduate department, and the course that I taught dealt with fluid dynamics, turbulence, atmospheric modeling, that sort of thing. And I enjoyed that. In connection with that, I had a number of students who I felt were—

GAM: Pretty good!

CL: Very good, in fact! And that’s always stimulating and exciting to have interactions with really bright students, which I certainly did in that case.

GAM: Well, even before you left to go to NCAR, you had a group here.

CL: Yes, it was a small group of people involved in hydrodynamics problems in general of maybe twenty or thirty people, something like. I’ve forgotten the exact number.

GAM: Now, at that point, was there so much administrative interference that you couldn’t get much done?

CL: No, that was much less than what I got involved in at NCAR. At NCAR there were more people involved, more money involved, and times of shrinking budgets, which, of course, is always the hardest part for an administrator.

GAM: I’ve often wondered what happened to all those guys. I remember Maurice Newman. The janitor used to hate him, because he’d walk around his room smoking and putting the cigarette butts out on the floor and stamping them out, and the janitor couldn’t get it clean. And there was a guy by the name of Bob Stinner.

CL: I remember Maurice Newman, because of these problems that you mention, more clearly than almost anyone else from that time.

GAM: Well, going back really far—do you remember, at the LMG meetings, that Edward Teller suggested that everybody should take time out and think about doing something other than bomb physics with the computers? And Bernie Alder and Tom Wainwright came up with this molecular dynamic thing.

CL: Molecular dynamics. These were some of the earliest molecular dynamics computations.

GAM: It seems to me you began fooling around with breeding neurons at that point.

CL: Well, it is true that I did start playing some games, essentially, with simulated neuron collections to look at various ideas about—you might call simulated natural selection, I suppose.

GAM: Well, did you come to any conclusions from that?

CL: Not really anything very profound. It was sort of a game that I played for a while. Neural nodes have exciting and inhibiting connections or synapses. And I was able to find out rather quickly that unless there was a certain ratio of exciting to inhibiting synapses, the thing will either go wild or it would go dead.

GAM: That’s interesting. Well, I saw your problem through the eyes of Gale Marshall. He was writing some codes for you. I never did hear of this conclusion, though. That’s interesting.

CL: It came about, I think, because there was a machine that we had—

GAM: The 704?

CL: The IBM 704, I think. It was one in which it was relatively easy to do very rapid bit manipulations of the sort that were appropriate for these studies. The neurons were identified essentially by bit collections, which were made up of binary words.

GAM: Well, yes, speaking of binary stuff like that, the LARC and the UNIVAC before it were decimal machines.

CL: They were decimal machines.

GAM: And I think it was a mistake of us here to insist that the LARC be designed as a decimal machine.

CL: I’m not sure that we did insist, but at least—well, the UNIVAC was a decimal machine and the LARC was upward compatible.

GAM: Right.

CL: So the mistake was made on the UNIVAC.

GAM: Well, we didn’t make a mistake there—we just bought it.

CL: We just bought it.

GAM: I remember people insisting that, “By God, when you divide 1 by 10, you’re supposed to get 0.1 for an answer!” Now, in retrospect, you’ve been deeply involved in both kinds of machines—there isn’t any question now, right? The binary machine is the proper design technique?

CL: I don’t care that much, although I’m used to using it. The problem, of course, is that now that we’ve all moved into a time in which we depend on compilers anyway, it doesn’t make much difference.

Well, when I returned to the Livermore Laboratory, I was still interested in computing technology, and within a few years, in particular the issue came up about what, if anything, the Laboratory should do about these massively parallel computers which seem to be on the horizon. And in the spring and summer of 1988, a Laboratory-wide committee examined this question, and in the fall of 1988 turned out a report saying, yes, the Laboratory should move in this direction. I was chairman of that activity, and so I had to pull together the information that went into that final report. And I believed it. It was only afterwards it became slowly evident that our initial optimism was not quite borne out by what actually happened.

GAM: Well, maybe premature is perhaps the best—

CL: Perhaps too early. But, of course, by now many people have had a lot of experience with these machines. It was, I think, interesting that it was initially perceived that the problem on parallel computers was going to be the communication of information from one computing node to another computing node. But we became fairly adept at doing that. That’s not where the basic bottleneck appears to be. The basic bottleneck appears to be that the individual microprocessors on a node cannot be fed fast enough. That is, you cannot get information into the registers on a microprocessor fast enough to keep up with the speed at which the arithmetic can be done.

GAM: Seymour Cray discovered that in the mid-70s.

CL: Well, our original argument was that it was going to be more cost-effective to do your computing with mass-produced microprocessors than it was with processors hand-made by Seymour Cray. And I think in principle that was right, but at the time we were counting on greater speeds out of the microprocessors than in fact we are getting yet. For example, the new T3D parallel computer has at its nodes the DEC alpha chip—a 150-MHz chip—out of which, if we’re lucky, we’re getting on the order of 20 megaflops with the compiled code.

GAM: Oh, God!

CL: So, that is the current challenge—to try to understand how that situation can be improved so that we can get closer to what’s referred to as a peak speed. That is always kind of an ephemeral goal on these things that you’ll never get to, but at least you should be able to get closer than we have.

GAM: I should hope, yes. Well, I think that this whole area, if it can be separated from a religious aspect, would be an interesting thing to argue, because I’m not so sure that small processors a la the Butterfly 2 and so forth can ever meet your needs. Its largely because they are mass produced.

CL: The arguments that we made before were that they are mass produced; therefore, there’s much more cost-effective arithmetic. They are mass-produced for an industrial market, incidentally, not for scientific computing, which is not much of a market. And, therefore, you can get cheaper arithmetic on these, but they’re not that strong, and so, therefore, you’ve got to link a lot of them together in parallel architectures in order to get the overall computing power that you need for the big problems that we’re trying to solve.

GAM: But you’ve just finished telling us that arithmetic isn’t the real bottleneck.

CL: Well, eventually you were trying to get arithmetic done. You’re not getting it done on the microprocessors, because the memory bandwidth to the microprocessors is not big enough yet, or people haven’t yet figured out to use the local cache on the chips. That’s why I call it a “cache-flow” problem. It’s not simply the cache-flow problem; it is also an issue of the loading to and storing from the vector registers, or the registers on the chip, which tends to be slower than carrying out floating-point multiplication, for example.

GAM: Right! Well, at some point I think it would be fun to argue about this whole thing, but I remain more or less convinced that Seymour has the right idea—that you have to design these things all the way. You can’t take parts off a shelf and put them together, and say they’re going to meet the balanced need. Remember, you used that word “balance,” and that was the first—

CL: It is true. For example, if we solve this cache-flow problem, we’ll then have the problem with the internodal communication. It’s an issue of balance. It turns out right now it’s the arithmetic that we’re not getting done fast enough. But if we fix that by a factor of 3 or 4, then we’ll be back in the situation where we’ll start worrying about the communication between nodes. This typically now is taking about fifteen or twenty percent of the total time.

GAM: Do you feel that there are enough people working on this here?

CL: Oh, yes. There are a lot of people doing it, and quite effectively. It’s been—well, in the atmospheric modeling, the climate modeling activity, the Department of Energy a few years ago initiated a program, effectively to move climate models onto massively parallel architectures so that they could benefit from the increased computing power that was perceived at that time to be available from such machines, and is still perceived. This so-called CHAMMP program has been supporting a large number of people—not only at the Livermore and Los Alamos laboratories but in the University community and at NCAR—to carry out these programs. And it has been—as far as the software development and/or the model conversion, i.e., the numerical engineering aspects—remarkably successful it seems to me. The disappointing aspect has been the fact that—as I said before—the machines are not as fast as we hoped they would be at this time. That is what it really comes down to. And another factor of 3 or 4 would be very helpful at this point.

GAM: Now, this factor of 3 or 4—it’s not across the board. You need better access to registers, and you need better techniques for interprocessor communication.

CL: Well, at this stage, you could pick up a factor of 2 or 3 if you got better access to the registers, faster bandwidth between the local cache memory and registers on the chips.

And it’s possible. On the alpha chip, for example, for certain loops which don’t make memory references, I have got about 40 megaflops, and somebody else writing some assembly language loops is getting 80. Most people are typically getting 20. So, there are possibilities, which have already been shown, for more effective use of that chip.

GAM: Well, yes, that’s true, but the number of times you can really benefit from it in a real physics problem is small.

CL: Not necessarily. It’s a little like the issue of vectorization that we faced years ago, which could speed things up by a factor of 2 or 3, and in particular, of course, the use of STACKLIB is the way we did it on the 6600 and the 7600. And it’s conceivable that STACKLIB could give us, as it did before, a factor of 2 or so, and maybe more, in increased speed.

GAM: Let’s see, now, you came up with the STACKLIB idea after your experiences on the 6600, right?

CL: Yes.

GAM: Come to think of it, though, you did some other stuff on a 6600, which the CDC people were surprised at. You had run something twice rather than believe it.

CL: Well, when the 6600 was first delivered in 1964, it kept making errors in arithmetic fairly frequently. Well, like one every hundred thousand times, but that’s a lot. And I recognized this fact as a lot of other people did, and of course the CDC engineers were trying to track down what the source of the trouble was. But what I did for a simple piece of an atmospheric model that I was working on at that time was to recompute a time step until two successive passes would agree. They were occasional intermittent errors, so, in fact, only occasionally did it make a third pass. Almost always, it ran two times. I lost a factor of 2 in speed, but I was running happily while everybody else was scratching their heads trying to figure out how to find the source of the trouble.

However, the CDC engineers noted that I was every so often telling them that they were having particular difficulties with a certain part, because I could identify that whenever it occurred. And they asked me for the copy of the program card deck (required in those days) which I was running. They made a copy and took it back to the factory, and so they checked out every new 6600 with this particular diagnostic package, which happened to be—though they may not have realized it—a single-column calculation for the behavior of the atmosphere radiation transport and the like.

GAM: That’s great!

CL: That meant that every 6600 that came out of the factory could run my program, but that didn’t help an awful lot because it might not run other people’s.

The errors had to do with sequence of operations through the stack. And some sequences would tend to give errors; other sequences would not. Compilers generate only a relatively small number of such sequences by nature, and so you could get the compiler-written code working. But if somebody came along and wrote assembly-language code, they might generate a different sequence from which such errors would crop up. Norm Hardy and I, thinking about this once, recognized the fact that if we set up a test sweep that would run through every possible sequence of operations in the stack, it would be fine—it would presumably check out the machine. But it looked as if it might take a full year to run before—it would not be a very useful approach.

GAM: I suppose. So, how early in the use of the 6600 did you and Fred Andrews start talking about STACKLIB?

CL: Well, when Fred and I discussed this, we recognized that you could go in and write stack loops for inner loops for your program, and it would help. But it was a hard job. It was really a serious effort. And the game that you played was an interesting one. But, if I remember correctly, if you finished one of these loops arranging the issuing of instructions and so on to fill up every possible clock cycle, then a week later you’d look back and you couldn’t see what it was that you’d done, because it was such a complicated intermingling of prefetching for the next cycle, and doing arithmetic for this cycle, and storing stuff from the last cycle—all intermingled in the sequence of operations—that it was very hard to recognize what it was that you’d done and/or to know how you could make changes in it.

What Fred did then was to look at certain simple vector operations, such as D = A x B + C, where D, A, B, and C were all vector quantities, write that in the stack loops, which would be very efficient, and then, essentially, package that as a macro operation. Then, when you’re writing a program, you could essentially call one after another of these macro operations to do your arithmetic in a vector mode as it were. And this is a technique that picked up a factor of 2 or more in speed over the other, simpler ways of doing things. It was a technique which was then transported to the people at NCAR, and they rewrote their climate models using the STACKLIB to pick up a factor of 2 in speed on the 6600, which later carried over to the 7600.

But when the Cray 1 came, with its vectorization capability, much of the benefit of this was lost—was not particularly appropriate for the Cray 1. Those people who had not gone through this tough earlier step picked up a bigger increase in speed, therefore, when they moved to the Cray 1 than did those who had gotten ahead of the game by this particular program.

The STACKLIB was moved to the Cray 1, but it provided less benefit there, because it was competing with fairly decent vectorizing compilers at that stage.

GAM: But now you’re thinking that it’s going to be influential or potentially influential on a T3D?

CL: Well, that’s right. It may be the sort of approach that could be useful again for the microprocessors, the cache-flow problem—the question of getting the most efficient usage—the filling up of every clock cycle on these microprocessors in an effective way. Of course, the timing’s a little different on different microprocessors, although there may be enough commonality that—although I don’t know this yet—it’s conceivable that just the exact same sequence that was used before for the 6600 and the 7600 might already conceivably give you some benefit on the alpha chip, for example. After all, it takes a certain amount of time to load, it takes a certain amount of time to store, it takes a certain amount of time to multiply, and the ratio of all those times may not be that much different for the alpha chip than it was for the 6600. Of course, the absolute times are much faster than that by a large factor—you know, we’re talking about nanoseconds instead of microseconds.

GAM: It has been a marvelous interview, Chuck, and I’d like to think that we will revisit some of your remarks after seeing those of other persons. I’d like to thank you for these memories.

CL: Okay.

Last modified on March 29, 1996, by Catherine M. Williams. For information about this page contact:
George A. Michael — gam@llnl.gov

and LLNL Disclaimers

UCRL-MI-123855