This document describes the evolving set of Java coding guidelines identified by Agorics' developers to:
All developers at Agorics are required to adhere to the coding guidelines described in this paper. These guidelines are periodically reviewed by the developers to ensure that they continue to meet the objectives identified above. Suggestions for improvements are always welcome.
All projects at Agorics hold regularly scheduled code reviews in which, among other activities, developers evaluate exceptions to the guidelines, and help each other develop skill in this style of coding. By having standard coding practices, software can be maintained by any of the technical staff. This allows people to move easily among projects and work in areas which are of most interest to them.
Last modified: September 30, 2004
This work is licensed under a Creative Commons License.
The first section focuses on syntactic guidelines, including formatting, variable naming, etc. The second section focuses on design issues.
Required | These must be followed by production code. Code that violates required guidelines should not be incorporated into any Agorics product. |
---|---|
Important | These should be followed by production code. Code that violates important guidelines should be justified and commented before incorporation into any Agorics product. |
Guideline | These should be followed by production code unless they strongly conflict with other design constraints. Code that violates guidelines should be examined in review |
Tradition | These should be followed by production code whenever possible. Code that violates traditions should be examined in review. |
Trial | If feasible, these should be followed by production code. Code that violates trial guidelines should be examined to help review the guideline. |
For non-production code, the above categories also apply, but the rules indicated above are applied much less formally.
Each of the guidelines below will be labelled with a section name and number. If guidelines are rearranged, the numbers will remain the same, for ease of reference, so there may be gaps in the numbering.
Indent code using only tabs. Tabs should be set to four (4) spaces. Most modern IDEs on Mac and Windows platforms are designed for this. They are not designed for combinations of spaces and tabs.
For any control structure that allows braces (while, for, if, etc.), always use braces to enclose the controlled statements. Not using braces when possible is a common source of bugs during later revision, particularly during rapid feature enhancement and maintenance. The problem is that a single statement in the original version of the code gets changed into multiple statements during debugging, feature enhancement, code reuse (by copying), etc.
// GOOD if (x < 0) { reportError(); } doSomething(x); //BAD if (x < 0) reportError(); doSomething(x);
if (a == b) { ... } else if (a < b) { ... } else { ... } for (int x = 0; x < m_count; x++) { ... } switch (val) { case FOO: ... break; case BAR: ... break; default: ... break; } public class Foo extends Bar { ... } public int foo(int bar) { }
//GOOD switch (val) { case FOO: x = 3; break; case BAR: { int z; // a local variable z = 4; x = z; break; } case BAZ: ... // could not access "z" accidentally break; default: ... break; }
Most debuggers only debug at the line granularity, so a separate line for each statement allows finer-grain stepping through code.
// Also notice how the continuing comments just // continue at the same indentation level, but // continuing statements are indented in one more // level. static final String = "gee this is one super-long string."
Code is often printed for reviews, examined on small screens, etc. Known-width lines are essential for code readability. Code that breaks at arbitrary syntactic positions is very hard to follow. The leftover ends of unbroken lines break up the flow of the indentation which carries an enormous implicit weight. We prefer to have reasonable layout chosen by the author.
Project declarations, import statements, and the declaration of the outermost class or interface in a file start at the left margin. The opening brace of the class or interface is at the right end of the line or lines containing its declaration. The code inside the class starts out indented 4 spaces from the left margin, and is indented further from that point as called for below.
When a construct is continued after a newline, indent the second line one extra 4-space tab position. If the construct is followed by a brace, the brace is placed on its own line after the continued lines.
//GOOD public int sna(int foo) throws myException { ... }
Indent 4 spaces or to the syntactically relevant alignment point for line continuations.
public void reintegrateParticipants(school.house.Headmaster head, school.house.Teacher teacher, school.house.Student[] students) ...
or
if (teacher instance of school.house.Professor && teacher.age() < 30) { // Indicate surprise...
public class Set { class Set_Enumerator implements Enumerator {
Examples include enumerators for collection classes.
Java 1.1 will not allow such references. Note that references are allowed to non-public classes defined in files named after them.
This discourages the common practice of always adding setters for each getter, resulting in more coherent object abstractions.
Constants (not labelled) | |
Static variables | // CLASS VARIABLES /////////////////// |
Instance variables | // STATE ///////////////////////////// |
Constructors, factory methods, static/inst init | // CREATION ////////////////////////// |
Queries - no change to object's state | // QUERIES /////////////////////////// |
Manipulation - what the object does | // MANIPULATION ////////////////////// |
Internal - implementation details, local classes, ... | // INTERNAL ////////////////////////// |
The order is intended to present the implementation starting with the most global information, and proceeding to the most local. The class' contract should be described in the javadoc comments. Other than STATE, when a category is empty, the label should be omitted.
Use slashes to mark the separators. two slashes before the label, and slashes all the way to the right margin (currently 79 characters), to serve as a visual cue for aligning the code.
Constants should be rare. If there are more than a few, consider making a separate Interface to hold them, particularly if they are used throughout a package (or even more widely). When there aren't very many, it's cheap and valuable to put them first. When there are more than a few, this placement may serve as incentive to look for a better representation.
Static variables should be rare and should be prominent since they represent state shared among instances. (A singleton instance would also be declared here. In this case, state isn't shared among instances, but the fact that the class has only a single instance is just as important to make prominently visible.)
They are called CLASS VARIABLES because the obvious choice in Java, STATICS, looks too much like STATE and doesn't provide as good a reminder of the broader scope.
Grouping instance variables together makes them easy to find. Putting them at the beginning of the class is important. Understanding what state is maintained is a crucial part of understanding the implementation. Once you're familiar with a class, the methods are usually of more interest, but those learning about the class benefit from repeated exposure. Since they are kept together it's relatively easy to move on to the code.
When there are no instance variables, that fact is noteworthy: follow the category marker with a comment saying instances have no state. (In an Abstract class or Interface describe the state that implementors are expected to maintain, and how the base class will access it.)
It's much easier to verify that a class follows our approach to object creation (link to appropriate sections) if all the constructors and factory methods are together. In addition, it's far easier to ensure that all instance and static state is correctly initialized if there's a limited area in which the appropriate code can be found.
Other possible names were CONSTRUCTORS, FACTORY, or INITIALIZATION. CREATION seems superior, since it is broader and includes the others. LIFECYCLE was another possibility, but destructors and finalizations appear so seldom that it doesn't seem necessary to emphasize this aspect.
These guidelines encourage the use of immutable objects whenever possible, because their semantics are much easier to reason about. Making these methods be prominent supports this design direction. The practice is also useful with mutable objects: knowing which methods do not modify the state is helpful in understanding what the object represents. (link to the section on naming convention suggestions for accessors.)
Static methods that don't modify class or instance variables belong in this section. They don't have the priority that constants, shared state, or instance state do.
QUERIES is intended to remind us that the methods don't have side-effects, and that the object (or class) is responsible for deciding how to respond.
Methods that serve a common purpose should be grouped together for readability. Methods that are only called in one place should be defined near the method that calls them when possible.
Place internal classes and any internal methods intended to be used throughout the class together at the end of the class definition. Label this section INTERNAL.
Segregating this code makes it straightforward for developers who are new to the code to find the documentation for the code that maintains invariants. It is usually more useful for a first-time reader to see the methods that invoke the object's behavior than methods that do the detailed work. That way they find the purpose of the object first. The code that takes care of the details is best left for later.
If it seems that an internal class (other than anonymous internal classes declared in-line) is too important to hide at the end, consider promoting it to a top-level class with its own file.
Internal methods and classes should be described in javadoc. Explain what invariants they are responsible for maintaining and how they should be used. When external documentation of the class or package is created, it will not include descriptions of private or package-restricted methods or internal classes. Other programmers need to understand the constraints and invariants in order to be able to modify the code; If they are reading the javadoc as html, they'll generate it with the internal information.
//GOOD public int size() {
The reason for this is so that the names of the methods are lined up, allowing easier visual searching.
switch (c) { case 2: case 3: comSmall() break; default: ... }
switch (val) { case foo: ... case bar: ... default: throw new AssertionFailedException("bad val"); }
This helps prevent bugs when code is modified later, and catches some cases in which "break;" is left out of cases.
High precedence operators (primary and unary) never have spaces; low precedence operators (conditional, assignment, separators, bitwise, relational, logical) always have spaces; medium precedence operator (arithmetic) may or may not have spaces.
The primary operators: ".", "()",and "[ ]", get no spaces before or after them.
data.m_strValue; buf[i] = 'g';
Contents of parentheses (which are primary operators) are not separated from ( and ). Nor should the name of a function be separated from its arguments.
comCheckRequest(reqHandle, idata);
The unary operators !, ~, ++, --, +
, and -
do not have spaces before or
after them.
++i; if (!isDoorOpen)
The assignment operators (=, +=, -=
, etc.) and the ternary operator ?:
have a space on either side:
i = 2; i += 3; i = (j < MAX) ? MAX : 0;
Comma and semicolon separators are followed by one space.
for (i = 0; i < MAX; i++, j++) { ... initData(data, defaults);
Spaces around arithmetic operators are optional if the variable names are short.
i = arrSize * 2 + 1; // Preferred i = (j+1) - (k+3) - 5; j = k*3; // Acceptable
The bitwise operators &, !, ~, ^, <<, >> should use parentheses when combined with other medium precedence operators. Use spaces on both sides of bitwise operators.
if ((status & MASK) != SET) { ...
This is so they don't look like procedure calls.
if (charPut < limit) {... while (charsPut < limit) {... for ( I = 0; I < MAX; I++) {... switch (openCode) {... return 0; // note: these have no spaces before the semicolon. break; continue; return; node = (InOrderTraversalNode) currNode.nextInOrder();
In the following code that converts an array of alternating keys and values into a dictionary mapping, the second "i++" is hidden in the middle of dictionary code, so its easy to mistake it while reading and maintaining the code.
//BAD for (int i = 0; i < keysAndValues.length; i++) { dict.put(keysAndValues[i++], keysAndValues[i]); //note the "i++". }
The code should instead read:
//GOOD for (int i = 0; i < keysAndValues.length; i += 2) { dict.put(keysAndValues[i], keysAndValues[i + 1]); }
to make it clear that the loop steps by two elements.
Side-effects are extremely important actions in programming, particularly in the context of concurrency. Side-effects should be statements, so their order of execution is clearly specified, the side-effecting behavior can be more easily debugged, and they don't interfere with the normal parsing of operator expressions as mathematical, side-effect-free computations.
i.e. do not code:
if (foo) { ... some code ... } /* end if (foo) */
Instead just end the construct with the right brace.
It becomes very difficult to maintain such comments. It also makes it much more difficult to copy existing code and use it to start the implementation of more code (because all the comment references need to be changed as well).
//BAD if (isValid = isReady) {
This prevents confusion between "=" and "==".
There seem to be two good reasons and one circumstance that cause developers to sometimes include empty catch blocks. The good reasons include that the logic of the program guarantees that a declared exception can't be thrown in the present situation, or the exception may sometimes be thrown, but doesn't require processing for some reason. The circumstance that leads to empty catch blocks is that someone is in a hurry, and wants to get something to compile or run, and wants to deal with the error cases later. This case is the one we most want to defend ourselves against. If all three cases are required to be clearly labelled, we won't assume that unlabelled cases are intentional
If the exception needs to be caught, but no code is necessary in the catch block, it's imperative that a comment be added explaining why the exception isn't a problem, and why falling through to the code after the catch is correct. Otherwise some later programmer may rearrange the code to get rid of the try-catch, or to attempt to handle some other problem that the exception might signify. Usually a // NOTE comment is sufficient in this case.
If the logic of the program guarantees that the exception won't actually be thrown, and you don't want the exception to propogate beyond the present method, throw a runtime exception of some sort, and add a comment detailing what it is about the circumstances that guarantees that the exception is impossible. If it can't happen, it doesn't cost anything to insert the throw. This protects you from re-use of the code in other situations where the present guarantees don't hold.
Sometimes, during development, it's useful to continue coding without defining the detailed exceptions that will be needed later. In these situations, it's important to leave some warning that the code is incomplete. An explicit note is best, but it's acceptable to add
// HACK throw new Error("unimplemented");so it's clear that you didn't finish.
It should be acceptable for customers, clients, partners, consultants and other third parties to read our code. In particular, comments on workarounds for third-party software should be strictly factual.
Public class documentation should include detailed information about the contract after the summary sentence.
Public method documentation should include in the following order: @param tags for each parameter, in the same order as the call; an @return tag; and an @exception tag for each exception which can be raised.
Interfaces deserve substantial documentation if they are the locus of polymorphism. It is useful to document both what the users of the interface can count on and what implementors are obligated to provide. Since there is no separate place to record implementation comments (e.g. facilities intended to support subclasses, constraints to be maintained) we currently recommend separating these comments explicitly, and if they are substantial (as should be common) placing them on a separate web page (javadoc supports placement of arbitrary html in doc-files/ in the source directory) that is linked from the package doc, but clearly separated from interface comments.
Comments that are provided for methods in interfaces are automatically repeated on the methods that implement them if additional commentary isn't provided for the subclass method.
This is the class template.
/********************************************************** Describe the overview of the class's function, along with relationships to instances of other classes. Describe in more detail the contract of the class. Paragraphs are separated by the HTML tag and an extra line; each line is indented one tab. @see the other sections for information on the specific tags. **********************************************************/ public class Foo ...
Example of class and interface javadoc formatting for the CommandScheduler class.
package agorics.commands; /***************************************************************** Serialize the execution of asynchronously scheduled commands to provide concurrency without threads. Objects only accessed through commands within the same CommandScheduler do not require synchronization. Commands scheduled through a particular CommandScheduler will be invoked in the order in which they were queued. Different scheduling operations queue their commands at different times. The basic CommandLoop provides for immediately queuing commands. Subclasses support time-based queuing, etc. It is not possible to unqueue Commands except by unscheduling them. Commands may be queued more than once. When a command reaches the head of its queue, it is removed and executed. @see Command Design Overview @see Command @see CommandTimer *****************************************************************/ public interface CommandScheduler { ...
Example of method Javadoc formatting
/***************************************************************** Schedule a command to be executed as soon as possible. The command is immediately enabled. Synchronization: All requests must be thread-safe because CommandSchedulers are used to coordinate activities between threads. @param command the command to be scheduled (and enabled). *****************************************************************/ public void schedule(Command command);
Standard editors and IDES are much better at handling formatting in the above comment syntax (rather than starting each line with an asterisk).
Exceptions are javadoc comments and commented out code blocks. This allows blocks of code to be easily commented out.
This wasn't possible before JDK 1.1, but it makes it possible for the compiler to enforce the fact that the value doesn't change after initialization or first assignment. Unfortunately, some compilers don't support the new form. In these cases, it's acceptable to drop the "final" declaration.
Some recommended keywords are as follows.
//TODO
Identifies code which needs enhancement or fixing in the near future.
//BUG [bugid] topic
Means there's a known bug here, explain it and give the bug ID or reference number if applicable.
//HACK
Identifies code that works around a problem in the implementation infrastructure (such as the Java runtime or standard libraries, for instance), or that for temporary expedience violates required coding guidelines, low-level design rules, etc. in order to make progress on other parts of the code. All HACK notations should be reviewed regularly, and only infrastructure work-arounds are allowed in released code.
//NOTE
Tells somebody that the following code is tricky, obeys some special constraint, or is otherwise different than they might expect.
It should be acceptable for customers, clients, partners, consultants and other third parties to read our code. In particular, workarounds for third-party software should be strictly factual.
Use the following naming convention:
packagenames | all lowercase |
Interface/ClassNames | capitalized - short with MidCaps |
CONSTANT_VALUES | uppercase separated by underscore (static final variables) |
ms_staticVariables | lowercase with midCaps, prefixed by "ms_" |
m_instanceVariables | lowercase with midCaps, prefixed by "m_" |
methodNames | lowercase with midCaps |
localVariables/argNames | lowercase with midCaps |
Using midCaps shortens the names while still allowing the reader to easily separate the words.
Capitalization will allow these important roles to be easily distinguished, and is consistent with current Java style (e.g., Point, Panel, Vector).
Starting local variables and member variables with lower case letters will allow their role to be easily distinguished. E.g.
int arrayIndex = 0; bool m_isValid = false;
bool m_validity = false; // BAD - Does false mean valid??? bool m_isValid = false: // GOOD - This is clear.
End factory class names with "Factory".
End exception class names with "Exception".
Don't add "interface" to interfaces.
Instance variables (and static variables) are fundamentally different in character than the more numerous local variables. Lexically distinguishing them with a prefix enhances readability and maintainability of the program. The use of _ in the prefix above sets off the instance variable name so that the prefix does not interfere with reading the instance variable's name. The "m_" stands for "member" or "my".
As with instance variables, static variables are different from the more numerous local variables, and thus should be lexically identified. The prefix "ms_" stands for "member, static".
Iteration methods should be named using an appropriate attribute in the plural form and return an object of type Enumeration. Preferably the method should be called "keys" or "elements".
If there is an instance variable m_foo, then the reader method is foo(), and the setter method is void setFoo().
Provide initial (and preferably final) values for local variables in their definitions. As a corollary, avoid assignments even to local variables when they are unnecessary. Exception: when assigning an "if" branch or assigning a "try/catch."
Instead of:
// BAD: uninitialized local variable Button newButton; add(newButton = uifactory.makeButton("OK")); newButton.addHandler(myOKHandler); add(newButton = uifactory.makeButton("Cancel")); newButton.addHandler(myCancelHandler);
use:
// GOOD: locals are always clearly defined // OK Button Button okButton = uifactory.makeButton("OK"); add(okButton); okButton.addHandler(myOKHandler); // Cancel Button Button cancelButton = uifactory.makeButton("Cancel"); add(cancelButton); cancelButton.addHandler(myCancelHandler);
Because the code falls into more separable code sequences, the following possible clarification becomes more evident:
// BETTER: code sequences captured in private method addControl(uifactory.makeButton("OK"), myOKHandler); addControl(uifactory.makeButton("Cancel"), myCancelHandler);
By initializing variables in the declaration, they never have invalid values, and the name is associated with a value (a good semantic connection) instead of more loosely with a slot that may eventually play a role in the code. Modern compilers can easily determine when a local is no longer used, and not preserve its value beyond that point, so reusing locals does not provide any optimization.
for (int i = 0; i < MAX; i++) { ...
This makes it obvious that the variable isn't used again later.
When the initial value for a variable is explicitly provided, it should be one of the values that the variable is allowed to have; i.e., a value that program is prepared for.
As part of this guideline, avoid the common C/C++ style of initializing pointers to null. In C++, this was a good idea because it prevented garbage values in uninitialized variables. In Java, this is a bad idea because it prevents the compiler from checking for proper initialization. Object variables should only be initialized to null if null is a legal value for them.
Public instance variables expose implementation details and so interfere with software maintenance.
Declare instance variables with either the private keyword so that only the instance can access them, or with no protection category. No protection category means they are package private, and thus other friends in the package can access them. Package private variables should only be accessed by other, helper classes within the same file.
class QueueLink { private QueueLink m_next; Object m_value; }
Only these access rules provide real encapsulation because protected instance variables can be accessed in any other package through subclassing. Package private access is the way to implement the C++ concept of "friend" in Java.
Pure "getter" methods don't need to be synchronized because they atomically access state. However, if they test for uninitialized and perform initialization, then they may need to be synchronized because they access the variable twice, and they assign it. Therefore, lazy initialization increases synchronization costs. For similar reasons, lazy initialization also increases the costs of persistence and distribution.
The best alternative is to initialize objects explicitly upon creation. When lazy synchronization is necessary due to the cost of initializing some typically unneeded value, separate the initialization into a separate method (so that at least the assignment can be synchronized separately).
Static variables are essentially global variables. If more than one independent copy of the object is needed in the future, the use of static variables will have to be rethought.
If necessary, initialize a local variable with the value of the parameter, and then modify the local variable.
All numbers which need to be embedded in the code should be defined in a public static final value.
e.g. final int MASK=0xfff;
Allow the compiler to help check that these values are really constant.
Even though all members declared in an interface are implicitly public, these declarations get copied into class definitions which then provide implementations. Therefore, the declarations should be as close a possible to how a class would need to declare them. A consequence of this is that methods declared in interfaces should not be declared "abstract" (which is also implicit) because the implementations need to not be "abstract".
Interface fields are implicitly static, public, and final. They cannot be overridden by implementations. Interfaces are often switched to abstract classes during design. Declaring their properties explicitly leads to fewer mistakes when the class they are in is changed to something with similar semantics.
// GOOD public interface Alignment { public static final int LEFT = 1; // BAD public interface Alignment { int LEFT = 1;
Previous versions of this standard recommended declaring interface constants with no keywords. The explanation above gives the reason for the change.
These declarations provide a compiler-verified sanity check on possible error conditions.
Overloading on type is based on compile-time type checking, which is different from normal method dispatch. If packages are compiled separately, the wrong method can be resolved. (See "The Java Language Specification"; Gosling, Joy, and Steele; Addison Wesley; Section 13.4.22, page 257.)
Exceptions will wreak havoc if they escape a finalize method.
Use the variable directly instead of using a public accessor method unless the accessor provides a specific advantage that is required internally as well (like lazy initialization). Internal methods are permitted to make assumptions about the variable which are not permitted when using an accessor method.
// BAD class Foo { public Foo() {} }
The redundant constructor prevents the compiler from detecting errors if additional constructors are added later.
This makes them easier to understand, name, use, and re-use. It's almost always appropriate to take a method that does its job in two steps and convert it to calls on two new methods. Java makes this harder than languages like smalltalk, which means we should look harder for opportunities to make methods smaller.
The main exception is if reliablilty or consistency requires that whenever one of the methods is called, the other should be called as well. In this case, it's still possible to make the two methods private, and have the method that calls both in the correct order be public.
There's nothing wrong with methods with a single line of code if they get something coherent done. Methods that are longer than 40 lines should usually be refactored. (Shorter is usually better.) Good style is usually found at the lower end. Dealing with complex environments is often what drives code to longer methods. This problem can sometimes be addressed by building simple abstractions for the underlying environment, though some of the time this just moves the complexity into the implementation of the abstractions. When implementing a reasonably clean abstraction (which is the goal) small methods almost always get the job done more clearly.
Long methods are hard to read and maintain. The reader has to keep more state in mind in order to figure out what variables get re-used in distant places. The usual rule of thumb is that locality of reference to local variables determines the reasonable scope of a method.
Clarity of the code is one of the highest goals. (Usually a higher priority than ease of typing for the author when they conflict.) Simple invocations are easier to read and understand. If the method name and the arguments describe their role in the method clearly, and all fit on a single line (including any assignment of results), then readers of the code will find everything they need on one line. When a method invocation gets too long for a single line, there are several approaches to making everything fit the line length limit without sacrificing clarity. The easiest thing to do is often to simply add line breaks, but the writer's ease isn't the paramount concern: we want the code to be as easy to read as possible.
// GOOD public void requestIntro(Context transCtx, User introduce, Set addrs, MailEnvelope env) { Address requestor = env.getSender(); if (addrs.isEmpty()) { logAndNotifyEmptyAddress(transCtx, env); } else if (isKeyAdmin(requestor)) { introRequestFromKeyAdmin(transCtx, requestor, addrs, env); } else if (statusOf(requestor).isLocal()) { introRequestFromUser(transCtx, requestor, addrs, env); } else { warnKeyAdminRequestFromCorrespondent(transCtx, env); } }
Long lines often result from abstractions that are not crisp. This weakness produces methods that require too many arguments, intermediate values that don't have obvious names, etc. This rule causes a format change when lines get long, which calls visual attention to areas that may deserve more effort in order to simplify the code.
If the arguments to a method are long because they involve calculations, clarity can sometimes be improved by naming intermediate results. Notice that the decision to pull out intermediate results provides another opportunity for the author to identify stages in the computation that are semantically interesting. These can often be turned into separate reusable methods. Pulling out intermediate results often makes the opportunities for separating reusable methods more visible, since the shared computations are no longer hidden inside other invocations.
Exception:
When a method invocation has multiple visually-complex arguments, it is acceptable to put each argument on a separate line, even when they could fit on a single line.The most common cause of this is when more than one of them involves infix operators. (+ for string append, relational operators, etc.) The commas separating arguments from one another can get visually lost in those cases, and so it is acceptable to break such a method invocation into multiple lines even when a single line would hold all of it.
The first argument should appear on the same line as the name of the method and the open paren so the list of arguments has an obvious visual relation to the method name. Each of the other arguments should start in the same column as the first argument (directly after the open parenthesis). This makes it visually obvious that they all fall within the scope of those parentheses. A close parenthesis with semicolon MUST go on the end of the line containing the final argument. (Not on a separate line.)
Assign the results to a temporary variable instead, and use the (shorter) variable name instead.
An opaque reference is one that is specific to a type, but that does not require any particular property of that type. In the example below, Vectors are created, and manipulated (the size() message), so this code is dependent on a change in the Vector protocol, so Vector must be imported. Conversely, though Images seem pretty central, no part of the code below sends a message to any Image. Images are only passed in and passed on, so the code needs to designate the Image type, but does not depend on the Image protocol at all. Thus, the reference to Image is opaque, and the inability to mention the messages supported by Images is an encapsulation improvement.
// BAD import java.awt.Dimension; // BAD: not even mentioned! // BAD: nothing depends on Image's protocol import java.awt.Image; public class ImagesHolder { // BAD: non-opaque use of Vector without importing java.util.Vector m_frames = new java.util.Vector(); Image m_picture; public ImagesHolder(Image picture) { m_picture = picture; } public Image picture() { return m_picture; } public int frameCount() { return m_frames.size(); } } // GOOD import java.util.Vector; public class ImagesHolder { Vector m_frames = new Vector(); java.awt.Image m_picture; public ImagesHolder(java.awt.Image picture) { m_picture = picture; } public java.awt.Image picture() { return m_picture; } public int frameCount() { return m_frames.size(); } }
While this guideline can result in awkwardly long variable declarations, it allows automated tools to recompute "make" dependencies, generate class dependency diagrams, etc.
// BAD import java.util.*;
Group imports can result in a previously legal source file becoming incorrect because another package adds a class. It is easily possible to accidentally get a class from the wrong package this way. Also, explicit imports make it possible to find all referenced classes from the static contents of the source file.
Exception:
Test components and components that by-definition use all or most of the files in a package may use the package import mechanism if the number of classes to import would otherwise be unmaintainable.
The names of private classes should begin with the name of their public class.
public Dictionary private DictionaryEnumerator
This ensures that private classes in different files within a package won't conflict, and makes it clear where to go for source.
Concurrency control is far more complicated than the thread primitives provided by Java indicate. Even advanced books on Java show trivial code examples that subtly lead to deadlock. Deadlock is an emergent phenomenon caused by interacting components. Many subtle issues must be addressed when using any synchronization mechanism. The concurrency control guidelines in this section reduce, but cannot eliminate the deadlock possibilities inherent in multithreading. As a result, the primary recommendation for multithreading is to avoid it.
The "synchronized" keyword can be used to coordinate multi-threaded access to objects and data.
The "synchronized" keyword introduces substantial performance overhead and potential for deadlock. While important for abstractions designed specifically for multi-threaded control, it is expensive when used otherwise.
Synchronized methods are equivalent to using synchronized blocks on the entire body of the method with this as the object synchronized upon. Instead, use synchronized blocks explicitly, and use an internal object rather than this.
Synchronized methods are defined as in
// BAD public synchronized int size() { ... } the same method using blocks could public int size() { synchronized(m_array) { ... } }
Synchronized methods are a misleading convenience. Deadlock management often requires only portions of methods to be synchronized, but the method attribute makes it too easy to just synchronize the entire method, thus introducing potential deadlock. An example is when an object changes (concurrently) and broadcasts to its Observers or Reactors. This broadcast must be executed outside the synchronization. This also makes the key object explicit, so that using this is an explicit statement that external clients can cause the receiver to lock.
Use of an internal object also generalizes to allow shared coordination among several cooperating objects.
Finalization raises concurrency issues. Its semantics are also expected to change in future versions of Java.
This means only for methods that the superclass needs from the subclass, not methods the superclass provides for the subclasses use.
There are three reasons not to use protected for encapsulation:
- the actual protection rules are subtle and confusing.
- other packages can construct a subclass, thus giving them access to the protected resource.
- the protected interface elements are exposed to public clients, cluttering their interface and making it harder to understand.
Use the abstract keyword in the declaration of semantically abstract classes. This is essential if the class has default implementations for all its methods.
abstract class TreeNode { public boolean isLeaf() { return false: // The default } }
Explicitly identifying abstract classes allows the compiler to ensure that the classes are never instantiated. This results in a more robust program, because if all methods happen to have default implementations, the compiler cannot know to prevent instantiation without the class declaration.
Factories are objects whose primary purpose is to create other objects. Instances are created by sending messages to a factory object, rather than through static factory methods or constructors.
m_ok = uiFactory.makeButton("OK", TheOKCommand);
rather than
m_ok = new MyButton("OK", TheOKCommand);
Factories allow the implementation to be varied at run-time and can implement complex strategies such as caching. They provide variation of implementations at the scope of a package, for example.
If factories are not appropriate for instance creation (such as when creating factories), use a static member (usually called make). This technique will still allow implementation of complex strategies such as caching.
Objects returned from construction (via factories, factory methods, or constructors) should be completely constructed and should meet their contract. This means not relying on clients to properly invoke setup methods to ensure the objects correctness.
The following program illustrates a common problem caused by violating this rule:
// BAD: This example can raise a null pointer exception. public class Foo { private Bar m_parent; // expected to always be a Bar //CREATION public Foo() {} // QUERIES public String filename() { // BAD: m_parent could be null! return m_parent.filename(); } // MODIFIERS // BAD: exposes internal state void setParent(Bar parent) { m_parent = parent; } }
As programs evolve, messages may get sent to objects before they get "initialization" information from other objects. These messages would then be in error. The example above allows clients to cause a NullPointer exception. The correct code:
// GOOD: only initialized instances are exposed public class Foo { private Bar m_parent; // expected to always be a Bar //CREATION public Foo(Bar parent) { m_parent = parent; } // QUERIES public String filename() { return m_parent.filename(); }
results in only correct instances being provided to clients. It also changes an instance variable to a constant, improving the clarity, correctness, and concurrency aspects of the code.
Exceptions:
When creating circular structures, it may be necessary to have some of the objects be invalid until all the objects are valid. Often this can be hidden by a factory method that creates all the circular objects, and hooks them up.
Hiding inherited methods can interfere with upwardly compatible contracts.
Using assertions allows subtle conditions to be checked. Because such subtle sections are more delicate than most code, having such a small regression test built into the code itself can reduce chances of introducing subtle bugs during future development.
Third-party libraries are changing and being extended at a furious pace. When possible, subclassing instead of editing provides more insulation from changes by the third party.
Separate by groups that would evolve together. Use functional cohesion to logically group global data into smaller separate classes.
See the section on polymorphs in the Conceptual Glossary. Subclasses should be compatible with the contract of their superclasses.
Multiple inheritance makes it more difficult to understand exactly what a class is for and what it does. In Java, avoiding multiple inheritance means avoiding implementing more than one interface in a class.
Implement methods that would be protected in a separate object in the same package. The separate object should then be provided to appropriate implementors through other mechanisms.
There are three reasons not to use "protected" for encapsulation:
Design objects and methods so that changes to the objects internal state are minimized. The extreme of this is to use immutable data structures (such as collection) and create mostly immutable objects.
Side-effects introduce substantial complexity into programs: they are the source of concurrency synchronization issues; they interfere with many compiler optimizations, and they obscure the relationships and logic in the programming. By avoiding them when possible, the overall quality of the code is improved, and the side-effects that are essential to the program logic are highlighted.
The state exposed through "getters" of an object should only include data which is part of the object's semantics. For example, the user of a hash table may need to know the number of elements, but not the number of buckets. This reduces the potential implementation dependencies making code easier to maintain.
These methods allow external side-effects when the abstraction may not need them. It is better to set the initial values of instance variables as part of instance creation.
There are several design concepts which motivate many of these guidelines.
There should be a clear contract between an object and its users. Objects should state what they do and how they behave in exceptional circumstances. All behaviors should be specified in the contract. As in real life, good contracts make for good relations.
The principle advantage of object oriented design is the separation between use and implementation. This separation allows one to change the implementation without affecting users of that implementation. So long as the new implementations implement the stated contract, they can be freely substituted. Good encapsulation implies that the only exposed public interfaces are those which are necessary to meet the objects contract. In general, instance variables should be private.
Facets further enhance encapsulation by introducing the idea of need to know. Consider the producer/consumer relation. Between the producers and consumers is an object which enqueues and dequeues the data being passed. Since no one object both enqueues and dequeues data, the enqueue method can be placed in a separate object from the dequeue method. The enqueue object is passed to the producers and the dequeue object passed to the consumers, further isolating them from the implementation. The enqueue and dequeue objects are separate facets of the queue object. One way to implement facets in Java is to use separate methods in same package.
Consider the class hierarchy: Object | | | -------------------- | | Set Table | | | | HashSet ---------------- | | | | HashTable Array |
Now assume there is an "add" method in several of the classes: Set.add(), HashSet.add(), Table.add(), Array.add(), and HashTable.add(). However, class Object does not have an add method. We call the add methods in Table, Array, and HashTable polymorphs of each other. Their contract is defined by class Table. Classes Array and HashTable only specify compatible refinements of that basic contract. However, the add method in Set has a separate contract. It is not a part of the polymorph rooted in Table because the common ancestor, Object, does not have an add method. The coincidence of naming does not create the polymorph. |
The reactor pattern is similar to the Observer pattern. It differs from that pattern in having a separate method for each specific kind of change.
@see TypeOrMethod
The tag generates a "See Also:" hyperlink reference in the generated documentation. The argument
TypeOrMethodto @see can be a package, class, or method. If the argument is a class in the same package, or is imported into the current file, the unqualified classname is sufficient to identify it. Otherwise, the fully qualified form is necessary. With Methods, the argument types must be supplied as well.
/****************************************************************** Factory for Transforms that do PGP decryption. @see Transform ******************************************************************/ or /****************************************************************** Specific informative tags that can appear with userIDs. @see PGPPublicKeyDescParser#processNameOrTag(PGPPublicKeyDesc, Ascii) ******************************************************************/
@param parameterName description
This tag adds the specified parameter and description to the "Parameters:" section of the current method. The description may extend over more than one line. There should be one @param tag for each parameter, and they should occur in the same order as the parameters themselves are declared.
@return description
This tag adds a "Returns:" section with the description to the documentation for the current method. There may be at most one @return tag. [A @return tag should always be present unless the value would be void.]
@exception full-classname description
This tag adds to the documentation for the current method a "Throws:" entry with the full-classname of the exception and the description. There should be an @exception tag for each exception thrown or propagated by the method.
@version description
This tag adds a "Version:" entry. This tag should not be used.
@author description
This tag adds an "Author:" entry with the text to the documentation for the class. This tag should not be used.
@link TypeOrMethod visibleText
@Link tags add a link to a Class, Method or Package in the middle of other text. (As distinct from @see, which adds a separate line at the end of the description.) @link must always be surrounded by braces ("{" and "}"). It can take one or two arguments. The first argument is a description of what to link to. It can be a local reference, or a fully qualified reference outside the present package. If there are two arguments within the braces, the second one is used as the text to display. Here is Sun's description of @link.
Here's an example of package doc:
/****************************************************************************** SimpleKeyExchMgr manages keys, tracking their status through various transitions, and providing an interface that a KeyAdmin (either a class, or a person) can use to manipulate the keys. We only deal with pending and active keys.<p> The SimpleKeyExchMgr uses mail messages to communicate with the KeyAdmin, and stores all its keys in KeyStores. It defers decisions on format to store the keys to {@link KeyHandle}. *****************************************************************************/ public class SimpleKeyExchMgr extends KeyExchMgr {
In this case, the link would display as "KeyHandle". In the following example, the link would display as "Policy":
/************************************************************************** Build a {@link com.agorics.mailroom.process.MailPolicy Policy} entry. **************************************************************************/
In general, Agorics doesn't follow the Sun suggested standard of using "box comments" (in addition to rows of asterisks before and after the javadoc, they include columns to the left and right making a box.) These are hard to maintain in some development environments. Unfortunately, javadoc relies on the boxes in one situation: When you want to include formatted code in the javadoc as a comment, and the indentation matters.
The easiest way to do this is using <pre> and </pre>, but javadoc ignores any whitespace at the beginning of a line that doesn't follow an asterisk. So, when you want to include formatted code or other examples in javadoc, place the column of asterisks at the left.