This comprehensive deep dive into Java classes is as much history as it is education. In Part 1, we focus on class makeup to make classes more intuitive for developers.
Since Sun Microsystems released Java in 1995, there have been countless tutorials and articles written about Java classes: How they work, how to write them, best practices, worst practices, and how to do useful tricks with them, but apart from all these important topics, at the heart of understanding Java classes is gaining an intuitive understanding of classes in general. This requires building a gut feeling about what classes are and how to use them and applying this insight to the Java ecosystem.
In order to complete this daunting task, we will explore the world of classes in Java in four parts:
- Objects and the History of Classes
- Classes and Objects in Java
- Polymorphism and Dependency Injection
Part 1 of this series explores classes in great depth, but the aim of this series is not to provide a technical definition, with formal specifications and mathematical logic. Instead, by the end of this journey, the reader will gain an intuitive understanding of classes and how they apply to Java. This goes beyond black-and-white explanations and includes concepts that are ingrained in Java class development, including polymorphism and dependency injection. Our goal is not to create technical experts, but pragmatic developers.
In order to do so, we must first answer a simple question: What is a class?
The Concept of a Class
Before we can define what a class is, we first have to understand why we need them to begin with. The purpose of any software system is to model things in the real-world, both physical and abstract, and perform some useful work using this model. For example, we can create a cloud-based software system by modeling the servers (physical entities) and the channels and network connections (abstract concepts) and flow data through this model. Without a mechanism by which to capture these concepts into discrete representations and expedite interactions between these representations, we cannot solve a problem or complete a desired goal in systematic (algorithmic) manner.
Following this basic premise, we as developers need to use techniques and languages that facilitate this design and modeling process. The object-oriented paradigm, in particular, is based on the capturing of physical and abstract entities into classes. Using these this simple capture mechanism, nearly any conceivable idea or concept can be captured, and more importantly, made to do useful work.
What Is a Class?
A class (short for classification) is simply a representation of some entity or thing which is composed of state and behavior. The state of a class is the data associated with it, while the behaviors of a class are the actions that the class can perform. For example, we can classify a simple vehicle by saying that it has a manufacturer name, model name, and production year associated with it. Likewise, we can say that our simple car can accelerate and decelerate. Therefore, we can define state and behavior using the following rule:
The state of a class is what the class has, while the behavior of a class is what the class can do
Following our vehicle example, we can create a simple specification for our class:
Note that almost all programming languages restrict state entry names and behavior names from including spaces. For example,
manufacturer name is not a valid state entry name, but
manufacturerName is. This latter name is written in camel case (due to its resemblance of camel humps), where the first word in the name is in lower case and the subsequent words are capitalized (i.e.
theDogJumpedOverTheMoon). In addition to the camel case for state and behavior names, each word in class names is conventionally capitalized. For example,
TheWhiteHouse, etc. Since these are long established conventions, especially in Java, we will follow suit through this article.
Progressing beyond conventions, we have achieved a basic description of a vehicle, although it lacks enough detail to be useful. For example, what action is performed when want to accelerate the vehicle? In this case, we have declared the behaviors of our class but we have yet to define them. A behavior declaration is simply a statement about what behavior can be performed, while a behavior definition is a statement about how the behavior is executed. For example, if we add a new state entry called
currentSpeed, we can then define our accelerate and decelerate behaviors:
With our behavior definitions, we are able to do some useful work with our class. Notice that the purpose of the behavior definitions is to alter the state of our class. This leads to a general rule about the behavior of a class:
The purpose of the behavior associated with a class is to either access or alter the state of the class
Note that this rule is of such importance that a metric has been devised to measure it: Cohesion. Cohesion is the degree to which the behavior and a state of a class relate to one another and work towards a common purpose.
Although we now know how our vehicle will accelerate and decelerate, we have only provided a very simplistic definition which does not account for unexpected action. For example, what if we start with a current speed of 0 and decelerate? In general, the current speed should never drop below 0. This rule about the state of our class is called an invariant: An invariant must remain true before and after the completion of a behavior. We can support this logic by augmenting our existing decelerate definition:
We now have a more mature specification for our class, but it lacks one major element. This deficiency becomes evident when we examine the current speed state entry: In what units is the current speed measured? Is it 1 mile per hour? 1 kilometer per hour? For that matter, we have not even restricted its value to be a measurement of distance per time. Given our specification, the current speed of our vehicle could be in units of sloths. In order to better specify our class, we must associate a type with each state entry.
Before we supply types for our state entries, there is a critical connection that must be made: A type and a class are synonymous. For example, if we were to say that the type our current speed is a number, a number can itself be represented by a class: A number has a value and can perform actions, such as adding another number to itself or multiplying itself with another number. In essence, we could create a class specification for a number if we wished (albeit a simplified one):
With an understanding of the need for types, we can now update our vehicle class to include properly typed state entries (we will use the notation
type stateName, where the name of the type precedes the name of the state entry):
Note that a string is simply a sequence of characters and that we will ignore the units of our current speed for now. For the moment, it will suffice to restrict our current speed to be a number. In the process of describing types, we have run into two issues that must be addressed before we continue adorning and maturing our vehicle class: Primitive types and parameters to behavior.
As we declare the state entries of our class in terms of other classes, we will eventually reach a point where we can no longer reference another class without creating a cyclical hierarchy. For example, if we try to assign a type of
Number to our
currentSpeed state, how then do we define the class
Numberr? If we try to define it in terms of other classes, how then do we define those classes? Since we cannot allow for a type system that accepts this infinite regression, we must declare a value or a set of values to simply exist. These axiomatic types are called primitive types.
Most programming languages share a basic set of primitive types, including integers, decimal values (such as single-precision floating point values or double-precision floating point values), characters, strings, and boolean values (true or false values). These types exist as a collection of bits without the need to define a regressive class structure. These primitive types can then be used as building blocks to create more complex types in the form of classes. For the sake of our discussion, we will assume that the types
The second issue we must address is that of passing information to our class when we execute behaviors. For example, if we wish to accelerate to 80 miles per hour (assuming that our units are in miles per hour), we would have to successively execute the accelerate behavior of our vehicle class 80 times. This is unruly for any real system. Instead, we should be able to instruct our vehicle how much to accelerate by. For example, executing “accelerate by 30” followed by “accelerate by 15” should produce a current speed of 45 miles per hour.
In order to do this, we must be able to declare that our accelerate behavior accepts a value which represents the increased speed as a parameter. For this, we will surround the parameter with parenthesis and associate a type with the same notation as for the state of our class (with the same camel case convention). Note that the parameter must be named, or else we would be unable to access or use the parameter within our definition of the behavior. Also note that if a behavior does not have any parameters, we will use an empty set of parenthesis. For example, if the behavior
drive does not take any parameters, it will be declared as as
drive(). This results in the following class for our vehicle:
If we now instruct our vehicle to
accelerate by 30 (written as
accelerate(30)), we would be able to increase the current speed of the vehicle by 30. Since the value of 30 is passed to our behavior at execution, it is termed an argument. Although the terms parameter and argument are closely related, there is an important distinction: A parameter is a value that is referenced in the definition of a behavior while an argument is the actual value passed to the behavior when it is executed.
The selection of our execution notation is deliberate because it maps the arguments supplied to the declared parameters. For example,
accelerate(30) maps the value
amount (of type
Number) during execution. This notation for parameters can also be expanded for more than one parameter by enumerating the multiple parameters in a comma-separated list of the form
doSomething(Number valueOne, Number valueTwo). The same argument notation can be used, where the arguments are mapped positionally. For example,
doSomething(15, 37) will map
Although we are progressing well in our maturity of the vehicle class, it seriously lacks a true reflection of reality. For example, a real vehicle has an engine and transmission. Our class specification is lackluster if it cannot account for these real-world concepts. In order to remedy this situation, we can create new classes for an engine and transmission:
Since, as previously described, a class is interchangeable with a type, we can update our vehicle class to include state entries that correspond to our engine and transmission. For example, our vehicle class could resemble the following:
By referencing other classes from our vehicle class, we created an association between the vehicle class and those classes referenced within our vehicle class. An association is simply a special type of dependency, where the vehicle class specification depends on the specifications for both the engine and transmission classes.
Our vehicle class is now starting to realistically model a vehicle, but we are cheating when calculating the current speed of the vehicle. Currently, we have a state entry that simply records the current speed, but that is not how the current speed of the vehicle is obtained. Instead, the current speed of a vehicle is the computed as a product of the current revolutions per minute (RPMs) of the engine, the current gear ratio of the transmission, and the circumference of the wheels (there are other factors, such as the final drive ratio, but for the sake of simplicity, we will ignore them).
In order to better define our vehicle class, we should use this more realistic calculation. At the moment, though, we have no way to produce a result from a behavior. For example, if we instruct the vehicle to compute its current speed, we must be able to pass the result to the entity that requested it. To solve this, we introduce a return statement. A return statement allows us to return the result of a computation to the entity that requested it. For example, we can return the result of our current speed computation.
When we return a result, we must know the type of the result. In keeping with the notation used for the state entries, we will prefix the return type of our behavior before the behavior name. Note that not all behaviors return a result; if a behavior does not return a result, we will specify its return type as
void (note the lowercase type name, used to differentiate it from the name of a class that we created, which follow the capitalization convention). Incorporating these updates, we end up with the following class specification for our vehicle:
Accessing State and Behavior
getCurrentSpeed behavior, we implicitly accessed the state of our the state of our vehicle class. For example,
return gearRatio of the transmission denotes that we are accessing the state
transmission, which is part of the state of our
Vehicle class. We can reduce the wordiness of this state access by using placing a period between the different levels of state access. For example,
return gearRatio of the transmission can be reduced to
In the case of the state of the
Vehicle class, we have no starting point from which to reference. For example, if we try to access
wheelCircumference, our notation would devolve to
.wheelCircumference. Instead, we will use the term
this to denote the current class. For example, accessing the wheel circumference can be done with the notation
this.wheelCircumference (for the sake of explicitness, we will also prepend accesses of
this , i.e.
this.engine, in order to show entire access hierarchy). Updating our class to use this enhanced notation, we obtain the following:
Visibility and Encapsulation
Using our notation, we expose a previously hidden issue with our class specifications: Any state in our classes is accessible from any other class. Although this may appear to be extraneous, there is a serious problem that arises as our classes become more complex. In general, we do not want other classes to directly change the state of our class; instead, we want them to only access the state of our class if and when we allow them.
For example, in the case of our
Vehicle class, changing the gear ratio of the transmission and the RPMs of the engine is a delicate process, and if done wrong, could result in the destruction of the transmission or engine. If we allow any class to alter the transmission underneath us, then we lose control over the state of our class. Instead, we need to limit the visibility of our state to other classes.
To do this, we will create two visibility modifiers for each of our state entries: Public and private. Public visibility means that any class, including our own, can access the state entry. Private, on the other hand, denotes that only our own class can access the state entry. Using these qualifications, we can restrict other classes from changing the state of our vehicle class:
In the same way that we can access behavior, we likewise have to qualify our behavior definitions with visibility modifiers. In the case of our
Vehicle class, we want all other classes to be able to access the behavior of our class; therefore, we set the visibility of these behaviors to
By setting the state of our class to
private and setting the behavior of our class to
public, we essentially reduced our class to following (from the perspective of other classes):
Notice that other classes access our behavior without knowing its definition. For example, accessing
accelerate(10) does not change, even if the definition of the
accelerate behavior changes. From the perspective of other classes, the vehicle is accelerating by 10, irrespective of the means by which the vehicle class performs this action.
The reduced appearance of our class to other classes is called the interface of our class. Interfaces are one of the most important concepts in any object-oriented language and provide the means through which other classes interact with our class (i.e. other classes interact with our class through its interface). This concept is illustrated in the figure below.
The combination of the hidden state and the externally visible interface ensure that our class maintains its encapsulation, or the restricted access to the private portions of the class only through the public portions. Seen from a different perspective, a well-encapsulated class is one in which outside classes can only interact with the class through a well defined and minimal interface. In general, it is beneficial to hide as much as possible from outside classes, ensuring that these external actors do not change the internal state of the class without the permission of the class. We can sum this up in a simple rule:
Be distrustful of exposing state and behavior to other classes: Exposing state entries or behaviors to other classes should be done deliberately
Proper encapsulation also has an important corollary: We can change the internal logic or state of our class without affecting dependent classes. For example, when we removed the
currentSpeed state from the vehicle class, we forced any dependent classes to stop referencing this state. If instead, we had originally created a behavior that returned the current speed of the vehicle class and hid the
currentSpeed state, no change would have been necessary to all dependent classes when the
currentSpeed state was removed. For example, if we originally had the following (other details removed for brevity)…
…we could have switched to the following class specification…
…without changing any dependent classes, since the interface of our class did not change (i.e. other classes could have obtained the current speed of our vehicle by accessing
getCurrentSpeed() just as they did before, unaware that the actual logic of the
getCurrentSpeed behavior changed).
With our state entries now privatized, how can we allow other classes to access or change this information? One of the most common techniques to resolve this problem is the introduction of getters and setters. A getter is a behavior that accesses the value of some internal state from a class, while a setter is a mutator that allows the internal state of a class to be changed. For example, a pair of getters and setters would resemble the following for the manufacturer name of our vehicle (all other details removed for brevity):
Since no external classes can directly access our
manufacturerName state, we have guarded how our
manufacturerName state is changed. For example, an external class can only change the value of
manufacturerName by using our setter. This ensures that we have full control over how and when this state is changed. If another class provides an invalid value (such as an empty name), then we can reject the change, ensuring our
manufacturerName state remains in a stable state. If we allowed for direct access to this state, any class could have set the value of
manufacturerName to any value it pleases, destroying the stability of the internal state of the
Vehicle class. This is where the criticality of encapsulation appears: We create a barrier between the internal state of our class and the outside world (other classes).
As an interesting side note, the hidden state of our
Vehicle class can be made read-only, write-only, readable-writable, or unreadable-unwritable, depending on if a setter or getter is provided. For example, if we want our state to be read-only, we provide only a getter and no setter. If we want our state to be unreadable-unwritable, we do not provide a getter or a setter. The possibilities for readability and writability using getters and setters are enumerated in the table below:
|READABILITY/WRITEABILITY||INCLUDE GETTER||INCLUDE SETTER|
Notice that our
Vehicle class directly accesses the state of both the
Engine classes. In order to better encapsulate these classes, we will provide getters and setters for
Engine classes and update our
Vehicle class to access the state of the
Engine through the newly added getters and setters:
The last topic we must address before completing our basic class specification is that of extension. With our basic
Vehicle class, we have described a general vehicle, but in the real-world, there are specializations, such as a car, or truck, or bike that derive from this general class. Each has all of the basic state and behavior of a vehicle but includes more than that basic specification or have intricacies that alter the existing behavior of the general class. In this case, we want a mechanism that allows us to take all of the vehicle specification and extend it to add more functionality or specialize existing functionality. To do this, we will use the
extends notation to denote that we are extending an existing class:
It is important to note that a
Bike is a
Vehicle. This relationship is called inheritance and is an integral part of any object-oriented language. It is also one of the most powerful features that a programmer can use when developing code (the importance of this topic will be explained further later in this series). For the purpose of demonstration, our
Bike class can be viewed as the following (by nature of its extension of the
Although we have introduced a powerful feature to our specification, we have created a serious problem. In our discussion of visibility, we only defined two levels: Public and private. As previously stated, it is important to keep state hidden from the outside world, so we have declared all of the state entries in our
Vehicle class to private. While this ensures that all external classes cannot change the internal state of the
Vehicle class, it also bars our
Bike class from accessing it as well. For example, if we wanted to access the engine state entry in our
Vehicle, we would be unable to do so, because that state is only accessible to the
Vehicle class itself.
In order to remedy this situation, we must introduce a new level of visibility: Protected. A protected state entry or behavior is accessible only to the class itself, or any class that extends that class. This allows us to create state or behavior that cannot be used by any external classes, while maintaining the ability for classes that extend the class, or subclasses, to access the state or behavior. With this new mechanism, we are now able to create a specification for the Vehicle that can be extended by the
At this point, we have matured our class specification enough to model a vehicle relatively well, but we have been discussing this specification in an abstract sense: We have not touched on how a computing system would actually process this specification and do some useful work. That is the topic of Part 2 of this series: Objects. In the next article, we will delve into objects and how they allow us to take our classes and do useful work in a computing environment. We will also take a look at the history of classes and objects and provide some insight into the circumstances and thought processes that brought about the object-oriented revolution.
Article taken from: DZone